Query CGI_ID Hit type PSSM-ID From To E-Value Bitscore Accession Short name Incomplete Superfamily Definition Q#2 - CGI_10000456 superfamily 241841 11 135 8.33E-46 147.279 cl00399 MoaE superfamily - - "MoaE family. Members of this family are involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor for a diverse group of redox enzymes. Moco biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. Moco contains a tricyclic pyranopterin, termed molybdopterin (MPT), which carries the cis-dithiolene group responsible for molybdenum ligation. This dithiolene group is generated by MPT synthase in the second major step in Moco biosynthesis. MPT synthase is a heterotetramer consisting of two large (MoaE) and two small (MoaD) subunits." Q#4 - CGI_10000774 superfamily 220249 54 121 1.85E-18 74.564 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#6 - CGI_10000861 superfamily 217473 50 320 1.53E-25 106.68 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#7 - CGI_10000994 superfamily 245612 46 539 0 645.522 cl11426 Amidase superfamily - - Amidase; Amidase. Q#8 - CGI_10000643 superfamily 241600 1 181 1.26E-76 230.975 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#9 - CGI_10000763 superfamily 247684 57 82 0.00144517 34.6287 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10 - CGI_10000610 superfamily 243072 181 303 4.10E-33 120.951 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10 - CGI_10000610 superfamily 243072 12 131 5.08E-16 73.5718 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10 - CGI_10000610 superfamily 243072 278 363 2.46E-06 45.4523 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13 - CGI_10001333 superfamily 241739 152 460 1.49E-174 496.312 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#13 - CGI_10001333 superfamily 217020 2 94 2.13E-16 75.3238 cl03574 Seryl_tRNA_N superfamily - - Seryl-tRNA synthetase N-terminal domain; This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase. Q#14 - CGI_10002404 superfamily 241609 45 113 8.30E-23 88.5891 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#15 - CGI_10002405 superfamily 216897 190 269 3.48E-15 69.2473 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#17 - CGI_10002407 superfamily 197676 415 437 0.000585722 38.9861 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#18 - CGI_10001404 superfamily 115560 136 177 0.00926158 34.0824 cl06117 MEA1 superfamily N - "Male enhanced antigen 1 (MEA1); This family consists of several mammalian male enhanced antigen 1 (MEA1) proteins. The Mea-1 gene is found to be localised in primary and secondary spermatocytes and spermatids, but the protein products are detected only in spermatids. Intensive transcription of Mea-1 gene and specific localisation of the gene product suggest that Mea-1 may play a important role in the late stage of spermatogenesis." Q#19 - CGI_10001405 superfamily 243066 30 131 1.56E-25 101.54 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19 - CGI_10001405 superfamily 198867 142 240 2.28E-12 63.7167 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#19 - CGI_10001405 superfamily 243146 381 426 3.97E-09 53.4342 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19 - CGI_10001405 superfamily 243146 430 472 1.66E-08 51.5082 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19 - CGI_10001405 superfamily 243146 331 378 3.89E-08 50.3526 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19 - CGI_10001405 superfamily 243146 487 538 3.32E-06 44.8567 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19 - CGI_10001405 superfamily 243146 528 574 9.06E-05 40.5093 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20 - CGI_10001406 superfamily 241581 136 234 1.17E-19 85.901 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#20 - CGI_10001406 superfamily 190615 338 404 2.46E-07 49.5276 cl04028 dsRNA_bind superfamily - - "Double stranded RNA binding domain; This domain is a divergent double stranded RNA-binding domain. It is found in members of the Dicer protein family which function in RNA interference, an evolutionarily conserved mechanism for gene silencing using double-stranded RNA (dsRNA) molecules." Q#21 - CGI_10001407 superfamily 243187 494 666 4.19E-103 317.205 cl02789 EFG_like_IV superfamily - - "Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm." Q#21 - CGI_10001407 superfamily 243185 314 407 2.30E-46 160.421 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#21 - CGI_10001407 superfamily 243183 662 741 3.71E-39 139.983 cl02785 Elongation_Factor_C superfamily - - "Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown." Q#21 - CGI_10001407 superfamily 247724 1 171 1.26E-67 224.033 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22 - CGI_10002515 superfamily 245323 530 801 2.04E-132 405.088 cl10511 Beach superfamily - - "BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins." Q#22 - CGI_10002515 superfamily 243092 858 1073 7.02E-33 130.148 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22 - CGI_10002515 superfamily 247725 421 515 1.35E-15 74.6391 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22 - CGI_10002515 superfamily 248312 36 187 5.01E-10 58.9041 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#23 - CGI_10002516 superfamily 245596 9 236 1.79E-125 357.999 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#25 - CGI_10001165 superfamily 241592 38 78 1.22E-21 84.6107 cl00074 H2A superfamily NC - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#27 - CGI_10001233 superfamily 241563 66 97 1.80E-06 45.3559 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27 - CGI_10001233 superfamily 248318 14 37 0.00496026 35.4486 cl17764 FYVE superfamily C - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#30 - CGI_10002446 superfamily 243179 558 636 2.50E-08 52.3501 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#32 - CGI_10002757 superfamily 245864 166 224 1.29E-14 71.1554 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#34 - CGI_10000261 superfamily 217293 1 147 1.19E-34 121.586 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#37 - CGI_10003755 superfamily 241572 93 138 2.62E-05 42.2257 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#37 - CGI_10003755 superfamily 219547 313 408 1.63E-12 63.8141 cl06669 BRF1 superfamily - - Brf1-like TBP-binding domain; This region covers both the Brf homology II and III regions. This region is involved in binding TATA binding protein. Q#37 - CGI_10003755 superfamily 203895 6 44 2.67E-09 53.4414 cl07036 TF_Zn_Ribbon superfamily - - TFIIB zinc-binding; The transcription factor TFIIB contains a zinc-binding motif near the N-terminus. This domain is involved in the interaction with RNA pol II and TFIIF and plays a crucial role in selecting the transcription initiation site. The domain adopts a zinc ribbon like structure. Q#37 - CGI_10003755 superfamily 241572 168 212 0.00318072 35.7227 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#38 - CGI_10003756 superfamily 243066 91 194 1.88E-21 89.2137 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#38 - CGI_10003756 superfamily 198867 204 307 1.09E-10 58.5068 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#40 - CGI_10003758 superfamily 217255 473 667 3.20E-50 174.864 cl03746 DDHD superfamily - - "DDHD domain; The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3). This suggests that this region is involved in functionally important interactions in other members of this family." Q#42 - CGI_10001931 superfamily 245201 694 886 1.28E-71 238.205 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#42 - CGI_10001931 superfamily 241584 536 616 6.04E-05 42.4835 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#43 - CGI_10002150 superfamily 245546 97 120 3.36E-05 39.8097 cl11198 zf-ribbon_3 superfamily - - "zinc-ribbon domain; This family consists of a single zinc ribbon domain, ie half of a pair as in family DZR. pfam12773." Q#44 - CGI_10003110 superfamily 243054 384 610 1.85E-29 117.547 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#44 - CGI_10003110 superfamily 241559 135 238 2.37E-25 102.389 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#44 - CGI_10003110 superfamily 241559 30 124 5.74E-18 81.2031 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#44 - CGI_10003110 superfamily 243054 266 489 2.54E-15 75.5599 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#44 - CGI_10003110 superfamily 243054 502 723 1.55E-14 73.2487 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#44 - CGI_10003110 superfamily 247856 738 805 6.65E-12 62.5653 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#46 - CGI_10002840 superfamily 245202 77 165 3.58E-41 135.081 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#46 - CGI_10002840 superfamily 243703 1 77 7.75E-38 126.528 cl04309 RNAP_Rpb7_N_like superfamily - - "RNAP_Rpb7_N_like: This conserved domain represents the N-terminal ribonucleoprotein (RNP) domain of the Rpb7 subunit of eukaryotic RNA polymerase (RNAP) II and its homologs, Rpa43 of eukaryotic RNAP I, Rpc25 of eukaryotic RNAP III, and RpoE (subunit E) of archaeal RNAP. These proteins have, in addition to their N-terminal RNP domain, a C-terminal oligonucleotide-binding (OB) domain. Each of these subunits heterodimerizes with another RNAP subunit (Rpb7 to Rpb4, Rpc25 to Rpc17, RpoE to RpoF, and Rpa43 to Rpa14). The heterodimer is thought to tether the RNAP to a given promoter via its interactions with a promoter-bound transcription factor.The heterodimer is also thought to bind and position nascent RNA as it exits the polymerase complex." Q#47 - CGI_10002841 superfamily 243039 336 515 7.94E-108 322.283 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#47 - CGI_10002841 superfamily 247792 26 57 0.00154299 36.7182 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#48 - CGI_10002842 superfamily 241645 1 72 2.65E-26 95.2179 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#48 - CGI_10002842 superfamily 248233 73 125 8.68E-18 72.4039 cl17679 Ribosomal_S30 superfamily - - Ribosomal protein S30; Ribosomal protein S30. Q#50 - CGI_10004394 superfamily 247804 809 844 4.44E-08 52.1926 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#50 - CGI_10004394 superfamily 212559 524 567 2.95E-07 49.9203 cl18297 SANT_MTA3_like superfamily - - "Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis." Q#51 - CGI_10004395 superfamily 243176 7 526 0 876.973 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#52 - CGI_10004396 superfamily 244539 274 670 0 652.401 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#52 - CGI_10004396 superfamily 241863 79 216 7.38E-37 136.368 cl00438 Flavodoxin_2 superfamily - - Flavodoxin-like fold; This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. Q#53 - CGI_10004397 superfamily 243490 52 299 2.78E-66 208.284 cl03656 PS_Dcarbxylase superfamily - - "Phosphatidylserine decarboxylase; This is a family of phosphatidylserine decarboxylases, EC:4.1.1.65. These enzymes catalyze the reaction: Phosphatidyl-L-serine <=> phosphatidylethanolamine + CO2. Phosphatidylserine decarboxylase plays a central role in the biosynthesis of aminophospholipids by converting phosphatidylserine to phosphatidylethanolamine." Q#55 - CGI_10004399 superfamily 247723 773 844 1.83E-33 124.69 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#55 - CGI_10004399 superfamily 247723 863 943 3.30E-37 135.617 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#59 - CGI_10004069 superfamily 213389 162 328 5.63E-11 62.3067 cl17092 STING_C superfamily - - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#59 - CGI_10004069 superfamily 248012 10 90 1.23E-08 54.1209 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#60 - CGI_10004070 superfamily 241883 69 117 1.67E-20 79.8934 cl00466 ATP-synt_C superfamily N - ATP synthase subunit C; ATP synthase subunit C. Q#62 - CGI_10004072 superfamily 241641 55 117 1.23E-11 58.6293 cl00150 TY superfamily - - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#62 - CGI_10004072 superfamily 241641 141 184 7.51E-10 54.0069 cl00150 TY superfamily N - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#64 - CGI_10002683 superfamily 219541 2 125 1.56E-19 79.4347 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#65 - CGI_10002684 superfamily 215896 20 112 4.04E-12 58.8456 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#66 - CGI_10003316 superfamily 222006 720 795 1.02E-09 57.2322 cl16182 Hydrolase_like2 superfamily N - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#66 - CGI_10003316 superfamily 215733 319 406 2.06E-06 49.1007 cl02811 E1-E2_ATPase superfamily C - E1-E2 ATPase; E1-E2 ATPase. Q#71 - CGI_10001243 superfamily 247805 35 168 1.52E-06 44.2504 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#72 - CGI_10003166 superfamily 248097 1 109 1.79E-20 80.387 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#76 - CGI_10001588 superfamily 247745 106 447 5.10E-151 456.342 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#76 - CGI_10001588 superfamily 245003 442 526 4.03E-20 86.8381 cl08536 Alpha-mann_mid superfamily - - "Alpha mannosidase, middle domain; Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase." Q#77 - CGI_10001589 superfamily 241593 67 117 5.09E-05 41.8634 cl00075 HATPase_c superfamily C - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#79 - CGI_10001648 superfamily 247724 9 179 5.59E-131 368.53 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#80 - CGI_10001649 superfamily 247727 53 119 6.69E-08 46.2691 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#80 - CGI_10001649 superfamily 247844 32 70 0.00167229 35.3629 cl17290 Methyltransf_4 superfamily C - Putative methyltransferase; This is a family of putative methyltransferases. The aligned region contains the GXGXG S-AdoMet binding site suggesting a putative methyltransferase activity. Q#82 - CGI_10001049 superfamily 241862 116 252 4.37E-20 85.8708 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#83 - CGI_10001050 superfamily 245201 286 568 3.71E-178 515.082 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#83 - CGI_10001050 superfamily 243040 23 118 1.60E-29 114.799 cl02447 CRD_FZ superfamily N - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#83 - CGI_10001050 superfamily 241609 143 215 1.05E-21 90.9003 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#85 - CGI_10004307 superfamily 149284 224 370 2.32E-44 153.419 cl06952 CPL superfamily - - CPL (NUC119) domain; This C terminal domain is fund in Penguin-like proteins associated with Pumilio like repeats. Q#85 - CGI_10004307 superfamily 243032 126 255 9.02E-05 42.9639 cl02427 Pumilio superfamily NC - "Pumilio-family RNA binding domain; Puf repeats (also labelled PUM-HD or Pumilio homology domain) mediate sequence specific RNA binding in fly Pumilio, worm FBF-1 and FBF-2, and many other proteins such as vertebrate Pumilio. These proteins function as translational repressors in early embryonic development by binding to sequences in the 3' UTR of target mRNAs, such as the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA. Other proteins that contain Puf domains are also plausible RNA binding proteins. Yeast PUF1 (JSN1), for instance, appears to contain a single RNA-recognition motif (RRM) domain. Puf repeat proteins have been observed to function asymmetrically and may be responsible for creating protein gradients involved in the specification of cell fate and differentiation. Puf domains usually occur as a tandem repeat of 8 domains. This model encompasses all 8 tandem repeats. Some proteins may have fewer (canonical) repeats." Q#87 - CGI_10001530 superfamily 247916 114 204 5.92E-19 81.6602 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#87 - CGI_10001530 superfamily 216198 439 536 0.00248928 36.5229 cl08295 Transglut_C superfamily - - "Transglutaminase family, C-terminal ig like domain; Transglutaminase family, C-terminal ig like domain. " Q#88 - CGI_10001531 superfamily 201479 64 188 8.04E-30 108.485 cl02994 Transglut_N superfamily - - Transglutaminase family; Transglutaminase family. Q#89 - CGI_10001532 superfamily 220376 45 136 3.77E-08 48.1736 cl10729 DUF2040 superfamily C - "Coiled-coil domain-containing protein 55 (DUF2040); This entry is a conserved domain of approximately 130 residues of proteins conserved from fungi to humans. The proteins do contain a coiled-coil domain, but the function is unknown." Q#90 - CGI_10001586 superfamily 245864 2 462 8.89E-58 199.391 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#91 - CGI_10001587 superfamily 247856 64 124 6.39E-07 42.5349 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#91 - CGI_10001587 superfamily 247856 1 38 0.000681633 34.4457 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#92 - CGI_10005941 superfamily 241563 63 98 4.08E-05 41.504 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#92 - CGI_10005941 superfamily 243092 308 439 0.00139174 39.6256 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#92 - CGI_10005941 superfamily 241563 8 53 0.0019754 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#96 - CGI_10005945 superfamily 247637 63 89 0.00933949 33.2412 cl16912 MDR superfamily N - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#97 - CGI_10005946 superfamily 245864 4 435 1.60E-52 184.019 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#99 - CGI_10005948 superfamily 247792 8 56 6.06E-07 45.8996 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#99 - CGI_10005948 superfamily 241563 157 188 0.00156573 36.1611 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#101 - CGI_10009164 superfamily 195671 13 143 1.80E-44 145.286 cl08257 Ribosomal_L11 superfamily - - "Ribosomal protein L11. Ribosomal protein L11, together with proteins L10 and L7/L12, and 23S rRNA, form the L7/L12 stalk on the surface of the large subunit of the ribosome. The homologous eukaryotic cytoplasmic protein is also called 60S ribosomal protein L12, which is distinct from the L12 involved in the formation of the L7/L12 stalk. The C-terminal domain (CTD) of L11 is essential for binding 23S rRNA, while the N-terminal domain (NTD) contains the binding site for the antibiotics thiostrepton and micrococcin. L11 and 23S rRNA form an essential part of the GTPase-associated region (GAR). Based on differences in the relative positions of the L11 NTD and CTD during the translational cycle, L11 is proposed to play a significant role in the binding of initiation factors, elongation factors, and release factors to the ribosome. Several factors, including the class I release factors RF1 and RF2, are known to interact directly with L11. In eukaryotes, L11 has been implicated in regulating the levels of ubiquinated p53 and MDM2 in the MDM2-p53 feedback loop, which is responsible for apoptosis in response to DNA damage. In bacteria, the "stringent response" to harsh conditions allows bacteria to survive, and ribosomes that lack L11 are deficient in stringent factor stimulation." Q#102 - CGI_10009165 superfamily 197827 305 342 2.86E-06 44.8161 cl02725 CARP superfamily - - Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product; Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product. Q#103 - CGI_10009166 superfamily 246925 52 300 2.78E-09 57.3654 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#104 - CGI_10009167 superfamily 241754 12 369 0 588.497 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#106 - CGI_10009169 superfamily 243092 88 401 4.80E-53 180.223 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#107 - CGI_10009170 superfamily 243072 182 277 5.35E-13 66.6382 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#107 - CGI_10009170 superfamily 243072 570 648 4.28E-09 55.0822 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#107 - CGI_10009170 superfamily 243072 444 614 9.71E-05 41.6003 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#108 - CGI_10009171 superfamily 241609 102 173 1.65E-15 68.1858 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#108 - CGI_10009171 superfamily 241629 59 84 0.000115088 39.221 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#111 - CGI_10009174 superfamily 219670 1 69 2.79E-06 45.4557 cl06834 zf-C3HC superfamily N - "C3HC zinc finger-like; This zinc-finger like domain is distributed throughout the eukaryotic kingdom in NIPA (Nuclear interacting partner of ALK) proteins. NIPA is implicate to perform some sort of antiapoptotic role in nucleophosmin-anaplastic lymphoma kinase (ALK) mediated signaling events. The domain is often repeated, with the second domain usually containing a large insert (approximately 90 residues) after the first three cysteine residues. The Schizosaccharomyces pombe the protein containing this domain is involved in mRNA export from the nucleus." Q#111 - CGI_10009174 superfamily 219926 102 137 2.37E-05 42.416 cl07279 Rsm1 superfamily C - Rsm1-like; Rsm1 is a protein involved in mRNA export from the nucleus Q#112 - CGI_10009175 superfamily 242042 1 67 2.24E-41 131.032 cl00712 RNA_pol_N superfamily - - RNA polymerases N / 8 kDa subunit; RNA polymerases N / 8 kDa subunit. Q#113 - CGI_10009176 superfamily 244265 230 508 3.95E-42 152.632 cl05973 FAM20_C_like superfamily - - "C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins; Drosophila Fj is a Golgi kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in the Drosophila fj gene include loss of the intermediate leg joint, and a PCP defect in the eye. Fjx1, the murine homologue of Fj, has been shown to be involved in both the Fat and Hippo signaling pathways, these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. This domain has homology to a kinase-active site, mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. This model includes the FAM20_C domain family, previously known as DUF1193; FAM20_C appears to be homologous to the catalytic domain of the phosphoinositide 3-kinase (PI3K)-like family." Q#115 - CGI_10009178 superfamily 217962 31 79 1.02E-06 43.4032 cl09558 TPD52 superfamily N - "Tumour protein D52 family; The hD52 gene was originally identified through its elevated expression level in human breast carcinoma. Cloning of D52 homologues from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Two human homologues of hD52, hD53 and hD54, have also been identified, demonstrating the existence of a novel gene/protein family. These proteins have an amino terminal coiled-coil that allows members to form homo- and heterodimers with each other." Q#117 - CGI_10009180 superfamily 246918 741 793 5.38E-12 62.9895 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 986 1038 4.11E-11 60.2931 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 476 528 1.25E-10 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 419 471 7.89E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 930 976 5.13E-09 54.5151 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 627 678 6.11E-09 54.1299 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 1080 1133 7.20E-09 53.7447 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 684 736 5.53E-08 51.4335 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 886 925 0.000413935 39.8775 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#117 - CGI_10009180 superfamily 246918 544 585 0.000661586 39.1071 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#119 - CGI_10009182 superfamily 243066 20 124 5.63E-28 105.777 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#119 - CGI_10009182 superfamily 243146 232 278 6.41E-12 59.9826 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#119 - CGI_10009182 superfamily 243146 342 386 7.54E-11 57.1831 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#119 - CGI_10009182 superfamily 243146 198 243 1.84E-10 56.0275 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#119 - CGI_10009182 superfamily 243146 135 183 2.48E-06 44.1894 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#119 - CGI_10009182 superfamily 243146 294 341 1.55E-05 42.1603 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#120 - CGI_10000371 superfamily 245201 11 56 3.17E-25 93.5584 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#122 - CGI_10008750 superfamily 242893 40 140 1.75E-54 169.739 cl02121 Med31 superfamily - - "SOH1; The family consists of Saccharomyces cerevisiae SOH1 homologues. SOH1 is responsible for the repression of temperature sensitive growth of the HPR1 mutant and has been found to be a component of the RNA polymerase II transcription complex. SOH1 not only interacts with factors involved in DNA repair, but transcription as well. Thus, the SOH1 protein may serve to couple these two processes." Q#124 - CGI_10008752 superfamily 147513 5 57 1.91E-14 62.2842 cl05104 UCR_UQCRX_QCR9 superfamily - - "Ubiquinol-cytochrome C reductase, UQCRX/QCR9 like; The UQCRX/QCR9 protein is the 9/10 subunit of complex III, encoding a protein of about 7-kDa. Deletion of QCR9 results in the inability of cells to grow on grow on-fermentable carbon source n yeast." Q#125 - CGI_10008753 superfamily 241636 141 332 2.11E-114 338.793 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#126 - CGI_10008754 superfamily 247725 12 109 5.23E-60 198.728 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#126 - CGI_10008754 superfamily 216381 402 815 1.25E-120 370.382 cl03136 Oxysterol_BP superfamily - - Oxysterol-binding protein; Oxysterol-binding protein. Q#127 - CGI_10008755 superfamily 221858 34 76 8.43E-13 59.0855 cl15169 MOZART2 superfamily N - "Mitotic-spindle organizing gamma-tubulin ring associated; FAM128A and FAM128B proteins have been re-named MOZART2A and B. The name MOZART is derived from letters of 'mitotic-spindle organizing proteins associated with a ring of gamma-tubulin'. This family operates as part of the gamma-tubulin ring complex, gamma-TuRC, one of the complexes necessary for chromosome segregation. This complex is located at centrosomes and mediates the formation of bipolar spindles in mitosis; it consists of six subunits. However, unlike the other four known subunits, the MOZART proteins, both 1 and 2, do not carry the conserved 'Spc97-Spc98' GCP domain, so the TUBCGP nomenclature cannot be used for it. The exact function of MOZART2 is not clear." Q#129 - CGI_10008757 superfamily 241992 369 868 0 606.184 cl00628 Piwi-like superfamily - - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#129 - CGI_10008757 superfamily 241765 244 360 1.16E-51 177.067 cl00301 PAZ superfamily - - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#131 - CGI_10007100 superfamily 220628 323 440 3.64E-16 74.637 cl10890 Ada3 superfamily - - "Histone acetyltransferases subunit 3; Ada3 is a family of proteins conserved from yeasts to humans. It is an essential component of the Ada transcriptional coactivator (alteration/deficiency in activation) complex. Ada3 plays a key role in linking histone acetyltransferase-containing complexes to p53 (tumour suppressor protein) thereby regulating p53 acetylation, stability and transcriptional activation following DNA damage." Q#132 - CGI_10007101 superfamily 217951 25 220 2.08E-19 84.5076 cl18437 Mannosyl_trans2 superfamily N - "Mannosyltransferase (PIG-V)); This is a family of eukaryotic ER membrane proteins that are involved in the synthesis of glycosylphosphatidylinositol (GPI), a glycolipid that anchors many proteins to the eukaryotic cell surface. Proteins in this family are involved in transferring the second mannose in the biosynthetic pathway of GPI." Q#134 - CGI_10007103 superfamily 247725 7 129 5.05E-77 234.456 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#134 - CGI_10007103 superfamily 248318 155 210 2.90E-16 71.6981 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#134 - CGI_10007103 superfamily 248318 250 286 2.22E-09 52.8233 cl17764 FYVE superfamily N - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#135 - CGI_10007104 superfamily 242165 22 209 3.10E-56 177.712 cl00880 Ribosomal_S8e_like superfamily - - "Eukaryotic/archaeal ribosomal protein S8e and similar proteins; This family contains the eukaryotic/archaeal ribosomal protein S8, a component of the small ribosomal subunits, as well as the NSA2 gene product." Q#136 - CGI_10007105 superfamily 245201 55 310 0 536.464 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#136 - CGI_10007105 superfamily 152065 487 535 1.72E-21 88.1801 cl13134 Mst1_SARAH superfamily - - "C terminal SARAH domain of Mst1; This family of proteins represents the C terminal SARAH domain of Mst1. SARAH controls apoptosis and cell cycle arrest via the Ras, RASSF, MST pathway. The Mst1 SARAH domain interacts with Rassf1 and Rassf5 by forming a heterodimer which mediates the apoptosis process." Q#137 - CGI_10007106 superfamily 190308 48 192 3.90E-09 55.0175 cl18163 Fringe superfamily C - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#138 - CGI_10007107 superfamily 222150 376 401 7.54E-05 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#138 - CGI_10007107 superfamily 222150 349 372 0.00046678 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#138 - CGI_10007107 superfamily 246975 308 328 0.00247813 36.1709 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#140 - CGI_10007109 superfamily 217915 663 948 1.54E-43 166.528 cl14957 Spc97_Spc98 superfamily N - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#140 - CGI_10007109 superfamily 217915 271 731 2.09E-13 72.5388 cl14957 Spc97_Spc98 superfamily - - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#143 - CGI_10007112 superfamily 203134 392 454 8.15E-05 40.3577 cl04866 CHORD superfamily - - "CHORD; CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development." Q#143 - CGI_10007112 superfamily 241701 151 195 0.00808728 36.3963 cl00223 NusB_Sun superfamily C - "RNA binding domain of NusB (N protein-Utilization Substance B) and Sun (also known as RrmB or Fmu) proteins. This family includes two orthologous groups exemplified by the transcription termination factor NusB and the N-terminal domain of the rRNA-specific 5-methylcytidine transferase (m5C-methyltransferase) Sun. The NusB protein plays a key role in the regulation of ribosomal RNA biosynthesis in eubacteria by modulating the efficiency of transcriptional antitermination. NusB along with other Nus factors (NusA, NusE/S10 and NusG) forms the core complex with the boxA element of the nut site of the rRNA operons. These interactions help RNA polymerase to counteract polarity during transcription of rRNA operons and allow stable antitermination. The transcription antitermination system can be appropriated by some bacteriophages such as lambda, which use the system to switch between the lysogenic and lytic modes of phage propagation. The m5C-methyltransferase Sun shares the N-terminal non-catalytic RNA-binding domain with NusB." Q#144 - CGI_10007113 superfamily 246680 9 87 3.26E-12 62.6044 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#145 - CGI_10007114 superfamily 246680 9 87 3.35E-13 66.0712 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#145 - CGI_10007114 superfamily 246680 385 474 1.32E-09 55.9792 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#147 - CGI_10001026 superfamily 248020 24 356 1.06E-52 182.664 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#149 - CGI_10001514 superfamily 248097 265 385 1.88E-28 108.121 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#149 - CGI_10001514 superfamily 242406 50 113 0.00168013 37.358 cl01271 DUF1768 superfamily C - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#151 - CGI_10001610 superfamily 245847 1 67 2.66E-15 66.0409 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#153 - CGI_10003986 superfamily 218702 76 128 3.88E-06 41.5038 cl05324 Dimer_Tnp_hAT superfamily N - hAT family dimerisation domain; This dimerisation domain is found at the C terminus of the transposases of elements belonging to the Activator superfamily (hAT element superfamily). The isolated dimerisation domain forms extremely stable dimers in vitro. Q#154 - CGI_10003987 superfamily 217926 294 430 5.50E-49 170.819 cl04418 YTH superfamily - - "YT521-B-like domain; A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily." Q#154 - CGI_10003987 superfamily 217926 734 870 5.50E-49 170.819 cl04418 YTH superfamily - - "YT521-B-like domain; A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily." Q#156 - CGI_10003989 superfamily 217643 91 253 6.69E-08 51.7745 cl04182 Solute_trans_a superfamily N - "Organic solute transporter Ostalpha; This family is a transmembrane organic solute transport protein. In vertebrates these proteins form a complex with Ostbeta, and function as bile transporters. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death." Q#158 - CGI_10003991 superfamily 241597 23 86 2.30E-25 96.2133 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#159 - CGI_10003992 superfamily 216971 199 368 1.55E-22 92.2984 cl03532 Octopine_DH superfamily - - "NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain; This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids." Q#159 - CGI_10003992 superfamily 217105 51 117 0.000595319 38.755 cl18391 ApbA superfamily NC - "Ketopantoate reductase PanE/ApbA; This is a family of 2-dehydropantoate 2-reductases also known as ketopantoate reductases, EC:1.1.1.169. The reaction catalyzed by this enzyme is: (R)-pantoate + NADP(+) <=> 2-dehydropantoate + NADPH. AbpA catalyzes the NADPH reduction of ketopantoic acid to pantoic acid in the alternative pyrimidine biosynthetic (APB) pathway. ApbA and PanE are allelic. ApbA, the ketopantoate reductase enzyme is required for the synthesis of thiamine via the APB biosynthetic pathway." Q#161 - CGI_10001640 superfamily 241758 83 130 1.20E-15 68.5506 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#164 - CGI_10001745 superfamily 219525 44 92 6.06E-06 40.095 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#164 - CGI_10001745 superfamily 219525 3 37 0.000721192 34.7022 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#167 - CGI_10004224 superfamily 193687 6 154 3.49E-59 183.681 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#168 - CGI_10004225 superfamily 201540 4 70 0.00378836 35.9873 cl16960 Troponin superfamily N - "Troponin; Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin." Q#173 - CGI_10003728 superfamily 245746 56 114 1.96E-17 77.2654 cl11668 Lig_chan-Glu_bd superfamily - - "Ligated ion channel L-glutamate- and glycine-binding site; This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan, pfam00060." Q#173 - CGI_10003728 superfamily 197504 311 438 7.28E-14 68.4701 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#174 - CGI_10003729 superfamily 244881 194 493 1.12E-140 416.209 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#174 - CGI_10003729 superfamily 215788 2 92 6.07E-34 125.369 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#174 - CGI_10003729 superfamily 203720 596 676 3.00E-20 86.4481 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#175 - CGI_10003336 superfamily 215647 58 176 4.24E-05 41.4401 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#177 - CGI_10003338 superfamily 241568 87 125 3.45E-05 40.1388 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#188 - CGI_10002770 superfamily 222070 7 43 0.00392058 31.8793 cl18634 DDE_3 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#190 - CGI_10009092 superfamily 241559 32 132 7.19E-13 66.9507 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#190 - CGI_10009092 superfamily 241559 141 231 2.71E-06 47.3055 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#190 - CGI_10009092 superfamily 216033 637 725 4.76E-16 75.8332 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 417 545 6.49E-14 69.67 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 1205 1279 9.60E-13 66.2032 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 921 992 2.41E-11 62.3512 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 728 812 4.16E-11 61.5808 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 549 634 5.74E-11 61.1956 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 819 902 4.43E-10 58.4992 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 1089 1174 1.68E-07 50.7952 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 329 413 2.03E-07 50.41 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 216033 1033 1086 3.15E-07 50.0248 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#190 - CGI_10009092 superfamily 241559 1 25 0.00568501 36.9051 cl00030 CH superfamily N - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#191 - CGI_10009093 superfamily 216033 629 713 1.29E-18 82.3816 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#191 - CGI_10009093 superfamily 216033 721 810 1.82E-18 81.9964 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#191 - CGI_10009093 superfamily 216033 533 620 1.23E-15 73.9072 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#191 - CGI_10009093 superfamily 216033 358 427 1.67E-09 56.188 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#191 - CGI_10009093 superfamily 216033 430 524 3.45E-07 49.2544 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#192 - CGI_10009094 superfamily 241638 115 239 2.18E-10 55.4521 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#193 - CGI_10009096 superfamily 241638 148 258 2.42E-09 53.5028 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#194 - CGI_10009097 superfamily 241638 135 274 2.08E-10 56.5844 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#195 - CGI_10009098 superfamily 243555 23 215 2.02E-16 75.1202 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#196 - CGI_10009099 superfamily 247684 7 99 1.28E-22 89.6439 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#197 - CGI_10007489 superfamily 248458 47 192 1.75E-08 53.4717 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#198 - CGI_10007490 superfamily 246680 27 105 1.13E-07 49.6296 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#198 - CGI_10007490 superfamily 248012 207 306 1.15E-05 44.2364 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#200 - CGI_10007492 superfamily 192487 96 342 3.34E-67 216.117 cl10912 DUF2215 superfamily - - Uncharacterized conserved protein (DUF2215); This entry is the central 200 residues of a family of proteins conserved from worms to humans. The function is unknown. Q#201 - CGI_10007493 superfamily 198867 134 234 6.59E-39 138.06 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#201 - CGI_10007493 superfamily 243066 22 126 7.32E-32 118.489 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#201 - CGI_10007493 superfamily 243146 460 505 1.35E-14 68.8422 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 507 551 4.92E-14 67.3014 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 378 424 5.80E-14 67.1983 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 413 457 8.19E-13 63.8346 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 331 376 9.34E-13 63.7315 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#201 - CGI_10007493 superfamily 243146 283 330 2.92E-11 59.4943 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#203 - CGI_10007495 superfamily 241546 845 886 8.11E-09 54.5892 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#204 - CGI_10007496 superfamily 216434 434 571 3.48E-21 94.0664 cl08318 PPDK_N superfamily C - "Pyruvate phosphate dikinase, PEP/pyruvate binding domain; This enzyme catalyzes the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP)." Q#205 - CGI_10007497 superfamily 241874 9 531 0 598.696 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#207 - CGI_10007499 superfamily 201540 1 44 1.42E-13 62.9513 cl16960 Troponin superfamily NC - "Troponin; Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin." Q#212 - CGI_10003787 superfamily 243029 47 105 1.86E-11 60.4421 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#224 - CGI_10010951 superfamily 216363 239 344 9.18E-25 96.7705 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#229 - CGI_10010956 superfamily 245847 7 146 3.68E-14 68.3521 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#229 - CGI_10010956 superfamily 241619 241 287 0.000221313 38.7173 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#230 - CGI_10010957 superfamily 241568 187 212 0.00273195 35.5164 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#230 - CGI_10010957 superfamily 245847 224 369 5.63E-18 79.5229 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#230 - CGI_10010957 superfamily 241619 44 115 0.00160844 36.4061 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#232 - CGI_10010959 superfamily 222429 4 51 2.54E-07 44.1536 cl18676 Myb_DNA-bind_5 superfamily C - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#239 - CGI_10006397 superfamily 246723 14 521 0 659.437 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#240 - CGI_10006398 superfamily 241563 64 105 4.49E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#241 - CGI_10006399 superfamily 243047 7 120 1.59E-43 151.233 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#244 - CGI_10006402 superfamily 241596 107 168 2.12E-13 62.2315 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#245 - CGI_10006403 superfamily 221913 1094 1284 4.46E-30 119.953 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#245 - CGI_10006403 superfamily 222005 788 854 2.96E-08 52.7396 cl18632 AAA_19 superfamily - - Part of AAA domain; Part of AAA domain. Q#246 - CGI_10006404 superfamily 241563 68 108 3.90E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#247 - CGI_10006405 superfamily 241706 506 574 4.20E-22 91.0855 cl00229 eIF1_SUI1_like superfamily - - "Eukaryotic initiation factor 1 and related proteins; Members of the eIF1/SUI1 (eukaryotic initiation factor 1) family are found in eukaryotes, archaea, and some bacteria; eukaryotic members are understood to play an important role in accurate initiator codon recognition during translation initiation. eIF1 interacts with 18S rRNA in the 40S ribosomal subunit during eukaryotic translation initiation. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown. The function of non-eukaryotic family members is also unclear." Q#247 - CGI_10006405 superfamily 211517 6 81 1.86E-20 86.1722 cl16921 eIF2D_N_like superfamily - - "N-terminal domain of eIF2D, malignant T cell-amplified sequence 1 and related proteins; This N-terminal domain of various proteins co-occurs with a PUA domain. Members of this family are: (1) MCTS-1 (malignant T cell-amplified sequence 1) or MCT-1 (multiple copies T cell malignancies), which may play roles in the regulation of the cell cycle, (2) the eukayotic translation initiation factor 2D, and (3) an uncharacterized archaeal family." Q#247 - CGI_10006405 superfamily 241977 64 178 1.04E-09 56.2875 cl00607 PUA superfamily - - "PUA domain; The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain." Q#248 - CGI_10006406 superfamily 241559 5 153 2.92E-26 107.952 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#248 - CGI_10006406 superfamily 247744 728 910 4.51E-15 75.7362 cl17190 NK superfamily N - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#248 - CGI_10006406 superfamily 247807 571 613 0.00550305 37.2746 cl17253 AAA_17 superfamily C - AAA domain; AAA domain. Q#249 - CGI_10006407 superfamily 216056 35 135 5.36E-31 120.107 cl08279 Peptidase_M16 superfamily - - Insulinase (Peptidase family M16); Insulinase (Peptidase family M16). Q#249 - CGI_10006407 superfamily 218490 181 363 6.95E-22 94.8507 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#249 - CGI_10006407 superfamily 218490 623 805 7.10E-09 55.1751 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#251 - CGI_10003515 superfamily 241636 71 257 1.44E-83 255.205 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#252 - CGI_10003516 superfamily 241597 69 134 1.31E-17 78.0533 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#252 - CGI_10003516 superfamily 222150 514 539 9.93E-05 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#252 - CGI_10003516 superfamily 197676 500 522 0.00397734 35.5193 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#252 - CGI_10003516 superfamily 220222 249 316 0.0079053 35.6567 cl09651 FadA superfamily C - Adhesion protein FadA; FadA (Fusobacterium adhesin A) is an adhesin which forms two alpha helices. Q#253 - CGI_10003517 superfamily 247684 5 366 3.28E-18 84.6363 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#255 - CGI_10003135 superfamily 216363 129 208 1.42E-12 61.3322 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#256 - CGI_10003136 superfamily 222429 17 93 1.14E-07 45.3092 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#261 - CGI_10006311 superfamily 191128 69 109 8.14E-08 47.5336 cl04846 Ninjurin superfamily C - Ninjurin; Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation and function in some tissues. Q#263 - CGI_10006313 superfamily 234167 115 180 5.21E-21 84.141 cl11877 ygfZ_signature superfamily - - "folate-binding protein YgfZ; YgfZ is a protein from Escherichia coli, homologous to the glycine cleavage system T protein, or aminomethyltransferase, GcvT (TIGR00528). Homologs of YgfZ other than members of the GcvT family share a well-conserved signature region that includes the motif, KGCYxGQE. Elsewhere, sequence diverge and length variation are substantial. Members of this family are mostly bacterial, largely absent from the Firmicutes and otherwise usually present. A few eukaryotic examples are found among the Apicomplexa, and a few archaeal sequences are found. Two functions implicated for this folate-binding protein are RNA modification (a function likely to be conserved) and replication initiation (a function likely to be highly variable). Many members of this family are, at the time of construction of this model, misnamed as the glycine cleavage system T protein [Protein synthesis, tRNA and rRNA base modification]." Q#264 - CGI_10006314 superfamily 245864 122 223 9.70E-19 85.4078 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#265 - CGI_10006315 superfamily 247792 61 103 1.02E-06 41.6624 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#266 - CGI_10006316 superfamily 247856 64 124 6.39E-07 42.5349 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#266 - CGI_10006316 superfamily 247856 1 38 0.000681633 34.4457 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#268 - CGI_10006318 superfamily 198850 38 105 1.34E-15 68.3143 cl04907 L51_S25_CI-B8 superfamily - - "Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain; The proteins in this family are located in the mitochondrion. The family includes ribosomal protein L51, and S25. This family also includes mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) EC:1.6.5.3. It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins." Q#269 - CGI_10006319 superfamily 243072 3 108 2.09E-22 92.8318 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#269 - CGI_10006319 superfamily 243072 387 451 5.23E-08 50.845 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#269 - CGI_10006319 superfamily 248006 178 209 0.00999194 34.0839 cl17452 TPR_10 superfamily N - Tetratricopeptide repeat; Tetratricopeptide repeat. Q#270 - CGI_10006320 superfamily 247103 43 387 2.69E-153 441.8 cl15852 COX15-CtaA superfamily - - Cytochrome oxidase assembly protein; This is a family of integral membrane proteins. CtaA is required for cytochrome aa3 oxidase assembly in Bacillus subtilis. COX15 is required for cytochrome c oxidase assembly in yeast. Q#271 - CGI_10006321 superfamily 248054 35 81 1.26E-05 43.2296 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#272 - CGI_10006322 superfamily 247755 253 406 7.60E-93 300.716 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#272 - CGI_10006322 superfamily 247755 1339 1431 4.17E-53 188.238 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#272 - CGI_10006322 superfamily 243179 114 231 1.14E-30 119.33 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#272 - CGI_10006322 superfamily 244201 777 889 2.96E-27 109.243 cl05797 SMC_hinge superfamily - - SMC proteins Flexible Hinge Domain; This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. Q#272 - CGI_10006322 superfamily 248228 59 112 0.000356764 40.2357 cl17674 COG5487 superfamily - - Small integral membrane protein [Function unknown] Q#272 - CGI_10006322 superfamily 151039 452 591 0.000852119 39.7815 cl11115 Cenp-F_leu_zip superfamily - - "Leucine-rich repeats of kinetochore protein Cenp-F/LEK1; Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. There are several leucine-rich repeats along the sequence of LEK1 that are considered to be zippers, though they do not appear to be binding DNA directly in this instance." Q#273 - CGI_10006323 superfamily 243179 120 229 3.38E-29 108.545 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#276 - CGI_10003305 superfamily 246683 61 261 9.28E-89 270.146 cl14648 Aldose_epim superfamily N - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#277 - CGI_10003306 superfamily 217293 68 239 4.63E-66 214.034 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#277 - CGI_10003306 superfamily 202474 246 472 5.58E-33 124.688 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#278 - CGI_10003307 superfamily 217293 5 211 1.19E-81 254.094 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#278 - CGI_10003307 superfamily 202474 218 450 3.58E-33 125.074 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#279 - CGI_10013066 superfamily 248097 61 183 1.51E-21 85.7798 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#280 - CGI_10013067 superfamily 241584 21 63 0.00531587 32.8535 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#282 - CGI_10013069 superfamily 248097 4 112 3.79E-23 87.7058 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#283 - CGI_10013070 superfamily 248097 93 194 3.55E-16 71.1422 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#284 - CGI_10013071 superfamily 248097 3 111 9.33E-18 73.4534 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#285 - CGI_10013072 superfamily 248097 4 59 7.32E-13 59.201 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#286 - CGI_10013073 superfamily 248097 81 204 4.21E-20 82.313 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#287 - CGI_10013074 superfamily 248054 15 69 0.000491261 38.6072 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#287 - CGI_10013074 superfamily 248054 213 267 0.00139816 37.4516 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#288 - CGI_10013075 superfamily 241832 234 349 2.12E-62 201.939 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#288 - CGI_10013075 superfamily 241645 438 521 8.54E-29 109.215 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#289 - CGI_10013076 superfamily 241832 7 78 2.83E-17 72.9896 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#289 - CGI_10013076 superfamily 243175 123 180 4.01E-10 53.8106 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#290 - CGI_10013077 superfamily 241832 7 78 2.17E-17 73.3748 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#290 - CGI_10013077 superfamily 243175 123 180 7.21E-10 53.0402 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#292 - CGI_10013079 superfamily 202715 67 167 3.11E-39 130.776 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#294 - CGI_10013081 superfamily 247725 95 185 2.67E-45 156.637 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#294 - CGI_10013081 superfamily 241647 39 68 1.97E-07 48.293 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#296 - CGI_10013083 superfamily 245670 676 859 3.11E-53 183.936 cl11519 DENN superfamily - - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#296 - CGI_10013083 superfamily 243635 588 666 9.02E-14 68.5153 cl04085 uDENN superfamily - - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#297 - CGI_10013084 superfamily 245840 24 167 2.12E-86 260.339 cl12022 Ribosomal_L18e superfamily - - Ribosomal protein L18e/L15; This family includes eukaryotic L18 as well as prokaryotic L15. Q#297 - CGI_10013084 superfamily 245840 240 364 2.55E-73 226.827 cl12022 Ribosomal_L18e superfamily - - Ribosomal protein L18e/L15; This family includes eukaryotic L18 as well as prokaryotic L15. Q#299 - CGI_10013086 superfamily 241659 90 161 3.48E-22 89.1162 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#299 - CGI_10013086 superfamily 241659 283 360 2.52E-21 86.805 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#301 - CGI_10013088 superfamily 245596 116 395 2.10E-147 434.209 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#301 - CGI_10013088 superfamily 222439 445 706 1.44E-44 162.023 cl16461 Glyco_transf_49 superfamily - - "Glycosyl-transferase for dystroglycan; This glycosyl-transferase brings about the glycosylation of the alpha-dystroglycan subunit. Dystroglycan is an integral member of the skeletal muscular dystrophin glycoprotein complex, which links dystrophin to proteins in the extracellular matrix." Q#302 - CGI_10013089 superfamily 219932 16 354 9.29E-93 283.534 cl07288 Pex16 superfamily - - Peroxisomal membrane protein (Pex16); Pex16 is a peripheral protein located at the matrix face of the peroxisomal membrane. Q#303 - CGI_10013090 superfamily 245874 15 79 1.60E-18 81.7037 cl12111 TNFR superfamily C - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#304 - CGI_10013091 superfamily 247684 43 100 2.20E-18 78.7801 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#304 - CGI_10013091 superfamily 247675 3 66 0.00958995 32.5321 cl17011 Arginase_HDAC superfamily C - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#305 - CGI_10009028 superfamily 241625 9 133 1.51E-27 100.092 cl00123 PROF superfamily - - "Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway." Q#307 - CGI_10009030 superfamily 216371 48 436 2.38E-57 196.12 cl18365 ERG4_ERG24 superfamily - - Ergosterol biosynthesis ERG4/ERG24 family; Ergosterol biosynthesis ERG4/ERG24 family. Q#308 - CGI_10009031 superfamily 243066 161 251 9.16E-29 110.72 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#308 - CGI_10009031 superfamily 219619 504 573 1.52E-10 57.9879 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#309 - CGI_10009032 superfamily 243066 78 168 6.70E-23 93.7716 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#309 - CGI_10009032 superfamily 219619 424 499 1.03E-09 55.6767 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#309 - CGI_10009032 superfamily 241672 534 622 0.00463683 38.488 cl00192 ribokinase_pfkB_like superfamily C - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#311 - CGI_10009034 superfamily 247875 63 233 0.00796951 35.3313 cl17321 2OG-FeII_Oxy_2 superfamily - - 2OG-Fe(II) oxygenase superfamily; 2OG-Fe(II) oxygenase superfamily. Q#312 - CGI_10009035 superfamily 243092 343 585 1.55E-11 64.2784 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#312 - CGI_10009035 superfamily 246925 24 100 0.00450112 38.1054 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#313 - CGI_10009036 superfamily 243092 57 81 0.00385314 31.1286 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#314 - CGI_10009037 superfamily 219502 72 280 3.03E-57 186.111 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#315 - CGI_10009038 superfamily 219502 229 444 2.31E-52 179.563 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#315 - CGI_10009038 superfamily 201962 46 117 8.93E-18 78.5716 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#315 - CGI_10009038 superfamily 219507 126 224 1.95E-09 55.3231 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#317 - CGI_10009040 superfamily 243092 14 347 6.59E-45 164.815 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#317 - CGI_10009040 superfamily 219469 753 905 1.13E-21 95.031 cl15655 Hira superfamily - - TUP1-like enhancer of split; The Hira proteins are found in a range of eukaryotes and are implicated in the assembly of repressive chromatin. These proteins also contain pfam00400. Q#318 - CGI_10009041 superfamily 243109 2370 2524 6.08E-76 252.815 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#318 - CGI_10009041 superfamily 241594 3963 4328 4.90E-68 237.463 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#319 - CGI_10001089 superfamily 243092 305 451 5.66E-06 46.9444 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#321 - CGI_10014903 superfamily 243035 225 356 7.88E-28 106.145 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#323 - CGI_10014905 superfamily 243035 25 96 1.58E-14 64.1781 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#324 - CGI_10014906 superfamily 243035 22 46 0.000107247 36.4238 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#326 - CGI_10014908 superfamily 243034 672 767 1.44E-11 62.0124 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#326 - CGI_10014908 superfamily 242008 573 636 0.00687659 37.9876 cl00656 Cas1_I-II-III superfamily N - "CRISPR/Cas system-associated protein Cas1; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain" Q#328 - CGI_10014910 superfamily 245201 617 667 4.02E-09 56.776 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#328 - CGI_10014910 superfamily 245201 402 459 0.000408299 41.9484 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#331 - CGI_10014913 superfamily 220692 48 263 9.01E-05 42.1913 cl18570 7TM_GPCR_Srw superfamily N - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#333 - CGI_10014915 superfamily 241741 60 651 0 966.323 cl00270 PEPCK_HprK superfamily - - "Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of HPr and its dephosphorylation by phosphorolysis. PEPCK and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting that these two phosphotransferases have related functions." Q#334 - CGI_10014916 superfamily 241741 36 626 0 948.219 cl00270 PEPCK_HprK superfamily - - "Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of HPr and its dephosphorylation by phosphorolysis. PEPCK and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting that these two phosphotransferases have related functions." Q#335 - CGI_10014917 superfamily 245234 15 66 1.50E-05 42.6646 cl10022 ABM superfamily C - Antibiotic biosynthesis monooxygenase; This domain is found in monooxygenases involved in the biosynthesis of several antibiotics by Streptomyces species. It's occurrence as a repeat in Streptomyces coelicolor SCO1909 is suggestive that the other proteins function as multimers. There is also a conserved histidine which is likely to be an active site residue. Q#336 - CGI_10014918 superfamily 209898 18 40 0.000427673 38.1534 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#336 - CGI_10014918 superfamily 209898 65 87 0.00594323 35.0718 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#336 - CGI_10014918 superfamily 209898 86 103 0.00998121 34.2383 cl14787 MORN superfamily C - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#338 - CGI_10014920 superfamily 243050 610 671 6.39E-37 133.285 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#338 - CGI_10014920 superfamily 243050 550 602 1.07E-34 126.689 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#338 - CGI_10014920 superfamily 243050 440 493 1.48E-29 112.526 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#340 - CGI_10014922 superfamily 242966 107 181 0.00260004 35.3989 cl02288 DUF1330 superfamily C - Protein of unknown function (DUF1330); This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. Q#340 - CGI_10014922 superfamily 242966 33 87 0.0033624 35.0137 cl02288 DUF1330 superfamily NC - Protein of unknown function (DUF1330); This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. Q#341 - CGI_10014923 superfamily 242966 621 688 0.000210758 39.8452 cl02288 DUF1330 superfamily - - Protein of unknown function (DUF1330); This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. Q#342 - CGI_10014924 superfamily 242966 81 157 0.00242903 34.6285 cl02288 DUF1330 superfamily - - Protein of unknown function (DUF1330); This family consists of several hypothetical bacterial proteins of around 90 residues in length. The function of this family is unknown. Q#343 - CGI_10014925 superfamily 241578 317 478 0.000424557 40.0234 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#344 - CGI_10004680 superfamily 192535 18 132 0.00034405 40.657 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#346 - CGI_10004683 superfamily 241563 62 101 0.00054935 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#347 - CGI_10004684 superfamily 241953 366 424 0.0041339 37.4952 cl00567 Colicin_V superfamily N - Colicin V production protein; Colicin V production protein is required in E. Coli for colicin V production from plasmid pColV-K30. This protein is coded for in the purF operon. Q#350 - CGI_10004687 superfamily 192997 443 553 4.50E-09 55.2803 cl18184 Sterol-sensing superfamily N - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#351 - CGI_10000262 superfamily 217915 14 127 2.40E-18 79.8576 cl14957 Spc97_Spc98 superfamily N - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#352 - CGI_10005014 superfamily 241568 139 193 0.000841951 36.672 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#352 - CGI_10005014 superfamily 246918 201 253 1.24E-12 61.4487 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#352 - CGI_10005014 superfamily 246918 258 308 2.89E-06 43.7295 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#352 - CGI_10005014 superfamily 241619 27 98 0.00332274 34.862 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#353 - CGI_10005015 superfamily 243072 15 163 2.92E-24 93.6022 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#360 - CGI_10015340 superfamily 243161 4 91 5.08E-19 76.2789 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#361 - CGI_10015341 superfamily 220695 27 150 1.28E-06 47.9587 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#364 - CGI_10015344 superfamily 247803 127 298 5.20E-109 326.455 cl17249 YlqF_related_GTPase superfamily - - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#365 - CGI_10015345 superfamily 243050 282 337 1.85E-25 97.6754 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#365 - CGI_10015345 superfamily 243050 219 276 2.85E-19 81.1084 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#365 - CGI_10015345 superfamily 243050 344 400 1.51E-15 70.8852 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#365 - CGI_10015345 superfamily 195146 173 211 1.14E-06 46.0883 cl05674 PET superfamily N - "PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions; PET domain is involved in protein-protein interactions and is usually found in conjunction with LIM domain, which is also a protein-protein interaction domain. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. The PET domain has been found at the N-terminal of four known groups of proteins: prickle, testin, LIMPETin/LIM-9 and overexpressed breast tumor protein (OEBT). Prickle has been implicated in regulation of cell movement through its association with the Dishevelled (Dsh) protein in the planar cell polarity (PCP) pathway. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell contact areas, and at focal adhesion plaques. It interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin, and is involved in cell motility and adhesion events. Knockout mice experiments reveal tumor repressor function of Testin. LIMPETin/LIM-9 contains an N-terminal PET domain and 6 LIM domains at the C-terminal. In Schistosoma mansoni, where LIMPETin was first identified, it is down regulated in sexually mature adult females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator. In C. elegans, LIM-9 may play a role in regulating the assembly and maintenance of the muscle A-band by forming a protein complex with SCPL-1 and UNC-89 and other proteins. OEBT displays a PET domain with two LIM domains, and is predicted to be localized in the nucleus with a possible role in cancer differentiation." Q#366 - CGI_10015346 superfamily 248097 93 138 8.23E-09 49.1858 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#367 - CGI_10015347 superfamily 248097 8 122 3.40E-16 69.6014 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#368 - CGI_10015348 superfamily 248097 62 188 6.19E-20 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#370 - CGI_10015350 superfamily 216152 3 262 1.54E-81 252.235 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#372 - CGI_10015352 superfamily 245206 5 245 2.24E-73 226.413 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#373 - CGI_10015353 superfamily 247868 288 352 8.73E-06 46.3725 cl17314 PRK07608 superfamily N - ubiquinone biosynthesis hydroxylase family protein; Provisional Q#373 - CGI_10015353 superfamily 247868 1 169 1.46E-05 45.7154 cl17314 PRK07608 superfamily C - ubiquinone biosynthesis hydroxylase family protein; Provisional Q#374 - CGI_10015354 superfamily 247755 442 683 1.08E-117 354.924 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#374 - CGI_10015354 superfamily 216049 141 396 4.41E-41 150.899 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#377 - CGI_10015357 superfamily 242274 71 359 1.18E-119 367.053 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#377 - CGI_10015357 superfamily 247743 627 794 6.11E-28 111.084 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#378 - CGI_10015358 superfamily 241607 46 92 5.88E-05 36.1182 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#380 - CGI_10015360 superfamily 241641 43 106 5.81E-19 82.8969 cl00150 TY superfamily - - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#380 - CGI_10015360 superfamily 241641 292 343 8.05E-09 53.6217 cl00150 TY superfamily N - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#380 - CGI_10015360 superfamily 238155 147 206 2.12E-10 58.9254 cl08547 SPARC_EC superfamily N - "SPARC_EC; extracellular Ca2+ binding domain (containing 2 EF-hand motifs) of SPARC and related proteins (QR1, SC1/hevin, testican and tsc-36/FRP). SPARC (BM-40) is a multifunctional glycoprotein, a matricellular protein, that functions to regulate cell-matrix interactions; binds to such proteins as collagen and vitronectin and binds to endothelial cells thus inhibiting cellular proliferation. The EC domain interacts with a follistatin-like (FS) domain which appears to stabilize Ca2+ binding. The two EF-hands interact canonically but their conserved disulfide bonds confer a tight association between the EF-hand pair and an acid/amphiphilic N-terminal helix. Proposed active form involves a Ca2+ dependent symmetric homodimerization of EC-FS modules." Q#380 - CGI_10015360 superfamily 238155 366 449 0.000317471 40.0506 cl08547 SPARC_EC superfamily - - "SPARC_EC; extracellular Ca2+ binding domain (containing 2 EF-hand motifs) of SPARC and related proteins (QR1, SC1/hevin, testican and tsc-36/FRP). SPARC (BM-40) is a multifunctional glycoprotein, a matricellular protein, that functions to regulate cell-matrix interactions; binds to such proteins as collagen and vitronectin and binds to endothelial cells thus inhibiting cellular proliferation. The EC domain interacts with a follistatin-like (FS) domain which appears to stabilize Ca2+ binding. The two EF-hands interact canonically but their conserved disulfide bonds confer a tight association between the EF-hand pair and an acid/amphiphilic N-terminal helix. Proposed active form involves a Ca2+ dependent symmetric homodimerization of EC-FS modules." Q#383 - CGI_10008773 superfamily 241600 82 246 3.01E-71 219.805 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#384 - CGI_10008774 superfamily 241570 35 92 4.01E-05 43.0834 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#384 - CGI_10008774 superfamily 241570 168 226 0.000151104 41.1574 cl00047 CAP_ED superfamily N - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#384 - CGI_10008774 superfamily 241570 246 296 0.000728101 39.2314 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#385 - CGI_10008775 superfamily 248289 49 106 0.00597594 31.3324 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#388 - CGI_10008778 superfamily 243092 106 419 3.01E-51 175.986 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#388 - CGI_10008778 superfamily 219730 8 76 1.18E-19 82.5683 cl06962 NLE superfamily - - NLE (NUC135) domain; This domain is located N terminal to WD40 repeats. It is found in the microtubule-associated yeast protein YTM1. Q#390 - CGI_10008780 superfamily 247725 639 751 7.07E-67 222.14 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#390 - CGI_10008780 superfamily 247755 5 151 2.59E-46 166.958 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#390 - CGI_10008780 superfamily 247789 197 356 2.05E-29 118.515 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#390 - CGI_10008780 superfamily 215882 559 662 1.79E-27 109.678 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#390 - CGI_10008780 superfamily 220215 474 550 3.35E-25 101.918 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#390 - CGI_10008780 superfamily 192138 764 782 7.90E-07 47.9988 cl07378 FA superfamily C - "FERM adjacent (FA); This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase substrates." Q#394 - CGI_10009161 superfamily 243035 22 148 3.49E-23 89.2161 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#395 - CGI_10009162 superfamily 219079 13 70 0.00767766 32.1128 cl14967 PHA-1 superfamily NC - Regulator protein PHA-1; This family represents the protein product of the gene pha-1 which coordinates with lin-35 Rb during animal development. The protein is expressed during embryonic development and functions in the cytoplasm. PHA-1 acts in a parallel pathway with UBC-18 to regulate the activity of a common cellular target. Q#396 - CGI_10004142 superfamily 202865 37 142 6.72E-22 93.8807 cl04378 Sec8_exocyst superfamily N - Sec8 exocyst complex component specific domain; Sec8 exocyst complex component specific domain. Q#397 - CGI_10004144 superfamily 245847 33 164 0.000900076 37.482 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#398 - CGI_10004145 superfamily 218609 15 66 0.00143857 34.2715 cl05189 Destabilase superfamily C - "Destabilase; Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine." Q#399 - CGI_10004146 superfamily 245847 3 134 0.00752483 35.1708 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#400 - CGI_10005018 superfamily 245213 316 352 1.04E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#400 - CGI_10005018 superfamily 245213 267 304 0.000187463 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#400 - CGI_10005018 superfamily 243061 67 156 5.98E-33 122.836 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#400 - CGI_10005018 superfamily 243061 158 256 5.70E-32 120.139 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#400 - CGI_10005018 superfamily 243068 368 591 6.40E-23 98.3864 cl02523 Zona_pellucida superfamily - - Zona pellucida-like domain; Zona pellucida-like domain. Q#401 - CGI_10005019 superfamily 243035 21 121 0.000162434 37.5994 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#402 - CGI_10005020 superfamily 245213 64 101 5.10E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#402 - CGI_10005020 superfamily 243068 125 362 5.73E-35 130.743 cl02523 Zona_pellucida superfamily - - Zona pellucida-like domain; Zona pellucida-like domain. Q#403 - CGI_10005021 superfamily 227404 391 725 3.43E-28 117.717 cl18810 ALK1 superfamily - - Serine/threonine kinase of the haspin family [Cell division and chromosome partitioning] Q#404 - CGI_10005022 superfamily 245206 8 262 2.22E-88 264.928 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#405 - CGI_10005023 superfamily 241578 1 187 5.19E-81 248.818 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#406 - CGI_10005024 superfamily 245226 152 329 1.63E-72 227.103 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#406 - CGI_10005024 superfamily 207684 97 131 0.000120176 39.2844 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#410 - CGI_10006746 superfamily 247792 81 118 3.74E-05 37.4252 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#412 - CGI_10006748 superfamily 248024 37 182 9.89E-29 110.065 cl17470 SBF superfamily C - "Sodium Bile acid symporter family; This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds." Q#413 - CGI_10006749 superfamily 248024 67 204 3.97E-18 80.7901 cl17470 SBF superfamily C - "Sodium Bile acid symporter family; This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds." Q#414 - CGI_10006750 superfamily 215866 6 129 1.64E-20 86.6103 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#414 - CGI_10006750 superfamily 243212 184 287 2.05E-08 51.5758 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#415 - CGI_10006751 superfamily 245205 3 47 2.22E-05 39.1433 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#416 - CGI_10006752 superfamily 245205 14 72 7.77E-06 38.7581 cl09930 RPA_2b-aaRSs_OBF_like superfamily C - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#426 - CGI_10015168 superfamily 248458 17 87 1.38E-06 45.3825 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#427 - CGI_10015169 superfamily 222429 7 54 1.96E-07 43.3832 cl18676 Myb_DNA-bind_5 superfamily C - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#428 - CGI_10015170 superfamily 248458 105 169 9.05E-07 46.1529 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#429 - CGI_10015171 superfamily 248458 13 76 5.60E-06 43.4565 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#430 - CGI_10015172 superfamily 248458 179 308 3.46E-09 57.3237 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#430 - CGI_10015172 superfamily 248458 433 541 3.02E-05 44.9973 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#431 - CGI_10015173 superfamily 248458 57 186 6.18E-12 65.4129 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#431 - CGI_10015173 superfamily 248458 273 449 1.02E-06 49.2345 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#432 - CGI_10015174 superfamily 245213 160 196 1.33E-07 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#432 - CGI_10015174 superfamily 245213 198 234 3.34E-07 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#432 - CGI_10015174 superfamily 245213 84 120 6.33E-06 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#432 - CGI_10015174 superfamily 245213 122 158 0.000202999 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#433 - CGI_10005525 superfamily 244906 1373 1432 5.68E-26 104.144 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#433 - CGI_10005525 superfamily 221593 643 757 2.34E-22 95.9038 cl13857 DUF3694 superfamily - - "Kinesin protein; This domain family is found in eukaryotes, and is typically between 131 and 151 amino acids in length. The family is found in association with pfam00225, pfam00498. There is a single completely conserved residue W that may be functionally important." Q#433 - CGI_10005525 superfamily 221571 233 285 1.57E-10 58.6683 cl13810 KIF1B superfamily - - "Kinesin protein 1B; This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00225, pfam00498. KIF1B is an anterograde motor for transport of mitochondria in axons of neuronal cells." Q#434 - CGI_10005527 superfamily 243146 66 104 6.70E-06 42.6486 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#435 - CGI_10005528 superfamily 243100 169 222 4.31E-09 50.6901 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#436 - CGI_10005529 superfamily 241571 74 168 8.93E-07 47.0219 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#436 - CGI_10005529 superfamily 243035 190 245 3.98E-09 54.143 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#436 - CGI_10005529 superfamily 245847 264 394 0.000127152 41.003 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#438 - CGI_10005531 superfamily 188051 141 432 1.90E-25 105.627 cl18155 nop2p superfamily - - "NOL1/NOP2/sun family putative RNA methylase; [Protein synthesis, tRNA and rRNA base modification]." Q#442 - CGI_10012222 superfamily 248020 22 226 1.48E-06 48.6148 cl17466 Sulfatase superfamily C - Sulfatase; Sulfatase. Q#445 - CGI_10012225 superfamily 243093 71 156 8.09E-08 50.1613 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#447 - CGI_10012227 superfamily 247792 16 59 7.32E-08 48.596 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#447 - CGI_10012227 superfamily 241563 154 189 6.01E-06 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#449 - CGI_10012229 superfamily 243238 38 506 4.64E-170 499.102 cl02915 Voltage_gated_ClC superfamily - - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#449 - CGI_10012229 superfamily 246936 524 688 4.52E-19 83.8408 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#451 - CGI_10012231 superfamily 216033 749 837 7.91E-19 83.152 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 625 710 2.30E-15 73.1368 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 57 141 4.28E-15 72.3664 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 530 617 8.52E-15 71.596 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 245 329 1.56E-14 70.8256 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 353 428 8.12E-12 62.7364 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 460 523 9.11E-11 59.6548 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 178 236 8.59E-05 41.5504 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#451 - CGI_10012231 superfamily 216033 4 49 0.00231685 37.3132 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#452 - CGI_10012232 superfamily 245309 150 228 0.000609837 38.6292 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#452 - CGI_10012232 superfamily 248097 259 365 4.98E-22 92.3282 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#452 - CGI_10012232 superfamily 248097 503 619 8.95E-12 62.6678 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#452 - CGI_10012232 superfamily 248097 375 487 1.79E-06 46.4894 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#453 - CGI_10012233 superfamily 245309 86 164 0.00133461 37.0884 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#453 - CGI_10012233 superfamily 248097 193 299 2.74E-25 101.188 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#453 - CGI_10012233 superfamily 248097 449 556 3.47E-08 51.9193 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#453 - CGI_10012233 superfamily 248097 309 353 6.29E-06 44.5634 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#455 - CGI_10012235 superfamily 243212 356 484 3.70E-15 72.7617 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#455 - CGI_10012235 superfamily 215866 165 287 7.43E-12 63.1132 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#456 - CGI_10012236 superfamily 247725 913 1017 9.87E-34 127.508 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#456 - CGI_10012236 superfamily 201217 88 139 1.27E-12 64.8544 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 201217 582 629 1.67E-12 64.4692 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 205718 616 645 1.82E-07 49.411 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 205718 72 101 9.40E-06 44.4034 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 201217 143 204 2.69E-05 43.2832 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 201217 632 679 3.46E-05 43.2832 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 201217 209 253 0.000555424 39.4312 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#456 - CGI_10012236 superfamily 209898 1124 1142 0.00190174 37.7051 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#456 - CGI_10012236 superfamily 209898 1152 1169 0.00927199 35.7791 cl14787 MORN superfamily C - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#457 - CGI_10012237 superfamily 128469 243 322 0.00030407 38.5904 cl17971 VPS9 superfamily C - Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. Q#463 - CGI_10004485 superfamily 215866 13 147 3.41E-22 90.0771 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#463 - CGI_10004485 superfamily 243212 166 206 1.37E-05 42.7162 cl02844 Arrestin_C superfamily C - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#467 - CGI_10007406 superfamily 245876 5 107 2.82E-50 166.693 cl12113 HSF_DNA-bind superfamily - - HSF-type DNA-binding; HSF-type DNA-binding. Q#467 - CGI_10007406 superfamily 219081 251 394 0.000702796 39.6848 cl05853 Vert_HS_TF superfamily C - "Vertebrate heat shock transcription factor; This family represents the C-terminal region of vertebrate heat shock transcription factors. Heat shock transcription factors regulate the expression of heat shock proteins - a set of proteins that protect the cell from damage caused by stress and aid the cell's recovery after the removal of stress. This C-terminal region is found with the N-terminal pfam00447, and may contain a three-stranded coiled-coil trimerisation domain and a CE2 regulatory region, the latter of which is involved in sustained heat shock response." Q#468 - CGI_10007407 superfamily 241571 198 309 1.68E-24 99.7942 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#468 - CGI_10007407 superfamily 241571 5 99 2.36E-17 78.9934 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#468 - CGI_10007407 superfamily 241613 104 136 1.78E-08 51.4386 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#468 - CGI_10007407 superfamily 241613 399 430 1.49E-07 49.1274 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#468 - CGI_10007407 superfamily 241613 155 194 0.000587255 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#469 - CGI_10007408 superfamily 248289 26 80 6.94E-05 37.4035 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#470 - CGI_10007409 superfamily 245864 97 243 3.06E-11 61.9106 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#471 - CGI_10007410 superfamily 238191 10 507 4.26E-130 391.31 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#473 - CGI_10007412 superfamily 247856 102 120 0.00526491 32.25 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#474 - CGI_10011934 superfamily 241782 34 429 2.93E-141 413.118 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#476 - CGI_10011936 superfamily 222150 243 267 1.69E-05 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#476 - CGI_10011936 superfamily 222150 270 295 3.08E-05 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#476 - CGI_10011936 superfamily 222150 298 322 0.000428054 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#476 - CGI_10011936 superfamily 222150 327 351 0.000867075 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#476 - CGI_10011936 superfamily 246975 230 250 0.00813496 33.8597 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#477 - CGI_10011937 superfamily 248458 18 194 7.69E-14 71.1909 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#478 - CGI_10011938 superfamily 241572 63 115 0.000345107 36.8329 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#479 - CGI_10011939 superfamily 247769 588 761 7.93E-06 45.4081 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#479 - CGI_10011939 superfamily 248010 343 493 2.03E-16 77.4216 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#479 - CGI_10011939 superfamily 248010 162 242 0.00492498 36.9756 cl17456 GAF superfamily N - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#482 - CGI_10011942 superfamily 245814 606 679 1.00E-09 56.3435 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 38 99 8.30E-08 50.5655 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 515 571 4.33E-06 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 134 205 4.73E-06 45.1727 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 242 300 0.00290884 36.6983 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 417 477 1.52E-07 49.7127 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 310 375 2.31E-07 49.297 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#482 - CGI_10011942 superfamily 245814 492 532 0.00496426 36.367 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#484 - CGI_10011944 superfamily 241599 273 329 2.56E-06 44.9269 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#485 - CGI_10011945 superfamily 241563 68 109 2.61E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#487 - CGI_10011947 superfamily 245819 1075 1214 5.57E-51 178.927 cl11967 Nucleotidyl_cyc_III superfamily C - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#487 - CGI_10011947 superfamily 245225 270 626 1.99E-51 187.455 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#487 - CGI_10011947 superfamily 245201 790 1003 4.26E-28 115.712 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#487 - CGI_10011947 superfamily 219526 1005 1062 0.000833874 40.6803 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#488 - CGI_10011948 superfamily 217228 204 422 3.59E-129 377.466 cl07843 G6PD_C superfamily C - "Glucose-6-phosphate dehydrogenase, C-terminal domain; Glucose-6-phosphate dehydrogenase, C-terminal domain. " Q#488 - CGI_10011948 superfamily 215937 29 202 1.35E-84 258.971 cl02877 G6PD_N superfamily - - "Glucose-6-phosphate dehydrogenase, NAD binding domain; Glucose-6-phosphate dehydrogenase, NAD binding domain. " Q#489 - CGI_10011949 superfamily 205277 121 282 1.39E-16 74.586 cl16092 ShortName superfamily - - Family description; Family description. Q#490 - CGI_10011950 superfamily 242900 1 157 8.73E-27 100.423 cl02137 PRA1 superfamily - - PRA1 family protein; This family includes the PRA1 (Prenylated rab acceptor) protein which is a Rab guanine dissociation inhibitor (GDI) displacement factor. This family also includes the glutamate transporter EAAC1 interacting protein GTRAP3-18. Q#491 - CGI_10011951 superfamily 247750 67 363 0 525.768 cl17196 E1_enzyme_family superfamily - - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#491 - CGI_10011951 superfamily 192164 370 457 1.54E-18 80.3472 cl07434 E2_bind superfamily - - "E2 binding domain; E1 and E2 enzymes play a central role in ubiquitin and ubiquitin-like protein transfer cascades. This is an E2 binding domain that is found on NEDD8 activating E1 enzyme. The domain resembles ubiquitin, and recruits the catalytic core of the E2 enzyme Ubc12 in a similar manner to that in which ubiquitin interacts with ubiquitin binding domains." Q#493 - CGI_10008351 superfamily 241574 69 122 3.90E-17 74.5445 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#494 - CGI_10008352 superfamily 247724 253 427 4.18E-41 147.297 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#494 - CGI_10008352 superfamily 247724 106 222 2.12E-33 125.34 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#494 - CGI_10008352 superfamily 247724 7 62 3.24E-05 43.7528 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#495 - CGI_10008353 superfamily 247805 24 98 3.08E-10 54.1875 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#496 - CGI_10008354 superfamily 241600 132 287 2.70E-66 215.953 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#498 - CGI_10008356 superfamily 245208 1 55 1.98E-11 57.381 cl09933 ACAD superfamily N - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#500 - CGI_10011017 superfamily 241563 38 77 1.45E-05 42.6595 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#505 - CGI_10011023 superfamily 246713 10 34 0.00532324 34.302 cl14786 ENDO3c superfamily - - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#506 - CGI_10011024 superfamily 243164 103 136 3.97E-11 54.466 cl02748 zf-CDGSH superfamily - - "Iron-binding zinc finger CDGSH type; The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm." Q#506 - CGI_10011024 superfamily 243164 59 92 6.67E-09 47.9176 cl02748 zf-CDGSH superfamily - - "Iron-binding zinc finger CDGSH type; The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm." Q#507 - CGI_10011025 superfamily 243058 77 195 3.32E-21 89.2959 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#507 - CGI_10011025 superfamily 243058 161 279 6.81E-19 82.7475 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#507 - CGI_10011025 superfamily 243058 245 407 2.30E-07 48.8499 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#507 - CGI_10011025 superfamily 243058 28 89 0.00376149 36.1384 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#509 - CGI_10011027 superfamily 243035 2 107 2.90E-17 71.8821 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#510 - CGI_10011028 superfamily 243072 24 147 1.21E-25 97.069 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#518 - CGI_10006220 superfamily 242830 11 62 2.08E-27 102.628 cl02008 CAT superfamily C - Chloramphenicol acetyltransferase; Chloramphenicol acetyltransferase. Q#519 - CGI_10006223 superfamily 248230 1 81 1.07E-38 142.016 cl17676 Rep_3 superfamily C - Initiator Replication protein; This protein is an initiator of plasmid replication. RepB possesses nicking-closing (topoisomerase I) like activity. It is also able to perform a strand transfer reaction on ssDNA that contains its target. This family also includes RepA which is an E.coli protein involved in plasmid replication. The RepA protein binds to DNA repeats that flank the repA gene. Q#520 - CGI_10006224 superfamily 221913 292 385 2.01E-08 53.6983 cl18626 AAA_12 superfamily C - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#521 - CGI_10006225 superfamily 245206 4 192 1.09E-72 223.093 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#522 - CGI_10006226 superfamily 241600 22 159 1.91E-47 155.091 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#524 - CGI_10006228 superfamily 241600 281 377 1.71E-42 149.313 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#524 - CGI_10006228 superfamily 241600 111 192 3.08E-22 92.6886 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#524 - CGI_10006228 superfamily 241619 226 261 0.0011018 37.0291 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#525 - CGI_10006229 superfamily 241600 2 157 1.31E-49 160.869 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#529 - CGI_10014571 superfamily 245342 625 703 1.31E-20 88.5598 cl10594 ERCC4 superfamily - - ERCC4 domain; This domain is a family of nucleases. The family includes EME1 which is an essential component of a Holliday junction resolvase. EME1 interacts with MUS81 to form a DNA structure-specific endonuclease. Q#531 - CGI_10014573 superfamily 220207 433 485 1.23E-17 77.6748 cl09622 Sas10_Utp3_C superfamily C - Sas10 C-terminal domain; This family contains a C-terminal presumed domain in Sas10 which hash been identified as a regulator of chromatin silencing. Q#531 - CGI_10014573 superfamily 217836 224 304 1.53E-14 69.2317 cl09556 Sas10_Utp3 superfamily - - "Sas10/Utp3/C1D family; This family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex. It also includes the human C1D protein and Saccharomyces cerevisiae YHR081W (rrp47), an exosome-associated protein required for the 3' processing of stable RNAs, and Sas10 which has been identified as a regulator of chromatin silencing. This family also includes the human protein Neuroguidin an initiation factor 4E (eIF4E) binding protein." Q#532 - CGI_10014574 superfamily 245201 252 511 0 560.891 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#532 - CGI_10014574 superfamily 246908 137 232 5.80E-63 202.735 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#532 - CGI_10014574 superfamily 247683 78 129 8.38E-30 111.135 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#533 - CGI_10014575 superfamily 243072 1328 1377 7.05E-09 55.4674 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#533 - CGI_10014575 superfamily 152510 171 204 0.000101474 41.918 cl13504 KN_motif superfamily - - "KN motif; This small motif is found at the N-terminus of Kank proteins and has been called the KN (for Kank N-terminal) motif. This protein is found in eukaryotes. Proteins in this family are typically between 413 to 1202 amino acids in length. This protein is found associated with pfam00023. This protein has two conserved sequence motifs: TPYG and LDLDF. Kank1 was obtained by positional cloning of a tumor suppressor gene in renal cell carcinoma, while the other members were found by homology search. The family is involved in the regulation of actin polymerization and cell motility through signaling pathways containing PI3K/Akt and/or unidentified modulators/effectors." Q#534 - CGI_10014576 superfamily 218954 1 84 0.00194946 36.5295 cl05646 Isy1 superfamily C - Isy1-like splicing family; Isy1 protein is important in the optimisation of splicing. Q#535 - CGI_10014577 superfamily 238147 30 161 3.45E-42 148.491 cl18906 TFIIFa superfamily N - "Transcription initiation factor IIF, alpha subunit, N-terminal region of RAP74. Subunit of transcription initiation complex involved in initiation, elongation and promoter escape.Tetramer of 2 alpha and 2 beta TFIIF subunits interacts directly with RNA polymerase II. TFIIF inhibits non-specific transcription initiation by PolII and recruits the polymerase to the preinitiation complex on promoter DNA for site-specific transcription initiation. The PolII/TFIIF-complex attaches through direct interactions of TFIIF with promoter DNA, TFIIB and the TAF250 subunit of TFIID, and provides scaffolding for addition of TFIIE and TFIIH. Together with TFIIE, TFIIF participates in DNA strand separation (open complex formation). N-terminal domains of RAP30 and RAP74 co-fold to form a single core structure, a triple barrel heterodimer, and has pseudo-2-fold symmetry." Q#536 - CGI_10014578 superfamily 212555 135 348 9.73E-37 133.386 cl17024 FBX4_GTPase_like superfamily - - "C-terminal GTPase-like domain of F-Box Only Protein 4; F-box proteins are involved in substrate recognition as part of SCF (Skp1-Cul1-Rbx1-F-box protein) ubiquitin ligase complexes. Fbx4 (or Fbxo4) binds to the telomere repeat binding factor 1 (TRF1), whose activity at telomeres is regulated in part by selective ubiquitination and degradation. This ubiquitination of TRF1 is mediated by Fbx4, which binds to the TRFH domain of TRF1, via the C-terminal domain characterized by this model, a module resembling a small GTPase domain that lacks the GTP-binding site. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, which in turn promotes telomere elongation. Fbx4 has also been reported to target cyclin D1 for degradation by the proteasome, a mechanism ensuring the fidelity of DNA replication. More recently, these findings have been disputed." Q#536 - CGI_10014578 superfamily 243074 37 83 1.73E-13 64.0649 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#537 - CGI_10014579 superfamily 242200 1 91 4.29E-28 99.0852 cl00932 Ribosomal_L37e superfamily - - Ribosomal protein L37e; This family includes ribosomal protein L37 from eukaryotes and archaebacteria. The family contains many conserved cysteines and histidines suggesting that this protein may bind to zinc. Q#540 - CGI_10014582 superfamily 246902 22 84 1.25E-31 109.236 cl15239 PLDc_SF superfamily N - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#541 - CGI_10014583 superfamily 218768 21 164 5.47E-43 143.549 cl05419 DUF846 superfamily - - Eukaryotic protein of unknown function (DUF846); This family consists of several of unknown function from a variety of eukaryotic organisms. Q#543 - CGI_10014585 superfamily 243555 25 211 4.03E-21 88.6022 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#545 - CGI_10014587 superfamily 245595 29 352 8.06E-158 453.588 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#545 - CGI_10014587 superfamily 248053 356 438 2.49E-20 85.2684 cl17499 Peptidase_M14NE-CP-C_like superfamily - - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#548 - CGI_10014590 superfamily 248279 588 652 1.90E-12 63.9247 cl17725 zf-HC5HC2H superfamily C - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#549 - CGI_10007841 superfamily 242903 23 97 7.53E-48 152.327 cl02148 APC10-like superfamily C - "APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination; This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here." Q#554 - CGI_10007847 superfamily 245847 9 156 6.65E-17 72.9745 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#557 - CGI_10007850 superfamily 243104 14 55 6.89E-05 37.9097 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#557 - CGI_10007850 superfamily 245205 99 156 0.00379233 33.3653 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#560 - CGI_10000417 superfamily 242240 9 151 1.14E-34 123.941 cl00997 DUF297 superfamily N - TM1410 hypothetical-related protein; TM1410 hypothetical-related protein. Q#561 - CGI_10000522 superfamily 216901 28 229 0.000264258 39.4913 cl03466 Rap_GAP superfamily - - Rap/ran-GAP; Rap/ran-GAP. Q#562 - CGI_10000553 superfamily 247724 17 93 3.05E-21 88.3611 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#563 - CGI_10013033 superfamily 241611 1293 1451 3.33E-16 80.1252 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#563 - CGI_10013033 superfamily 207627 232 294 1.25E-08 56.1039 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 1529 1616 3.90E-08 54.5631 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2049 2136 2.03E-07 52.2519 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 976 1044 3.86E-07 51.4815 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 4302 4392 9.12E-07 50.3307 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2531 2603 1.02E-06 50.3259 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 3926 4017 1.42E-06 49.9455 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 4933 5012 1.42E-06 49.9407 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 325 417 3.38E-06 48.7899 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2752 2852 2.98E-05 45.7083 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2919 3014 3.40E-05 45.7083 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 3544 3635 0.000106299 44.1627 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 1908 2010 0.000254021 43.0071 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 4168 4257 0.000412536 42.2367 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 814 916 0.000535749 41.8563 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 103 173 0.000750086 41.4711 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#563 - CGI_10013033 superfamily 207627 2629 2717 0.000956225 41.0859 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#564 - CGI_10013034 superfamily 248302 661 776 2.05E-22 93.7279 cl17748 VRR_NUC superfamily - - VRR-NUC domain; VRR-NUC domain. Q#565 - CGI_10013035 superfamily 243088 2 112 1.82E-34 119.301 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#566 - CGI_10013036 superfamily 246908 39 139 4.64E-48 158.458 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#566 - CGI_10013036 superfamily 246908 225 300 2.41E-25 97.1086 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#566 - CGI_10013036 superfamily 247683 156 214 2.29E-23 90.9008 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#567 - CGI_10013037 superfamily 241546 1369 1488 2.73E-53 185.557 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#567 - CGI_10013037 superfamily 248011 134 208 1.13E-09 57.8917 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#567 - CGI_10013037 superfamily 248011 41 112 2.46E-05 44.7106 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#567 - CGI_10013037 superfamily 248011 3 33 0.000797984 40.127 cl17457 PKD superfamily N - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#567 - CGI_10013037 superfamily 243086 1288 1321 0.00898446 36.9694 cl02559 GPS superfamily C - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#567 - CGI_10013037 superfamily 219520 1727 1780 0.00942925 37.9716 cl18515 5TM-5TMR_LYT superfamily NC - 5TMR of 5TMR-LYT; This entry represents the transmembrane region of the 5TM-LYT (5TM Receptors of the LytS-YhcK type). Q#568 - CGI_10013038 superfamily 241636 1 149 5.89E-44 149.275 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#570 - CGI_10013040 superfamily 216033 235 285 0.00147918 36.928 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#570 - CGI_10013040 superfamily 128778 25 150 0.00208976 36.8591 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#571 - CGI_10013041 superfamily 241758 229 466 3.98E-55 188.397 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#571 - CGI_10013041 superfamily 241780 2 198 8.46E-53 180.446 cl00319 Gn_AT_II superfamily - - "Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer." Q#572 - CGI_10013042 superfamily 247856 35 96 1.77E-07 45.6165 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#572 - CGI_10013042 superfamily 247856 70 138 9.25E-07 43.6905 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#575 - CGI_10013045 superfamily 214781 216 316 1.02E-14 71.2192 cl02747 NRF superfamily - - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#576 - CGI_10013046 superfamily 216316 1278 1718 6.79E-164 509.093 cl10574 CD36 superfamily - - CD36 family; The CD36 family is thought to be a novel class of scavenger receptors. There is also evidence suggesting a possible role in signal transduction. CD36 is involved in cell adhesion. Q#581 - CGI_10001052 superfamily 246723 226 610 2.92E-40 151.816 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#582 - CGI_10001053 superfamily 243092 25 131 3.38E-13 63.8932 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#583 - CGI_10011092 superfamily 241546 1 63 0.000576447 34.9637 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#584 - CGI_10011093 superfamily 246751 214 435 7.94E-89 274.891 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#584 - CGI_10011093 superfamily 246751 56 199 2.62E-42 152.012 cl14883 Lipase superfamily C - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#585 - CGI_10011094 superfamily 246751 47 332 2.36E-96 293.766 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#585 - CGI_10011094 superfamily 241546 342 435 5.69E-05 41.5121 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#587 - CGI_10011096 superfamily 245201 222 514 0 583.404 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#587 - CGI_10011096 superfamily 241620 61 106 7.87E-20 83.4735 cl00113 CRIB superfamily - - "PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules." Q#588 - CGI_10011097 superfamily 217740 65 245 1.77E-23 95.1209 cl18427 Scramblase superfamily - - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#589 - CGI_10011098 superfamily 217311 69 504 2.19E-119 365.891 cl18402 DUF229 superfamily - - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#590 - CGI_10011099 superfamily 248338 15 275 9.38E-07 49.1369 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#590 - CGI_10011099 superfamily 247999 291 330 8.22E-06 42.5842 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#592 - CGI_10011101 superfamily 215502 84 301 9.62E-16 76.2292 cl18335 PLN02929 superfamily C - NADH kinase Q#594 - CGI_10011103 superfamily 241563 157 194 1.37E-05 43.0447 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#594 - CGI_10011103 superfamily 128778 201 318 0.00133317 38.0147 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#595 - CGI_10011104 superfamily 247792 72 114 0.00070665 38.5808 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#595 - CGI_10011104 superfamily 241563 217 245 6.91E-05 41.696 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#596 - CGI_10011105 superfamily 241767 64 170 2.80E-42 144.332 cl00304 TP_methylase superfamily N - "S-AdoMet dependent tetrapyrrole methylases; This family uses S-AdoMet (S-adenosyl-L-methionine or SAM) in the methylation of diverse substrates. Most members catalyze various methylation steps in cobalamin (vitamin B12) biosynthesis. There are two distinct cobalamin biosynthetic pathways in bacteria. The aerobic pathway requires oxygen, and cobalt is inserted late in the pathway; the anaerobic pathway does not require oxygen, and cobalt insertion is the first committed step towards cobalamin synthesis. The enzymes involved in the aerobic pathway are prefixed Cob and those of the anaerobic pathway Cbi. Most of the enzymes are shared by both pathways and a few enzymes are pathway-specific. Diphthine synthase and Ribosomal RNA small subunit methyltransferase I (RsmI) are two superfamily members that are not involved in cobalamin biosynthesis. Diphthine synthase participates in the posttranslational modification of a specific histidine residue in elongation factor 2 (EF-2) of eukaryotes and archaea to diphthamide. RsmI catalyzes the 2-O-methylation of the ribose of cytidine 1402 (C1402) in 16S rRNA using S-adenosylmethionine (Ado-Met) as the methyl donor." Q#597 - CGI_10011106 superfamily 241594 710 1067 9.89E-138 421.588 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#598 - CGI_10011108 superfamily 215821 13 95 5.99E-35 116.571 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#599 - CGI_10011109 superfamily 215821 228 304 9.87E-36 126.586 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#599 - CGI_10011109 superfamily 215821 30 91 3.29E-11 59.1763 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#600 - CGI_10011110 superfamily 215821 32 106 1.71E-17 72.273 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#601 - CGI_10011111 superfamily 215821 84 161 4.38E-32 111.949 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#602 - CGI_10011112 superfamily 241565 1037 1102 4.96E-06 45.7755 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#602 - CGI_10011112 superfamily 241565 1139 1212 0.006362 36.5846 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#605 - CGI_10001286 superfamily 243362 354 395 0.00022821 40.4863 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#605 - CGI_10001286 superfamily 241563 60 96 0.000739732 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#605 - CGI_10001286 superfamily 128778 121 216 0.00142486 37.6295 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#608 - CGI_10005120 superfamily 110440 91 117 0.000977636 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#609 - CGI_10005121 superfamily 247755 1026 1246 2.75E-127 393.012 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#609 - CGI_10005121 superfamily 247755 458 681 3.13E-104 329.815 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#609 - CGI_10005121 superfamily 216049 168 415 1.11E-10 62.3034 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#609 - CGI_10005121 superfamily 216049 778 970 4.93E-05 45.3546 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#613 - CGI_10001940 superfamily 247792 10 69 4.66E-05 45.4616 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#614 - CGI_10001839 superfamily 220253 53 129 1.33E-07 48.8287 cl09706 Cobl superfamily C - "Cordon-bleu domain; The Cordon-bleu protein domain is highly conserved among vertebrates. The sequence contains three repeated lysine, arginine, and proline-rich regions, the KKRAP motif. The exact function of the protein is unknown but it is thought to be involved in mid-brain neural tube closure. It is expressed specifically in the node." Q#616 - CGI_10011442 superfamily 243072 332 454 5.75E-23 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#616 - CGI_10011442 superfamily 243072 404 519 9.44E-20 85.513 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#616 - CGI_10011442 superfamily 243072 311 335 0.000154551 39.4596 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#618 - CGI_10011444 superfamily 245847 21 161 9.22E-16 69.8929 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#624 - CGI_10011450 superfamily 248097 76 194 1.92E-22 88.4762 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#626 - CGI_10011453 superfamily 217473 11 33 0.00692743 33.8778 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#628 - CGI_10002181 superfamily 243110 135 356 8.98E-19 85.5589 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#630 - CGI_10001384 superfamily 204434 14 39 7.86E-10 54.5081 cl10963 zf-CCHH superfamily - - "Zinc-finger (CX5CX6HX5H) motif; This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism." Q#630 - CGI_10001384 superfamily 241752 245 394 3.14E-09 55.4143 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#630 - CGI_10001384 superfamily 204434 71 94 1.52E-07 47.9597 cl10963 zf-CCHH superfamily - - "Zinc-finger (CX5CX6HX5H) motif; This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism." Q#632 - CGI_10012121 superfamily 241743 255 344 1.80E-14 69.9094 cl00274 ML superfamily C - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#633 - CGI_10012122 superfamily 241743 30 161 9.05E-24 92.251 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#634 - CGI_10012123 superfamily 241743 83 138 3.81E-09 50.6494 cl00274 ML superfamily C - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#635 - CGI_10012124 superfamily 202367 1 137 1.75E-21 88.3656 cl18226 3HCDH_N superfamily - - "3-hydroxyacyl-CoA dehydrogenase, NAD binding domain; This family also includes lambda crystallin." Q#635 - CGI_10012124 superfamily 216084 142 214 3.60E-15 68.7725 cl08285 3HCDH superfamily C - "3-hydroxyacyl-CoA dehydrogenase, C-terminal domain; This family also includes lambda crystallin. Some proteins include two copies of this domain." Q#636 - CGI_10012125 superfamily 245201 19 335 0 598.579 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#637 - CGI_10012126 superfamily 221377 1506 1672 9.36E-63 213.484 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#637 - CGI_10012126 superfamily 242184 826 864 0.00137507 38.512 cl00909 Ribosomal_L24e_L24 superfamily - - "Ribosomal protein L24e/L24 is a ribosomal protein found in eukaryotes (L24) and in archaea (L24e, distinct from archaeal L24). L24e/L24 is located on the surface of the large subunit, adjacent to proteins L14 and L3, and near the translation factor binding site. L24e/L24 appears to play a role in the kinetics of peptide synthesis, and may be involved in interactions between the large and small subunits, either directly or through other factors. In mouse, a deletion mutation in L24 has been identified as the cause for the belly spot and tail (Bst) mutation that results in disrupted pigmentation, somitogenesis and retinal cell fate determination. L24 may be an important protein in eukaryotic reproduction: in shrimp, L24 expression is elevated in the ovary, suggesting a role in oogenesis, and in Arabidopsis, L24 has been proposed to have a specific function in gynoecium development. No protein with sequence or structural homology to L24e/L24 has been identified in bacteria, but a functionally equivalent protein may exist. Bacterial L19 forms an interprotein beta sheet with L14 that is similar to the L24e/L14 interprotein beta sheet observed in the archaeal L24e structures. Some eukaryotic L24 proteins were initially identified as L30, and this alignment model contains several sequences called L30." Q#638 - CGI_10012127 superfamily 244551 286 375 1.15E-34 125.819 cl06904 eNOPS_SF superfamily - - "NOPS domain, including C-terminal helical extension region, in the p54nrb/PSF/PSP1 family; All members in this family contain a DBHS domain (for Drosophila behavior, human splicing), which comprises two conserved RNA recognition motifs (RRM1 and RRM2), also termed RBDs (RNA binding domains) or RNPs (ribonucleoprotein domains), and a charged protein-protein interaction NOPS (NONA and PSP1) domain with a long helical C-terminal extension. The NOPS domain specifically binds to RRM2 domain of the partner DBHS protein via a substantial interaction surface. Its highly conserved C-terminal residues are critical for functional DBHS dimerization while the highly conserved C-terminal helical extension, forming a right-handed antiparallel heterodimeric coiled-coil, is essential for localization of these proteins to subnuclear bodies. PSF has an additional large N-terminal domain that differentiates it from other family members. The p54nrb/PSF/PSP1 family includes 54 kDa nuclear RNA- and DNA-binding protein (p54nrb), polypyrimidine tract-binding protein (PTB)-associated-splicing factor (PSF) and paraspeckle protein 1 (PSP1), which are ubiquitously expressed and are well conserved in vertebrates. p54nrb, also termed NONO or NMT55, is a multi-functional protein involved in numerous nuclear processes including transcriptional regulation, splicing, DNA unwinding, nuclear retention of hyperedited double-stranded RNA, viral RNA processing, control of cell proliferation, and circadian rhythm maintenance. PSF, also termed POMp100, is also a multi-functional protein that binds RNA, single-stranded DNA (ssDNA), double-stranded DNA (dsDNA) and many factors, and mediates diverse activities in the cell. PSP1, also termed PSPC1, is a novel nucleolar factor that accumulates within a new nucleoplasmic compartment, termed paraspeckles, and diffusely distributes in the nucleoplasm. The cellular function of PSP1 remains unknown currently. The family also includes some p54nrb/PSF/PSP1 homologs from invertebrate species. For instance, the Drosophila melanogaster gene no-ontransient A (nonA) encoding puff-specific protein Bj6 (also termed NONA) and Chironomus tentans hrp65 gene encoding protein Hrp65. D. melanogaster NONA is involved in eye development and behavior and may play a role in circadian rhythm maintenance, similar to vertebrate p54nrb. C. tentans Hrp65 is a component of nuclear fibers associated with ribonucleoprotein particles in transit from the gene to the nuclear pore." Q#638 - CGI_10012127 superfamily 247723 140 210 4.63E-29 110.059 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#638 - CGI_10012127 superfamily 247723 216 295 2.42E-28 108.162 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#640 - CGI_10012129 superfamily 248458 231 382 6.97E-25 108.94 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#640 - CGI_10012129 superfamily 247743 772 848 0.00231575 39.8756 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#641 - CGI_10012130 superfamily 247727 70 185 4.53E-14 67.455 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#643 - CGI_10012132 superfamily 243035 283 382 3.26E-06 45.6886 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#643 - CGI_10012132 superfamily 243035 5 111 4.67E-06 45.3034 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#643 - CGI_10012132 superfamily 243035 433 553 0.00452061 36.0586 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#644 - CGI_10012133 superfamily 243035 5 92 3.87E-08 47.7686 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#646 - CGI_10012135 superfamily 243035 198 317 3.39E-11 59.9409 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#647 - CGI_10002104 superfamily 243092 1 163 7.50E-07 47.3296 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#651 - CGI_10004233 superfamily 247725 401 528 1.22E-72 231.031 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#651 - CGI_10004233 superfamily 241631 5 180 2.08E-63 208.616 cl00136 Sec7 superfamily - - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#655 - CGI_10011430 superfamily 245201 9 252 4.22E-144 425.16 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#656 - CGI_10011431 superfamily 247740 241 493 1.19E-129 384.562 cl17186 TIM_phosphate_binding superfamily N - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#656 - CGI_10011431 superfamily 246936 115 230 1.67E-52 175.854 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#656 - CGI_10011431 superfamily 247740 29 109 1.26E-34 132.641 cl17186 TIM_phosphate_binding superfamily C - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#659 - CGI_10011434 superfamily 243072 1 40 9.10E-06 44.6819 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#659 - CGI_10011434 superfamily 241832 662 722 1.24E-05 44.2214 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#661 - CGI_10011436 superfamily 220651 23 203 1.03E-42 144.593 cl10932 Mlf1IP superfamily - - "Myelodysplasia-myeloid leukemia factor 1-interacting protein; This entry is the conserved central region of a group of proteins that are putative transcriptional repressors. The structure contains a putative 14-3-3 binding motif involved in the subcellular localisation of various regulatory molecules, and it may be that interaction with the transcription factor DREF could be regulated through this motif. DREF regulates proliferation-related genes in Drosophila. Mlf1IP is expressed in both the nuclei and the cytoplasm and thus may have multi-functions." Q#662 - CGI_10011437 superfamily 217473 794 960 3.22E-21 95.5097 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#665 - CGI_10011440 superfamily 242849 40 113 1.59E-27 99.5856 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#666 - CGI_10011441 superfamily 247999 533 580 1.59E-10 57.5004 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#666 - CGI_10011441 superfamily 247999 421 456 2.58E-07 48.2556 cl17445 PHD superfamily C - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#666 - CGI_10011441 superfamily 247999 363 421 7.82E-05 40.9368 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#666 - CGI_10011441 superfamily 247999 475 533 7.82E-05 40.9368 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#668 - CGI_10008204 superfamily 241610 2707 2759 1.53E-20 89.2314 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#668 - CGI_10008204 superfamily 246671 2058 2180 3.75E-11 63.596 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#668 - CGI_10008204 superfamily 243034 1647 1738 0.000140192 43.1376 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#668 - CGI_10008204 superfamily 219042 2308 2498 4.32E-71 239.967 cl05795 Spond_N superfamily - - Spondin_N; This conserved region is found at the in the N-terminal half of several Spondin proteins. Spondins are involved in patterning axonal growth trajectory through either inhibiting or promoting adhesion of embryonic nerve cells. Q#668 - CGI_10008204 superfamily 246918 2849 2900 2.86E-12 65.3007 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#668 - CGI_10008204 superfamily 246918 2566 2612 2.68E-08 53.3595 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#668 - CGI_10008204 superfamily 246918 2647 2693 7.78E-06 46.4259 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#668 - CGI_10008204 superfamily 246918 2905 2956 0.000373574 41.4183 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#669 - CGI_10008205 superfamily 246669 271 407 1.19E-79 244.315 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#669 - CGI_10008205 superfamily 246669 140 264 3.01E-30 113.129 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#671 - CGI_10008207 superfamily 221176 79 401 9.89E-32 128.977 cl13202 Npa1 superfamily - - "Ribosome 60S biogenesis N-terminal; Npa1p is required for ribosome biogenesis and operates in the same functional environment as Rsa3p and Dbp6p during early maturation of 60S ribosomal subunits. The protein partners of Npa1p include eight putative helicases as well as the novel Npa2p factor. Npa1p can also associate with a subset of H/ACA and C/D small nucleolar RNPs (snoRNPs) involved in the chemical modification of residues in the vicinity of the peptidyl transferase centre. The protein has also been referred to as Urb1, and this domain at the N-terminal is one of several conserved regions along the length." Q#672 - CGI_10008208 superfamily 227778 45 161 9.43E-12 59.8219 cl17122 VPS24 superfamily C - Conserved protein implicated in secretion [Cell motility and secretion] Q#673 - CGI_10008209 superfamily 241563 13 44 1.55E-06 45.3559 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#673 - CGI_10008209 superfamily 192987 81 187 0.00561351 35.6259 cl13724 TMF_TATA_bd superfamily N - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#674 - CGI_10008210 superfamily 245610 6 287 2.64E-94 299 cl11424 nitrilase superfamily - - "Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes; This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy." Q#674 - CGI_10008210 superfamily 241758 326 649 5.86E-83 268.262 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#675 - CGI_10008211 superfamily 243082 826 1045 3.68E-58 202.231 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#675 - CGI_10008211 superfamily 243082 313 431 4.22E-26 109.784 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#675 - CGI_10008211 superfamily 245220 116 173 4.57E-15 72.0522 cl09957 zf-UBP superfamily - - Zn-finger in ubiquitin-hydrolases and other protein; Zn-finger in ubiquitin-hydrolases and other protein. Q#675 - CGI_10008211 superfamily 243082 214 247 8.28E-09 56.626 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#676 - CGI_10008212 superfamily 241581 206 319 4.98E-25 97.0718 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#677 - CGI_10008213 superfamily 248458 93 270 4.50E-07 51.1605 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#677 - CGI_10008213 superfamily 248458 414 493 0.000601914 41.1453 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#687 - CGI_10007559 superfamily 247684 73 160 2.48E-05 43.7316 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#687 - CGI_10007559 superfamily 202746 212 442 1.10E-80 252.216 cl08402 Hexokinase_2 superfamily - - Hexokinase; Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam00349. Some members of the family have two copies of each of these domains. Q#688 - CGI_10007560 superfamily 247896 21 505 1.66E-162 473.725 cl17342 Pyruvate_Kinase superfamily - - "Pyruvate kinase (PK): Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors. Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state. PK exists as several different isozymes, depending on organism and tissue type. In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung. PK forms a homotetramer, with each subunit containing three domains. The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer." Q#689 - CGI_10007561 superfamily 219909 3 279 1.20E-133 384.656 cl07252 Mo25 superfamily - - Mo25-like; Mo25-like proteins are involved in both polarised growth and cytokinesis. In fission yeast Mo25 is localised alternately to the spindle pole body and to the site cell division in a cell cycle dependent manner. Q#690 - CGI_10007562 superfamily 248054 13 235 2.99E-12 64.6311 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#691 - CGI_10007563 superfamily 243077 348 404 1.51E-14 68.3409 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#691 - CGI_10007563 superfamily 243034 17 95 8.52E-12 61.6272 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#691 - CGI_10007563 superfamily 243034 226 325 1.58E-09 55.0788 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#691 - CGI_10007563 superfamily 243034 114 208 1.55E-08 51.9972 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#692 - CGI_10016397 superfamily 241597 2829 2867 6.76E-06 47.2374 cl00082 HMG-box superfamily C - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#692 - CGI_10016397 superfamily 243250 4731 4984 7.75E-78 269.518 cl02959 Glyco_hydro_9 superfamily C - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#692 - CGI_10016397 superfamily 215827 336 512 7.86E-37 142.222 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#692 - CGI_10016397 superfamily 248279 1738 1817 1.75E-17 82.0291 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#692 - CGI_10016397 superfamily 215827 824 908 1.96E-15 78.6643 cl02830 Tyrosinase superfamily N - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#692 - CGI_10016397 superfamily 247999 1878 1926 1.19E-12 66.7452 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 2196 2244 1.37E-12 66.7452 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 1829 1879 5.55E-09 56.3448 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 2146 2194 6.00E-08 52.9846 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 2274 2326 7.41E-06 47.1 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#692 - CGI_10016397 superfamily 247999 1962 2016 0.00261936 39.5026 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#693 - CGI_10016398 superfamily 245206 157 367 1.23E-83 256.068 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#694 - CGI_10016399 superfamily 247829 23 405 8.86E-180 512.504 cl17275 PRTase_typeII superfamily - - "Phosphoribosyltransferase (PRTase) type II; This family contains two enzymes that play an important role in NAD production by either allowing quinolinic acid (QA) , quinolinate phosphoribosyl transferase (QAPRTase), or nicotinic acid (NA), nicotinate phosphoribosyltransferase (NAPRTase), to be used in the synthesis of NAD. QAPRTase catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide, an important step in the de novo synthesis of NAD. NAPRTase catalyses a similar reaction leading to NAMN and pyrophosphate, using nicotinic acid an PPRP as substrates, used in the NAD salvage pathway." Q#695 - CGI_10016400 superfamily 248312 73 167 0.000636771 37.3329 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#696 - CGI_10016401 superfamily 243051 190 318 1.16E-31 124.798 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2303 2453 6.34E-30 119.79 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2 158 7.75E-25 105.152 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 321 448 1.62E-23 101.3 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 624 790 2.13E-23 100.915 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 463 615 1.01E-19 90.1297 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2136 2294 5.79E-19 87.8185 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2857 3003 1.18E-18 86.6629 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 2470 2614 1.10E-15 78.1885 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 241613 1297 1332 6.13E-09 55.2906 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 3204 3238 1.98E-08 53.7498 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 2817 2851 1.49E-07 51.4386 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 3010 3042 7.96E-05 43.3494 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 2619 2653 0.00409589 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 241613 3163 3201 0.0042323 37.9566 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#696 - CGI_10016401 superfamily 243061 1916 2017 1.19E-44 160.2 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#696 - CGI_10016401 superfamily 241640 3346 3410 1.14E-19 91.569 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#696 - CGI_10016401 superfamily 243051 2659 2814 4.86E-11 63.9089 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243061 3063 3159 6.45E-10 59.4026 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#696 - CGI_10016401 superfamily 241640 3410 3462 9.41E-09 57.2862 cl00149 Tryp_SPc superfamily N - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#696 - CGI_10016401 superfamily 243061 1673 1718 1.10E-06 49.6478 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#696 - CGI_10016401 superfamily 243051 2007 2131 0.000162468 43.4933 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 1339 1485 0.00020441 43.1009 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#696 - CGI_10016401 superfamily 243051 799 984 0.000571423 41.9798 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#697 - CGI_10016402 superfamily 243051 151 289 3.94E-14 68.9165 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#697 - CGI_10016402 superfamily 241571 322 381 1.92E-06 45.4811 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#697 - CGI_10016402 superfamily 243060 24 79 0.000160123 39.6696 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#698 - CGI_10016403 superfamily 243060 240 303 1.06E-09 55.0776 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#698 - CGI_10016403 superfamily 243060 118 180 5.61E-06 43.9068 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#700 - CGI_10016405 superfamily 248458 121 500 5.80E-39 146.305 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#701 - CGI_10016406 superfamily 247856 151 213 2.58E-15 70.6545 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#701 - CGI_10016406 superfamily 247856 377 441 1.73E-10 56.7873 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#701 - CGI_10016406 superfamily 247856 300 366 4.65E-07 47.1573 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#701 - CGI_10016406 superfamily 247856 187 242 0.000550105 37.9125 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#703 - CGI_10016408 superfamily 245206 28 299 4.51E-151 428.538 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#706 - CGI_10016411 superfamily 241563 59 99 7.97E-06 43.4799 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#706 - CGI_10016411 superfamily 110440 484 510 0.000302119 38.9281 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#707 - CGI_10016412 superfamily 243092 170 280 0.000430797 40.396 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#707 - CGI_10016412 superfamily 110440 357 383 0.00145818 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#709 - CGI_10001189 superfamily 243035 101 230 3.83E-14 67.2597 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#709 - CGI_10001189 superfamily 243035 29 77 2.63E-06 44.533 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#711 - CGI_10004174 superfamily 245309 59 134 1.87E-05 41.7108 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#711 - CGI_10004174 superfamily 243035 219 323 2.97E-05 41.4514 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#712 - CGI_10004175 superfamily 217658 24 114 2.79E-25 94.8872 cl04196 UPF0041 superfamily - - Uncharacterized protein family (UPF0041); Uncharacterized protein family (UPF0041). Q#713 - CGI_10004176 superfamily 247723 53 125 2.11E-45 154.319 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#713 - CGI_10004176 superfamily 222683 144 228 2.59E-31 115.39 cl16803 CSTF2_hinge superfamily - - "Hinge domain of cleavage stimulation factor subunit 2; The hinge domain of cleavage stimulation factor subunit 2 proteins, CSTF2, is necessary for binding to the subunit CstF-77 within the polyadenylation complex and subsequent nuclear localisation. This suggests that nuclear import of a pre-formed CSTF complex is an essential step in polyadenylation. Accurate and efficient polyadenylation is essential for transcriptional termination, nuclear export, translation, and stability of eukaryotic mRNAs. CSTF2 is an important regulatory subunit of the polyadenylation complex." Q#713 - CGI_10004176 superfamily 206472 437 479 5.23E-08 49.4189 cl16788 CSTF_C superfamily - - "Transcription termination and cleavage factor C-terminal; The C-terminal section of CSTF proteins is a discreet structure is crucial for mRNA 3'-end processing. This domain interacts with Pcf11 and possibly PC4, thus linking CstF2 to transcription, transcriptional termination, and cell growth." Q#714 - CGI_10004177 superfamily 248097 330 453 9.21E-17 76.1498 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#715 - CGI_10004178 superfamily 247743 121 233 0.000172117 40.646 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#716 - CGI_10004179 superfamily 243088 29 153 2.26E-69 208.466 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#717 - CGI_10004180 superfamily 219918 12 102 2.14E-23 95.8336 cl07265 DUF1767 superfamily - - Domain of unknown function (DUF1767); Eukaryotic domain of unknown function. This domain is found to the N-terminus of the nucleic acid binding domain. Q#718 - CGI_10004181 superfamily 242611 63 410 1.49E-127 374.139 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#721 - CGI_10002999 superfamily 245008 814 880 3.00E-14 69.1392 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#721 - CGI_10002999 superfamily 207794 372 794 0 560.372 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#721 - CGI_10002999 superfamily 243574 65 223 5.22E-15 73.7005 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#722 - CGI_10003000 superfamily 248097 606 720 4.60E-26 104.269 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#723 - CGI_10003001 superfamily 241574 170 316 1.67E-33 126.546 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#723 - CGI_10003001 superfamily 241574 366 511 2.76E-14 71.4629 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#724 - CGI_10004794 superfamily 247905 4 117 1.26E-25 104.242 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#724 - CGI_10004794 superfamily 248281 605 667 0.000542173 39.5611 cl17727 GT1 superfamily C - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#726 - CGI_10005632 superfamily 241815 5 241 2.31E-40 140.272 cl00361 Transcrip_reg superfamily - - "Transcriptional regulator; This is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region." Q#727 - CGI_10005633 superfamily 247856 308 367 2.60E-11 58.7133 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#727 - CGI_10005633 superfamily 247856 271 330 0.00291279 35.2161 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#732 - CGI_10005638 superfamily 246921 8 53 2.98E-10 56.6149 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#733 - CGI_10005639 superfamily 246921 186 226 0.00499747 33.8881 cl15299 FG-GAP superfamily C - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#734 - CGI_10005640 superfamily 207627 132 189 0.00363857 34.9227 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#741 - CGI_10003432 superfamily 110440 378 404 5.90E-05 40.4689 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#742 - CGI_10003880 superfamily 217900 2 34 2.23E-12 63.7551 cl04403 APG9 superfamily N - "Autophagy protein Apg9; In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialised compartment essential for these vesicle-mediated alternative targeting pathways." Q#745 - CGI_10003883 superfamily 241613 301 335 1.11E-11 60.6833 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#745 - CGI_10003883 superfamily 241613 223 257 5.13E-11 58.7574 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#745 - CGI_10003883 superfamily 241613 262 296 2.65E-10 56.4462 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#747 - CGI_10007918 superfamily 241592 100 211 2.76E-71 215.539 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#748 - CGI_10007919 superfamily 247792 536 574 1.04E-05 43.5224 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#749 - CGI_10007920 superfamily 241688 39 157 7.78E-30 112.645 cl00210 Isoprenoid_Biosyn_C1 superfamily NC - "Isoprenoid Biosynthesis enzymes, Class 1; Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes; and are widely distributed among archaea, bacteria, and eukaryota.The enzymes in this superfamily share the same 'isoprenoid synthase fold' and include several subgroups. The head-to-tail (HT) IPPS catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. Cyclic monoterpenes, diterpenes, and sesquiterpenes, are formed from their respective linear isoprenoid diphosphates by class I terpene cyclases. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Cyclization of these 30- and 40-carbon linear forms are catalyzed by class II cyclases. Both the isoprenoid chain elongation reactions and the class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Generally, the enzymes in this family exhibit an all-trans reaction pathway, an exception, is the cis-trans terpene cyclase, trichodiene synthase. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD." Q#750 - CGI_10007921 superfamily 243109 319 511 2.63E-70 224.323 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#750 - CGI_10007921 superfamily 247999 54 89 0.00130508 37.1914 cl17445 PHD superfamily N - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#751 - CGI_10007922 superfamily 248304 421 513 2.19E-05 43.2792 cl17750 CTD superfamily - - "Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif." Q#752 - CGI_10007923 superfamily 245201 31 289 2.47E-44 163.053 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#753 - CGI_10007924 superfamily 243540 85 308 2.85E-21 89.2292 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#754 - CGI_10007925 superfamily 114591 4 153 1.58E-07 48.3083 cl05445 Mt_ATP-synt_D superfamily - - "ATP synthase D chain, mitochondrial (ATP5H); This family consists of several ATP synthase D chain, mitochondrial (ATP5H) proteins. Subunit d has no extensive hydrophobic sequences, and is not apparently related to any subunit described in the simpler ATP synthases in bacteria and chloroplasts." Q#755 - CGI_10007926 superfamily 243066 26 112 1.61E-25 96.468 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#756 - CGI_10007927 superfamily 245201 17 277 9.37E-149 425.839 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#756 - CGI_10007927 superfamily 247725 340 395 2.96E-10 56.8396 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#757 - CGI_10007928 superfamily 207684 638 672 2.65E-10 57.0035 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#757 - CGI_10007928 superfamily 210068 153 177 1.24E-05 43.4274 cl15286 RPEL superfamily - - RPEL repeat; The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that the Drosophila myocardin-related transcription factor contains a pfam02037 domain that is also implicated in DNA binding. Q#757 - CGI_10007928 superfamily 210068 197 222 0.000512995 38.805 cl15286 RPEL superfamily - - RPEL repeat; The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that the Drosophila myocardin-related transcription factor contains a pfam02037 domain that is also implicated in DNA binding. Q#759 - CGI_10007930 superfamily 218390 644 763 3.72E-30 122.039 cl04895 PARG_cat superfamily C - "Poly (ADP-ribose) glycohydrolase (PARG); Poly(ADP-ribose) glycohydrolase (PARG), is a ubiquitously expressed exo- and endoglycohydrolase which mediates oxidative and excitotoxic neuronal death." Q#760 - CGI_10007931 superfamily 242164 36 133 1.59E-28 104.726 cl00878 Ribosomal_S24e superfamily C - Ribosomal protein S24e; Ribosomal protein S24e. Q#761 - CGI_10015876 superfamily 218118 1357 1437 1.18E-13 68.7948 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#761 - CGI_10015876 superfamily 247792 1267 1292 0.00319407 37.3724 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#762 - CGI_10015877 superfamily 217293 62 195 6.22E-13 66.8875 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#762 - CGI_10015877 superfamily 202474 270 350 6.84E-08 51.8857 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#763 - CGI_10015878 superfamily 202474 1 119 1.73E-12 62.6713 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#765 - CGI_10015880 superfamily 243035 108 222 3.30E-22 88.4457 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#766 - CGI_10015881 superfamily 247856 110 168 2.44E-05 39.4533 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#766 - CGI_10015881 superfamily 247856 39 100 0.000329702 36.3717 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#767 - CGI_10015882 superfamily 241575 461 527 8.52E-17 76.1571 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#767 - CGI_10015882 superfamily 241575 356 405 2.69E-07 48.8079 cl00054 DSRM superfamily C - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#767 - CGI_10015882 superfamily 241575 137 188 3.57E-07 48.4227 cl00054 DSRM superfamily C - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#767 - CGI_10015882 superfamily 241575 279 318 2.19E-06 46.1115 cl00054 DSRM superfamily N - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#768 - CGI_10015883 superfamily 241568 42 92 1.67E-05 38.2128 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#769 - CGI_10015884 superfamily 245206 2 237 1.36E-81 248.462 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#771 - CGI_10015886 superfamily 191913 32 129 2.74E-16 71.2098 cl07876 NIPSNAP superfamily - - NIPSNAP; Members of this family include many hypothetical proteins. It also includes members of the NIPSNAP family which have putative roles in vesicular transport. This domain is often found in duplicate. Q#774 - CGI_10015889 superfamily 243859 255 327 1.90E-05 41.9318 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#775 - CGI_10015890 superfamily 219525 495 533 0.00114672 37.3986 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#775 - CGI_10015890 superfamily 219525 344 384 0.00230787 36.6282 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#775 - CGI_10015890 superfamily 219525 415 452 0.00411758 35.8578 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#775 - CGI_10015890 superfamily 219525 455 503 0.00859334 34.7022 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#779 - CGI_10015896 superfamily 247913 161 531 6.47E-37 140.91 cl17359 PTR2 superfamily - - POT family; The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters. Q#779 - CGI_10015896 superfamily 247913 31 250 3.47E-05 45.3243 cl17359 PTR2 superfamily C - POT family; The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters. Q#781 - CGI_10015898 superfamily 243072 137 220 5.77E-16 75.1126 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#781 - CGI_10015898 superfamily 243047 10 126 4.78E-40 142.48 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#781 - CGI_10015898 superfamily 209407 330 360 4.44E-08 50.2716 cl11983 GIT_SHD superfamily - - "Spa2 homology domain (SHD) of GIT; GIT proteins are signaling integrators with GTPase-activating function which may be involved in the organisation of the cytoskeletal matrix assembled at active zones (CAZ). The function of the CAZ might be to define sites of neurotransmitter release. Mutations in the Spa2 homology domain (SHD) domain of GIT1 described here interfere with the association of GIT1 with Piccolo, beta-PIX, and focal adhesion kinase." Q#781 - CGI_10015898 superfamily 209407 274 295 9.21E-06 43.338 cl11983 GIT_SHD superfamily N - "Spa2 homology domain (SHD) of GIT; GIT proteins are signaling integrators with GTPase-activating function which may be involved in the organisation of the cytoskeletal matrix assembled at active zones (CAZ). The function of the CAZ might be to define sites of neurotransmitter release. Mutations in the Spa2 homology domain (SHD) domain of GIT1 described here interfere with the association of GIT1 with Piccolo, beta-PIX, and focal adhesion kinase." Q#782 - CGI_10015899 superfamily 243072 11 101 3.94E-18 80.8906 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#782 - CGI_10015899 superfamily 221304 155 437 8.87E-94 291.627 cl13359 GPCR_chapero_1 superfamily - - "GPCR-chaperone; This domain, and the associated ANK family repeat pfam00023 domain, together act as a chaperone for biogenesis and folding of the DP receptor for prostaglandin D2." Q#783 - CGI_10015900 superfamily 152787 151 216 5.29E-13 61.4597 cl18053 V-SNARE_C superfamily - - Snare region anchored in the vesicle membrane C-terminus; Within the SNARE proteins interactions in the C-terminal half of the SNARE helix are critical to the driving of membrane fusion; whereas interactions in the N-terminal half of the SNARE domain are important for promoting priming or docking of the vesicle pfam05008. Q#786 - CGI_10015903 superfamily 113482 2 50 1.83E-18 74.7139 cl04700 BCL_N superfamily - - "BCL7, N-terminal conserver region; Members of the BCL family have significant sequence similarity at their N-terminus, represented in this family. The function of BCL7 proteins is unknown. They may be involved in early development. In addition, BCL7B is commonly hemizygously deleted in patients with Williams syndrome." Q#787 - CGI_10015904 superfamily 241703 5 304 8.41E-121 352.332 cl00226 nuc_hydro superfamily - - "nuc_hydro: Nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium, the purine-specific inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax and, pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases such as URH1 from Saccharomyces cerevisiae, RihA and RihB from Escherichia coli. Nucleoside hydrolases are of interest as a target for antiprotozoan drugs as, no nucleoside hydrolase activity or genes encoding these enzymes have been detected in humans and, parasitic protozoans lack de novo purine synthesis relying on nucleoside hydrolase to scavenge purine and/or pyrimidines from the environment." Q#788 - CGI_10015905 superfamily 238191 20 461 4.12E-117 356.642 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#789 - CGI_10015906 superfamily 241619 51 105 5.30E-05 37.9469 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#791 - CGI_10008711 superfamily 219541 88 140 1.03E-18 78.6643 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#792 - CGI_10008712 superfamily 241644 95 252 2.00E-23 92.6505 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#793 - CGI_10008713 superfamily 190534 608 698 3.34E-28 109.42 cl18165 bZIP_Maf superfamily - - "bZIP Maf transcription factor; Maf transcription factors contain a conserved basic region leucine zipper (bZIP) domain, which mediates their dimerisation and DNA binding property. Thus, this family is probably related to pfam00170." Q#793 - CGI_10008713 superfamily 245716 182 207 0.001905 36.8385 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#793 - CGI_10008713 superfamily 245716 134 155 0.00278718 36.3823 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#794 - CGI_10008714 superfamily 241647 15 44 1.14E-13 64.8566 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#794 - CGI_10008714 superfamily 241647 56 86 1.50E-06 44.8262 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#794 - CGI_10008714 superfamily 245206 120 401 3.64E-132 384.256 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#795 - CGI_10008715 superfamily 243082 622 890 8.49E-39 145.704 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#795 - CGI_10008715 superfamily 243082 521 640 6.90E-07 50.7844 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#796 - CGI_10008716 superfamily 241739 224 413 5.49E-95 297.965 cl00268 class_II_aaRS-like_core superfamily N - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#796 - CGI_10008716 superfamily 241738 541 650 1.17E-47 164.653 cl00266 HGTP_anticodon superfamily - - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#796 - CGI_10008716 superfamily 241805 9 57 2.26E-19 83.3046 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#796 - CGI_10008716 superfamily 241739 68 140 4.18E-25 105.365 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#796 - CGI_10008716 superfamily 241738 651 722 1.87E-12 64.532 cl00266 HGTP_anticodon superfamily N - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#797 - CGI_10008717 superfamily 241754 1 546 0 956.267 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#799 - CGI_10008719 superfamily 247684 52 474 7.41E-89 284.17 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#802 - CGI_10008722 superfamily 243352 39 315 9.63E-112 328.014 cl03224 Porin3 superfamily - - "Eukaryotic porin family that forms channels in the mitochondrial outer membrane; The porin family 3 contains two sub-families that play vital roles in the mitochondrial outer membrane, a translocase for unfolded pre-proteins (Tom40) and the voltage-dependent anion channel (VDAC) that regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane." Q#803 - CGI_10002489 superfamily 243035 30 139 1.12E-05 40.681 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#804 - CGI_10002490 superfamily 243035 50 118 1.88E-06 42.9922 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#805 - CGI_10005003 superfamily 246908 107 255 7.14E-13 65.2982 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#806 - CGI_10005004 superfamily 247723 62 141 1.74E-37 128.548 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#807 - CGI_10005005 superfamily 241659 130 207 7.86E-28 102.213 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#808 - CGI_10005006 superfamily 241717 152 311 7.12E-29 109.508 cl00240 RRF superfamily - - "Ribosome recycling factor (RRF). Ribosome recycling factor dissociates the posttermination complex, composed of the ribosome, deacylated tRNA, and mRNA, after termination of translation. Thus ribosomes are "recycled" and ready for another round of protein synthesis. RRF is believed to bind the ribosome at the A-site in a manner that mimics tRNA, but the specific mechanisms remain unclear. RRF is essential for bacterial growth. It is not necessary for cell growth in archaea or eukaryotes, but is found in mitochondria or chloroplasts of some eukaryotic species." Q#809 - CGI_10005007 superfamily 243563 69 334 2.22E-158 447.367 cl03888 PTPA superfamily - - "Phosphotyrosyl phosphatase activator (PTPA) is also known as protein phosphatase 2A (PP2A) phosphatase activator. PTPA is an essential, well conserved protein that stimulates the tyrosyl phosphatase activity of PP2A. It also reactivates the serine/threonine phosphatase activity of an inactive form of PP2A. Together, PTPA and PP2A constitute an ATPase. It has been suggested that PTPA alters the relative specificity of PP2A from phosphoserine/phosphothreonine substrates to phosphotyrosine substrates in an ATP-hydrolysis-dependent manner. Basal expression of PTPA is controlled by the transcription factor Yin Yang1 (YY1). PTPA has been suggested to play a role in the insertion of metals to the PP2A catalytic subunit (PP2Ac) active site, to act as a chaperone, and more recently, to have peptidyl prolyl cis/trans isomerase activity that specifically targets human PP2Ac." Q#811 - CGI_10005009 superfamily 247684 1 256 4.88E-57 194.033 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#814 - CGI_10002737 superfamily 243088 12 130 2.43E-30 113.274 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#817 - CGI_10004590 superfamily 219525 326 373 2.84E-07 49.3397 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 111397 26 94 1.23E-06 48.1063 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#817 - CGI_10004590 superfamily 205157 1196 1231 9.47E-06 44.4507 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#817 - CGI_10004590 superfamily 219525 927 972 1.48E-05 44.3322 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 820 867 1.49E-05 44.3322 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 600 647 1.51E-05 44.3322 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 1037 1084 8.17E-05 42.021 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 711 758 0.000332292 40.095 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 547 592 0.000605073 39.3246 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 1144 1192 0.000693415 39.3246 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#817 - CGI_10004590 superfamily 219525 979 1028 0.00124553 38.5542 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#819 - CGI_10004592 superfamily 207794 9 146 1.15E-54 179.716 cl02948 GH20_hexosaminidase superfamily N - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#820 - CGI_10002678 superfamily 246723 70 154 1.56E-41 146.179 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#821 - CGI_10002679 superfamily 217473 92 314 3.14E-27 111.688 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#823 - CGI_10002967 superfamily 243035 102 128 0.000460011 36.4238 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#824 - CGI_10007233 superfamily 243091 58 106 1.37E-05 42.094 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#825 - CGI_10007234 superfamily 222150 347 366 0.000440805 39.2973 cl16282 zf-H2C2_2 superfamily C - Zinc-finger double domain; Zinc-finger double domain. Q#825 - CGI_10007234 superfamily 246975 334 355 0.00176589 37.7117 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#825 - CGI_10007234 superfamily 246975 829 850 0.00239389 37.3265 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#825 - CGI_10007234 superfamily 222150 814 839 0.00290206 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#825 - CGI_10007234 superfamily 222150 317 343 0.00709273 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#827 - CGI_10007236 superfamily 241613 654 690 8.27E-10 55.6758 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#827 - CGI_10007236 superfamily 193258 86 194 0.000795128 40.7901 cl15087 Innate_immun superfamily NC - "Invertebrate innate immunity transcript family; The immune response of the purple sea urchin appears to be more complex than previously believed in that it uses immune-related gene families homologous to vertebrate Toll-like and NOD/NALP-like receptor families as well as C-type lectins and a rudimentary complement system. In addition, the species also produces this unusual family of mRNAs, also known as 185/333, which is strongly upregulated in response to pathogen challenge." Q#828 - CGI_10007237 superfamily 247856 757 801 1.54E-08 52.5501 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#828 - CGI_10007237 superfamily 248020 15 346 3.57E-44 163.404 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#828 - CGI_10007237 superfamily 221634 535 592 3.56E-12 65.0984 cl13923 DUF3740 superfamily N - "Sulfatase protein; This domain family is found in eukaryotes, and is typically between 144 and 173 amino acids in length. The family is found in association with pfam00884." Q#828 - CGI_10007237 superfamily 247856 827 884 0.00353179 36.7569 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#829 - CGI_10007238 superfamily 220388 98 457 2.93E-118 360.525 cl12372 FimP superfamily - - "Fms-interacting protein; This entry carries part of the crucial 144 N-terminal residues of the FmiP protein, which is essential for the binding of the protein to the cytoplasmic domain of activated Fms-molecules in M-CSF induced haematopoietic differentiation of macrophages. The C-terminus contains a putative nuclear localisation sequence and a leucine zipper which suggest further, as yet unknown, nuclear functions. The level of FMIP expression might form a threshold that determines whether cells differentiate into macrophages or into granulocytes." Q#830 - CGI_10007239 superfamily 218267 32 101 2.52E-12 60.9112 cl04754 LMBR1 superfamily C - "LMBR1-like membrane protein; Members of this family are integral membrane proteins that are around 500 residues in length. LMBR1 is not involved in preaxial polydactyly, as originally thought. Vertebrate members of this family may play a role in limb development. A member of this family has been shown to be a lipocalin membrane receptor" Q#831 - CGI_10001051 superfamily 241563 62 98 2.62E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#831 - CGI_10001051 superfamily 110440 492 518 0.000272142 38.9281 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#831 - CGI_10001051 superfamily 241563 8 53 0.000685283 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#834 - CGI_10012802 superfamily 222150 753 777 0.000429555 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#835 - CGI_10012803 superfamily 218079 108 194 0.00739043 34.5657 cl04507 CHD5 superfamily C - CHD5-like protein; Members of this family are probably coiled-coil proteins that are similar to the CHD5 (Congenital heart disease 5) protein. In Saccharomyces cerevisiae this protein localises to the ER and is thought to play a homeostatic role. Q#836 - CGI_10012804 superfamily 241599 1159 1215 2.19E-17 79.5948 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#836 - CGI_10012804 superfamily 241599 1402 1460 8.56E-14 69.1944 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#836 - CGI_10012804 superfamily 241599 1683 1741 1.11E-13 68.8092 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#836 - CGI_10012804 superfamily 241599 1052 1110 4.38E-11 61.4904 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#838 - CGI_10012806 superfamily 245816 239 405 1.77E-40 143.671 cl11964 CYTH-like_Pase superfamily - - "CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases; CYTH-like superfamily enzymes hydrolyze triphosphate-containing substrates and require metal cations as cofactors. They have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB), and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions." Q#840 - CGI_10012808 superfamily 241810 14 86 1.64E-26 96.3906 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#840 - CGI_10012808 superfamily 190164 54 126 7.34E-27 97.2503 cl03394 Ribosomal_L14e superfamily - - Ribosomal protein L14; This family includes the eukaryotic ribosomal protein L14. Q#841 - CGI_10012809 superfamily 216574 19 155 2.98E-34 125.013 cl14794 FAD_binding_4 superfamily - - "FAD binding domain; This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidises the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan." Q#842 - CGI_10012811 superfamily 243555 55 148 0.000281496 38.141 cl03871 Chitin_bind_3 superfamily N - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#843 - CGI_10012812 superfamily 247907 58 140 2.60E-09 51.6501 cl17353 LamG superfamily C - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#844 - CGI_10012813 superfamily 218427 222 299 3.10E-28 105.551 cl18456 CIAPIN1 superfamily - - "Cytokine-induced anti-apoptosis inhibitor 1, Fe-S biogenesis; Anamorsin, subsequently named CIAPIN1 for cytokine-induced anti-apoptosis inhibitor 1, in humans is the homologue of yeast Dre2, a conserved soluble eukaryotic Fe-S cluster protein, that functions in cytosolic Fe-S protein biogenesis. It is found in both the cytoplasm and in the mitochondrial intermembrane space (IMS). CIAPIN1 is found to be up-regulated in hepatocellular cancer, is considered to be a downstream effector of the receptor tyrosine kinase-Ras signalling pathway, and is essential in mouse definitive haematopoiesis. Dre2 has been found to interact with the yeast reductase Tah18, forming a tight cytosolic complex implicated in the response to high levels of oxidative stress." Q#844 - CGI_10012813 superfamily 247727 36 98 3.03E-06 44.1844 cl17173 AdoMet_MTases superfamily N - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#845 - CGI_10012814 superfamily 192955 439 572 7.99E-15 73.3483 cl13625 TPX2_importin superfamily C - Cell cycle regulated microtubule associated protein; This domain is found in eukaryotes. This domain is typically between 127 to 182 amino acids in length. This domain is found associated with pfam06886. This domain is found in the protein TPX2 (a.k.a p100) which is involved in cell cycling. It is only expressed between the start of the S phase and completion of cytokinesis. The microtubule-associated protein TPX2 has been reported to be crucial for mitotic spindle formation. This domain is close to the C terminal of TPX2. The protein importin alpha regulates the activity of TPX2 by binding to the nuclear localisation signal in this domain. Q#847 - CGI_10014134 superfamily 241584 1 92 1.14E-19 81.7739 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#847 - CGI_10014134 superfamily 241584 104 187 1.66E-14 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#847 - CGI_10014134 superfamily 241584 213 293 2.54E-13 64.4399 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#848 - CGI_10014135 superfamily 110440 284 311 0.00140971 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#849 - CGI_10014136 superfamily 241574 1531 1758 6.79E-108 344.569 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#849 - CGI_10014136 superfamily 241584 1154 1241 1.16E-10 60.5879 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 422 511 5.49E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 602 690 5.62E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 795 877 1.41E-06 48.2615 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 513 585 2.13E-05 44.7947 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 148 230 0.000216971 41.7131 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 885 970 0.000398145 40.9427 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 692 770 0.00425243 37.4759 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 241584 991 1058 0.00622876 37.0907 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#849 - CGI_10014136 superfamily 197431 1274 1428 0.000800033 40.8704 cl06408 UP_III_II superfamily - - "Uroplakin IIIb, IIIa and II; Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains separating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers; six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis." Q#853 - CGI_10014141 superfamily 247736 12 45 0.000131603 35.5883 cl17182 NAT_SF superfamily N - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#854 - CGI_10014142 superfamily 247736 4 69 2.43E-08 46.2365 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#855 - CGI_10014143 superfamily 241818 7 212 1.27E-143 402.712 cl00366 PMSR superfamily - - Peptide methionine sulfoxide reductase; This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine. Q#857 - CGI_10014145 superfamily 248019 83 135 3.47E-05 41.7943 cl17465 DAGK_cat superfamily NC - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#858 - CGI_10007636 superfamily 245882 25 407 3.11E-162 488.724 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#858 - CGI_10007636 superfamily 219542 526 636 2.09E-40 146.618 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#858 - CGI_10007636 superfamily 219541 884 1025 1.23E-24 101.776 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#858 - CGI_10007636 superfamily 215896 644 824 1.63E-14 72.3276 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#862 - CGI_10007640 superfamily 220070 335 536 1.83E-37 139.468 cl18542 SF3b1 superfamily - - "Splicing factor 3B subunit 1; This family consists of several eukaryotic splicing factor 3B subunit 1 proteins, which associate with p14 through a C-terminus beta-strand that interacts with beta-3 of the p14 RNA recognition motif (RRM) beta-sheet, which is in turn connected to an alpha-helix by a loop that makes extensive contacts with both the shorter C-terminal helix and RRM of p14. This subunit is required for 'A' splicing complex assembly (formed by the stable binding of U2 snRNP to the branchpoint sequence in pre-mRNA) and 'E' splicing complex assembly." Q#863 - CGI_10007641 superfamily 111646 256 392 1.72E-76 235.764 cl03707 S-AdoMet_synt_C superfamily - - "S-adenosylmethionine synthetase, C-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#863 - CGI_10007641 superfamily 217221 132 254 6.24E-70 218.059 cl03706 S-AdoMet_synt_M superfamily - - "S-adenosylmethionine synthetase, central domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#863 - CGI_10007641 superfamily 201226 20 119 2.08E-62 198.077 cl02868 S-AdoMet_synt_N superfamily - - "S-adenosylmethionine synthetase, N-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#864 - CGI_10007642 superfamily 111646 30 166 2.91E-78 231.912 cl03707 S-AdoMet_synt_C superfamily - - "S-adenosylmethionine synthetase, C-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#864 - CGI_10007642 superfamily 217221 1 28 1.04E-07 46.6455 cl03706 S-AdoMet_synt_M superfamily N - "S-adenosylmethionine synthetase, central domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#865 - CGI_10007643 superfamily 111646 268 404 2.82E-75 233.068 cl03707 S-AdoMet_synt_C superfamily - - "S-adenosylmethionine synthetase, C-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#865 - CGI_10007643 superfamily 217221 144 266 2.53E-68 214.207 cl03706 S-AdoMet_synt_M superfamily - - "S-adenosylmethionine synthetase, central domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#865 - CGI_10007643 superfamily 201226 32 131 4.09E-63 200.003 cl02868 S-AdoMet_synt_N superfamily - - "S-adenosylmethionine synthetase, N-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#867 - CGI_10007645 superfamily 241600 2 153 1.44E-58 183.596 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#868 - CGI_10007646 superfamily 203136 138 267 4.15E-12 61.5915 cl04867 LRAT superfamily - - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#869 - CGI_10007647 superfamily 245206 1 126 5.49E-30 108.927 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#874 - CGI_10006086 superfamily 247905 294 349 2.66E-09 58.7885 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#874 - CGI_10006086 superfamily 247805 19 179 2.20E-05 46.8687 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#876 - CGI_10011579 superfamily 215847 20 155 9.41E-19 82.4942 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#879 - CGI_10011582 superfamily 247057 15 63 1.47E-05 41.3869 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#879 - CGI_10011582 superfamily 245595 214 255 0.00240682 37.2145 cl11393 Peptidase_M14_like superfamily NC - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#881 - CGI_10011584 superfamily 221377 19 86 4.12E-06 44.3819 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#882 - CGI_10011585 superfamily 241568 425 478 0.000298676 39.3684 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#883 - CGI_10011586 superfamily 243099 414 524 1.68E-13 68.5136 cl02575 Bcl-2_like superfamily N - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#884 - CGI_10011587 superfamily 213465 89 247 1.50E-08 52.0653 cl17074 PRK03963 superfamily N - V-type ATP synthase subunit E; Provisional Q#892 - CGI_10010319 superfamily 241578 44 195 9.39E-31 115.852 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#897 - CGI_10010325 superfamily 241592 14 74 2.01E-08 46.8362 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#898 - CGI_10010326 superfamily 241568 146 184 9.38E-05 39.3684 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#901 - CGI_10014457 superfamily 241889 149 280 6.49E-33 120.428 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#902 - CGI_10014458 superfamily 246671 1 91 6.06E-08 47.0325 cl14606 Reeler_cohesin_like superfamily N - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#903 - CGI_10014459 superfamily 243047 18 132 4.72E-54 177.427 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#904 - CGI_10014460 superfamily 142634 1488 1681 1.08E-89 295.641 cl11429 RNAP_largest_subunit_C superfamily N - "Largest subunit of RNA polymerase (RNAP), C-terminal domain; RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is the final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei, RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. Structure studies revealed that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shape structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. The largest RNAP subunit (Rpb1) interacts with the second-largest RNAP subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The region covered by this domain makes up part of the foot and jaw structures. In archaea, some photosynthetic organisms, and some organelles, this domain exists as a separate subunit, while it forms the C-terminal region of the RNAP largest subunit in eukaryotes and bacteria." Q#904 - CGI_10014460 superfamily 245715 410 655 5.52E-91 299.049 cl11591 RNA_pol_Rpb1_2 superfamily N - "RNA polymerase Rpb1, domain 2; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 2, contains the active site. The invariant motif -NADFDGD- binds the active site magnesium ion." Q#904 - CGI_10014460 superfamily 142634 1198 1325 3.35E-53 191.252 cl11429 RNAP_largest_subunit_C superfamily C - "Largest subunit of RNA polymerase (RNAP), C-terminal domain; RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is the final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei, RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. Structure studies revealed that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shape structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. The largest RNAP subunit (Rpb1) interacts with the second-largest RNAP subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The region covered by this domain makes up part of the foot and jaw structures. In archaea, some photosynthetic organisms, and some organelles, this domain exists as a separate subunit, while it forms the C-terminal region of the RNAP largest subunit in eukaryotes and bacteria." Q#904 - CGI_10014460 superfamily 218361 633 815 2.22E-36 137.367 cl04873 RNA_pol_Rpb1_3 superfamily - - "RNA polymerase Rpb1, domain 3; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking." Q#904 - CGI_10014460 superfamily 218372 899 964 2.05E-19 86.6554 cl04881 RNA_pol_Rpb1_4 superfamily N - "RNA polymerase Rpb1, domain 4; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 4, represents the funnel domain. The funnel contain the binding site for some elongation factors." Q#904 - CGI_10014460 superfamily 218370 27 123 6.61E-17 82.7329 cl04880 RNA_pol_Rpb1_1 superfamily C - "RNA polymerase Rpb1, domain 1; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand." Q#905 - CGI_10014461 superfamily 241563 67 99 0.00361497 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#907 - CGI_10014463 superfamily 216686 71 260 1.01E-39 139.766 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#908 - CGI_10014464 superfamily 219958 1 162 1.25E-55 177.057 cl18536 Alg14 superfamily - - Oligosaccharide biosynthesis protein Alg14 like; Alg14 is involved dolichol-linked oligosaccharide biosynthesis and anchors the catalytic subunit Alg13 to the ER membrane. Q#909 - CGI_10014465 superfamily 241832 57 232 1.54E-84 252.426 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#910 - CGI_10014466 superfamily 243166 67 213 2.08E-22 96.1906 cl02759 TRAM_LAG1_CLN8 superfamily N - TLC domain; TLC domain. Q#910 - CGI_10014466 superfamily 193049 414 499 0.000396475 39.9611 cl13867 DUF3702 superfamily N - ImpA domain protein; This family of proteins is found in bacteria. Proteins in this family are typically between 207 and 469 amino acids in length. The family is found in association with pfam06812. Q#910 - CGI_10014466 superfamily 150420 331 429 0.000419972 40.1039 cl18042 Jnk-SapK_ap_N superfamily N - JNK_SAPK-associated protein-1; This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end. Q#911 - CGI_10014467 superfamily 242793 148 372 8.07E-26 102.131 cl01947 MT-A70 superfamily - - "MT-A70; MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs." Q#911 - CGI_10014467 superfamily 247727 101 180 0.0025093 36.8914 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#912 - CGI_10014468 superfamily 243065 970 1112 5.49E-20 89.7685 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#913 - CGI_10014469 superfamily 220131 494 774 7.60E-62 215.988 cl11721 DUF1943 superfamily - - "Domain of unknown function (DUF1943); Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined." Q#913 - CGI_10014469 superfamily 219034 805 895 1.91E-06 48.4866 cl05778 DUF1081 superfamily - - Domain of Unknown Function (DUF1081); This region is found in Apolipophorin proteins. Q#914 - CGI_10014470 superfamily 243065 568 710 1.20E-20 90.9241 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#914 - CGI_10014470 superfamily 248070 1 23 0.00455485 36.7711 cl17516 AAA_29 superfamily NC - P-loop containing region of AAA domain; P-loop containing region of AAA domain. Q#915 - CGI_10014471 superfamily 245864 52 461 5.85E-107 328.084 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#916 - CGI_10014472 superfamily 245864 13 402 7.18E-108 329.239 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#917 - CGI_10014473 superfamily 247805 129 334 1.20E-99 306.333 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#917 - CGI_10014473 superfamily 247905 371 477 1.92E-38 138.91 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#918 - CGI_10014474 superfamily 241782 22 439 0 630.004 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#920 - CGI_10014476 superfamily 245201 30 281 9.55E-58 186.288 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#921 - CGI_10014477 superfamily 243310 24 210 3.22E-44 150.467 cl03120 ELO superfamily C - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#922 - CGI_10014478 superfamily 241832 70 154 9.24E-19 78.169 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#923 - CGI_10014479 superfamily 241622 1084 1140 2.60E-13 67.5918 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#923 - CGI_10014479 superfamily 216736 690 793 8.09E-12 63.7432 cl03379 DIL superfamily - - DIL domain; The DIL domain has no known function. Q#923 - CGI_10014479 superfamily 241645 70 185 4.52E-09 55.0323 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#924 - CGI_10014480 superfamily 243078 6 145 5.04E-67 220.967 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#924 - CGI_10014480 superfamily 248318 162 218 3.97E-23 94.8101 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#924 - CGI_10014480 superfamily 152645 376 456 2.46E-33 124.937 cl13621 Hrs_helical superfamily - - "Hepatocyte growth factor-regulated tyrosine kinase substrate; This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam00790, pfam01363, pfam02809. This domain is the helical region of Hrs which forms the core complex of ESCRT with STAM." Q#925 - CGI_10014481 superfamily 218885 5 141 3.21E-47 157.232 cl18483 DUF938 superfamily C - Protein of unknown function (DUF938); This family consists of several hypothetical proteins from both prokaryotes and eukaryotes. The function of this family is unknown. Q#925 - CGI_10014481 superfamily 247724 142 231 2.26E-16 72.9944 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#927 - CGI_10014483 superfamily 241584 345 439 1.67E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#927 - CGI_10014483 superfamily 241584 552 645 3.01E-06 45.9503 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#927 - CGI_10014483 superfamily 241584 654 744 1.54E-05 44.0243 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#927 - CGI_10014483 superfamily 241584 462 541 0.00111811 38.2463 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#927 - CGI_10014483 superfamily 245814 57 137 3.11E-15 72.6328 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#927 - CGI_10014483 superfamily 245814 254 334 7.69E-14 68.2896 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#927 - CGI_10014483 superfamily 245814 158 235 3.01E-09 54.7578 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#927 - CGI_10014483 superfamily 245814 14 50 0.00307239 36.9119 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#929 - CGI_10006264 superfamily 243066 20 121 5.78E-10 56.8569 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#932 - CGI_10006267 superfamily 245205 120 201 1.03E-15 69.9593 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#932 - CGI_10006267 superfamily 245205 12 80 0.000913363 36.4469 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#935 - CGI_10013762 superfamily 242902 17 98 2.44E-06 45.3155 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#936 - CGI_10013764 superfamily 216939 90 135 8.95E-05 37.2573 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#937 - CGI_10013765 superfamily 245882 25 406 0 544.578 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#938 - CGI_10013766 superfamily 201217 799 845 6.60E-12 62.5432 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 201217 716 770 4.36E-10 57.5356 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 201217 849 896 1.33E-06 47.1352 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 205718 951 980 1.59E-06 46.7146 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 205718 757 785 3.59E-06 45.559 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 201217 967 1014 9.27E-05 41.7424 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#938 - CGI_10013766 superfamily 201217 1019 1044 0.00537781 36.3496 cl08266 RCC1 superfamily C - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#941 - CGI_10013769 superfamily 243061 1 102 1.99E-39 130.54 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#941 - CGI_10013769 superfamily 243061 108 153 1.95E-15 67.7522 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#942 - CGI_10013770 superfamily 241874 12 491 3.03E-170 500.47 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#943 - CGI_10013771 superfamily 207684 7 39 7.78E-07 43.5215 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#944 - CGI_10013772 superfamily 241547 69 347 2.94E-56 185.564 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#948 - CGI_10005274 superfamily 243035 7 84 6.81E-18 72.6525 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#949 - CGI_10005275 superfamily 241600 12 154 9.39E-25 95.385 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#950 - CGI_10005276 superfamily 111000 12 480 0 539.995 cl15499 Glyco_hydro_59 superfamily C - Glycosyl hydrolase family 59; Glycosyl hydrolase family 59. Q#951 - CGI_10005277 superfamily 247916 12 59 8.98E-06 43.9107 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#951 - CGI_10005277 superfamily 245008 554 583 0.00108844 37.8575 cl09101 E_set superfamily NC - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#953 - CGI_10008184 superfamily 248312 35 190 9.35E-08 48.5124 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#954 - CGI_10008185 superfamily 241574 681 801 1.40E-45 165.066 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#954 - CGI_10008185 superfamily 243051 127 281 2.75E-25 104.382 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#954 - CGI_10008185 superfamily 241609 547 621 3.90E-25 101.301 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#954 - CGI_10008185 superfamily 241609 379 440 6.30E-24 97.8339 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#954 - CGI_10008185 superfamily 241609 288 361 1.67E-19 85.1223 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#954 - CGI_10008185 superfamily 245213 498 539 1.03E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#954 - CGI_10008185 superfamily 241574 872 1052 1.56E-21 94.9601 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#954 - CGI_10008185 superfamily 241609 471 502 1.63E-07 50.3822 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#954 - CGI_10008185 superfamily 241609 635 665 0.000187189 40.8366 cl00100 KR superfamily C - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#956 - CGI_10008187 superfamily 241609 170 238 1.06E-19 80.4999 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#956 - CGI_10008187 superfamily 241609 83 156 2.20E-17 73.9638 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#956 - CGI_10008187 superfamily 241609 6 37 4.82E-07 45.3746 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#956 - CGI_10008187 superfamily 245213 44 74 0.00403367 33.7642 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#957 - CGI_10008188 superfamily 241609 261 336 3.63E-32 116.323 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#957 - CGI_10008188 superfamily 243051 58 187 1.13E-20 86.2777 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#957 - CGI_10008188 superfamily 241609 193 262 1.98E-15 70.1118 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#958 - CGI_10008189 superfamily 241571 373 499 1.93E-12 63.9706 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#958 - CGI_10008189 superfamily 245213 335 365 0.00619364 34.9198 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#958 - CGI_10008189 superfamily 241583 149 292 4.95E-40 143.481 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#959 - CGI_10008190 superfamily 241609 816 891 5.46E-27 107.079 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#959 - CGI_10008190 superfamily 241609 1008 1081 1.44E-24 100.145 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#959 - CGI_10008190 superfamily 243051 578 732 2.75E-22 95.5225 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#959 - CGI_10008190 superfamily 241609 1095 1163 1.01E-21 91.6707 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#959 - CGI_10008190 superfamily 241609 744 812 3.29E-18 81.6555 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#959 - CGI_10008190 superfamily 241571 402 528 6.59E-10 58.1926 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#959 - CGI_10008190 superfamily 245213 959 1000 8.02E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#959 - CGI_10008190 superfamily 241583 176 352 6.00E-40 147.333 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#959 - CGI_10008190 superfamily 241609 895 965 2.41E-14 70.497 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#960 - CGI_10008191 superfamily 246975 42 65 0.00304541 35.3063 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#961 - CGI_10000472 superfamily 247866 70 213 1.50E-29 111.776 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#962 - CGI_10000165 superfamily 245201 1 118 1.15E-28 107.322 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#963 - CGI_10000166 superfamily 246680 9 106 1.02E-23 93.2461 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#963 - CGI_10000166 superfamily 245201 267 320 5.68E-09 54.85 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#964 - CGI_10001128 superfamily 247684 7 436 1.92E-91 289.562 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#967 - CGI_10006799 superfamily 246925 244 341 1.26E-05 46.1946 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#967 - CGI_10006799 superfamily 246925 276 513 1.33E-05 46.1946 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#967 - CGI_10006799 superfamily 214507 511 566 2.22E-05 42.4172 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#970 - CGI_10006802 superfamily 215827 58 229 2.49E-36 134.133 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#972 - CGI_10006804 superfamily 220533 52 706 0 727.982 cl12375 Dpy19 superfamily - - "Q-cell neuroblast polarisation; Dyp-19, formerly known as DUF2211, is a transmembrane domain family that is required to orient the neuroblast cells, QR and QL accurately on the anterior-posterior axis: QL and QR are born in the same anterior-posterior position, but polarise and migrate left-right asymmetrically, QL migrating towards the posterior and QR migrating towards the anterior. It is also required, with unc-40, to express mab-5 correctly in the Q cell descendants. The Dpy-19 protein derives from the C. elegans DUMPY mutant." Q#976 - CGI_10006808 superfamily 247856 114 161 0.0069236 32.9049 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#978 - CGI_10000646 superfamily 242849 33 106 6.14E-27 96.504 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#979 - CGI_10012480 superfamily 241645 9 85 5.45E-10 53.083 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#979 - CGI_10012480 superfamily 243076 115 188 1.33E-06 44.13 cl02539 BAG superfamily - - BAG domain; Domain present in Hsp70 regulators. Q#980 - CGI_10012481 superfamily 241645 9 81 4.79E-09 48.8458 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#981 - CGI_10012482 superfamily 241554 8 142 9.85E-34 128.917 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241554 231 357 3.60E-28 112.739 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241554 797 912 1.23E-26 108.117 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241554 936 1063 1.55E-21 93.4791 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241752 1428 1551 2.47E-15 75.0485 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#981 - CGI_10012482 superfamily 241554 1119 1222 1.42E-13 69.9819 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#981 - CGI_10012482 superfamily 241752 660 775 8.23E-08 53.1031 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#982 - CGI_10012483 superfamily 241554 232 357 2.16E-28 112.354 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#982 - CGI_10012483 superfamily 241554 411 551 8.72E-27 107.731 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#982 - CGI_10012483 superfamily 241752 771 897 7.20E-17 78.5153 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#982 - CGI_10012483 superfamily 241554 618 721 2.23E-13 68.8263 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#982 - CGI_10012483 superfamily 241752 907 1004 1.33E-09 56.9441 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#982 - CGI_10012483 superfamily 222429 14 91 1.72E-06 46.85 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#983 - CGI_10012484 superfamily 241554 60 149 2.39E-16 71.1375 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#984 - CGI_10012485 superfamily 241884 762 1006 5.32E-115 356.198 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#984 - CGI_10012485 superfamily 241554 206 351 1.09E-30 118.902 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#984 - CGI_10012485 superfamily 241554 55 195 8.69E-27 107.731 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#984 - CGI_10012485 superfamily 241752 543 668 1.87E-21 91.9973 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#985 - CGI_10012486 superfamily 241570 304 412 1.52E-17 80.0626 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#986 - CGI_10012487 superfamily 220692 137 260 0.000909103 39.8801 cl18570 7TM_GPCR_Srw superfamily C - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#987 - CGI_10012488 superfamily 243092 179 473 4.69E-82 258.804 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#989 - CGI_10012490 superfamily 241874 207 806 0 657.599 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#990 - CGI_10012491 superfamily 247684 16 442 5.60E-94 296.496 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#991 - CGI_10012492 superfamily 247684 34 453 1.01E-99 311.904 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#992 - CGI_10012493 superfamily 204202 63 95 0.000252723 34.9237 cl07827 Vps4_C superfamily N - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#1001 - CGI_10005913 superfamily 241593 29 148 1.85E-22 95.4316 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#1001 - CGI_10005913 superfamily 219431 506 549 1.67E-14 70.1524 cl06504 zf-CW superfamily - - "CW-type Zinc Finger; This domain appears to be a zinc finger. The alignment shows four conserved cysteine residues and a conserved tryptophan. It was first identified by, and is predicted to be a "highly specialised mononuclear four-cysteine zinc finger...that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including...chromatin methylation status and early embryonic development." Weak homology to pfam00628 further evidences these predictions (personal obs: C Yeats). Twelve different CW-domain-containing protein subfamilies are described, with different subfamilies being characteristic of vertebrates, higher plants and other animals in which these domain is found." Q#1002 - CGI_10005914 superfamily 241636 76 261 1.54E-126 369.609 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#1003 - CGI_10005915 superfamily 247724 37 187 1.15E-40 142.215 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1003 - CGI_10005915 superfamily 247724 277 301 0.000564519 38.9817 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1006 - CGI_10005918 superfamily 247684 17 448 5.30E-107 331.164 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#1007 - CGI_10017521 superfamily 245201 695 1068 0 634.54 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1008 - CGI_10017522 superfamily 148406 25 138 1.25E-07 47.4752 cl06034 UPF0240 superfamily N - Uncharacterized protein family (UPF0240); Uncharacterized protein family (UPF0240). Q#1013 - CGI_10017528 superfamily 243034 262 371 1.50E-16 75.1091 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#1013 - CGI_10017528 superfamily 215821 29 121 9.70E-38 133.52 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#1013 - CGI_10017528 superfamily 215821 146 236 3.73E-22 90.3774 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#1016 - CGI_10017531 superfamily 243058 102 209 1.00E-08 51.9315 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#1016 - CGI_10017531 superfamily 243058 224 311 2.02E-08 51.1611 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#1016 - CGI_10017531 superfamily 243058 45 137 0.000114371 39.9904 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#1017 - CGI_10017532 superfamily 242611 255 445 6.89E-88 268.412 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#1017 - CGI_10017532 superfamily 245606 73 189 6.46E-20 86.0463 cl11410 TPP_enzyme_PYR superfamily C - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#1018 - CGI_10017533 superfamily 218118 66 133 4.63E-08 46.4533 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#1019 - CGI_10017534 superfamily 204301 7 1185 0 1415.27 cl14974 Nckap1 superfamily - - Membrane-associated apoptosis protein; Expression of this protein was found to be markedly reduced in patients with Alzheimer's disease. It is involved in the regulation of actin polymerisation in the brain as part of a WAVE2 signalling complex. Q#1020 - CGI_10017535 superfamily 243077 54 106 4.46E-16 71.4225 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#1021 - CGI_10017536 superfamily 216112 483 747 5.06E-60 207.149 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#1022 - CGI_10017537 superfamily 221913 406 615 5.12E-56 189.674 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#1022 - CGI_10017537 superfamily 222258 358 395 2.25E-05 44.4812 cl18656 AAA_30 superfamily NC - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#1023 - CGI_10017538 superfamily 245213 226 260 5.38E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 245213 300 334 1.64E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 245213 1805 1846 5.67E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 245213 2296 2329 8.82E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 245213 1765 1796 0.00112031 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 243065 1851 2002 1.07E-28 115.962 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#1023 - CGI_10017538 superfamily 246918 2511 2560 3.53E-07 50.2779 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#1023 - CGI_10017538 superfamily 241600 1515 1551 7.42E-06 48.0055 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1023 - CGI_10017538 superfamily 205157 145 171 5.24E-05 43.2951 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#1023 - CGI_10017538 superfamily 245213 1715 1747 0.000259071 41.5656 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1023 - CGI_10017538 superfamily 241607 2384 2426 0.00265321 38.4294 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#1023 - CGI_10017538 superfamily 241600 343 385 0.00307478 40.3015 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1023 - CGI_10017538 superfamily 248288 2655 2713 0.0036237 38.1534 cl17734 DAN superfamily - - "DAN domain; This domain contains 9 conserved cysteines and is extracellular. Therefore the cysteines may form disulphide bridges. This family of proteins has been termed the DAN family after the first member to be reported. This family includes DAN, Cerberus and Gremlin. The gremlin protein is an antagonist of bone morphogenetic protein signaling. It is postulated that all members of this family antagonise different TGF beta pfam00019 ligands. Recent work shows that the DAN protein is not an efficient antagonist of BMP-2/4 class signals, we found that DAN was able to interact with GDF-5 in a frog embryo assay, suggesting that DAN may regulate signaling by the GDF-5/6/7 class of BMPs in vivo." Q#1025 - CGI_10017540 superfamily 246751 52 298 3.11E-76 238.682 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#1028 - CGI_10017543 superfamily 245213 83 107 0.00689835 34.1494 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1032 - CGI_10017547 superfamily 246612 72 135 2.56E-07 49.4309 cl14057 BPL_LplA_LipB superfamily NC - "Biotin/lipoate A/B protein ligase family; This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyzes the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. The unusual biosynthesis pathway of lipoic acid is mechanistically intertwined with attachment of the cofactor." Q#1034 - CGI_10017549 superfamily 216686 80 256 5.22E-45 153.633 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#1036 - CGI_10017551 superfamily 241600 30 79 1.81E-05 40.7234 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1037 - CGI_10004351 superfamily 247805 18 217 4.41E-86 264.732 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#1037 - CGI_10004351 superfamily 247905 232 362 3.23E-33 122.346 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#1040 - CGI_10009521 superfamily 247789 161 252 0.00152795 37.6234 cl17235 ABC2_membrane superfamily N - ABC-2 type transporter; ABC-2 type transporter. Q#1041 - CGI_10013161 superfamily 217473 175 341 1.00E-26 110.532 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#1041 - CGI_10013161 superfamily 241750 408 426 0.00865753 37.4638 cl00281 metallo-dependent_hydrolases superfamily NC - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1049 - CGI_10013169 superfamily 141815 177 291 1.18E-21 91.6564 cl04275 Mtc superfamily N - Tricarboxylate carrier; Tricarboxylate carrier. Q#1050 - CGI_10013170 superfamily 243084 73 179 9.28E-63 209.585 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#1050 - CGI_10013170 superfamily 243084 364 448 1.96E-48 168.996 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#1051 - CGI_10013171 superfamily 216363 297 375 0.000304174 38.9906 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#1052 - CGI_10013172 superfamily 241563 196 233 5.13E-05 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1053 - CGI_10013173 superfamily 243092 145 275 0.000765218 39.6256 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1055 - CGI_10013175 superfamily 217293 525 726 1.88E-37 141.231 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1055 - CGI_10013175 superfamily 241563 8 42 2.68E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1056 - CGI_10013176 superfamily 247044 39 151 2.92E-60 193.208 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1056 - CGI_10013176 superfamily 247044 166 249 6.54E-31 113.872 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1056 - CGI_10013176 superfamily 247044 315 390 5.11E-18 78.8304 cl15697 ADF_gelsolin superfamily C - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1059 - CGI_10005505 superfamily 216363 47 97 0.000763328 34.3682 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#1064 - CGI_10005510 superfamily 243035 78 186 5.62E-21 84.2085 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1065 - CGI_10005511 superfamily 243035 5 100 1.88E-15 69.5709 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1065 - CGI_10005511 superfamily 243035 110 171 1.13E-09 53.0687 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1066 - CGI_10005979 superfamily 247743 163 275 0.000147255 41.8624 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1067 - CGI_10005980 superfamily 241758 825 1004 5.89E-56 192.847 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#1067 - CGI_10005980 superfamily 241782 111 446 5.91E-37 143.928 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#1067 - CGI_10005980 superfamily 246680 453 528 0.00498247 36.713 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1068 - CGI_10005981 superfamily 247727 57 139 5.44E-13 66.2994 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#1068 - CGI_10005981 superfamily 247727 549 625 9.61E-05 41.2615 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#1069 - CGI_10005982 superfamily 243146 57 103 9.65E-11 53.4342 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1069 - CGI_10005982 superfamily 243146 33 68 3.20E-06 41.0047 cl02701 Kelch_3 superfamily N - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1072 - CGI_10005985 superfamily 219080 460 581 2.01E-25 102.034 cl05851 DUF1115 superfamily - - Protein of unknown function (DUF1115); This family represents the C-terminus of hypothetical eukaryotic proteins of unknown function. Q#1072 - CGI_10005985 superfamily 243141 307 430 2.74E-12 63.8746 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#1074 - CGI_10017767 superfamily 241810 73 228 5.38E-84 249.387 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#1075 - CGI_10017768 superfamily 150884 9 45 1.16E-09 51.3665 cl10958 Med19 superfamily NC - Mediator of RNA pol II transcription subunit 19; Med19 represents a family of conserved proteins which are members of the multi-protein co-activator Mediator complex. Mediator is required for activation of RNA polymerase II transcription by DNA binding transactivators. Q#1077 - CGI_10017770 superfamily 247745 159 434 2.94E-33 130.13 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#1077 - CGI_10017770 superfamily 191851 895 1023 5.32E-32 124.279 cl06708 DUF1640 superfamily - - Protein of unknown function (DUF1640); This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured. Q#1078 - CGI_10017771 superfamily 241644 14 134 7.51E-55 171.231 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#1079 - CGI_10017772 superfamily 247805 404 543 1.35E-23 99.7191 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#1079 - CGI_10017772 superfamily 241575 3 69 1.06E-11 62.6751 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#1079 - CGI_10017772 superfamily 247905 635 776 3.70E-10 59.5589 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#1079 - CGI_10017772 superfamily 241575 177 247 5.30E-07 48.8079 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#1079 - CGI_10017772 superfamily 243778 831 921 7.75E-29 113.088 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#1079 - CGI_10017772 superfamily 219532 960 1071 5.39E-22 93.9182 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#1080 - CGI_10017773 superfamily 247787 792 1035 3.25E-81 268.683 cl17233 RecA-like_NTPases superfamily - - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#1080 - CGI_10017773 superfamily 247723 1371 1453 3.93E-41 148.336 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1080 - CGI_10017773 superfamily 243066 558 652 6.00E-17 79.1985 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1080 - CGI_10017773 superfamily 243066 343 469 3.62E-11 62.2497 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1080 - CGI_10017773 superfamily 243146 30 68 5.03E-07 48.8118 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1080 - CGI_10017773 superfamily 243146 77 108 5.89E-06 45.6807 cl02701 Kelch_3 superfamily C - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1080 - CGI_10017773 superfamily 243146 255 306 4.56E-05 43.0467 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1080 - CGI_10017773 superfamily 198867 676 724 0.000697862 39.8344 cl06652 BACK superfamily C - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#1080 - CGI_10017773 superfamily 243146 91 146 0.00184221 38.4243 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1080 - CGI_10017773 superfamily 198867 472 530 0.00321034 37.7061 cl06652 BACK superfamily C - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#1080 - CGI_10017773 superfamily 243146 200 254 0.004851 36.8835 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1082 - CGI_10017775 superfamily 191444 2460 2524 6.45E-05 43.8521 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#1085 - CGI_10017778 superfamily 241750 84 385 4.44E-109 329.98 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1085 - CGI_10017778 superfamily 241750 451 518 1.86E-27 111.552 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1085 - CGI_10017778 superfamily 246664 418 451 1.46E-07 52.6974 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#1086 - CGI_10017779 superfamily 243072 151 265 7.40E-17 77.4238 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1086 - CGI_10017779 superfamily 243072 34 200 2.16E-07 49.3042 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1086 - CGI_10017779 superfamily 243072 378 524 0.00381718 36.5927 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1088 - CGI_10017781 superfamily 241691 363 489 2.30E-05 43.6548 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#1089 - CGI_10017782 superfamily 248458 82 466 6.44E-35 133.593 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1092 - CGI_10017785 superfamily 241546 4 111 2.10E-21 90.0908 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#1092 - CGI_10017785 superfamily 215847 212 554 2.53E-70 240.426 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#1095 - CGI_10002601 superfamily 241750 503 699 4.54E-35 133.468 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1096 - CGI_10023555 superfamily 247736 120 185 3.94E-08 48.0946 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#1102 - CGI_10023561 superfamily 247736 157 196 0.000584851 36.9238 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#1104 - CGI_10023563 superfamily 216566 978 1076 4.34E-08 53.3453 cl18370 Peptidase_M23 superfamily - - "Peptidase family M23; Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins such as Escherichia coli murein hydrolase activator NlpD, for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown." Q#1105 - CGI_10023564 superfamily 207637 4395 4462 2.49E-07 51.7699 cl02541 CIDE_N superfamily - - "CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein." Q#1106 - CGI_10023565 superfamily 242406 137 240 1.86E-10 57.2161 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1108 - CGI_10023567 superfamily 241737 7 179 3.79E-46 150.772 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#1109 - CGI_10023568 superfamily 241737 53 207 1.06E-39 135.364 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#1111 - CGI_10023570 superfamily 220763 292 332 2.20E-10 57.3737 cl11101 NUFIP1 superfamily N - "Nuclear fragile X mental retardation-interacting protein 1 (NUFIP1); Proteins in this family have been implicated in the assembly of the large subunit of the ribosome and in telomere maintenance. Some proteins in this family contain a CCCH zinc finger. This family contains a protein called human fragile X mental retardation-interacting protein 1, which is known to bind RNA and is phosphorylated upon DNA damage." Q#1112 - CGI_10023571 superfamily 243072 73 192 6.15E-32 119.025 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1112 - CGI_10023571 superfamily 243072 133 298 3.00E-26 103.617 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1112 - CGI_10023571 superfamily 243072 7 118 5.33E-23 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1112 - CGI_10023571 superfamily 245010 364 456 0.000226671 39.5235 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#1113 - CGI_10023572 superfamily 247724 13 222 1.56E-105 305.953 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1114 - CGI_10023573 superfamily 149515 563 584 5.72E-05 41.6724 cl07204 SRP72 superfamily N - SRP72 RNA-binding domain; This region has been identified as the binding site of the SRP72 protein to SRP RNA. Q#1115 - CGI_10023574 superfamily 247792 157 204 2.11E-06 45.8996 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1115 - CGI_10023574 superfamily 115400 730 753 0.00014775 40.2713 cl06002 SBBP superfamily N - Beta-propeller repeat; This family is related to pfam00400 and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller. Q#1115 - CGI_10023574 superfamily 241563 301 330 0.00403117 36.1611 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1116 - CGI_10023575 superfamily 243045 131 215 4.51E-11 60.3395 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#1118 - CGI_10023577 superfamily 245864 9 318 1.01E-47 167.455 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1119 - CGI_10023578 superfamily 247986 16 123 1.65E-07 50.4494 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#1119 - CGI_10023578 superfamily 197504 235 371 5.64E-19 82.7225 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#1120 - CGI_10023579 superfamily 245864 74 167 7.99E-13 64.607 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1121 - CGI_10023580 superfamily 246676 187 331 1.64E-38 138.247 cl14616 Cyt_b561 superfamily C - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#1121 - CGI_10023580 superfamily 246710 37 187 7.45E-24 96.728 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#1122 - CGI_10023581 superfamily 247856 85 156 4.28E-09 49.4685 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1122 - CGI_10023581 superfamily 247856 49 105 1.38E-06 42.9201 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1122 - CGI_10023581 superfamily 244899 11 72 0.00037606 36.699 cl08302 S-100 superfamily - - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#1123 - CGI_10023582 superfamily 247856 66 127 1.20E-13 62.5653 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1123 - CGI_10023582 superfamily 247856 101 174 1.58E-12 59.4837 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1124 - CGI_10023583 superfamily 241619 530 583 8.81E-08 50.5461 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1124 - CGI_10023583 superfamily 241619 408 480 0.000131056 40.9161 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1125 - CGI_10023584 superfamily 241619 795 866 9.48E-08 50.9313 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1125 - CGI_10023584 superfamily 241619 650 728 0.000606571 39.4844 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1125 - CGI_10023584 superfamily 241619 874 925 0.00075372 39.0992 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#1129 - CGI_10023589 superfamily 217062 40 285 8.26E-50 168.217 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#1130 - CGI_10023590 superfamily 217062 13 258 3.80E-51 169.372 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#1131 - CGI_10023591 superfamily 217062 13 222 3.10E-47 159.742 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#1133 - CGI_10023593 superfamily 245864 4 418 1.93E-102 314.602 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1138 - CGI_10023598 superfamily 243110 103 308 1.10E-12 67.4545 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#1139 - CGI_10023599 superfamily 220692 14 303 2.66E-14 71.0813 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#1144 - CGI_10004423 superfamily 216566 778 844 0.000330674 40.6337 cl18370 Peptidase_M23 superfamily N - "Peptidase family M23; Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins such as Escherichia coli murein hydrolase activator NlpD, for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown." Q#1145 - CGI_10004424 superfamily 207684 36 67 0.00140004 33.8916 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#1148 - CGI_10011233 superfamily 241584 368 461 0.000286703 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#1148 - CGI_10011233 superfamily 245814 185 263 0.000348544 39.344 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1149 - CGI_10011234 superfamily 215859 685 881 5.73E-58 197.824 cl18347 Peptidase_S9 superfamily - - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#1149 - CGI_10011234 superfamily 215859 586 616 0.000164382 42.5887 cl18347 Peptidase_S9 superfamily C - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#1151 - CGI_10011237 superfamily 247755 862 1082 3.12E-123 379.145 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1151 - CGI_10011237 superfamily 247755 214 416 3.26E-109 340.986 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1151 - CGI_10011237 superfamily 216049 584 773 1.42E-20 92.7341 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#1151 - CGI_10011237 superfamily 216049 1 170 7.30E-17 81.1782 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#1153 - CGI_10011239 superfamily 246925 141 216 5.05E-11 62.373 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#1157 - CGI_10011243 superfamily 243066 28 120 4.13E-16 73.0353 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1157 - CGI_10011243 superfamily 198867 128 236 2.29E-07 48.1064 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#1160 - CGI_10011246 superfamily 247905 311 363 4.48E-09 53.7809 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#1160 - CGI_10011246 superfamily 247805 34 217 7.56E-05 41.1688 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#1163 - CGI_10014591 superfamily 220672 9 174 5.21E-18 78.0574 cl10957 Frag1 superfamily N - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#1164 - CGI_10014592 superfamily 241974 504 579 1.65E-13 67.2666 cl00604 STAS superfamily N - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#1164 - CGI_10014592 superfamily 216188 14 293 3.49E-52 181.647 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#1165 - CGI_10014593 superfamily 241974 706 780 1.91E-12 64.9554 cl00604 STAS superfamily N - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#1165 - CGI_10014593 superfamily 241974 552 596 7.39E-05 41.8435 cl00604 STAS superfamily C - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#1165 - CGI_10014593 superfamily 216188 218 497 6.06E-53 185.499 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#1165 - CGI_10014593 superfamily 205965 77 160 7.94E-26 102.876 cl18285 Sulfate_tra_GLY superfamily - - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#1166 - CGI_10014594 superfamily 241594 34 317 2.83E-14 70.7971 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#1167 - CGI_10014595 superfamily 241614 134 239 5.39E-32 118.137 cl00105 LMWPc superfamily N - Low molecular weight phosphatase family; Q#1169 - CGI_10014597 superfamily 241571 989 1096 7.81E-18 81.3046 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1169 - CGI_10014597 superfamily 205157 216 251 1.92E-08 52.1547 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#1169 - CGI_10014597 superfamily 219525 947 984 2.59E-08 52.0361 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#1169 - CGI_10014597 superfamily 219525 826 874 1.57E-07 49.7249 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1169 - CGI_10014597 superfamily 219525 881 928 1.16E-06 47.4137 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1169 - CGI_10014597 superfamily 245213 509 538 1.98E-06 46.4712 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1169 - CGI_10014597 superfamily 241578 546 585 0.000123782 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1169 - CGI_10014597 superfamily 241578 463 506 0.000126336 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1169 - CGI_10014597 superfamily 205157 313 348 0.000423605 39.4431 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#1169 - CGI_10014597 superfamily 241578 248 289 0.000900149 40.8312 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1169 - CGI_10014597 superfamily 221695 450 471 0.00243792 37.4346 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#1169 - CGI_10014597 superfamily 241578 385 426 0.00902978 37.7496 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1170 - CGI_10014598 superfamily 247723 72 144 4.58E-44 145.268 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1171 - CGI_10014599 superfamily 221616 161 221 1.21E-11 63.2209 cl13896 DUF3719 superfamily C - "Protein of unknown function (DUF3719); This domain family is found in eukaryotes, and is approximately 70 amino acids in length. There is a conserved HLR sequence motif. There are two completely conserved residues (W and H) that may be functionally important." Q#1173 - CGI_10014601 superfamily 247856 770 829 5.28E-07 48.3129 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1173 - CGI_10014601 superfamily 247856 886 944 0.000325597 39.8385 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1173 - CGI_10014601 superfamily 247856 282 323 0.00788725 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1174 - CGI_10014602 superfamily 243082 176 625 0 636.283 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#1175 - CGI_10014603 superfamily 119093 8 77 3.16E-09 50.3456 cl11200 UPF0561 superfamily - - Uncharacterized protein family UPF0561; This family of proteins has no known function. Q#1176 - CGI_10014604 superfamily 241992 571 1005 0 555.338 cl00628 Piwi-like superfamily - - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#1176 - CGI_10014604 superfamily 241765 444 563 1.15E-46 164.356 cl00301 PAZ superfamily - - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#1176 - CGI_10014604 superfamily 241765 343 413 1.19E-25 104.264 cl00301 PAZ superfamily C - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#1176 - CGI_10014604 superfamily 241992 1003 1126 1.64E-64 226.378 cl00628 Piwi-like superfamily N - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#1182 - CGI_10014610 superfamily 241554 22 94 2.13E-17 74.8785 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#1183 - CGI_10014611 superfamily 217311 41 525 1.80E-131 397.863 cl18402 DUF229 superfamily - - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#1184 - CGI_10014612 superfamily 243077 36 89 4.13E-18 77.9709 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#1184 - CGI_10014612 superfamily 247804 385 426 0.00194789 36.0142 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#1184 - CGI_10014612 superfamily 247804 244 295 2.15E-07 47.6894 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#1185 - CGI_10014613 superfamily 217293 14 223 4.47E-88 271.814 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1185 - CGI_10014613 superfamily 202474 230 496 4.15E-30 116.599 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#1186 - CGI_10014614 superfamily 248097 2 78 3.08E-06 40.3262 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#1188 - CGI_10011248 superfamily 246669 866 984 1.42E-45 160.921 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#1188 - CGI_10011248 superfamily 241623 340 689 2.83E-173 512.374 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#1188 - CGI_10011248 superfamily 241742 164 335 4.63E-44 158.575 cl00271 PI3Ka superfamily - - "Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture." Q#1188 - CGI_10011248 superfamily 243088 716 834 1.01E-34 129.46 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#1188 - CGI_10011248 superfamily 246669 3 146 3.70E-25 103.977 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#1193 - CGI_10011253 superfamily 243035 75 199 1.79E-15 74.1933 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1193 - CGI_10011253 superfamily 241571 397 502 2.62E-13 67.8226 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1193 - CGI_10011253 superfamily 241568 572 625 6.11E-11 59.3988 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#1193 - CGI_10011253 superfamily 241571 286 387 1.46E-10 59.7334 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1193 - CGI_10011253 superfamily 241568 630 689 0.00524521 35.9016 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#1193 - CGI_10011253 superfamily 111397 738 817 7.09E-07 48.1063 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#1193 - CGI_10011253 superfamily 241571 221 274 3.22E-05 43.1699 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1194 - CGI_10011254 superfamily 245213 8 42 2.50E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1194 - CGI_10011254 superfamily 219525 619 666 8.95E-09 52.8065 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1194 - CGI_10011254 superfamily 219525 460 507 5.79E-06 44.7174 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1194 - CGI_10011254 superfamily 219525 520 559 2.69E-05 42.7914 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1194 - CGI_10011254 superfamily 219525 573 611 0.000579585 38.5542 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#1197 - CGI_10003094 superfamily 242889 319 416 1.02E-18 81.1101 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#1198 - CGI_10003095 superfamily 247740 17 167 3.34E-61 191.939 cl17186 TIM_phosphate_binding superfamily C - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#1200 - CGI_10023970 superfamily 243035 18 129 1.45E-07 45.6886 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1207 - CGI_10023977 superfamily 219459 175 272 2.06E-26 104.251 cl06530 NOC3p superfamily - - Nucleolar complex-associated protein; Nucleolar complex-associated protein (Noc3p) is conserved in eukaryotes and has essential roles in replication and rRNA processing in Saccharomyces cerevisiae. Q#1207 - CGI_10023977 superfamily 245319 524 664 1.97E-24 100.368 cl10505 CBF superfamily - - CBF/Mak21 family; CBF/Mak21 family. Q#1208 - CGI_10023978 superfamily 243035 308 432 4.00E-20 86.9049 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1208 - CGI_10023978 superfamily 243035 162 286 2.57E-18 81.9673 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1208 - CGI_10023978 superfamily 243035 49 147 2.50E-05 43.3574 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1209 - CGI_10023979 superfamily 241599 150 208 7.13E-22 86.9136 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#1210 - CGI_10023980 superfamily 241622 125 206 1.86E-22 93.7854 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#1210 - CGI_10023980 superfamily 241622 254 332 1.97E-16 76.4514 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#1210 - CGI_10023980 superfamily 241622 1070 1147 3.35E-08 52.5691 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#1210 - CGI_10023980 superfamily 143751 876 945 1.04E-09 56.7322 cl11968 harmonin_N_like superfamily - - "N-terminal protein-binding module of harmonin and similar domains; This domain is found in harmonin, and similar proteins such as delphilin, and whirlin. These are postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold proteins. Harmonin and whirlin are organizers of the Usher protein network of the inner ear and the retina, delphilin is found at the cerebellar parallel fiber-Purkinje cell synapses. This harmonin_N_like domain is found in either one or two copies. Harmonin contains a single copy, which is found at its N-terminus and binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain; cadherin 23 is a component of the Usher protein network. Whirlin contains two copies of the harmonin_N_like domain; the first of these has been assayed for interaction with the cytoplasmic domain of cadherin 23 and no interaction could be detected." Q#1210 - CGI_10023980 superfamily 143751 7 85 5.42E-06 45.7966 cl11968 harmonin_N_like superfamily - - "N-terminal protein-binding module of harmonin and similar domains; This domain is found in harmonin, and similar proteins such as delphilin, and whirlin. These are postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold proteins. Harmonin and whirlin are organizers of the Usher protein network of the inner ear and the retina, delphilin is found at the cerebellar parallel fiber-Purkinje cell synapses. This harmonin_N_like domain is found in either one or two copies. Harmonin contains a single copy, which is found at its N-terminus and binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain; cadherin 23 is a component of the Usher protein network. Whirlin contains two copies of the harmonin_N_like domain; the first of these has been assayed for interaction with the cytoplasmic domain of cadherin 23 and no interaction could be detected." Q#1211 - CGI_10023981 superfamily 199166 400 492 7.82E-14 70.0488 cl15308 AMN1 superfamily NC - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#1211 - CGI_10023981 superfamily 199166 46 233 3.50E-05 43.8552 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#1213 - CGI_10023983 superfamily 247792 1761 1804 7.68E-12 62.8484 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1215 - CGI_10023985 superfamily 241626 463 584 5.48E-58 192.049 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#1217 - CGI_10023987 superfamily 243120 215 244 0.00139407 37.9808 cl02633 ARID superfamily C - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#1218 - CGI_10023988 superfamily 247684 211 393 2.26E-06 47.1983 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#1220 - CGI_10023990 superfamily 212156 35 248 6.39E-161 459.591 cl17007 COE_DBD superfamily - - "Colier/Olf/Early B-cell factor (EBF) DNA Binding Domain; COE_DBD is the amino-terminal DNA binding domain of the COE protein family. The COE transcription factor is a regulator of development in several organs and tissues that contain the DBD domain as well as IPT/TIG (immunoglobulin-like, Plexins, transcription factors/transcription factor immunoglobulin) and basic helix-loop-helix (bHLH) domains. COE has four members in mammals (COE1-4) with high sequence similarity at the amino-terminal region. COE_DBD requires a zinc ion to bind DNA and contains a zinc finger motif (H-X(3)-C-X(2)-C-X(5)-C) termed the zinc knuckle. COE is homo- or heterodimerized through the bHLH domain to bind DNA. COE1-4 each has a variant due to alternative splicing. However, this alternative splicing does not occur at the DBD domain." Q#1220 - CGI_10023990 superfamily 247038 280 364 6.39E-46 156.655 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#1222 - CGI_10023992 superfamily 241868 69 223 2.57E-48 161.465 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#1223 - CGI_10023993 superfamily 217575 94 219 7.13E-34 123.153 cl04090 eRF1_2 superfamily - - "eRF1 domain 2; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#1223 - CGI_10023993 superfamily 146221 222 359 4.52E-23 91.8451 cl04091 eRF1_3 superfamily - - "eRF1 domain 3; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#1223 - CGI_10023993 superfamily 217574 17 90 4.01E-08 50.6846 cl04089 eRF1_1 superfamily C - "eRF1 domain 1; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#1224 - CGI_10023994 superfamily 188340 2 68 9.65E-08 47.5723 cl18158 selen_PSTK_euk superfamily N - "L-seryl-tRNA(Sec) kinase, eukaryotic; Members of this protein are L-seryl-tRNA(Sec) kinase. This enzyme is part of a two-step pathway in Eukaryota and Archaea for performing selenocysteine biosynthesis by changing serine misacylated on selenocysteine-tRNA to selenocysteine. This enzyme performs the first step, phosphorylation of the OH group of the serine side chain. This family represents eukaryotic proteins with this activity." Q#1225 - CGI_10023995 superfamily 146221 11 63 3.67E-05 37.1467 cl04091 eRF1_3 superfamily N - "eRF1 domain 3; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#1226 - CGI_10023996 superfamily 245225 246 438 3.64E-36 137.379 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1226 - CGI_10023996 superfamily 245225 40 235 2.20E-25 105.471 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1228 - CGI_10023998 superfamily 245225 5 182 2.38E-33 129.289 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1228 - CGI_10023998 superfamily 245225 244 416 1.06E-25 106.948 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1229 - CGI_10023999 superfamily 245225 50 235 8.38E-40 142.386 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1230 - CGI_10024000 superfamily 245819 854 1030 1.11E-63 213.98 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#1230 - CGI_10024000 superfamily 245201 562 779 6.41E-34 130.82 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1230 - CGI_10024000 superfamily 245225 27 394 7.65E-132 407.789 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1230 - CGI_10024000 superfamily 219526 793 840 0.000193597 42.6063 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#1231 - CGI_10024001 superfamily 242232 19 68 6.26E-14 67.198 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#1231 - CGI_10024001 superfamily 242232 225 298 2.64E-13 66.1462 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#1231 - CGI_10024001 superfamily 242232 150 200 7.47E-06 43.7008 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#1231 - CGI_10024001 superfamily 242232 94 117 0.00330325 35.9968 cl00984 TM2 superfamily C - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#1236 - CGI_10024006 superfamily 247723 37 117 2.71E-55 177.627 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1236 - CGI_10024006 superfamily 247723 308 348 1.43E-17 76.6398 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1239 - CGI_10024009 superfamily 241828 30 102 3.19E-13 61.375 cl00382 Ribosomal_L21p superfamily C - Ribosomal prokaryotic L21 protein; Ribosomal prokaryotic L21 protein. Q#1240 - CGI_10024010 superfamily 243082 810 1116 8.14E-94 300.744 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#1240 - CGI_10024010 superfamily 241647 690 720 3.48E-08 51.3746 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#1240 - CGI_10024010 superfamily 241626 165 295 2.22E-07 50.3582 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#1240 - CGI_10024010 superfamily 117535 6 111 2.52E-27 108.781 cl07540 DUF1873 superfamily - - Domain of unknown function (DUF1873); This domain is predominantly found in the amino terminal region of Ubiquitin carboxyl-terminal hydrolase 8 (USP8). It has no known function. Q#1242 - CGI_10024012 superfamily 243146 261 307 4.43E-10 55.3602 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1242 - CGI_10024012 superfamily 243146 227 272 5.49E-06 43.7011 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1246 - CGI_10024016 superfamily 243066 108 181 5.61E-12 60.7089 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1247 - CGI_10024017 superfamily 246598 22 287 5.93E-153 432.447 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#1249 - CGI_10024019 superfamily 220830 40 92 2.52E-11 61.5728 cl11246 Ofd1_CTDD superfamily N - "Oxoglutarate and iron-dependent oxygenase degradation C-term; Ofd1 is a prolyl 4-hydroxylase-like 2-oxoglutarate-Fe(II) dioxygenase that accelerates the degradation of Sre1N in the presence of oxygen. The domain is conserved from yeasts to humans. Yeast Sre1 is the orthologue of mammalian sterol regulatory element binding protein (SREBP), and it responds to changes in oxygen-dependent sterol synthesis as an indirect measure of oxygen availability. However, unlike the prolyl 4-hydroxylases that regulate mammalian hypoxia-inducible factor, Ofd1 uses multiple domains to regulate Sre1N degradation by oxygen; the Ofd1 N-terminal dioxygenase domain is required for oxygen sensing and this Ofd1 C-terminal domain accelerates Sre1N degradation in yeasts." Q#1249 - CGI_10024019 superfamily 248293 202 283 0.00296015 35.4123 cl17739 MADF_DNA_bdg superfamily - - Alcohol dehydrogenase transcription factor Myb/SANT-like; The myb/SANT-like domain in Adf-1 (MADF) is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Q#1250 - CGI_10024020 superfamily 244363 51 99 1.43E-15 68.6247 cl06336 Commd superfamily C - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#1251 - CGI_10024021 superfamily 245847 31 175 4.39E-29 111.29 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#1251 - CGI_10024021 superfamily 245847 347 443 3.48E-15 72.1499 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#1254 - CGI_10002901 superfamily 245864 27 435 5.56E-38 143.573 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1255 - CGI_10007895 superfamily 241733 6 81 3.19E-24 88.0866 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#1256 - CGI_10007896 superfamily 245213 38 74 1.62E-07 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1256 - CGI_10007896 superfamily 245213 76 112 4.30E-07 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1257 - CGI_10007897 superfamily 245213 265 302 0.000520922 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1259 - CGI_10007899 superfamily 247057 2 42 2.36E-06 40.0382 cl15755 SAM_superfamily superfamily NC - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#1260 - CGI_10007900 superfamily 152053 16 32 0.00519167 31.3467 cl13123 Cu-binding_MopE superfamily N - "Protein metal binding site; This family of proteins represents a unique protein copper binding site that involves a tryptophan metabolite, kynurenine in the protein MopE. The production of kyneurenin by modification of tryptophan and its involvement in copper binding is an innate property of MopE." Q#1261 - CGI_10007901 superfamily 247792 16 66 6.57E-05 39.3512 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1263 - CGI_10006069 superfamily 199166 37 245 1.10E-11 62.3448 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#1263 - CGI_10006069 superfamily 199166 159 335 6.15E-10 57.3372 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#1263 - CGI_10006069 superfamily 243074 5 31 8.28E-06 42.4937 cl02535 F-box-like superfamily N - F-box-like; This is an F-box-like family. Q#1264 - CGI_10006070 superfamily 246908 493 579 3.65E-29 112.621 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#1264 - CGI_10006070 superfamily 245201 608 856 6.82E-148 438.435 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1264 - CGI_10006070 superfamily 245835 10 246 1.94E-54 188.747 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#1267 - CGI_10019202 superfamily 241832 22 96 4.07E-16 70.3496 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1267 - CGI_10019202 superfamily 243175 108 224 2.62E-13 63.3208 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#1268 - CGI_10019203 superfamily 241550 129 396 2.57E-49 173.904 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#1268 - CGI_10019203 superfamily 245839 480 574 1.06E-06 47.9587 cl12020 Anticodon_Ia_like superfamily C - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#1269 - CGI_10019204 superfamily 220403 100 276 5.55E-51 171.949 cl18555 Tmemb_55A superfamily N - "Transmembrane protein 55A; Members of this family catalyze the hydrolysis of the 4-position phosphate of phosphatidylinositol 4,5-bisphosphate, in the reaction: 1-phosphatidyl-myo-inositol 4,5-bisphosphate + H(2)O = 1-phosphatidyl-1D-myo-inositol 5-phosphate + phosphate." Q#1272 - CGI_10019207 superfamily 241563 190 223 3.35E-07 48.0523 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1272 - CGI_10019207 superfamily 247792 8 62 2.15E-05 42.818 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1272 - CGI_10019207 superfamily 216033 396 489 1.14E-21 90.856 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#1272 - CGI_10019207 superfamily 110440 634 661 4.24E-08 50.4841 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1272 - CGI_10019207 superfamily 110440 565 592 9.31E-07 46.6321 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1272 - CGI_10019207 superfamily 110440 518 545 9.40E-07 46.6321 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1272 - CGI_10019207 superfamily 110440 681 708 5.17E-06 44.3209 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1273 - CGI_10019208 superfamily 241782 62 482 2.45E-147 430.837 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#1274 - CGI_10019209 superfamily 222599 9 99 1.58E-20 79.6021 cl16717 DUF4326 superfamily - - "Domain of unknown function (DUF4326); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 100 and 162 amino acids in length. There are two completely conserved residues (P and C) that may be functionally important." Q#1275 - CGI_10019210 superfamily 222599 10 98 1.55E-10 54.9493 cl16717 DUF4326 superfamily C - "Domain of unknown function (DUF4326); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria, archaea, eukaryotes and viruses. Proteins in this family are typically between 100 and 162 amino acids in length. There are two completely conserved residues (P and C) that may be functionally important." Q#1277 - CGI_10019212 superfamily 243035 182 301 1.56E-29 109.632 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1278 - CGI_10019213 superfamily 215896 60 113 1.45E-08 48.8304 cl18351 Cu-oxidase superfamily NC - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#1279 - CGI_10019214 superfamily 247725 156 283 1.99E-74 228.334 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1279 - CGI_10019214 superfamily 241631 10 129 1.59E-42 147.369 cl00136 Sec7 superfamily N - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#1280 - CGI_10019215 superfamily 243035 77 193 2.28E-19 79.9713 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1283 - CGI_10019218 superfamily 192997 293 439 1.90E-28 112.675 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#1284 - CGI_10019220 superfamily 245210 5 68 4.61E-24 92.2358 cl09938 cond_enzymes superfamily C - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#1286 - CGI_10019222 superfamily 247724 31 87 0.00303107 36.5358 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1288 - CGI_10019224 superfamily 243613 62 118 0.00514474 33.6967 cl04011 DPBB_1 superfamily C - "Rare lipoprotein A (RlpA)-like double-psi beta-barrel; Rare lipoprotein A (RlpA) contains a conserved region that has the double-psi beta-barrel (DPBB) fold. The function of RlpA is not well understood, but it has been shown to act as a prc mutant suppressor in Escherichia coli. The DPBB fold is often an enzymatic domain. The members of this family are quite diverse, and if catalytic this family may contain several different functions. Another example of this domain is found in the N terminus of pollen allergen." Q#1289 - CGI_10019225 superfamily 244910 134 205 0.000857301 36.0085 cl08320 Pollen_allerg_1 superfamily - - "Pollen allergen; This family contains allergens lol PI, PII and PIII from Lolium perenne." Q#1292 - CGI_10019228 superfamily 241578 35 197 3.01E-44 155.527 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1292 - CGI_10019228 superfamily 241578 286 446 1.66E-41 148.208 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1293 - CGI_10019230 superfamily 245364 632 756 3.25E-77 246.409 cl10717 CactinC_cactus superfamily - - "Cactus-binding C-terminus of cactin protein; CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain pfam10312 further upstream." Q#1293 - CGI_10019230 superfamily 220686 246 431 2.22E-57 194.449 cl10987 Cactin_mid superfamily - - "Conserved mid region of cactin; This is the conserved middle region of a family of proteins referred to as cactins. The region contains two of three predicted coiled-coil domains. Most members of this family have a CactinC_cactus pfam09732 domain at the C-terminal end. Upstream of Mid_cactin in Drosophila members are a serine-rich region, some non-typical RD motifs and three predicted bipartite nuclear localisation signals, none of which are well-conserved. Cactin associates with IkappaB-cactus as one of the intracellular members of the Rel (NF-kappaB) pathway which is conserved in invertebrates and vertebrates. In mammals, this pathway controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo." Q#1297 - CGI_10019234 superfamily 245226 9 181 2.61E-105 302.158 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#1298 - CGI_10019235 superfamily 243138 648 901 3.47E-111 344.744 cl02675 DZF superfamily - - DZF domain; The function of this domain is unknown. It is often found associated with pfam00098 or pfam00035. This domain has been predicted to belong to the nucleotidyltransferase superfamily. Q#1298 - CGI_10019235 superfamily 197732 215 244 2.41E-06 45.7063 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#1298 - CGI_10019235 superfamily 197732 439 467 2.64E-05 42.6247 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#1298 - CGI_10019235 superfamily 197732 265 294 0.000112181 41.0839 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#1300 - CGI_10019237 superfamily 247907 20 162 5.17E-24 101.726 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#1300 - CGI_10019237 superfamily 247907 197 359 4.84E-09 57.0429 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#1301 - CGI_10019238 superfamily 247743 131 296 1.96E-21 91.8239 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1301 - CGI_10019238 superfamily 247743 402 589 7.57E-16 75.6455 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1302 - CGI_10019239 superfamily 219904 23 95 4.37E-15 64.6767 cl07245 Ribosomal_L37 superfamily - - Mitochondrial ribosomal protein L37; This family includes yeast MRPL37 a mitochondrial ribosomal protein. Q#1304 - CGI_10019241 superfamily 190882 148 363 5.70E-64 205.146 cl04416 SCAMP superfamily - - "SCAMP family; In vertebrates, secretory carrier membrane proteins (SCAMPs) 1-3 constitute a family of putative membrane-trafficking proteins composed of cytoplasmic N-terminal sequences with NPF repeats, four central transmembrane regions (TMRs), and a cytoplasmic tail. SCAMPs probably function in endocytosis by recruiting EH-domain proteins to the N-terminal NPF repeats but may have additional functions mediated by their other sequences." Q#1305 - CGI_10019242 superfamily 241874 25 477 5.78E-136 405.75 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1306 - CGI_10019243 superfamily 242065 29 156 1.67E-49 167.284 cl00749 UPF0066 superfamily - - "Escherichia coli YaeB and related proteins; Uncharacterized protein family UPF0066. This domain includes Escherichia coli YeaB, Archeoglobus fulgidus AF0241, and Agrobacterium tumefaciens VirR. Proteins with this domain are probable S-adenosylmethionine-dependent methyltransferases but they have not been functionally characterized and the substrate is unknown." Q#1306 - CGI_10019243 superfamily 242065 179 306 1.53E-44 153.802 cl00749 UPF0066 superfamily - - "Escherichia coli YaeB and related proteins; Uncharacterized protein family UPF0066. This domain includes Escherichia coli YeaB, Archeoglobus fulgidus AF0241, and Agrobacterium tumefaciens VirR. Proteins with this domain are probable S-adenosylmethionine-dependent methyltransferases but they have not been functionally characterized and the substrate is unknown." Q#1308 - CGI_10019245 superfamily 241680 33 275 9.54E-72 223.67 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#1309 - CGI_10019246 superfamily 241752 1 63 1.32E-19 76.5893 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#1317 - CGI_10004325 superfamily 241574 268 332 9.06E-07 47.9657 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1317 - CGI_10004325 superfamily 241574 202 235 0.000500909 39.8766 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1319 - CGI_10004327 superfamily 241574 25 118 1.18E-22 93.8045 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1319 - CGI_10004327 superfamily 241574 275 331 0.00687248 36.0246 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1320 - CGI_10002266 superfamily 110440 462 488 0.000115003 40.0837 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1320 - CGI_10002266 superfamily 241563 37 73 0.000234776 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1322 - CGI_10003945 superfamily 247724 37 155 8.25E-47 164.988 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1322 - CGI_10003945 superfamily 247724 162 206 6.76E-20 88.382 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1322 - CGI_10003945 superfamily 241578 447 628 1.69E-10 60.7219 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1322 - CGI_10003945 superfamily 192987 413 470 0.00605975 36.3963 cl13724 TMF_TATA_bd superfamily C - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#1327 - CGI_10001540 superfamily 245847 7 79 7.40E-12 58.7221 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#1328 - CGI_10003327 superfamily 243056 198 305 1.01E-08 53.1317 cl02495 RabGAP-TBC superfamily N - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#1329 - CGI_10003328 superfamily 220691 119 191 0.00591057 36.827 cl18569 7TM_GPCR_Srv superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#1342 - CGI_10009551 superfamily 241578 512 646 9.26E-10 58.3462 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1342 - CGI_10009551 superfamily 241578 963 1120 5.88E-22 94.9744 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1342 - CGI_10009551 superfamily 241578 169 325 3.01E-16 77.8213 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1345 - CGI_10009554 superfamily 221683 66 144 7.27E-09 53.0415 cl15002 UPF0489 superfamily - - UPF0489 domain; This family is probably an enzyme which is related to the Arginase family. Q#1351 - CGI_10006012 superfamily 243074 9 55 0.000691335 41.3381 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#1352 - CGI_10012847 superfamily 243161 5 61 3.27E-11 55.0929 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#1357 - CGI_10012855 superfamily 248264 325 439 0.000532924 39.913 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#1360 - CGI_10012858 superfamily 110440 84 110 0.00195237 33.9205 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1365 - CGI_10009791 superfamily 245201 5 254 5.57E-138 392.086 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1366 - CGI_10009792 superfamily 242059 9 442 3.93E-40 149.819 cl00738 MBOAT superfamily - - "MBOAT, membrane-bound O-acyltransferase family; The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue." Q#1367 - CGI_10009793 superfamily 241571 242 361 2.40E-25 102.876 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241571 95 213 3.33E-24 99.409 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241571 529 638 9.22E-18 80.9194 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241571 392 505 3.45E-17 79.3786 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241571 648 775 2.51E-12 65.1262 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1367 - CGI_10009793 superfamily 241613 783 816 0.000707617 38.7685 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#1370 - CGI_10009796 superfamily 244881 10 311 1.24E-146 419.377 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#1371 - CGI_10009797 superfamily 241788 440 488 5.21E-19 82.5386 cl00327 Ribosomal_L22 superfamily N - "Ribosomal protein L22/L17e. L22 (L17 in eukaryotes) is a core protein of the large ribosomal subunit. It is the only ribosomal protein that interacts with all six domains of 23S rRNA, and is one of the proteins important for directing the proper folding and stabilizing the conformation of 23S rRNA. L22 is the largest protein contributor to the surface of the polypeptide exit channel, the tunnel through which the polypeptide product passes. L22 is also one of six proteins located at the putative translocon binding site on the exterior surface of the ribosome." Q#1371 - CGI_10009797 superfamily 245814 236 309 1.24E-05 43.2467 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1371 - CGI_10009797 superfamily 245814 143 201 2.10E-11 60.1131 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1371 - CGI_10009797 superfamily 245814 32 98 6.94E-05 41.259 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1372 - CGI_10009798 superfamily 245201 78 430 0 573.228 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1373 - CGI_10009799 superfamily 247948 99 146 2.12E-12 62.3114 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#1374 - CGI_10009800 superfamily 217293 32 136 2.70E-06 46.4719 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1375 - CGI_10009801 superfamily 115363 170 224 1.27E-09 54.6854 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#1375 - CGI_10009801 superfamily 241578 10 116 5.18E-05 42.4416 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1376 - CGI_10009802 superfamily 242406 49 107 3.41E-12 58.3717 cl01271 DUF1768 superfamily C - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1377 - CGI_10014543 superfamily 246616 1 307 3.70E-34 128.194 cl14105 MetH superfamily - - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#1378 - CGI_10014544 superfamily 246616 218 305 0.00161364 38.4871 cl14105 MetH superfamily C - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#1381 - CGI_10014547 superfamily 245201 14 348 3.86E-139 406.743 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1381 - CGI_10014547 superfamily 221460 461 492 2.46E-05 42.0207 cl12053 OSR1_C superfamily - - "Oxidative-stress-responsive kinase 1 C terminal; This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00069. There is a single completely conserved residue F that may be functionally important. OSR1 is involved in the signalling cascade which activates Na/K/2Cl cotransporter during osmotic stress. This domain is the C terminal domain of OSR1 which recognises a motif (Arg-Phe-Xaa-Val) on the OSR1-activating protein WNK1." Q#1383 - CGI_10014549 superfamily 198827 19 63 0.000449235 34.3272 cl03803 BAF superfamily NC - Barrier to autointegration factor; The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. Q#1384 - CGI_10014550 superfamily 243119 108 152 9.56E-06 42.8061 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#1384 - CGI_10014550 superfamily 247907 282 384 0.000177293 40.0154 cl17353 LamG superfamily C - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#1384 - CGI_10014550 superfamily 243119 66 101 0.000668119 37.4133 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#1385 - CGI_10014551 superfamily 241611 77 238 2.45E-08 52.0056 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#1388 - CGI_10014554 superfamily 241571 64 179 5.17E-13 67.4374 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1388 - CGI_10014554 superfamily 238012 838 882 2.17E-05 43.497 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#1388 - CGI_10014554 superfamily 238012 884 929 6.97E-05 41.9562 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#1388 - CGI_10014554 superfamily 243146 309 348 3.22E-06 46.1154 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1388 - CGI_10014554 superfamily 243146 422 473 0.00412865 36.8835 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1389 - CGI_10014555 superfamily 247948 12 67 1.22E-16 71.9414 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#1390 - CGI_10014556 superfamily 247948 12 67 1.12E-14 63.8522 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#1391 - CGI_10014557 superfamily 247723 377 497 9.52E-56 185.343 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1391 - CGI_10014557 superfamily 247723 514 594 2.05E-54 181.529 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#1392 - CGI_10014558 superfamily 247684 9 461 1.63E-87 279.933 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#1393 - CGI_10014559 superfamily 206634 65 186 3.63E-54 170.635 cl16904 AKAP28 superfamily - - "28 kDa A-kinase anchor; 28 kDa AKAP (AKAP28) is highly enriched in human airway axonemes. The mRNA for AKAP28 is up-regulated as primary airway cells differentiate and is specifically expressed in tissues containing cilia and/or flagella. Homologs of AKAP28 are present in all animals and in some, including mice the AKAP28-like domain are preceded by another uncharacterized domain" Q#1394 - CGI_10014560 superfamily 206050 28 123 2.06E-21 85.7887 cl16449 KIAA1430 superfamily - - KIAA1430 homologue; This is a family of KIAA1430 homologues. The function is not known. Q#1395 - CGI_10014561 superfamily 243555 22 214 2.73E-13 67.031 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#1399 - CGI_10014565 superfamily 248100 72 132 0.000557878 36.3632 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#1401 - CGI_10014567 superfamily 246925 434 585 6.40E-05 44.6538 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#1401 - CGI_10014567 superfamily 243051 53 195 0.00427231 37.7426 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#1404 - CGI_10002873 superfamily 247780 310 588 5.25E-143 418.876 cl17226 NAD_bind_amino_acid_DH superfamily - - "NAD(P) binding domain of amino acid dehydrogenase-like proteins; Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts." Q#1404 - CGI_10002873 superfamily 215894 122 300 2.88E-103 313.044 cl02855 malic superfamily - - "Malic enzyme, N-terminal domain; Malic enzyme, N-terminal domain. " Q#1407 - CGI_10008634 superfamily 217293 33 233 2.86E-34 127.364 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1407 - CGI_10008634 superfamily 202474 240 340 2.18E-13 68.0641 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#1413 - CGI_10008641 superfamily 241600 287 502 1.39E-94 288.755 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1415 - CGI_10011964 superfamily 247725 265 356 8.52E-59 196.024 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1415 - CGI_10011964 superfamily 149993 456 613 1.15E-48 170.386 cl07673 Talin_middle superfamily - - "Talin, middle domain; Members of this family adopt a structure consisting of five alpha helices that fold into a bundle. They contain a Vinculin binding site (VBS) composed of a hydrophobic surface spanning five turns of helix four. Activation of the VBS causes subsequent recruitment of Vinculin, which enables maturation of small integrin/talin complexes into more stable adhesions. Formation of the complex between VBS and Vinculin requires prior unfolding of this middle domain: once released from the talin hydrophobic core, the VBS helix is then available to induce the 'bundle conversion' conformational change within the vinculin head domain thereby displacing the intramolecular interaction with the vinculin tail, allowing vinculin to bind actin." Q#1415 - CGI_10011964 superfamily 215882 162 269 8.52E-28 109.678 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#1415 - CGI_10011964 superfamily 220215 46 154 3.45E-09 54.9238 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#1418 - CGI_10011967 superfamily 245226 145 341 4.19E-43 153.209 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#1420 - CGI_10011969 superfamily 218405 134 246 0.000169644 40.9429 cl18455 DUF676 superfamily C - Putative serine esterase (DUF676); This family of proteins are probably serine esterase type enzymes with an alpha/beta hydrolase fold. Q#1420 - CGI_10011969 superfamily 247101 241 311 0.000400199 40.2413 cl15849 Palm_thioest superfamily N - Palmitoyl protein thioesterase; Palmitoyl protein thioesterase. Q#1421 - CGI_10011970 superfamily 247692 415 530 5.45E-18 83.8798 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#1421 - CGI_10011970 superfamily 247101 160 340 4.17E-05 44.2094 cl15849 Palm_thioest superfamily - - Palmitoyl protein thioesterase; Palmitoyl protein thioesterase. Q#1424 - CGI_10011973 superfamily 248262 7 275 1.10E-131 381.962 cl17708 HMBS superfamily - - "Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophylls, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). HMBS consists of three domains, and is believed to bind substrate through a hinge-bending motion of domains I and II. HMBS is found in all organisms except viruses." Q#1425 - CGI_10011974 superfamily 242432 248 309 3.57E-10 58.9123 cl01321 SURF1 superfamily C - "SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder." Q#1425 - CGI_10011974 superfamily 242432 317 388 4.65E-10 57.7387 cl01321 SURF1 superfamily N - "SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder." Q#1426 - CGI_10011975 superfamily 191068 20 75 1.08E-14 63.7971 cl04701 ETC_C1_NDUFA5 superfamily - - ETC complex I subunit conserved region; Family of eukaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 29.9 kDa protein. The conserved region is found at the N-terminus of the member proteins. Q#1427 - CGI_10011976 superfamily 247725 307 417 1.85E-59 198.132 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1430 - CGI_10011979 superfamily 247755 717 937 2.54E-88 284.076 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1430 - CGI_10011979 superfamily 248376 428 710 6.21E-26 109.42 cl17822 MutS_III superfamily - - "MutS domain III; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterized in." Q#1430 - CGI_10011979 superfamily 218486 285 405 1.72E-11 62.7637 cl04975 MutS_II superfamily - - "MutS domain II; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam01624, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. This domain corresponds to domain II in Thermus aquaticus MutS as characterized in, and has similarity resembles RNAse-H-like domains (see pfam00075)." Q#1431 - CGI_10011980 superfamily 247800 106 130 0.000113839 37.1776 cl17246 MarR_2 superfamily NC - "MarR family; The Mar proteins are involved in the multiple antibiotic resistance, a non-specific resistance system. The expression of the mar operon is controlled by a repressor, MarR. A large number of compounds induce transcription of the mar operon. This is thought to be due to the compound binding to MarR, and the resulting complex stops MarR binding to the DNA. With the MarR repression lost, transcription of the operon proceeds. The structure of MarR is known and shows MarR as a dimer with each subunit containing a winged-helix DNA binding motif." Q#1434 - CGI_10001708 superfamily 243035 23 146 4.52E-11 56.8593 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1436 - CGI_10004256 superfamily 215647 21 118 0.00084485 38.3585 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#1437 - CGI_10004257 superfamily 241596 77 120 8.08E-10 51.8311 cl00081 HLH superfamily C - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#1439 - CGI_10004259 superfamily 247724 58 224 1.54E-11 59.4825 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1440 - CGI_10004260 superfamily 243092 1572 1687 6.45E-11 63.8932 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1440 - CGI_10004260 superfamily 205451 42 135 4.21E-06 46.8027 cl16203 DUF4062 superfamily - - "Domain of unknown function (DUF4062); This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. There is a conserved SST sequence motif." Q#1440 - CGI_10004260 superfamily 243092 1176 1337 0.000274456 43.4776 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1440 - CGI_10004260 superfamily 247743 395 547 0.000904535 40.2608 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1443 - CGI_10005280 superfamily 203136 76 160 5.56E-05 40.0204 cl04867 LRAT superfamily N - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#1448 - CGI_10005285 superfamily 246680 9 84 1.90E-16 74.1603 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1448 - CGI_10005285 superfamily 241567 164 294 1.12E-08 54.5287 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#1449 - CGI_10005286 superfamily 243175 194 265 3.27E-16 71.8883 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#1449 - CGI_10005286 superfamily 241832 46 112 1.69E-17 74.9678 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1451 - CGI_10022918 superfamily 245225 3 86 3.36E-06 48.0765 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1451 - CGI_10022918 superfamily 243199 170 235 4.65E-05 43.4338 cl02808 RT_like superfamily N - "RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs." Q#1452 - CGI_10022920 superfamily 247908 54 199 8.29E-56 177.871 cl17354 NIF superfamily - - NLI interacting factor-like phosphatase; This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain. Q#1453 - CGI_10022921 superfamily 243072 52 157 4.96E-27 101.691 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1454 - CGI_10022922 superfamily 246669 2 62 5.78E-05 41.8747 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#1456 - CGI_10022924 superfamily 246683 59 155 9.89E-37 130.703 cl14648 Aldose_epim superfamily C - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#1457 - CGI_10022925 superfamily 218118 94 151 7.18E-10 56.0833 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#1458 - CGI_10022926 superfamily 243238 77 552 0 582.283 cl02915 Voltage_gated_ClC superfamily - - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#1458 - CGI_10022926 superfamily 246936 822 868 2.28E-19 85.3816 cl15354 CBS_pair superfamily N - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#1460 - CGI_10022928 superfamily 209871 345 525 3.96E-85 269.909 cl14608 P53 superfamily - - "P53 DNA-binding domain; P53 is a tumor suppressor gene product; mutations in p53 or lack of expression are found associated with a large fraction of all human cancers. P53 is activated by DNA damage and acts as a regulator of gene expression that ultimatively blocks progression through the cell cycle. P53 binds to DNA as a tetrameric transcription factor. In its inactive form, p53 is bound to the ring finger protein Mdm2, which promotes its ubiquitinylation and subsequent proteosomal degradation. Phosphorylation of p53 disrupts the Mdm2-p53 complex, while the stable and active p53 binds to regulatory regions of its target genes, such as the cyclin-kinase inhibitor p21, which complexes and inactivates cdk2 and other cyclin complexes." Q#1460 - CGI_10022928 superfamily 247057 674 732 2.99E-30 114.722 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#1460 - CGI_10022928 superfamily 149007 554 595 1.63E-16 74.9011 cl06653 P53_tetramer superfamily - - P53 tetramerisation motif; P53 tetramerisation motif. Q#1460 - CGI_10022928 superfamily 149567 210 234 7.11E-06 44.1366 cl07246 P53_TAD superfamily - - P53 transactivation motif; The binding of the p53 transactivation domain by regulatory proteins regulates p53 transcription activation. This motif is comprised of a single amphipathic alpha helix and contains a highly conserved sequence. Q#1461 - CGI_10022929 superfamily 243092 47 94 0.00199089 34.618 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1462 - CGI_10022930 superfamily 241645 320 393 6.34E-07 46.4955 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#1463 - CGI_10022931 superfamily 241597 14 84 2.05E-26 100.45 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#1464 - CGI_10022932 superfamily 220692 56 295 2.82E-15 74.5481 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#1465 - CGI_10022933 superfamily 243239 351 439 2.54E-31 116.651 cl02916 POLO_box superfamily - - "Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases; The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides." Q#1465 - CGI_10022933 superfamily 243239 454 537 9.12E-29 109.209 cl02916 POLO_box superfamily - - "Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases; The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides." Q#1465 - CGI_10022933 superfamily 245201 18 223 1.30E-60 202.365 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1468 - CGI_10022937 superfamily 241574 15 82 1.55E-10 54.5539 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1471 - CGI_10008552 superfamily 241743 97 187 7.40E-10 53.1397 cl00274 ML superfamily N - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#1472 - CGI_10008553 superfamily 241743 155 193 0.00103381 36.5762 cl00274 ML superfamily N - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#1473 - CGI_10008554 superfamily 241607 23 62 2.68E-07 42.257 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#1474 - CGI_10008555 superfamily 245206 6 244 5.52E-76 233.347 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#1475 - CGI_10008556 superfamily 220791 148 204 8.75E-06 42.8284 cl11149 Borealin superfamily C - "Cell division cycle-associated protein 8; The chromosomal passenger complex of Aurora B kinase, INCENP, and Survivin has essential regulatory roles at centromeres and the central spindle in mitosis. Borealin is also a member of the complex. Approximately half of Aurora B in mitotic cells is complexed with INCENP, Borealin, and Survivin. Depletion of Borealin by RNA interference delays mitotic progression and results in kinetochore-spindle mis-attachments and an increase in bipolar spindles associated with ectopic asters." Q#1476 - CGI_10008557 superfamily 246675 296 448 9.99E-92 295.875 cl14615 PI-PLCc_GDPD_SF superfamily C - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#1476 - CGI_10008557 superfamily 247725 14 134 2.94E-59 200.558 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1476 - CGI_10008557 superfamily 246908 640 742 4.69E-51 176.687 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#1476 - CGI_10008557 superfamily 246908 528 626 4.96E-48 167.913 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#1476 - CGI_10008557 superfamily 246669 1077 1203 4.25E-45 160.4 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#1476 - CGI_10008557 superfamily 246708 940 1060 1.20E-49 173.086 cl14781 PI-PLC-Y superfamily - - "Phosphatidylinositol-specific phospholipase C, Y domain; This associates with pfam00388 to form a single structural unit." Q#1476 - CGI_10008557 superfamily 247683 795 842 2.94E-15 72.3647 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#1476 - CGI_10008557 superfamily 247725 854 920 9.62E-07 48.578 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1476 - CGI_10008557 superfamily 247725 468 505 1.34E-06 48.1928 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1477 - CGI_10008558 superfamily 248458 228 341 3.47E-06 48.4641 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1478 - CGI_10008559 superfamily 243072 173 292 5.16E-29 109.395 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1478 - CGI_10008559 superfamily 243072 63 193 3.51E-25 98.6098 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1478 - CGI_10008559 superfamily 243072 2 121 1.08E-14 69.3346 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1478 - CGI_10008559 superfamily 243072 266 347 3.26E-13 65.4826 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1479 - CGI_10008560 superfamily 241580 79 156 4.02E-39 135.759 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#1480 - CGI_10008561 superfamily 241832 7 106 8.93E-12 60.7268 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1480 - CGI_10008561 superfamily 241832 114 172 0.000201118 39.1271 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1481 - CGI_10008562 superfamily 243157 16 95 1.11E-37 127.44 cl02720 PB1 superfamily - - "The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants." Q#1482 - CGI_10001583 superfamily 245814 236 293 1.93E-05 42.1828 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1483 - CGI_10007614 superfamily 241583 209 316 4.94E-24 97.8297 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#1483 - CGI_10007614 superfamily 216572 41 135 5.63E-05 40.7211 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#1485 - CGI_10007617 superfamily 243072 18 143 1.43E-18 79.3498 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1485 - CGI_10007617 superfamily 243072 91 272 8.32E-14 65.8678 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1489 - CGI_10007621 superfamily 245215 28 459 0 676.927 cl09945 XylA superfamily - - Xylose isomerase [Carbohydrate transport and metabolism] Q#1489 - CGI_10007621 superfamily 246976 460 851 3.66E-151 467.999 cl15483 Dymeclin superfamily C - "Dyggve-Melchior-Clausen syndrome protein; Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. Mutations in the gene coding for this protein in humans give rise to the disorder Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800) which is an autosomal-recessive disorder characterized by the association of a spondylo-epi-metaphyseal dysplasia and mental retardation. DYM transcripts are widely expressed throughout human development and Dymeclin is not an integral membrane protein of the ER, but rather a peripheral membrane protein dynamically associated with the Golgi apparatus." Q#1490 - CGI_10007622 superfamily 247999 503 550 1.36E-06 46.3296 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#1491 - CGI_10007623 superfamily 247805 1 77 0.000251028 38.4724 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#1492 - CGI_10007624 superfamily 247724 23 45 9.10E-05 35.7332 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1496 - CGI_10012255 superfamily 248264 65 125 1.60E-06 45.691 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#1498 - CGI_10012257 superfamily 241613 225 259 3.07E-12 61.0685 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#1498 - CGI_10012257 superfamily 241613 186 220 3.80E-12 60.6833 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#1498 - CGI_10012257 superfamily 245814 103 177 0.000251694 39.0257 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1500 - CGI_10012259 superfamily 241750 2 155 1.67E-42 143.098 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#1501 - CGI_10012260 superfamily 246680 9 91 1.11E-07 48.7372 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1504 - CGI_10012263 superfamily 110440 181 208 0.0072863 32.7649 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1505 - CGI_10012264 superfamily 246918 99 153 4.66E-10 51.8187 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#1508 - CGI_10012267 superfamily 241563 68 109 1.58E-06 45.548 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1508 - CGI_10012267 superfamily 241563 28 59 0.000480661 38.2292 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1509 - CGI_10010396 superfamily 247792 510 551 2.51E-09 53.9888 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1509 - CGI_10010396 superfamily 222330 3 479 1.66E-124 379.805 cl16356 TRC8_N superfamily - - TRC8 N-terminal domain; This region is found at the N-terminus of the TRC8 protein. TRC8 is an E3 ubiquitin-protein ligase also known as RNF139. This region contains 12 transmembrane domains. This region has been suggested to contain a sterol sensing domain. It has been found that TRC8 protein levels are sterol responsive and that it binds and stimulates ubiquitylation of the endoplasmic reticulum anchor protein INSIG. Q#1510 - CGI_10010397 superfamily 247792 528 569 1.35E-07 48.9812 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1510 - CGI_10010397 superfamily 222330 31 497 7.14E-116 358.619 cl16356 TRC8_N superfamily - - TRC8 N-terminal domain; This region is found at the N-terminus of the TRC8 protein. TRC8 is an E3 ubiquitin-protein ligase also known as RNF139. This region contains 12 transmembrane domains. This region has been suggested to contain a sterol sensing domain. It has been found that TRC8 protein levels are sterol responsive and that it binds and stimulates ubiquitylation of the endoplasmic reticulum anchor protein INSIG. Q#1511 - CGI_10010398 superfamily 241599 155 211 4.49E-20 81.5208 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#1512 - CGI_10010399 superfamily 243362 84 150 3.48E-20 82.473 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#1512 - CGI_10010399 superfamily 199908 20 69 1.37E-11 56.4951 cl16908 DnaJ_zf superfamily C - "Zinc finger domain of DnaJ and HSP40; Central/middle or CxxCxGxG-motif containing domain of DnaJ/Hsp40 (heat shock protein 40). DnaJ proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonin family. Hsp40 proteins are characterized by the presence of an N-terminal J domain, which mediates the interaction with Hsp70. This central domain contains four repeats of a CxxCxGxG motif and binds to two Zinc ions. It has been implicated in substrate binding." Q#1515 - CGI_10010402 superfamily 241748 162 348 5.59E-39 140.401 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#1515 - CGI_10010402 superfamily 244955 24 80 4.15E-05 41.7981 cl08433 AMP_N superfamily C - "Aminopeptidase P, N-terminal domain; This domain is structurally very similar to the creatinase N-terminal domain (pfam01321). However, little or no sequence similarity exists between the two families." Q#1516 - CGI_10010403 superfamily 245201 180 368 2.08E-15 76.036 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1516 - CGI_10010403 superfamily 245201 633 695 1.84E-08 55.0488 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1517 - CGI_10010404 superfamily 245201 511 745 8.76E-17 81.0436 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1517 - CGI_10010404 superfamily 245201 944 1048 1.05E-12 68.332 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1518 - CGI_10010405 superfamily 245819 452 636 1.21E-59 199.728 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#1518 - CGI_10010405 superfamily 219812 126 362 1.90E-11 63.4792 cl07121 NIT superfamily - - "Nitrate and nitrite sensing; The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure." Q#1518 - CGI_10010405 superfamily 219526 394 439 6.67E-08 52.2363 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#1519 - CGI_10010406 superfamily 247856 155 216 2.64E-16 70.2693 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1519 - CGI_10010406 superfamily 247856 81 142 1.66E-11 57.1725 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1520 - CGI_10010407 superfamily 216981 59 173 5.57E-16 73.7209 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#1520 - CGI_10010407 superfamily 243130 237 262 0.0089632 33.9779 cl02655 CUE superfamily N - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#1521 - CGI_10010408 superfamily 218533 231 498 1.96E-98 308.993 cl10638 DUF726 superfamily N - Protein of unknown function (DUF726); This family consists of several uncharacterized eukaryotic proteins. Q#1523 - CGI_10010410 superfamily 221911 40 146 1.47E-43 145.794 cl18625 Fer2_3 superfamily - - 2Fe-2S iron-sulfur cluster binding domain; The 2Fe-2S ferredoxin family have a general core structure consisting of beta(2)-alpha-beta(2) which abeta-grasp type fold. The domain is around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. Q#1524 - CGI_10010411 superfamily 248312 63 167 2.54E-08 51.2088 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#1527 - CGI_10009921 superfamily 247916 137 198 0.000148824 41.2322 cl17362 Transglut_core superfamily C - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#1528 - CGI_10009922 superfamily 241599 144 195 1.21E-16 72.276 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#1529 - CGI_10009923 superfamily 243072 515 626 5.62E-22 92.4466 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1529 - CGI_10009923 superfamily 245008 60 120 3.02E-06 45.2424 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#1530 - CGI_10009924 superfamily 248020 221 395 1.26E-23 100.617 cl17466 Sulfatase superfamily N - Sulfatase; Sulfatase. Q#1530 - CGI_10009924 superfamily 248020 28 135 9.15E-16 77.1196 cl17466 Sulfatase superfamily C - Sulfatase; Sulfatase. Q#1531 - CGI_10009925 superfamily 241645 75 153 5.00E-12 62.2756 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#1532 - CGI_10009926 superfamily 241626 31 109 2.57E-28 101.556 cl00125 RHOD superfamily N - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#1533 - CGI_10009927 superfamily 241626 11 100 9.94E-33 115.409 cl00125 RHOD superfamily N - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#1534 - CGI_10009928 superfamily 148050 5 136 1.12E-29 107.054 cl05627 GRIM-19 superfamily - - "GRIM-19 protein; This family consists of several eukaryotic gene associated with retinoic-interferon-induced mortality 19 (GRIM-19) proteins. GRIM-19, was reported to encode a small protein primarily distributed in the nucleus and was able to promote cell death induced by IFN-# and RA. A bovine homologue of GRIM-19 was co-purified with mitochondrial NADH:ubiquinone oxidoreductase (complex I) in bovine heart. Therefore, its exact cellular localisation and function are unclear. It has now been discovered that GRIM-19 is a specific interacting protein which negatively regulates Stat3 activity." Q#1537 - CGI_10009931 superfamily 243092 410 694 1.15E-16 80.0716 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1539 - CGI_10009933 superfamily 243093 85 160 4.90E-07 47.525 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#1544 - CGI_10007414 superfamily 241563 59 95 0.000151318 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1545 - CGI_10007415 superfamily 247727 160 257 1.18E-14 68.6106 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#1546 - CGI_10007416 superfamily 247727 165 257 1.40E-12 62.8326 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#1547 - CGI_10007417 superfamily 247727 159 256 8.98E-13 63.603 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#1548 - CGI_10007418 superfamily 247727 123 221 1.13E-11 60.1362 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#1549 - CGI_10007419 superfamily 245226 5 70 0.000414565 39.5907 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#1551 - CGI_10003628 superfamily 247044 36 148 1.35E-59 190.512 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1551 - CGI_10003628 superfamily 247044 163 258 8.88E-22 88.0632 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1551 - CGI_10003628 superfamily 247044 278 373 1.68E-24 95.7791 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1552 - CGI_10006576 superfamily 149284 208 354 3.04E-44 152.649 cl06952 CPL superfamily - - CPL (NUC119) domain; This C terminal domain is fund in Penguin-like proteins associated with Pumilio like repeats. Q#1552 - CGI_10006576 superfamily 243032 110 239 8.53E-05 42.9639 cl02427 Pumilio superfamily NC - "Pumilio-family RNA binding domain; Puf repeats (also labelled PUM-HD or Pumilio homology domain) mediate sequence specific RNA binding in fly Pumilio, worm FBF-1 and FBF-2, and many other proteins such as vertebrate Pumilio. These proteins function as translational repressors in early embryonic development by binding to sequences in the 3' UTR of target mRNAs, such as the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA. Other proteins that contain Puf domains are also plausible RNA binding proteins. Yeast PUF1 (JSN1), for instance, appears to contain a single RNA-recognition motif (RRM) domain. Puf repeat proteins have been observed to function asymmetrically and may be responsible for creating protein gradients involved in the specification of cell fate and differentiation. Puf domains usually occur as a tandem repeat of 8 domains. This model encompasses all 8 tandem repeats. Some proteins may have fewer (canonical) repeats." Q#1553 - CGI_10006577 superfamily 114591 430 528 1.03E-10 59.8642 cl05445 Mt_ATP-synt_D superfamily N - "ATP synthase D chain, mitochondrial (ATP5H); This family consists of several ATP synthase D chain, mitochondrial (ATP5H) proteins. Subunit d has no extensive hydrophobic sequences, and is not apparently related to any subunit described in the simpler ATP synthases in bacteria and chloroplasts." Q#1553 - CGI_10006577 superfamily 243092 232 344 0.0070652 37.3144 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1555 - CGI_10006579 superfamily 217814 1 146 1.16E-18 83.527 cl04345 Jun superfamily N - Jun-like transcription factor; Jun-like transcription factor. Q#1555 - CGI_10006579 superfamily 243100 177 214 0.000363556 38.4592 cl02576 B_zip1 superfamily N - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#1555 - CGI_10006579 superfamily 243100 356 393 0.000363556 38.4592 cl02576 B_zip1 superfamily N - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#1556 - CGI_10006580 superfamily 247755 496 647 5.42E-59 199.45 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1556 - CGI_10006580 superfamily 248376 282 502 1.62E-18 85.5378 cl17822 MutS_III superfamily - - "MutS domain III; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterized in." Q#1558 - CGI_10005836 superfamily 192535 39 248 0.00654023 36.4198 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#1559 - CGI_10005837 superfamily 192535 33 70 0.00531716 36.4198 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#1560 - CGI_10018322 superfamily 241640 70 194 2.61E-45 157.053 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#1561 - CGI_10018323 superfamily 241596 30 89 1.71E-15 69.1651 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#1561 - CGI_10018323 superfamily 243123 105 149 4.38E-10 54.1013 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#1563 - CGI_10018325 superfamily 216152 206 544 3.63E-64 215.641 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#1563 - CGI_10018325 superfamily 216152 5 170 2.41E-42 155.55 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#1564 - CGI_10018326 superfamily 247068 236 332 2.98E-12 63.8717 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#1564 - CGI_10018326 superfamily 247068 154 227 2.18E-07 49.2342 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#1564 - CGI_10018326 superfamily 247068 340 437 5.06E-06 44.997 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#1564 - CGI_10018326 superfamily 247068 457 536 0.000559224 38.8338 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#1565 - CGI_10018327 superfamily 248458 129 268 1.52E-10 61.5609 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1565 - CGI_10018327 superfamily 248458 367 505 6.19E-10 59.6349 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1566 - CGI_10018328 superfamily 241584 443 530 9.16E-05 41.3279 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#1566 - CGI_10018328 superfamily 245213 5 37 0.00124754 37.4562 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1566 - CGI_10018328 superfamily 245201 620 690 5.94E-05 44.2517 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1573 - CGI_10018335 superfamily 202823 94 152 1.04E-17 80.6608 cl08408 Ribosomal_L2_C superfamily N - "Ribosomal Proteins L2, C-terminal domain; Ribosomal Proteins L2, C-terminal domain. " Q#1573 - CGI_10018335 superfamily 109247 11 90 7.79E-11 59.5129 cl02816 Ribosomal_L2 superfamily - - "Ribosomal Proteins L2, RNA binding domain; Ribosomal Proteins L2, RNA binding domain. " Q#1574 - CGI_10018336 superfamily 247805 293 514 3.58E-98 305.178 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#1574 - CGI_10018336 superfamily 247905 524 655 8.50E-40 143.532 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#1576 - CGI_10018338 superfamily 241795 250 379 5.37E-72 223.665 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#1576 - CGI_10018338 superfamily 241795 105 235 1.89E-70 219.623 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#1576 - CGI_10018338 superfamily 207712 17 97 2.74E-23 92.7601 cl02728 DUF1126 superfamily - - Repeat of unknown function (DUF1126); This family consists of several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown. Q#1578 - CGI_10018340 superfamily 220131 687 969 1.67E-59 210.21 cl11721 DUF1943 superfamily - - "Domain of unknown function (DUF1943); Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined." Q#1578 - CGI_10018340 superfamily 219034 982 1080 2.46E-10 61.1982 cl05778 DUF1081 superfamily - - Domain of Unknown Function (DUF1081); This region is found in Apolipophorin proteins. Q#1578 - CGI_10018340 superfamily 222032 2360 2405 0.00237836 39.9231 cl16218 CPSF100_C superfamily N - "Cleavage and polyadenylation factor 2 C-terminal; This family lies at the C-terminus of many fungal and plant cleavage and polyadenylation specificity factor subunit 2 proteins. The exact function of the domain is not known, but is likely to function as a binding domain for the protein within the overall CPSF complex." Q#1581 - CGI_10013404 superfamily 241599 157 214 1.22E-21 85.758 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#1582 - CGI_10013405 superfamily 154937 171 243 1.14E-27 102.296 cl02489 SWIB superfamily - - SWIB/MDM2 domain; This family includes the SWIB domain and the MDM2 domain. The p53-associated protein (MDM2) is an inhibitor of the p53 tumour suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2. Q#1582 - CGI_10013405 superfamily 204056 5 56 2.55E-09 51.3297 cl07395 DEK_C superfamily - - DEK C terminal domain; DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family. This domain is also found in chitin synthase proteins and in protein phosphatases. Q#1586 - CGI_10013409 superfamily 220692 1 72 8.55E-06 41.8061 cl18570 7TM_GPCR_Srw superfamily N - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#1588 - CGI_10013411 superfamily 220692 64 373 1.20E-22 95.7341 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#1593 - CGI_10013416 superfamily 216731 12 49 8.57E-05 37.6235 cl12258 A2M_N superfamily N - MG2 domain; This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin. Q#1594 - CGI_10013417 superfamily 244881 228 531 1.25E-118 363.436 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#1594 - CGI_10013417 superfamily 215788 8 98 1.45E-27 108.035 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#1594 - CGI_10013417 superfamily 203720 640 725 9.31E-26 103.012 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#1594 - CGI_10013417 superfamily 147487 60 157 5.84E-05 42.3898 cl05075 SVA superfamily N - "Seminal vesicle autoantigen (SVA); This family consists of seminal vesicle autoantigen and prolactin-inducible (PIP) proteins. Seminal vesicle autoantigen (SVA) is specifically present in the seminal plasma of mice. This 19-kDa secretory glycoprotein suppresses the motility of spermatozoa by interacting with phospholipid. PIP, has several known functions. In saliva, this protein plays a role in host defence by binding to microorganisms such as Streptococcus. PIP is an aspartyl proteinase and it acts as a factor capable of suppressing T-cell apoptosis through its interaction with CD4." Q#1595 - CGI_10013418 superfamily 215827 2 131 2.47E-17 80.5903 cl02830 Tyrosinase superfamily N - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#1596 - CGI_10013419 superfamily 247856 80 136 2.72E-08 46.7721 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1596 - CGI_10013419 superfamily 247856 13 72 1.91E-06 41.3793 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1598 - CGI_10013421 superfamily 247856 82 137 2.49E-12 58.3281 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1598 - CGI_10013421 superfamily 247856 13 75 1.56E-10 52.9353 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1600 - CGI_10013423 superfamily 154937 280 356 1.81E-26 102.39 cl02489 SWIB superfamily - - SWIB/MDM2 domain; This family includes the SWIB domain and the MDM2 domain. The p53-associated protein (MDM2) is an inhibitor of the p53 tumour suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2. Q#1602 - CGI_10013425 superfamily 243066 19 72 2.35E-09 50.3085 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#1604 - CGI_10013427 superfamily 245213 326 380 0.0029501 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1604 - CGI_10013427 superfamily 241571 74 160 0.00220995 37.0067 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1604 - CGI_10013427 superfamily 245213 383 414 0.0073 34.9152 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1605 - CGI_10013428 superfamily 245596 127 426 2.00E-159 460.133 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#1605 - CGI_10013428 superfamily 247085 439 553 7.70E-24 97.191 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#1607 - CGI_10004635 superfamily 241571 1057 1182 3.40E-10 58.963 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1607 - CGI_10004635 superfamily 245213 1014 1049 0.00233918 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1607 - CGI_10004635 superfamily 241583 799 980 2.97E-37 139.629 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#1612 - CGI_10004191 superfamily 247724 2 128 1.25E-40 138.822 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1613 - CGI_10022176 superfamily 217293 526 713 2.79E-22 96.5479 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1615 - CGI_10022178 superfamily 243078 20 137 1.64E-56 188.53 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#1621 - CGI_10022184 superfamily 243065 1028 1183 3.02E-35 134.452 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#1621 - CGI_10022184 superfamily 243065 579 726 7.16E-33 127.173 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#1621 - CGI_10022184 superfamily 243065 221 372 1.27E-32 126.788 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#1621 - CGI_10022184 superfamily 222049 1729 1821 2.38E-21 91.6314 cl16239 Mucin2_WxxW superfamily - - "Mucin-2 protein WxxW repeating region; This family is repeating region found on mucins 2 and 5. The function is not known, but the repeat can be present in up to 32 copies, as in a member from Branchiostoma floridae. The region carries a highly conserved WxxW sequence motif and also has at least six well conserved cysteine residues." Q#1621 - CGI_10022184 superfamily 244710 1223 1298 9.86E-20 86.6753 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#1621 - CGI_10022184 superfamily 244710 408 482 1.07E-14 72.0377 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#1621 - CGI_10022184 superfamily 244710 764 830 7.33E-14 69.7265 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#1621 - CGI_10022184 superfamily 222049 1465 1550 6.75E-13 66.9786 cl16239 Mucin2_WxxW superfamily - - "Mucin-2 protein WxxW repeating region; This family is repeating region found on mucins 2 and 5. The function is not known, but the repeat can be present in up to 32 copies, as in a member from Branchiostoma floridae. The region carries a highly conserved WxxW sequence motif and also has at least six well conserved cysteine residues." Q#1622 - CGI_10022185 superfamily 245213 660 684 0.00212571 36.8458 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1622 - CGI_10022185 superfamily 245864 387 584 9.16E-21 94.6526 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#1622 - CGI_10022185 superfamily 221981 174 222 0.00348597 36.5764 cl18630 Big_5 superfamily N - Bacterial Ig-like domain; Bacterial Ig-like domain. Q#1623 - CGI_10022186 superfamily 243100 43 74 0.00074595 33.0664 cl02576 B_zip1 superfamily N - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#1624 - CGI_10022187 superfamily 145367 67 104 8.88E-11 55.8148 cl03479 pKID superfamily - - pKID domain; CBP and P300 bind to the pKID (phosphorylated kinase-inducible-domain) domain of CREB. Q#1624 - CGI_10022187 superfamily 243100 251 282 5.61E-07 45.778 cl02576 B_zip1 superfamily N - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#1625 - CGI_10022188 superfamily 245206 33 305 3.15E-89 270.688 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#1626 - CGI_10022189 superfamily 244824 29 417 0 527.131 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#1626 - CGI_10022189 superfamily 243149 423 511 6.71E-32 117.339 cl02706 Alpha-amylase_C superfamily - - "Alpha amylase, C-terminal all-beta domain; Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain." Q#1627 - CGI_10022190 superfamily 244824 1 364 7.13E-168 478.981 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#1627 - CGI_10022190 superfamily 243149 370 458 2.04E-32 118.109 cl02706 Alpha-amylase_C superfamily - - "Alpha amylase, C-terminal all-beta domain; Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain." Q#1628 - CGI_10022191 superfamily 243181 2 83 4.45E-47 160.093 cl02783 TopoII_MutL_Trans superfamily N - "MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of type II DNA topoisomerases (Topo II) and DNA mismatch repair (MutL/MLH1/PMS2) proteins. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. The GyrB dimerizes in response to ATP binding, and is homologous to the N-terminal half of eukaryotic Topo II and the ATPase fragment of MutL. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. Included in this group are proteins similar to human MLH1 and PMS2. MLH1 forms a heterodimer with PMS2 which functions in meiosis and in DNA mismatch repair (MMR). Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families." Q#1635 - CGI_10022199 superfamily 221377 100 143 0.00607213 34.3667 cl13449 DUF3504 superfamily NC - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#1637 - CGI_10022202 superfamily 245226 195 238 1.93E-06 46.1397 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#1641 - CGI_10010184 superfamily 190709 5 104 2.54E-52 161.896 cl04204 UPF0139 superfamily - - Uncharacterized protein family (UPF0139); Uncharacterised protein family (UPF0139). Q#1642 - CGI_10010185 superfamily 246748 21 329 4.24E-134 404.278 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#1643 - CGI_10010186 superfamily 207690 92 116 0.000777182 36.5269 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#1643 - CGI_10010186 superfamily 207690 15 39 0.00145163 35.7565 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#1645 - CGI_10010188 superfamily 245847 17 123 1.12E-09 51.7344 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#1646 - CGI_10010189 superfamily 238191 425 894 1.11E-114 365.116 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#1646 - CGI_10010189 superfamily 242406 208 358 1.46E-52 181.636 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1646 - CGI_10010189 superfamily 245201 32 99 1.40E-30 122.222 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1647 - CGI_10010190 superfamily 245201 7 261 1.86E-112 337.934 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1647 - CGI_10010190 superfamily 242406 436 549 1.43E-31 119.233 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1648 - CGI_10010191 superfamily 245201 492 746 6.21E-82 268.212 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1648 - CGI_10010191 superfamily 242406 894 1044 4.42E-53 183.176 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1648 - CGI_10010191 superfamily 242406 314 464 2.54E-51 178.169 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1649 - CGI_10010192 superfamily 242406 233 382 9.18E-51 169.096 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#1650 - CGI_10010193 superfamily 241754 202 533 0 545.701 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#1651 - CGI_10010194 superfamily 241832 567 661 7.12E-28 108.029 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1651 - CGI_10010194 superfamily 248022 203 516 9.25E-07 50.3539 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#1651 - CGI_10010194 superfamily 248022 59 101 0.000824379 40.7239 cl17468 Aa_trans superfamily C - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#1653 - CGI_10010196 superfamily 241563 114 144 0.00590007 32.4512 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1655 - CGI_10010198 superfamily 247792 16 45 0.000111314 35.114 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1656 - CGI_10010199 superfamily 241563 75 110 4.72E-07 47.0888 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1658 - CGI_10010201 superfamily 241563 75 110 0.000174327 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1659 - CGI_10010202 superfamily 247792 16 59 1.21E-07 48.2108 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1659 - CGI_10010202 superfamily 241563 154 189 8.87E-06 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1661 - CGI_10013462 superfamily 243555 21 216 7.73E-14 69.3422 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#1662 - CGI_10013463 superfamily 246925 312 514 0.000342083 41.5722 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#1664 - CGI_10013465 superfamily 248458 75 413 1.81E-23 99.6956 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1666 - CGI_10013467 superfamily 248458 1 90 0.000315988 37.6785 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1667 - CGI_10013468 superfamily 241609 10 85 2.07E-24 95.1375 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1667 - CGI_10013468 superfamily 241609 284 360 1.68E-17 76.2627 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1667 - CGI_10013468 superfamily 245213 167 196 0.000285893 38.0014 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1667 - CGI_10013468 superfamily 241609 211 279 4.19E-16 72.4107 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1667 - CGI_10013468 superfamily 241609 89 161 4.64E-12 61.2522 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1668 - CGI_10013469 superfamily 241610 509 559 3.02E-14 68.0454 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#1668 - CGI_10013469 superfamily 241610 392 442 2.09E-13 65.7342 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#1670 - CGI_10013471 superfamily 243051 496 643 3.55E-25 103.997 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#1670 - CGI_10013471 superfamily 241609 732 807 6.01E-25 100.915 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1670 - CGI_10013471 superfamily 241609 1012 1080 7.07E-20 86.2779 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1670 - CGI_10013471 superfamily 241609 923 999 4.95E-19 83.9667 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1670 - CGI_10013471 superfamily 241571 359 485 1.06E-12 66.2818 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1670 - CGI_10013471 superfamily 245213 888 916 0.000342463 39.9274 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1670 - CGI_10013471 superfamily 241583 131 309 1.35E-38 143.481 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#1670 - CGI_10013471 superfamily 241609 657 728 1.96E-14 70.4847 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1670 - CGI_10013471 superfamily 241609 819 879 1.71E-13 67.7162 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1672 - CGI_10013473 superfamily 241574 27 72 1.73E-14 69.5369 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1672 - CGI_10013473 superfamily 241574 99 270 3.40E-14 68.7665 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1673 - CGI_10013474 superfamily 244824 70 86 0.00885295 34.9627 cl07893 AmyAc_family superfamily C - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#1678 - CGI_10013479 superfamily 241574 512 667 2.65E-70 232.862 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1678 - CGI_10013479 superfamily 241574 768 829 0.00558042 37.9506 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1681 - CGI_10000828 superfamily 241583 53 149 2.80E-17 75.6854 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#1682 - CGI_10000898 superfamily 241698 20 85 7.12E-29 103.443 cl00220 cysteine_hydrolases superfamily N - "Cysteine hydrolases; This family contains amidohydrolases, like CSHase (N-carbamoylsarcosine amidohydrolase), involved in creatine metabolism and nicotinamidase, converting nicotinamide to nicotinic acid and ammonia in the pyridine nucleotide cycle. It also contains isochorismatase, an enzyme that catalyzes the conversion of isochorismate to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of the vinyl ether bond, and other related enzymes with unknown function." Q#1684 - CGI_10005103 superfamily 241563 59 99 1.56E-05 42.7095 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1684 - CGI_10005103 superfamily 128778 98 212 0.00379937 36.4739 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#1686 - CGI_10005105 superfamily 110440 112 139 0.00171035 33.5353 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1687 - CGI_10004514 superfamily 241832 71 139 2.11E-18 76.8938 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1688 - CGI_10004515 superfamily 148314 1 342 4.09E-26 108.048 cl05919 XRCC4 superfamily - - "DNA double-strand break repair and V(D)J recombination protein XRCC4; This family consists of several eukaryotic DNA double-strand break repair and V(D)J recombination protein XRCC4 sequences. In the non-homologous end joining pathway of DNA double-strand break repair, the ligation step is catalyzed by a complex of XRCC4 and DNA ligase IV. It is thought that XRCC4 and ligase IV are essential for alignment-based gap filling, as well as for final ligation of the breaks." Q#1688 - CGI_10004515 superfamily 148314 365 539 4.34E-08 53.735 cl05919 XRCC4 superfamily C - "DNA double-strand break repair and V(D)J recombination protein XRCC4; This family consists of several eukaryotic DNA double-strand break repair and V(D)J recombination protein XRCC4 sequences. In the non-homologous end joining pathway of DNA double-strand break repair, the ligation step is catalyzed by a complex of XRCC4 and DNA ligase IV. It is thought that XRCC4 and ligase IV are essential for alignment-based gap filling, as well as for final ligation of the breaks." Q#1689 - CGI_10004516 superfamily 241832 47 117 7.63E-26 97.6946 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1689 - CGI_10004516 superfamily 243175 137 242 2.54E-14 66.1104 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#1690 - CGI_10004517 superfamily 243175 63 194 3.38E-17 73.4291 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#1690 - CGI_10004517 superfamily 241832 1 46 1.38E-09 51.8558 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1691 - CGI_10004518 superfamily 243175 137 268 2.24E-15 69.1919 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#1691 - CGI_10004518 superfamily 241832 48 120 1.50E-19 80.3606 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1692 - CGI_10004519 superfamily 243175 143 211 6.96E-14 64.1844 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#1692 - CGI_10004519 superfamily 241832 26 91 2.48E-21 84.5978 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1693 - CGI_10004520 superfamily 194545 18 105 9.45E-29 100.71 cl03131 Dynein_light superfamily - - Dynein light chain type 1; Dynein light chain type 1. Q#1694 - CGI_10004521 superfamily 243072 90 215 1.40E-41 140.211 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1694 - CGI_10004521 superfamily 243072 37 116 1.52E-11 58.9342 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1695 - CGI_10004522 superfamily 217895 21 134 1.52E-05 44.1711 cl04401 CD20 superfamily - - "CD20-like family; This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulfide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probably topology where both amino- and carboxy termini protrude into the cytoplasm. This family also includes LR8 like proteins from humans, mice and rats. The function of the human LR8 protein is unknown although it is known to be strongly expressed in the lung fibroblasts. This family also includes sarcospan is a transmembrane component of dystrophin-associated glycoprotein. Loss of the sarcoglycan complex and sarcospan alone is sufficient to cause muscular dystrophy. The role of the sarcoglycan complex and sarcospan is thought to be to strengthen the dystrophin axis connecting the basement membrane with the cytoskeleton." Q#1696 - CGI_10001186 superfamily 248020 28 132 2.52E-17 77.4015 cl17466 Sulfatase superfamily C - Sulfatase; Sulfatase. Q#1697 - CGI_10001187 superfamily 241752 277 411 7.52E-17 77.7559 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#1698 - CGI_10005726 superfamily 238191 11 503 1.36E-108 335.071 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#1699 - CGI_10005727 superfamily 238191 28 518 3.33E-111 342.005 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#1700 - CGI_10005728 superfamily 238191 23 513 8.07E-110 338.153 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#1702 - CGI_10005730 superfamily 246676 186 346 2.35E-37 135.55 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#1702 - CGI_10005730 superfamily 246710 28 171 2.23E-16 75.9272 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#1703 - CGI_10005731 superfamily 246676 430 602 5.30E-40 147.491 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#1703 - CGI_10005731 superfamily 246710 119 277 3.07E-20 89.7944 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#1703 - CGI_10005731 superfamily 246710 294 399 3.43E-13 68.6084 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#1703 - CGI_10005731 superfamily 246676 863 934 1.14E-05 45.7986 cl14616 Cyt_b561 superfamily N - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#1703 - CGI_10005731 superfamily 246710 760 825 0.000271468 41.2592 cl14783 DOMON_like superfamily C - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#1709 - CGI_10004796 superfamily 248264 36 200 1.04E-49 160.866 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#1710 - CGI_10004797 superfamily 110440 158 185 0.00108511 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1711 - CGI_10001418 superfamily 243082 109 201 1.75E-33 121.126 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#1711 - CGI_10001418 superfamily 243082 25 157 0.00470137 35.8048 cl02553 Peptidase_C19 superfamily NC - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#1712 - CGI_10001591 superfamily 245201 8 227 1.05E-83 252.457 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1715 - CGI_10006691 superfamily 241578 202 355 4.57E-42 146.663 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1715 - CGI_10006691 superfamily 241578 6 151 6.40E-23 93.5053 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1716 - CGI_10006692 superfamily 247980 5 182 2.48E-56 177.33 cl17426 DHFR superfamily - - "Dihydrofolate reductase (DHFR). Reduces 7,8-dihydrofolate to 5,6,7,8-tetrahydrofolate with NADPH as a cofactor. This is an essential step in the biosynthesis of deoxythymidine phosphate since 5,6,7,8-tetrahydrofolate is required to regenerate 5,10-methylenetetrahydrofolate which is then utilized by thymidylate synthase. Inhibition of DHFR interrupts thymidilate synthesis and DNA replication, inhibitors of DHFR (such as Methotrexate) are used in cancer chemotherapy. 5,6,7,8-tetrahydrofolate also is involved in glycine, serine, and threonine metabolism and aminoacyl-tRNA biosynthesis." Q#1717 - CGI_10006693 superfamily 243061 133 221 8.92E-09 51.5738 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#1717 - CGI_10006693 superfamily 243061 32 97 0.00122765 36.2907 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#1718 - CGI_10006694 superfamily 247980 25 56 4.39E-10 51.7343 cl17426 DHFR superfamily N - "Dihydrofolate reductase (DHFR). Reduces 7,8-dihydrofolate to 5,6,7,8-tetrahydrofolate with NADPH as a cofactor. This is an essential step in the biosynthesis of deoxythymidine phosphate since 5,6,7,8-tetrahydrofolate is required to regenerate 5,10-methylenetetrahydrofolate which is then utilized by thymidylate synthase. Inhibition of DHFR interrupts thymidilate synthesis and DNA replication, inhibitors of DHFR (such as Methotrexate) are used in cancer chemotherapy. 5,6,7,8-tetrahydrofolate also is involved in glycine, serine, and threonine metabolism and aminoacyl-tRNA biosynthesis." Q#1721 - CGI_10006697 superfamily 241578 1 159 4.24E-27 108.918 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1721 - CGI_10006697 superfamily 243119 417 470 0.0016003 37.4133 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#1722 - CGI_10006698 superfamily 241574 276 332 1.87E-09 57.5957 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1724 - CGI_10001942 superfamily 245818 20 106 3.47E-37 128.467 cl11966 Rel-Spo_like superfamily N - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#1725 - CGI_10001943 superfamily 245818 10 193 5.49E-76 233.241 cl11966 Rel-Spo_like superfamily C - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#1727 - CGI_10001074 superfamily 248458 21 137 6.83E-07 49.6197 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1727 - CGI_10001074 superfamily 248458 272 400 2.64E-05 44.6121 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1728 - CGI_10013178 superfamily 248012 137 227 1.18E-08 52.2757 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#1728 - CGI_10013178 superfamily 248012 2 100 0.000492997 38.4584 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#1729 - CGI_10013179 superfamily 248012 137 264 1.60E-11 60.7501 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#1729 - CGI_10013179 superfamily 248012 2 100 1.50E-05 43.0808 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#1730 - CGI_10013180 superfamily 243030 546 578 0.00664983 34.9155 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#1732 - CGI_10013182 superfamily 241563 819 851 0.000450752 40.0131 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1732 - CGI_10013182 superfamily 241563 1124 1156 0.000551992 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1734 - CGI_10013184 superfamily 217293 317 514 4.26E-48 168.965 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1734 - CGI_10013184 superfamily 202474 522 742 1.48E-43 157.045 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#1734 - CGI_10013184 superfamily 216290 83 203 2.67E-16 76.5581 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#1734 - CGI_10013184 superfamily 217685 219 298 1.21E-11 63.122 cl04225 Cu2_monoox_C superfamily C - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#1735 - CGI_10013185 superfamily 245212 108 173 4.76E-06 44.5478 cl09940 S4 superfamily - - "S4/Hsp/ tRNA synthetase RNA-binding domain; The domain surface is populated by conserved, charged residues that define a likely RNA-binding site; Found in stress proteins, ribosomal proteins and tRNA synthetases; This may imply a hitherto unrecognized functional similarity between these three protein classes." Q#1735 - CGI_10013185 superfamily 217293 217 373 6.05E-37 136.223 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1735 - CGI_10013185 superfamily 202474 382 558 4.09E-27 109.28 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#1736 - CGI_10013186 superfamily 243072 2041 2176 6.43E-28 112.477 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1736 - CGI_10013186 superfamily 243072 1946 2101 7.27E-20 88.9798 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1736 - CGI_10013186 superfamily 245213 528 567 8.91E-09 54.565 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 254 292 1.58E-07 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 877 911 1.67E-07 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 837 873 2.01E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1066 1101 2.73E-07 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 607 643 3.32E-07 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 645 681 3.52E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1225 1261 5.20E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1028 1064 1.88E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 683 719 2.76E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 569 605 3.51E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 990 1026 5.51E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1106 1140 2.16E-05 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 412 447 3.19E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1459 1495 3.47E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 799 834 6.01E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1339 1376 0.000135618 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 371 409 0.000141435 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 761 796 0.000186932 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1263 1299 0.000347516 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 721 759 0.000736031 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 333 369 0.000740441 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1310 1337 0.000834379 39.9274 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 299 331 0.00141052 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1196 1223 0.00312278 38.0014 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 1147 1180 0.00809482 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 245213 217 251 0.00826439 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1736 - CGI_10013186 superfamily 191614 1635 1685 5.37E-05 43.7622 cl08449 NOD superfamily - - "NOTCH protein; NOTCH signalling plays a fundamental role during a great number of developmental processes in multicellular animals. NOD and NODP represent a region present in many NOTCH proteins and NOTCH homologs in multiple species such as NOTCH2 and NOTCH3, LIN12, SC1 and TAN1. Role of NOD domain remains to be elucidated." Q#1736 - CGI_10013186 superfamily 243028 1515 1545 0.000827318 39.9698 cl02419 Notch superfamily - - LNR domain; The LNR (Lin-12/Notch repeat) domain is found in three tandem copies in Notch related proteins. The structure of the domain has been determined by NMR and was shown to contain three disulphide bonds and coordinate a calcium ion. Three repeats are also found in the PAPP-A peptidase. Q#1736 - CGI_10013186 superfamily 243028 1599 1625 0.00126374 39.1994 cl02419 Notch superfamily N - LNR domain; The LNR (Lin-12/Notch repeat) domain is found in three tandem copies in Notch related proteins. The structure of the domain has been determined by NMR and was shown to contain three disulphide bonds and coordinate a calcium ion. Three repeats are also found in the PAPP-A peptidase. Q#1736 - CGI_10013186 superfamily 243028 1550 1587 0.00139022 39.1994 cl02419 Notch superfamily - - LNR domain; The LNR (Lin-12/Notch repeat) domain is found in three tandem copies in Notch related proteins. The structure of the domain has been determined by NMR and was shown to contain three disulphide bonds and coordinate a calcium ion. Three repeats are also found in the PAPP-A peptidase. Q#1737 - CGI_10013187 superfamily 244901 8 224 2.89E-121 350.387 cl08306 Peptidase_C12 superfamily - - "Cysteine peptidase C12 contains ubiquitin carboxyl-terminal hydrolase (UCH) families L1, L3, L5 and BAP1; The ubiquitin C-terminal hydrolase (UCH; ubiquitinyl hydrolase; ubiquitin thiolesterase) family of deubiquitinating enzymes (DUBs) consists of four members to date: UCH-L1, UCH-L3, UCH-L5 (UCH37) and BRCA1-associated protein-1 (BAP1), all containing a conserved catalytic domain with cysteine peptidase activity. UCH-L1 hydrolyzes carboxyl terminal esters and amides of ubiquitin (Ub). Dysfunction of this hydrolase activity can lead to an accumulation of alpha-synuclein, which is linked to Parkinson's disease (PD) and neurofibrillary tangles, linked to Alzheimer's disease (AD). UCH-L1, in its dimeric form, has additional enzymatic activity as a ubiquitin ligase. UCH-L3 hydrolyzes isopeptide bonds at the C-terminal glycine of either Ub or Nedd8, a ubiquitin-like protein. UCH-L3 can also interact with Lys48-linked Ub dimers to protect it from degradation while inhibiting its hydrolase activity at the same time. UCH-L1 and UCH-L3 are the most closely related of the UCH members. UCH-L5 (UCH37) is involved in the deubiquitinating activity in the 19S proteasome regulatory complex. It is also associated with the human Ino80 chromatin-remodeling complex (hINO80) in the nucleus. BAP1 binds to the wild-type BRCA1 RING finger domain, localized in the nucleus. It consists of the N-terminal UCH domain and two predicted nuclear localization signals (NLSs), only one of which is functional. The full-length human BRCA1 is a ubiquitin ligase. However, BAP1 does not appear to function in the deubiquitination of autoubiquitinated BRCA1. There is growing evidence that UCH enzymes and human malignancies are closely correlated. Studies show that UCH enzymes play a crucial role in some signaling pathways and in cell-cycle regulation." Q#1738 - CGI_10013188 superfamily 243164 38 55 7.03E-07 42.1396 cl02748 zf-CDGSH superfamily N - "Iron-binding zinc finger CDGSH type; The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm." Q#1738 - CGI_10013188 superfamily 243164 57 91 0.00142506 33.2801 cl02748 zf-CDGSH superfamily - - "Iron-binding zinc finger CDGSH type; The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm." Q#1739 - CGI_10013189 superfamily 245456 166 418 1.90E-149 427.536 cl10970 AP_MHD_Cterm superfamily - - "C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD); This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15." Q#1739 - CGI_10013189 superfamily 242876 1 120 4.72E-06 44.6525 cl02092 Clat_adaptor_s superfamily - - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#1741 - CGI_10013191 superfamily 247724 17 121 8.58E-38 137.899 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#1742 - CGI_10013192 superfamily 199156 191 206 0.000179906 38.2041 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#1744 - CGI_10012664 superfamily 241609 222 289 2.03E-23 95.5227 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1744 - CGI_10012664 superfamily 241571 49 168 4.72E-11 60.889 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1744 - CGI_10012664 superfamily 245213 178 207 0.000113277 40.6978 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1744 - CGI_10012664 superfamily 247042 332 616 3.92E-09 58.111 cl15693 Sema superfamily C - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#1746 - CGI_10012666 superfamily 247044 603 696 6.00E-19 83.8201 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1746 - CGI_10012666 superfamily 247044 111 213 2.90E-21 90.7452 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1746 - CGI_10012666 superfamily 247044 711 807 2.28E-20 88.1233 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1746 - CGI_10012666 superfamily 247044 485 581 5.73E-17 78.0814 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1746 - CGI_10012666 superfamily 247044 227 317 6.71E-13 66.1068 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1746 - CGI_10012666 superfamily 247044 357 436 8.03E-08 51.096 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#1746 - CGI_10012666 superfamily 207613 872 905 1.45E-06 46.5434 cl02491 VHP superfamily - - Villin headpiece domain; Villin headpiece domain. Q#1747 - CGI_10012667 superfamily 243034 561 659 5.54E-09 55.464 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#1747 - CGI_10012667 superfamily 243034 775 893 2.90E-06 47.3748 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#1747 - CGI_10012667 superfamily 243034 989 1085 3.04E-06 46.9896 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#1747 - CGI_10012667 superfamily 243034 9 102 0.000765354 39.6708 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#1748 - CGI_10012668 superfamily 241571 195 288 3.15E-15 69.7486 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1748 - CGI_10012668 superfamily 241571 50 168 1.81E-14 67.8226 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#1751 - CGI_10012671 superfamily 245815 33 513 0 953.717 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#1759 - CGI_10012680 superfamily 247725 205 317 7.45E-60 203.657 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1759 - CGI_10012680 superfamily 215882 125 229 2.90E-25 104.285 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#1759 - CGI_10012680 superfamily 220215 40 117 1.31E-21 92.2882 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#1759 - CGI_10012680 superfamily 192138 328 374 1.23E-07 51.0803 cl07378 FA superfamily - - "FERM adjacent (FA); This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase substrates." Q#1761 - CGI_10002187 superfamily 245814 8 75 6.07E-07 43.2467 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1761 - CGI_10002187 superfamily 245814 91 131 0.00119136 34.4033 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1762 - CGI_10008685 superfamily 243092 430 708 2.89E-60 206.417 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1762 - CGI_10008685 superfamily 243074 347 391 4.77E-10 56.3609 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#1763 - CGI_10008686 superfamily 218753 4 355 4.37E-97 298.043 cl05390 Tcp11 superfamily - - T-complex protein 11; This family consists of several eukaryotic T-complex protein 11 (Tcp11) related sequences. Tcp11 is only expressed in fertile adult mammalian testes and is thought to be important in sperm function and fertility. The family also contains the yeast Sok1 protein which is known to suppress cyclic AMP-dependent protein kinase mutants. Q#1764 - CGI_10008687 superfamily 243095 237 460 3.33E-35 131.684 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#1764 - CGI_10008687 superfamily 243038 40 132 8.25E-26 101.67 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#1766 - CGI_10008689 superfamily 247907 91 224 1.73E-08 52.8057 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#1766 - CGI_10008689 superfamily 245213 412 449 0.000114028 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1766 - CGI_10008689 superfamily 245213 450 485 0.00138811 36.8458 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1766 - CGI_10008689 superfamily 248289 290 344 6.67E-13 63.9823 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#1766 - CGI_10008689 superfamily 248289 348 408 7.68E-09 52.4263 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#1770 - CGI_10008693 superfamily 243263 120 413 5.20E-55 201.097 cl02990 ASC superfamily C - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#1772 - CGI_10002498 superfamily 248097 104 227 6.00E-26 98.4914 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#1772 - CGI_10002498 superfamily 248097 7 69 0.00294265 35.3186 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#1775 - CGI_10002646 superfamily 243119 167 213 0.00015376 38.1938 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#1778 - CGI_10002013 superfamily 242274 43 115 0.000116727 38.9326 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#1779 - CGI_10001004 superfamily 241677 73 113 0.00150725 36.6247 cl00197 cyclophilin superfamily N - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#1779 - CGI_10001004 superfamily 241677 30 51 0.00839458 34.3135 cl00197 cyclophilin superfamily C - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#1780 - CGI_10001005 superfamily 192529 32 87 7.77E-15 68.1229 cl10984 DUF2414 superfamily - - Protein of unknown function (DUF2414); This is a family of proteins conserved from fungi to mammals. One mouse member is referred to as ELG protein but this is not a homologue of human ELG protein. The function is not known. Q#1781 - CGI_10003222 superfamily 247856 441 495 4.25E-07 47.5425 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1781 - CGI_10003222 superfamily 246925 173 326 1.34E-18 85.4849 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#1782 - CGI_10003223 superfamily 246925 357 620 3.60E-17 82.0181 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#1782 - CGI_10003223 superfamily 247856 634 689 0.00199656 37.1421 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1783 - CGI_10003224 superfamily 241578 385 554 1.64E-19 87.2362 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#1783 - CGI_10003224 superfamily 247057 196 254 0.00272314 36.9309 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#1784 - CGI_10003225 superfamily 245596 236 421 2.94E-49 174.803 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#1786 - CGI_10003227 superfamily 242129 24 113 0.007049 34.7359 cl00832 DUF359 superfamily N - Protein of unknown function (DUF359); This family of archaebacterial proteins are about 170 amino acids in length. They have no known function. The most conserved portion of the protein contains the sequence GEEDL that may be important for its function. Q#1787 - CGI_10021082 superfamily 243352 40 261 8.45E-82 249.433 cl03224 Porin3 superfamily N - "Eukaryotic porin family that forms channels in the mitochondrial outer membrane; The porin family 3 contains two sub-families that play vital roles in the mitochondrial outer membrane, a translocase for unfolded pre-proteins (Tom40) and the voltage-dependent anion channel (VDAC) that regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane." Q#1788 - CGI_10021084 superfamily 243035 140 260 2.60E-18 78.1153 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1789 - CGI_10021085 superfamily 243035 95 205 4.01E-26 98.4609 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#1789 - CGI_10021085 superfamily 245226 29 99 0.00848567 34.5034 cl10012 DnaQ_like_exo superfamily NC - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#1790 - CGI_10021087 superfamily 247792 138 182 5.57E-07 47.8256 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#1790 - CGI_10021087 superfamily 243109 696 834 5.62E-08 51.9117 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#1790 - CGI_10021087 superfamily 216033 527 574 2.07E-05 43.4764 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#1790 - CGI_10021087 superfamily 241563 276 319 0.00336616 36.5463 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1791 - CGI_10021088 superfamily 243161 3 37 0.000383531 37.759 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#1795 - CGI_10021094 superfamily 243056 112 320 1.93E-61 205.286 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#1796 - CGI_10021095 superfamily 241563 60 95 0.000770051 35.3907 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1797 - CGI_10021096 superfamily 243092 27 130 0.000782312 38.8552 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1798 - CGI_10021097 superfamily 241563 37 73 0.000151298 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1798 - CGI_10021097 superfamily 110440 502 528 0.00285985 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1799 - CGI_10021098 superfamily 219295 20 133 4.05E-11 56.4332 cl06225 DUF1358 superfamily - - Protein of unknown function (DUF1358); This family consists of several hypothetical eukaryotic proteins of around 125 residues in length. The function of this family is unknown. Q#1800 - CGI_10021099 superfamily 216554 107 233 2.04E-37 134.914 cl15977 zf-DHHC superfamily C - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#1801 - CGI_10021100 superfamily 241563 59 95 0.000105352 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1801 - CGI_10021100 superfamily 110440 484 510 0.00564133 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1803 - CGI_10021103 superfamily 241563 59 95 5.32E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1803 - CGI_10021103 superfamily 110440 486 512 0.00073894 37.7725 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1803 - CGI_10021103 superfamily 110440 527 554 0.00866407 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1808 - CGI_10021108 superfamily 217293 1 156 5.51E-32 119.66 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1808 - CGI_10021108 superfamily 202474 163 270 7.84E-15 71.1457 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#1809 - CGI_10021109 superfamily 217293 374 573 2.32E-38 142.387 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#1809 - CGI_10021109 superfamily 202474 581 689 6.81E-15 73.8421 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#1810 - CGI_10021110 superfamily 245312 153 256 0.00340049 37.2191 cl10482 KefB superfamily C - "Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]" Q#1811 - CGI_10021111 superfamily 243072 91 215 1.45E-36 134.819 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1811 - CGI_10021111 superfamily 243072 157 285 1.78E-27 109.01 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1811 - CGI_10021111 superfamily 243072 31 149 9.97E-26 104.003 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1814 - CGI_10021114 superfamily 246680 19 74 1.03E-05 43.3444 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1815 - CGI_10021115 superfamily 246680 11 77 4.45E-08 50.278 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1816 - CGI_10021116 superfamily 246680 43 100 0.000963233 37.1812 cl14633 DD_superfamily superfamily N - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1817 - CGI_10021117 superfamily 247743 376 512 3.99E-28 110.699 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1817 - CGI_10021117 superfamily 247743 10 86 1.07E-09 56.7707 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1820 - CGI_10021120 superfamily 247743 356 393 1.58E-09 55.6151 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1821 - CGI_10021121 superfamily 218536 29 99 0.0045042 33.4724 cl05038 AAR2 superfamily C - AAR2 protein; This family consists of several eukaryotic AAR2-like proteins. The yeast protein AAR2 is involved in splicing pre-mRNA of the a1 cistron and other genes that are important for cell growth. Q#1823 - CGI_10021123 superfamily 215859 440 643 9.54E-49 170.09 cl18347 Peptidase_S9 superfamily - - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#1825 - CGI_10021125 superfamily 208690 705 1025 2.74E-157 471.471 cl07396 CRM1_C superfamily - - "CRM1 C terminal; CRM1 (also known as Exportin1) mediates the nuclear export of proteins bearing a leucine-rich nuclear export signal (NES). CRM1 forms a complex with the NES containing protein and the small GTPase Ran. This region forms an alpha helical structure formed by six helical hairpin motifs that are structurally similar to the HEAT repeat, but share little sequence similarity to the HEAT repeat." Q#1825 - CGI_10021125 superfamily 219817 118 263 1.88E-45 161.632 cl07129 Xpo1 superfamily - - "Exportin 1-like protein; The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus." Q#1825 - CGI_10021125 superfamily 243689 43 107 2.39E-11 61.1053 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#1826 - CGI_10021126 superfamily 220818 299 661 1.71E-36 140.685 cl11215 UPF0564 superfamily - - "Uncharacterized protein family UPF0564; This family of proteins has no known function. However, one of the members is annotated as an EF-hand family protein." Q#1827 - CGI_10021127 superfamily 247868 2 414 9.92E-63 209.04 cl17314 PRK07608 superfamily - - ubiquinone biosynthesis hydroxylase family protein; Provisional Q#1828 - CGI_10021128 superfamily 202558 1 256 7.69E-136 409.427 cl03915 XRN_N superfamily - - "XRN 5'-3' exonuclease N-terminus; This family aligns residues towards the N-terminus of several proteins with multiple functions. The members of this family all appear to possess 5'-3' exonuclease activity EC:3.1.11.-. Thus, the aligned region may be necessary for 5' to 3' exonuclease function. The family also contains several Xrn1 and Xrn2 proteins. The 5'-3' exoribonucleases Xrn1p and Xrn2p/Rat1p function in the degradation and processing of several classes of RNA in Saccharomyces cerevisiae. Xrn1p is the main enzyme catalyzing cytoplasmic mRNA degradation in multiple decay pathways, whereas Xrn2p/Rat1p functions in the processing of rRNAs and small nucleolar RNAs (snoRNAs) in the nucleus." Q#1829 - CGI_10021129 superfamily 241599 157 214 1.16E-22 89.61 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#1832 - CGI_10021132 superfamily 241563 61 99 0.000429596 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1833 - CGI_10021133 superfamily 215821 28 118 3.00E-09 50.7019 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#1835 - CGI_10003618 superfamily 241874 22 397 6.28E-40 153.12 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1835 - CGI_10003618 superfamily 241874 488 579 2.06E-08 56.0501 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1836 - CGI_10003619 superfamily 241874 138 405 6.57E-32 129.238 cl00456 SLC5-6-like_sbd superfamily NC - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1836 - CGI_10003619 superfamily 241874 496 587 3.45E-09 58.3613 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1837 - CGI_10003620 superfamily 203591 93 223 4.29E-28 111.309 cl06275 DUF1399 superfamily - - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#1838 - CGI_10003621 superfamily 203990 53 115 6.72E-06 41.1383 cl07882 Cmc1 superfamily - - Cytochrome c oxidase biogenesis protein Cmc1 like; Cmc1 is a metallo-chaperone like protein which is known to localise to the inner mitochondrial membrane in Saccharomyces cerevisiae. It is essential for full expression of cytochrome c oxidase and respiration. Cmc1 contains two Cx9C motifs and is able to bind copper(I). Cmc1 is thought to play a role in mitochondrial copper trafficking and transfer to cytochrome c oxidase. Q#1839 - CGI_10003622 superfamily 247786 15 278 1.04E-116 340.01 cl17232 F420_oxidored superfamily - - NADP oxidoreductase coenzyme F420-dependent; NADP oxidoreductase coenzyme F420-dependent. Q#1840 - CGI_10002904 superfamily 248264 114 277 4.40E-48 159.325 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#1840 - CGI_10002904 superfamily 222263 39 126 3.29E-08 49.6237 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#1841 - CGI_10002905 superfamily 247725 29 156 1.35E-60 190.622 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1842 - CGI_10002906 superfamily 248097 9 119 2.06E-25 93.869 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#1843 - CGI_10002907 superfamily 248097 57 181 2.44E-28 103.499 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#1845 - CGI_10004107 superfamily 244906 87 152 6.78E-26 103.759 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#1846 - CGI_10004108 superfamily 241596 225 277 6.08E-13 65.6983 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#1847 - CGI_10004109 superfamily 243157 5 88 1.53E-20 85.0775 cl02720 PB1 superfamily - - "The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants." Q#1847 - CGI_10004109 superfamily 241643 347 383 0.00758305 33.9631 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#1848 - CGI_10004110 superfamily 241760 110 151 4.05E-23 91.5525 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#1848 - CGI_10004110 superfamily 243157 3 89 6.85E-35 125.138 cl02720 PB1 superfamily - - "The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants." Q#1850 - CGI_10004649 superfamily 241609 367 433 3.02E-25 98.219 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1850 - CGI_10004649 superfamily 241568 323 358 0.0032152 35.5164 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#1851 - CGI_10004650 superfamily 241609 134 197 2.33E-21 84.3519 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#1851 - CGI_10004650 superfamily 241568 85 120 0.00249783 34.0393 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#1852 - CGI_10004651 superfamily 248097 105 180 6.03E-09 53.0378 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#1853 - CGI_10004652 superfamily 247743 2759 2922 4.30E-10 61.3931 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1853 - CGI_10004652 superfamily 247743 3111 3201 2.84E-05 46.3703 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1853 - CGI_10004652 superfamily 193257 3828 4055 2.84E-44 164.388 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#1853 - CGI_10004652 superfamily 222302 1186 1303 4.79E-27 110.604 cl16342 DUF4151 superfamily - - Domain of unknown function (DUF4151); This domain is found on dynein heavy chain proteins. The exact function is not known but it is conserved from plants to Sch. pombe to human. Q#1853 - CGI_10004652 superfamily 193253 3570 3811 8.55E-21 96.6445 cl15084 MT superfamily N - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#1853 - CGI_10004652 superfamily 193253 3389 3549 2.10E-14 76.9993 cl15084 MT superfamily C - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#1853 - CGI_10004652 superfamily 247743 2411 2551 1.10E-12 68.8612 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#1854 - CGI_10004653 superfamily 248097 31 135 4.00E-13 61.5122 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#1856 - CGI_10017595 superfamily 241874 7 421 3.56E-51 183.14 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1857 - CGI_10017596 superfamily 241874 1 66 5.41E-17 76.7509 cl00456 SLC5-6-like_sbd superfamily NC - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1857 - CGI_10017596 superfamily 241874 64 99 0.00817001 34.379 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#1859 - CGI_10017598 superfamily 247744 75 276 1.07E-27 107.057 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#1860 - CGI_10017599 superfamily 243072 110 235 6.22E-30 112.477 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1860 - CGI_10017599 superfamily 243072 176 302 3.06E-25 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1860 - CGI_10017599 superfamily 243072 15 136 1.33E-14 70.105 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1860 - CGI_10017599 superfamily 243073 382 417 2.95E-09 52.8577 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#1861 - CGI_10017600 superfamily 242889 193 293 2.77E-20 83.8065 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#1863 - CGI_10017602 superfamily 245213 102 132 0.00555043 32.6086 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#1865 - CGI_10017604 superfamily 245870 14 140 1.07E-13 69.2698 cl12097 DUF1772 superfamily - - Domain of unknown function (DUF1772); This domain is of unknown function. Q#1865 - CGI_10017604 superfamily 152105 156 277 3.41E-10 58.3131 cl13169 WBP-1 superfamily - - "WW domain-binding protein 1; This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain." Q#1866 - CGI_10017605 superfamily 245870 38 149 5.42E-09 50.3951 cl12097 DUF1772 superfamily - - Domain of unknown function (DUF1772); This domain is of unknown function. Q#1867 - CGI_10017606 superfamily 243948 15 213 1.16E-15 73.0928 cl04955 LanC_like superfamily C - "LanC-like proteins. LanC is the cyclase enzyme of the lanthionine synthetase. Lanthionine is a lantibiotic, a unique class of peptide antibiotics. They are ribosomally synthesized as a precursor peptide and then post-translationally modified to contain thioether cross-links called lanthionines (Lans) or methyllanthionines (MeLans), in addition to 2,3-didehydroalanine (Dha) and (Z)-2,3-didehydrobutyrine (Dhb). These unusual amino acids are introduced by the dehydration of serine and threonine residues, followed by thioether formation via addition of cysteine thiols, catalysed by LanB and LanC or LanM. LanC, the cyclase component, is a zinc metalloprotein, whose bound metal has been proposed to activate the thiol substrate for nucleophilic addition. A related domain is also present in LanM and other pro- and eukaryotic proteins of unknown function." Q#1869 - CGI_10017608 superfamily 149077 55 117 2.38E-06 43.7673 cl06719 TMC superfamily - - "TMC domain; These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 and EVIN2 - this region is termed the TMC domain. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters." Q#1871 - CGI_10017610 superfamily 245201 9 290 0 578.457 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1872 - CGI_10017611 superfamily 243072 515 637 4.79E-33 124.418 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1872 - CGI_10017611 superfamily 243072 183 301 2.31E-23 97.069 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1872 - CGI_10017611 superfamily 243072 46 171 9.77E-20 86.6686 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1872 - CGI_10017611 superfamily 247856 341 396 0.000237004 39.8385 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1873 - CGI_10017612 superfamily 241832 1 121 1.04E-34 119.542 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1875 - CGI_10017614 superfamily 247057 5 69 4.06E-36 127.414 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#1876 - CGI_10017615 superfamily 241622 18 96 2.46E-18 75.2958 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#1877 - CGI_10017616 superfamily 205477 310 368 2.02E-26 99.9101 cl16217 Telomere_Sde2_2 superfamily - - "Telomere stability C-terminal; This short C-terminal domain is found in higher eukaryotes further downstream from the Sde2 family, pfam13019. It is found in all Sde2-related proteins except those from fission yeast, fly, and mosquito. Its exact function in telomere formation and maintenance has not yet been established." Q#1878 - CGI_10017617 superfamily 241832 33 178 1.11E-62 193.548 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#1879 - CGI_10017618 superfamily 241672 355 775 9.57E-123 378.282 cl00192 ribokinase_pfkB_like superfamily - - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#1879 - CGI_10017618 superfamily 243146 236 283 2.14E-07 48.8247 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1879 - CGI_10017618 superfamily 243146 183 233 1.92E-06 46.1283 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1879 - CGI_10017618 superfamily 243146 49 92 0.000227907 39.9522 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#1880 - CGI_10017619 superfamily 243092 231 349 8.01E-14 70.8268 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1882 - CGI_10017621 superfamily 241593 37 185 2.08E-09 55.7306 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#1884 - CGI_10017623 superfamily 241975 92 137 2.16E-12 60.245 cl00605 RNase_P_Rpp14 superfamily N - "Rpp14/Pop5 family; tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule associated with at least eight protein subunits, hPop1, Rpp14, Rpp20, Rpp25, Rpp29, Rpp30, Rpp38, and Rpp40. This protein is known as Pop5 in eukaryotes." Q#1885 - CGI_10017624 superfamily 247868 295 348 2.27E-07 51.3634 cl17314 PRK07608 superfamily N - ubiquinone biosynthesis hydroxylase family protein; Provisional Q#1885 - CGI_10017624 superfamily 247868 106 333 0.000152572 42.4508 cl17314 PRK07608 superfamily NC - ubiquinone biosynthesis hydroxylase family protein; Provisional Q#1888 - CGI_10017627 superfamily 248458 320 439 2.15E-08 54.6273 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1888 - CGI_10017627 superfamily 248458 131 207 4.32E-08 53.4717 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1890 - CGI_10005309 superfamily 241563 71 104 1.82E-05 42.4664 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1890 - CGI_10005309 superfamily 241563 21 59 0.000813494 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1892 - CGI_10005311 superfamily 243068 171 302 9.55E-09 54.7002 cl02523 Zona_pellucida superfamily N - Zona pellucida-like domain; Zona pellucida-like domain. Q#1894 - CGI_10005313 superfamily 241563 3 37 4.79E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1895 - CGI_10005314 superfamily 243400 114 495 0 640.173 cl03362 AICARFT_IMPCHas superfamily - - "AICARFT/IMPCHase bienzyme; This is a family of bifunctional enzymes catalyzing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalyzed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase EC:2.1.2.3 (AICARFT), this enzyme catalyzes the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. This is catalyzed by a pair of C-terminal deaminase fold domains in the protein, where the active site is formed by the dimeric interface of two monomeric units. The last step is catalyzed by the N-terminal IMP (Inosine monophosphate) cyclohydrolase domain EC:3.5.4.10 (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP." Q#1895 - CGI_10005314 superfamily 241720 15 104 1.60E-41 147.746 cl00245 MGS-like superfamily N - "MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase, which catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The family also includes the C-terminal domain in carbamoyl phosphate synthetase (CPS) where it catalyzes the last phosphorylation of a coaboxyphosphate intermediate to form the product carbamoyl phosphate and may also play a regulatory role. This family also includes inosine monophosphate cyclohydrolase. The known structures in this family show a common phosphate binding site." Q#1898 - CGI_10017289 superfamily 150884 34 132 7.30E-42 142.659 cl10958 Med19 superfamily C - Mediator of RNA pol II transcription subunit 19; Med19 represents a family of conserved proteins which are members of the multi-protein co-activator Mediator complex. Mediator is required for activation of RNA polymerase II transcription by DNA binding transactivators. Q#1899 - CGI_10017291 superfamily 247068 149 243 1.65E-20 84.6725 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#1899 - CGI_10017291 superfamily 247068 56 137 4.00E-14 67.3385 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#1899 - CGI_10017291 superfamily 247068 255 315 8.06E-06 43.4562 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#1900 - CGI_10017292 superfamily 242064 423 515 2.76E-57 188.122 cl00748 Ribosomal_L32_L32e superfamily - - "Ribosomal_L32_L32e: L32 is a protein from the large subunit that contains a surface-exposed globular domain and a finger-like projection that extends into the RNA core to stabilize the tertiary structure. L32 does not appear to play a role in forming the A (aminacyl), P (peptidyl) or E (exit) sites of the ribosome, but does interact with 23S rRNA, which has a "kink-turn" secondary structure motif. L32 is overexpressed in human prostate cancer and has been identified as a stably expressed housekeeping gene in macrophages of human chronic obstructive pulmonary disease (COPD) patients. In Schizosaccharomyces pombe, L32 has also been suggested to play a role as a transcriptional regulator in the nucleus. Found in archaea and eukaryotes, this protein is known as L32 in eukaryotes and L32e in archaea." Q#1901 - CGI_10017293 superfamily 246680 259 330 5.92E-05 40.3966 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#1901 - CGI_10017293 superfamily 245874 35 129 4.85E-12 61.2882 cl12111 TNFR superfamily - - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#1902 - CGI_10017294 superfamily 247637 25 40 0.00890379 31.6977 cl16912 MDR superfamily NC - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#1904 - CGI_10017296 superfamily 245814 27 107 1.07E-16 69.7358 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1905 - CGI_10017297 superfamily 245814 237 300 1.59E-06 45.2163 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1905 - CGI_10017297 superfamily 247807 125 217 0.00135482 36.5042 cl17253 AAA_17 superfamily - - AAA domain; AAA domain. Q#1907 - CGI_10017299 superfamily 245814 251 331 1.34E-15 70.5062 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1907 - CGI_10017299 superfamily 245814 51 118 2.46E-15 69.3579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1907 - CGI_10017299 superfamily 245814 148 214 1.77E-07 48.2979 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1909 - CGI_10017302 superfamily 243134 354 473 7.50E-36 131.232 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#1909 - CGI_10017302 superfamily 243134 29 189 6.10E-31 117.75 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#1909 - CGI_10017302 superfamily 243134 510 610 5.07E-26 103.883 cl02663 Fasciclin superfamily N - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#1909 - CGI_10017302 superfamily 243134 226 321 1.26E-23 96.9495 cl02663 Fasciclin superfamily N - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#1910 - CGI_10017303 superfamily 243134 29 148 2.55E-37 130.077 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#1910 - CGI_10017303 superfamily 243134 185 283 2.77E-24 95.0235 cl02663 Fasciclin superfamily N - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#1911 - CGI_10017304 superfamily 243134 29 148 1.15E-37 131.232 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#1911 - CGI_10017304 superfamily 243134 185 280 4.58E-25 97.3347 cl02663 Fasciclin superfamily N - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#1912 - CGI_10017305 superfamily 246908 288 359 2.11E-10 56.6951 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#1914 - CGI_10017307 superfamily 202388 114 221 1.68E-48 156.688 cl03708 Sod_Fe_C superfamily - - "Iron/manganese superoxide dismutases, C-terminal domain; superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. C-terminal domain is a mixed alpha/beta fold." Q#1914 - CGI_10017307 superfamily 200985 28 109 8.72E-34 117.773 cl02809 Sod_Fe_N superfamily - - "Iron/manganese superoxide dismutases, alpha-hairpin domain; superoxide dismutases (SODs) catalyze the conversion of superoxide radicals to hydrogen peroxide and molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the Mn/Fe-binding family is one. In humans, there is a cytoplasmic Cu/Zn SOD, and a mitochondrial Mn/Fe SOD. N-terminal domain is a long alpha antiparallel hairpin. A small fragment of YTRE_LEPBI matches well - sequencing error?" Q#1916 - CGI_10017309 superfamily 247725 217 313 1.09E-69 221.737 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#1916 - CGI_10017309 superfamily 215882 110 223 1.53E-32 121.234 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#1916 - CGI_10017309 superfamily 220215 26 103 2.96E-22 91.5178 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#1917 - CGI_10017310 superfamily 220668 37 434 4.41E-134 395.62 cl10953 Tmp39 superfamily - - Putative transmembrane protein; This is a family of conserved proteins found from worms to humans. They are putative transmembrane proteins but the function is unknown. Q#1918 - CGI_10017311 superfamily 219739 38 297 4.70E-69 221.882 cl06991 PIH1 superfamily - - pre-RNA processing PIH1/Nop17; This domain is involved in pre-rRNA processing. It has has been shown to be required either for nucleolar retention or correct assembly of the box C/D snoRNP in Saccharomyces cerevisiae. The C-terminal region of this family has similarity to the CS domain pfam04969. Q#1920 - CGI_10017313 superfamily 247684 15 35 7.98E-06 41.5084 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#1921 - CGI_10017314 superfamily 243040 156 274 2.86E-23 92.1846 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#1921 - CGI_10017314 superfamily 243040 24 130 3.37E-23 91.7994 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#1923 - CGI_10017316 superfamily 243109 8 72 9.67E-07 47.8824 cl02614 SPRY superfamily C - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#1924 - CGI_10017317 superfamily 241583 211 361 4.35E-29 113.238 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#1924 - CGI_10017317 superfamily 216572 57 166 1.24E-09 55.3587 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#1925 - CGI_10017318 superfamily 243072 902 1023 8.73E-23 97.4542 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 2054 2160 4.09E-22 95.5282 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 2243 2358 8.07E-22 94.7578 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1764 1878 8.71E-22 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1572 1687 1.30E-21 93.9874 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1859 1974 2.28E-21 93.217 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 2113 2230 3.28E-21 92.8318 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1413 1527 5.02E-21 92.4466 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1347 1463 2.18E-20 90.5206 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1920 2038 4.44E-20 89.7502 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1102 1217 1.50E-19 88.2094 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1636 1782 7.60E-16 77.0386 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 1200 1336 2.51E-13 69.7198 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 243072 973 1114 1.79E-12 67.0234 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#1925 - CGI_10017318 superfamily 248357 227 305 0.000521648 42.4065 cl17803 FMN_bind_2 superfamily C - "Putative FMN-binding domain; In Bacillus subtilis, family member PAI 2/ORF-2 was found to be essential for growth. The SUPERFAMILY database finds that this domain is related to FMN-binding domains, suggesting this protein is also FMN-binding." Q#1929 - CGI_10001890 superfamily 243091 2 49 3.51E-09 49.798 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#1930 - CGI_10001891 superfamily 243091 45 93 3.98E-09 49.0276 cl02566 SET superfamily C - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#1932 - CGI_10002146 superfamily 241583 46 154 1.34E-22 89.6051 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#1935 - CGI_10004162 superfamily 216897 77 156 8.59E-16 71.5585 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#1935 - CGI_10004162 superfamily 216897 20 59 2.31E-09 53.4541 cl03463 Gal_Lectin superfamily N - Galactose binding lectin domain; Galactose binding lectin domain. Q#1936 - CGI_10004163 superfamily 220965 173 302 9.45E-26 98.6824 cl12631 DUF2870 superfamily - - Protein of unknown function (DUF2870); This is a eukaryotic family of proteins with unknown function. Q#1937 - CGI_10004164 superfamily 241659 156 233 1.13E-14 66.3895 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#1937 - CGI_10004164 superfamily 241659 51 125 4.29E-12 59.4559 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#1938 - CGI_10004165 superfamily 241659 57 122 2.69E-16 71.0119 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#1938 - CGI_10004165 superfamily 241659 153 231 1.55E-14 66.0043 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#1939 - CGI_10004166 superfamily 241659 44 121 2.31E-14 63.6931 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#1940 - CGI_10004167 superfamily 241643 377 411 0.000454299 39.3647 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#1940 - CGI_10004167 superfamily 219122 901 1172 3.01E-56 197.517 cl05933 DUF1162 superfamily - - Protein of unknown function (DUF1162); This family represents a conserved region within several hypothetical eukaryotic proteins. Family members might be vacuolar protein sorting related-proteins. Q#1941 - CGI_10004168 superfamily 219797 155 738 2.83E-171 513.016 cl09596 ACC_central superfamily - - "Acetyl-CoA carboxylase, central region; The region featured in this family is found in various eukaryotic acetyl-CoA carboxylases, N-terminal to the catalytic domain (pfam01039). This enzyme (EC:6.4.1.2) is involved in the synthesis of long-chain fatty acids, as it catalyzes the rate-limiting step in this process." Q#1941 - CGI_10004168 superfamily 219797 19 85 1.22E-09 60.407 cl09596 ACC_central superfamily N - "Acetyl-CoA carboxylase, central region; The region featured in this family is found in various eukaryotic acetyl-CoA carboxylases, N-terminal to the catalytic domain (pfam01039). This enzyme (EC:6.4.1.2) is involved in the synthesis of long-chain fatty acids, as it catalyzes the rate-limiting step in this process." Q#1942 - CGI_10000340 superfamily 248458 1 130 4.23E-06 43.8417 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#1943 - CGI_10004523 superfamily 241574 496 651 2.92E-52 181.245 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#1943 - CGI_10004523 superfamily 241584 139 263 0.00066345 38.6315 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#1943 - CGI_10004523 superfamily 241584 38 126 0.00281452 36.7055 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#1944 - CGI_10004524 superfamily 192535 46 250 0.00342204 37.9606 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#1945 - CGI_10004525 superfamily 238191 1 419 7.01E-95 297.707 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#1946 - CGI_10004526 superfamily 238191 1 392 8.16E-93 290.388 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#1947 - CGI_10004527 superfamily 241743 66 208 1.90E-23 92.6362 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#1948 - CGI_10003169 superfamily 202587 288 325 7.67E-14 64.9166 cl03974 EB1 superfamily - - "EB1-like C-terminal motif; This motif is found at the C-terminus of proteins that are related to the EB1 protein. The EB1 proteins contain an N-terminal CH domain pfam00307. The human EB1 protein was originally discovered as a protein interacting with the C-terminus of the APC protein. This interaction is often disrupted in colon cancer, due to deletions affecting the APC C-terminus. Several EB1 orthologues are also included in this family. The interaction between EB1 and APC has been shown to have a potent synergistic effect on microtubule polymerisation. Neither of EB1 or APC alone has this effect. It is thought that EB1 targets APC to the + ends of microtubules, where APC promotes microtubule polymerisation. This process is regulated by APC phosphorylation by Cdc2, which disrupts APC-EB1 binding. Human EB1 protein can functionally substitute for the yeast EB1 homologue Mal3. In addition, Mal3 can substitute for human EB1 in promoting microtubule polymerisation with APC." Q#1948 - CGI_10003169 superfamily 241559 77 191 6.20E-11 58.4448 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#1949 - CGI_10003170 superfamily 241884 1 192 8.78E-115 328.005 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#1950 - CGI_10003171 superfamily 241755 3 337 3.29E-165 468.715 cl00288 EPT_RTPC-like superfamily - - "This domain family includes the Enolpyruvate transferase (EPT) family and the RNA 3' phosphate cyclase family (RTPC). These 2 families differ in that EPT is formed by 3 repeats of an alpha-beta structural domain while RTPC has 3 similar repeats with a 4th slightly different domain inserted between the 2nd and 3rd repeat. They evidently share the same active site location, although the catalytic residues differ." Q#1951 - CGI_10003172 superfamily 243092 44 363 2.30E-39 143.244 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#1952 - CGI_10003173 superfamily 248305 127 459 4.62E-95 307.346 cl17751 Glyco_transf_22 superfamily C - Alg9-like mannosyltransferase family; Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. Q#1952 - CGI_10003173 superfamily 248305 562 697 6.32E-28 117.442 cl17751 Glyco_transf_22 superfamily N - Alg9-like mannosyltransferase family; Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. Q#1953 - CGI_10003174 superfamily 247740 555 857 1.11E-178 521.074 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#1953 - CGI_10003174 superfamily 248054 216 261 2.93E-06 45.926 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#1954 - CGI_10004989 superfamily 241583 209 448 2.29E-79 256.532 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#1954 - CGI_10004989 superfamily 216572 40 137 2.11E-07 49.9659 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#1955 - CGI_10004990 superfamily 245226 52 173 3.39E-13 63.0884 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#1956 - CGI_10004991 superfamily 247736 59 120 1.36E-06 42.1688 cl17182 NAT_SF superfamily N - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#1957 - CGI_10004992 superfamily 241547 42 277 7.67E-97 288.014 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#1960 - CGI_10002604 superfamily 241563 66 100 1.99E-05 42.2744 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1960 - CGI_10002604 superfamily 110440 478 502 0.00960455 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#1962 - CGI_10004667 superfamily 245882 3 345 2.28E-154 444.041 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#1965 - CGI_10004670 superfamily 241568 427 485 4.74E-06 44.376 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#1967 - CGI_10022335 superfamily 241600 7 175 1.51E-72 220.19 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1968 - CGI_10022336 superfamily 241600 12 128 2.49E-48 164.336 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1968 - CGI_10022336 superfamily 241600 150 191 1.98E-17 78.8215 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1969 - CGI_10022338 superfamily 241600 4 189 1.18E-79 238.679 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1971 - CGI_10022340 superfamily 241600 5 217 8.99E-89 263.332 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#1973 - CGI_10022342 superfamily 247856 66 85 0.00444169 31.7493 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#1974 - CGI_10022343 superfamily 241567 105 283 4.26E-32 125.02 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#1975 - CGI_10022344 superfamily 241567 741 993 3.60E-33 129.643 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#1976 - CGI_10022345 superfamily 241563 14 46 0.00491065 34.9556 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#1977 - CGI_10022346 superfamily 243050 493 544 6.13E-31 114.358 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#1977 - CGI_10022346 superfamily 243050 375 426 1.06E-30 113.638 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#1977 - CGI_10022346 superfamily 243050 434 486 5.71E-30 111.658 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#1977 - CGI_10022346 superfamily 243050 316 368 4.94E-25 98.168 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#1977 - CGI_10022346 superfamily 217602 108 187 2.92E-06 46.8385 cl04141 Paxillin superfamily N - Paxillin family; Paxillin family. Q#1978 - CGI_10022347 superfamily 245843 11 108 4.41E-56 171.653 cl12033 Spt4 superfamily - - "Transcription elongation factor Spt4; Spt4 is a transcription elongation factor. Three transcription-elongation factors Spt4, Spt5, and Spt6, are conserved among eukaryotes and are essential for transcription via the modulation of chromatin structure. It is known that Spt4, Spt5, and Spt6 are general transcription-elongation factors, controlling transcription both positively and negatively in important regulatory and developmental roles. Spt4 functions entirely in the context of the Spt4-Spt5 heterodimer and it has been found only as a complex to Spt5 in Yeast and Human. Spt4 is a small protein that has zinc finger at the N-terminus. Spt5 is a large protein that has several interesting structural features of an acidic N-terminus, a single NGN domain, five or six KOW domains, and a set of simple C-termianl repeats. Spt4 binds to Spt5 NGN domain. Unlike Spt5, Spt4 is not essential for viability in yeast, however Spt4 is critical for normal function of the Spt4-Spt5 complex. Spt4 homolog is not found in bacteria." Q#1979 - CGI_10022348 superfamily 247755 1362 1582 2.86E-111 352.181 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1979 - CGI_10022348 superfamily 247755 755 960 2.80E-103 329.045 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#1979 - CGI_10022348 superfamily 216049 322 585 5.12E-21 95.0453 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#1979 - CGI_10022348 superfamily 216049 1144 1268 1.60E-13 71.5482 cl18356 ABC_membrane superfamily C - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#1986 - CGI_10022355 superfamily 245814 906 971 1.05E-05 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1986 - CGI_10022355 superfamily 245814 811 878 0.00615667 37.0835 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1986 - CGI_10022355 superfamily 245814 1090 1172 4.79E-12 64.4488 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1986 - CGI_10022355 superfamily 245814 1001 1063 5.29E-11 60.8835 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#1986 - CGI_10022355 superfamily 204945 1557 1592 5.96E-06 45.8412 cl13890 Smoothelin superfamily N - "Smoothelin cytoskeleton protein; This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00307. Smoothelin is a cytoskeletal protein specifically expressed in differentiated smooth muscle cells and has been shown to co-localize with smooth muscle alpha actin." Q#1987 - CGI_10022356 superfamily 241559 2 102 2.01E-24 90.4479 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#1993 - CGI_10022362 superfamily 243039 233 368 1.50E-64 205.042 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#1993 - CGI_10022362 superfamily 190233 135 194 1.22E-14 68.2498 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#1993 - CGI_10022362 superfamily 190233 26 79 1.43E-07 48.2194 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#1995 - CGI_10022364 superfamily 245819 716 859 5.96E-48 168.527 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#1995 - CGI_10022364 superfamily 245201 417 643 4.92E-25 104.626 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#1995 - CGI_10022364 superfamily 245225 26 316 5.57E-33 131.601 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#1995 - CGI_10022364 superfamily 219526 655 702 3.79E-06 47.2287 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#2001 - CGI_10022372 superfamily 247743 1438 1583 7.59E-07 50.9927 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#2001 - CGI_10022372 superfamily 247792 3072 3111 0.00507477 38.2589 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2002 - CGI_10022373 superfamily 247916 265 320 2.46E-08 52.3851 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#2003 - CGI_10022374 superfamily 242251 27 122 0.000760474 34.9527 cl01015 FUN14 superfamily - - FUN14 family; This family of short proteins are found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices. Q#2005 - CGI_10024709 superfamily 245201 414 618 8.32E-18 83.4401 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#2005 - CGI_10024709 superfamily 247684 639 1047 3.55E-58 208.3 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#2007 - CGI_10024711 superfamily 243058 878 972 1.83E-06 47.3091 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#2007 - CGI_10024711 superfamily 218493 13 160 3.98E-42 151.741 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#2013 - CGI_10024717 superfamily 241578 17 115 2.09E-05 44.3676 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2015 - CGI_10024719 superfamily 243035 315 351 0.000544474 39.5054 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2015 - CGI_10024719 superfamily 245309 52 157 0.00608558 35.9734 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#2019 - CGI_10024723 superfamily 243035 371 438 4.03E-05 42.607 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2021 - CGI_10024725 superfamily 241564 36 102 1.02E-25 98.1067 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#2021 - CGI_10024725 superfamily 241564 132 199 2.38E-24 94.2547 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#2022 - CGI_10024726 superfamily 243092 216 371 6.46E-21 91.2424 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2023 - CGI_10024727 superfamily 243120 13 140 2.58E-19 85.7084 cl02633 ARID superfamily - - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#2023 - CGI_10024727 superfamily 190261 566 620 0.000105865 42.5358 cl03504 RFX_DNA_binding superfamily - - RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. Q#2024 - CGI_10024728 superfamily 218913 135 293 5.47E-05 43.1201 cl18486 Trehalose_recp superfamily N - "Trehalose receptor; In Drosophila, taste is perceived by gustatory neurons located in sensilla distributed on several different appendages throughout the body of the animal. This family represents the taste receptor sensitive to trehalose." Q#2025 - CGI_10024729 superfamily 241607 80 104 0.00016849 35.3234 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#2025 - CGI_10024729 superfamily 241607 40 64 0.000255066 34.553 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#2026 - CGI_10024730 superfamily 218766 23 184 3.14E-27 102.53 cl05413 NDUF_B8 superfamily - - "NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI or NDUFB8); This family consists of several eukaryotic NADH-ubiquinone oxidoreductase ASHI subunit (CI-ASHI) proteins. NADH:ubiquinone oxidoreductase (complex I) is an extremely complicated multiprotein complex located in the inner mitochondrial membrane. Its main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. Human complex I appears to consist of 41 subunits." Q#2028 - CGI_10024732 superfamily 241563 66 99 8.42E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2029 - CGI_10024733 superfamily 246680 10 82 4.95E-09 55.313 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#2029 - CGI_10024733 superfamily 241554 1173 1348 5.75E-49 174.381 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#2029 - CGI_10024733 superfamily 241752 1663 1783 1.99E-24 101.627 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#2029 - CGI_10024733 superfamily 247723 422 495 1.12E-09 57.2772 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2029 - CGI_10024733 superfamily 247723 510 579 7.47E-05 43.0308 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2029 - CGI_10024733 superfamily 247723 326 396 0.00126384 39.2077 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2029 - CGI_10024733 superfamily 247723 586 653 0.00175094 38.8225 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2030 - CGI_10024734 superfamily 247736 65 118 1.28E-08 49.5817 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#2030 - CGI_10024734 superfamily 247736 92 147 0.000142775 38.4646 cl17182 NAT_SF superfamily N - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#2031 - CGI_10024735 superfamily 241572 13 85 7.65E-11 58.404 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#2033 - CGI_10024737 superfamily 247755 459 606 1.27E-75 243.566 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2034 - CGI_10024738 superfamily 150796 15 104 9.12E-32 109.134 cl10865 C6_DPF superfamily - - Cysteine-rich domain; This is the N-terminal approximately 100 amino acids of a family of proteins found from nematodes to humans. It contains between six and eight highly conserved cysteine residues and a characteristic DPF sequence motif. One member is putatively named as receptor for egg jelly protein but this could not confirmed. Q#2036 - CGI_10024740 superfamily 245864 285 457 3.10E-25 106.594 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#2036 - CGI_10024740 superfamily 245864 32 81 0.00171431 39.1838 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#2037 - CGI_10024741 superfamily 244859 98 325 5.36E-07 49.1069 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#2038 - CGI_10024742 superfamily 248347 1 225 1.15E-20 92.5316 cl17793 Peptidase_C69 superfamily C - Peptidase family C69; Peptidase family C69. Q#2039 - CGI_10024743 superfamily 241810 36 118 6.98E-36 121.522 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#2039 - CGI_10024743 superfamily 201966 100 166 2.75E-34 117.332 cl03349 Ribosomal_L27e superfamily N - Ribosomal L27e protein family; The N-terminal region of the eukaryotic ribosomal L27 has the KOW motif. C-terminal region is represented by this family. Q#2040 - CGI_10024744 superfamily 202711 47 213 2.33E-68 209.901 cl04190 Mob1_phocein superfamily - - "Mob1/phocein family; Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature. This family also includes phocein, a rat protein that by yeast two hybrid interacts with striatin." Q#2042 - CGI_10024746 superfamily 247941 54 230 1.60E-06 45.9665 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#2043 - CGI_10024747 superfamily 245864 34 348 4.58E-46 164.374 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#2044 - CGI_10024748 superfamily 243555 17 206 2.99E-19 86.291 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#2046 - CGI_10002738 superfamily 189857 68 155 3.40E-09 51.4818 cl07832 Caveolin superfamily N - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#2047 - CGI_10002739 superfamily 189857 1 117 2.99E-33 115.04 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#2049 - CGI_10002741 superfamily 189857 3 123 5.17E-36 122.359 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#2050 - CGI_10002742 superfamily 247746 99 213 0.00105611 38.0082 cl17192 ATP-synt_B superfamily - - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#2050 - CGI_10002742 superfamily 241563 62 103 0.00452125 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2053 - CGI_10008123 superfamily 241578 205 385 6.92E-46 162.77 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2053 - CGI_10008123 superfamily 207701 15 77 1.35E-17 80.4162 cl02699 VIT superfamily N - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#2053 - CGI_10008123 superfamily 148333 670 779 2.72E-11 62.6838 cl05947 ITI_HC_C superfamily C - "Inter-alpha-trypsin inhibitor heavy chain C-terminus; This family represents the C-terminal region of inter-alpha-trypsin inhibitor heavy chains. Inter-alpha-trypsin inhibitors are glycoproteins with a high inhibitory activity against trypsin, built up from different combinations of four polypeptides: bikunin and the three heavy chains that belong to this family (HC1, HC2, HC3). The heavy chains do not have any protease inhibitory properties but have the capacity to interact in vitro and in vivo with hyaluronic acid, which promotes the stability of the extra-cellular matrix. All family members contain the pfam00092 domain." Q#2057 - CGI_10006631 superfamily 241596 83 138 2.10E-14 64.5427 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#2058 - CGI_10006632 superfamily 220691 118 232 0.00180177 38.3678 cl18569 7TM_GPCR_Srv superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#2059 - CGI_10006633 superfamily 217816 73 269 1.50E-59 190.958 cl18428 FSH1 superfamily - - Serine hydrolase (FSH1); This is a family of serine hydrolases. Q#2060 - CGI_10006634 superfamily 245601 32 334 8.68E-26 103.993 cl11399 HP superfamily - - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#2061 - CGI_10006635 superfamily 243072 544 665 5.73E-31 119.796 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2061 - CGI_10006635 superfamily 243072 707 822 3.74E-23 97.4542 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2061 - CGI_10006635 superfamily 115363 88 153 1.77E-16 76.2565 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#2061 - CGI_10006635 superfamily 241760 162 205 1.59E-12 64.4019 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#2061 - CGI_10006635 superfamily 115363 233 297 1.05E-09 56.6114 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#2065 - CGI_10010082 superfamily 243092 299 610 6.87E-50 178.297 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2065 - CGI_10010082 superfamily 243092 70 378 3.21E-25 106.65 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2065 - CGI_10010082 superfamily 217837 747 854 1.48E-20 88.8037 cl04367 Utp12 superfamily - - Dip2/Utp12 Family; This domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2. Q#2068 - CGI_10010085 superfamily 241597 76 147 8.48E-36 125.488 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#2068 - CGI_10010085 superfamily 152771 146 175 2.62E-05 41.4358 cl13733 SOXp superfamily C - "SOX transcription factor; This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found in association with pfam00505. There are two conserved sequence motifs: KKDK and LPG. This family is made up of SOX transcription factors. These are involved in upregulation of nestin, a neural promoter." Q#2069 - CGI_10010086 superfamily 241666 95 336 1.31E-57 189.531 cl00184 CAS_like superfamily - - "Clavaminic acid synthetase (CAS) -like; CAS is a trifunctional Fe(II)/ 2-oxoglutarate (2OG) oxygenase carrying out three reactions in the biosynthesis of clavulanic acid, an inhibitor of class A serine beta-lactamases. In general, Fe(II)-2OG oxygenases catalyze a hydroxylation reaction, which leads to the incorporation of an oxygen atom from dioxygen into a hydroxyl group and conversion of 2OG to succinate and CO2" Q#2071 - CGI_10010088 superfamily 241550 29 254 9.41E-103 301.144 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#2072 - CGI_10010089 superfamily 243116 519 702 1.72E-39 150.649 cl02626 DNA_pol_A superfamily C - "Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication; DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains." Q#2072 - CGI_10010089 superfamily 243116 682 829 2.84E-22 98.4523 cl02626 DNA_pol_A superfamily N - "Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication; DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains." Q#2073 - CGI_10010090 superfamily 246908 1044 1142 7.26E-42 150.197 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#2073 - CGI_10010090 superfamily 243073 1166 1213 2.55E-17 78.2153 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#2074 - CGI_10010091 superfamily 243072 12 100 7.66E-11 61.2454 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2074 - CGI_10010091 superfamily 247792 222 268 0.00448852 36.6548 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2075 - CGI_10010092 superfamily 243119 119 169 6.48E-05 38.1938 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#2075 - CGI_10010092 superfamily 243119 20 66 9.50E-05 37.7985 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#2077 - CGI_10016490 superfamily 248345 287 413 3.44E-12 64.197 cl17791 SAC3_GANP superfamily N - "SAC3/GANP/Nin1/mts3/eIF-3 p25 family; This large family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit. This family includes several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits." Q#2078 - CGI_10016491 superfamily 241733 24 109 2.39E-58 177.008 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#2079 - CGI_10016492 superfamily 241574 405 599 2.32E-82 264.063 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#2079 - CGI_10016492 superfamily 241574 689 751 0.000393842 41.4174 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#2083 - CGI_10016496 superfamily 242063 103 278 0.00315222 35.7366 cl00746 RDD superfamily - - "RDD family; This family of proteins contain three highly conserved amino acids: one arginine and two aspartates, hence the name of RDD family. This region contains two predicted transmembrane regions. The arginine occurs at the N terminus of the first helix and the first aspartate occurs in the middle of this helix. The molecular function of this region is unknown. However this region may be involved in transport of an as yet unknown set of ligands (Bateman A pers. obs.)." Q#2084 - CGI_10016497 superfamily 247736 76 127 0.000360472 36.0926 cl17182 NAT_SF superfamily N - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#2087 - CGI_10016500 superfamily 247743 4 120 0.000206043 39.8756 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#2088 - CGI_10016501 superfamily 243034 153 262 1.08E-07 50.4564 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2090 - CGI_10016503 superfamily 243034 484 593 0.000139435 41.5968 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2090 - CGI_10016503 superfamily 243034 806 912 0.000579871 39.6708 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2091 - CGI_10016504 superfamily 245213 220 258 0.00963833 33.7642 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2093 - CGI_10016506 superfamily 245864 39 469 4.50E-36 138.18 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#2094 - CGI_10016507 superfamily 118716 17 190 1.96E-79 243.732 cl10882 Oscp1 superfamily - - "Organic solute transport protein 1; Oscp1 is a family of proteins conserved from plants to humans. It is called organic solute transport protein or oxido-red- nitro domain-containing protein 1, however no reference could be find to confirm the function of the protein." Q#2095 - CGI_10016508 superfamily 247637 1 330 5.78E-139 400.442 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#2096 - CGI_10016509 superfamily 247637 1 330 5.78E-139 400.442 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#2099 - CGI_10016512 superfamily 248012 58 168 6.17E-20 81.8552 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#2100 - CGI_10016513 superfamily 248012 84 194 5.08E-17 75.692 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#2102 - CGI_10016515 superfamily 243040 29 153 2.74E-58 192.709 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#2103 - CGI_10016516 superfamily 246936 247 369 7.28E-16 73.2936 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#2103 - CGI_10016516 superfamily 216595 131 200 2.40E-10 58.3458 cl03275 DUF21 superfamily N - "Domain of unknown function DUF21; This transmembrane region has no known function. Many of the sequences in this family are annotated as hemolysins, however this is due to a similarity to Treponema hyodysenteriae hemolysin C that does not contain this domain. This domain is found in the N-terminus of the proteins adjacent to two intracellular CBS domains pfam00571." Q#2104 - CGI_10003713 superfamily 241874 26 109 4.50E-10 55.1798 cl00456 SLC5-6-like_sbd superfamily NC - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2105 - CGI_10003714 superfamily 241874 7 315 1.13E-120 367.962 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2105 - CGI_10003714 superfamily 241874 352 471 1.03E-10 62.5725 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2106 - CGI_10003715 superfamily 245608 84 292 8.50E-78 237.982 cl11421 FAA_hydrolase superfamily - - "Fumarylacetoacetate (FAA) hydrolase family; This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyses fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hepatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerises this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. This family also includes various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase, encoded by mhpD in E. coli, is involved in the phenylpropionic acid pathway of E. coli and catalyzes the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase." Q#2109 - CGI_10000725 superfamily 241802 5 34 0.00793326 34.3447 cl00342 Trp-synth-beta_II superfamily NC - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#2113 - CGI_10001655 superfamily 247684 1 185 1.07E-35 130.475 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#2115 - CGI_10001657 superfamily 247692 4 94 4.47E-12 60.7678 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#2117 - CGI_10007116 superfamily 245213 208 230 0.000296002 40.3126 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2120 - CGI_10007119 superfamily 245213 184 206 0.00418182 37.231 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2121 - CGI_10002984 superfamily 245874 115 178 1.09E-09 55.125 cl12111 TNFR superfamily C - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#2121 - CGI_10002984 superfamily 245874 19 115 4.85E-07 47.421 cl12111 TNFR superfamily - - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#2121 - CGI_10002984 superfamily 246680 300 365 0.000979893 36.9838 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#2122 - CGI_10002985 superfamily 246680 52 109 0.000951773 33.8482 cl14633 DD_superfamily superfamily N - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#2123 - CGI_10002986 superfamily 243146 103 153 8.40E-09 52.2915 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2123 - CGI_10002986 superfamily 243074 23 58 1.73E-06 45.5753 cl02535 F-box-like superfamily C - F-box-like; This is an F-box-like family. Q#2123 - CGI_10002986 superfamily 243146 214 260 0.000413614 38.8095 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2123 - CGI_10002986 superfamily 243146 267 318 0.00108566 37.2687 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2124 - CGI_10002987 superfamily 219566 776 982 3.06E-59 202.478 cl06691 DUF1620 superfamily - - Protein of unknown function (DUF1620); These sequences are mainly derived from predicted eukaryotic proteins. The region in question lies towards the C-terminus of these large proteins and is approximately 300 amino acid residues long. Q#2128 - CGI_10001279 superfamily 241874 1 77 2.36E-25 103.139 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2128 - CGI_10001279 superfamily 241874 168 227 1.47E-06 47.1918 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2129 - CGI_10001280 superfamily 241874 8 47 3.71E-17 74.2494 cl00456 SLC5-6-like_sbd superfamily NC - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2130 - CGI_10012753 superfamily 243948 51 410 1.46E-130 382.408 cl04955 LanC_like superfamily - - "LanC-like proteins. LanC is the cyclase enzyme of the lanthionine synthetase. Lanthionine is a lantibiotic, a unique class of peptide antibiotics. They are ribosomally synthesized as a precursor peptide and then post-translationally modified to contain thioether cross-links called lanthionines (Lans) or methyllanthionines (MeLans), in addition to 2,3-didehydroalanine (Dha) and (Z)-2,3-didehydrobutyrine (Dhb). These unusual amino acids are introduced by the dehydration of serine and threonine residues, followed by thioether formation via addition of cysteine thiols, catalysed by LanB and LanC or LanM. LanC, the cyclase component, is a zinc metalloprotein, whose bound metal has been proposed to activate the thiol substrate for nucleophilic addition. A related domain is also present in LanM and other pro- and eukaryotic proteins of unknown function." Q#2132 - CGI_10012755 superfamily 201708 1 33 0.000216982 37.0735 cl03146 Ribosomal_L9_N superfamily N - "Ribosomal protein L9, N-terminal domain; Ribosomal protein L9, N-terminal domain. " Q#2133 - CGI_10012756 superfamily 243035 174 263 5.22E-15 68.8005 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2133 - CGI_10012756 superfamily 245309 58 149 0.00103063 36.318 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#2135 - CGI_10012759 superfamily 218541 105 450 2.37E-85 269.227 cl09364 MCD superfamily - - "Malonyl-CoA decarboxylase (MCD); This family consists of several eukaryotic malonyl-CoA decarboxylase (MLYCD) proteins. Malonyl-CoA, in addition to being an intermediate in the de novo synthesis of fatty acids, is an inhibitor of carnitine palmitoyltransferase I, the enzyme that regulates the transfer of long-chain fatty acyl-CoA into mitochondria, where they are oxidized. After exercise, malonyl-CoA decarboxylase participates with acetyl-CoA carboxylase in regulating the concentration of malonyl-CoA in liver and adipose tissue, as well as in muscle. Malonyl-CoA decarboxylase is regulated by AMP-activated protein kinase (AMPK)." Q#2136 - CGI_10012760 superfamily 241688 2 327 2.17E-40 145.277 cl00210 Isoprenoid_Biosyn_C1 superfamily - - "Isoprenoid Biosynthesis enzymes, Class 1; Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes; and are widely distributed among archaea, bacteria, and eukaryota.The enzymes in this superfamily share the same 'isoprenoid synthase fold' and include several subgroups. The head-to-tail (HT) IPPS catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. Cyclic monoterpenes, diterpenes, and sesquiterpenes, are formed from their respective linear isoprenoid diphosphates by class I terpene cyclases. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Cyclization of these 30- and 40-carbon linear forms are catalyzed by class II cyclases. Both the isoprenoid chain elongation reactions and the class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Generally, the enzymes in this family exhibit an all-trans reaction pathway, an exception, is the cis-trans terpene cyclase, trichodiene synthase. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD." Q#2140 - CGI_10012764 superfamily 248458 99 483 9.72E-17 80.4357 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#2143 - CGI_10012767 superfamily 247684 24 227 4.46E-26 104.281 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#2144 - CGI_10012768 superfamily 247684 34 152 1.05E-29 111.985 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#2145 - CGI_10012769 superfamily 247684 13 432 4.08E-73 241.413 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#2148 - CGI_10012772 superfamily 243072 48 153 5.38E-26 104.388 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2151 - CGI_10012776 superfamily 241622 932 1002 3.85E-11 61.8138 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#2151 - CGI_10012776 superfamily 216901 603 791 4.72E-84 274.848 cl03466 Rap_GAP superfamily - - Rap/ran-GAP; Rap/ran-GAP. Q#2151 - CGI_10012776 superfamily 221287 1455 1702 1.90E-13 71.001 cl13342 DUF3401 superfamily - - "Domain of unknown function (DUF3401); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 231 to 250 amino acids in length. This domain is found associated with pfam02145, pfam00595." Q#2151 - CGI_10012776 superfamily 248213 1686 1725 9.42E-05 42.5621 cl17659 DivIC superfamily C - Septum formation initiator; DivIC from B. subtilis is necessary for both vegetative and sporulation septum formation. These proteins are mainly composed of an amino terminal coiled-coil. Q#2153 - CGI_10012778 superfamily 243035 35 159 1.32E-28 108.091 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2153 - CGI_10012778 superfamily 243035 293 374 1.57E-15 71.8821 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2153 - CGI_10012778 superfamily 243035 171 296 3.58E-22 90.3517 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2154 - CGI_10020123 superfamily 241609 294 372 6.51E-31 118.635 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#2154 - CGI_10020123 superfamily 241609 383 458 1.22E-27 109.39 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#2154 - CGI_10020123 superfamily 241609 543 626 1.00E-21 92.4411 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#2154 - CGI_10020123 superfamily 241609 464 539 1.76E-15 74.3367 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#2154 - CGI_10020123 superfamily 241613 791 825 6.10E-07 48.357 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#2154 - CGI_10020123 superfamily 241613 661 691 8.58E-06 44.8902 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#2154 - CGI_10020123 superfamily 246925 935 1139 6.60E-09 58.1358 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#2154 - CGI_10020123 superfamily 220695 1227 1451 6.16E-07 51.4255 cl18571 7TM_GPCR_Srx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#2154 - CGI_10020123 superfamily 220695 6 229 8.38E-05 44.8771 cl18571 7TM_GPCR_Srx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#2155 - CGI_10020124 superfamily 241609 117 192 4.58E-20 80.8851 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#2155 - CGI_10020124 superfamily 241609 74 105 4.42E-10 53.0786 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#2156 - CGI_10020125 superfamily 202715 118 218 2.65E-30 109.205 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#2157 - CGI_10020126 superfamily 245818 627 743 1.77E-34 129.599 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#2157 - CGI_10020126 superfamily 245818 121 219 1.77E-15 74.5159 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#2157 - CGI_10020126 superfamily 217750 838 891 1.05E-11 62.2114 cl04280 PAP_assoc superfamily - - Cid1 family poly A polymerase; This domain is found in poly(A) polymerases and has been shown to have polynucleotide adenylyltransferase activity. Proteins in this family have been located to both the nucleus and the cytoplasm. Q#2157 - CGI_10020126 superfamily 217750 275 328 2.17E-10 58.3594 cl04280 PAP_assoc superfamily - - Cid1 family poly A polymerase; This domain is found in poly(A) polymerases and has been shown to have polynucleotide adenylyltransferase activity. Proteins in this family have been located to both the nucleus and the cytoplasm. Q#2158 - CGI_10020127 superfamily 245226 967 1116 5.72E-84 269.74 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#2159 - CGI_10020128 superfamily 243077 25 79 8.45E-15 70.2669 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#2159 - CGI_10020128 superfamily 203591 249 385 1.02E-33 126.332 cl06275 DUF1399 superfamily - - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#2159 - CGI_10020128 superfamily 203591 156 249 6.29E-05 42.3584 cl06275 DUF1399 superfamily N - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#2160 - CGI_10020129 superfamily 203591 86 219 4.94E-36 130.569 cl06275 DUF1399 superfamily - - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#2160 - CGI_10020129 superfamily 226728 184 258 1.07E-05 45.6909 cl18775 COG4278 superfamily NC - Uncharacterized conserved protein [Function unknown] Q#2162 - CGI_10020131 superfamily 241638 134 258 9.29E-12 59.6893 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#2163 - CGI_10020132 superfamily 201778 26 108 5.72E-13 64.1522 cl18219 GFO_IDH_MocA superfamily N - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#2163 - CGI_10020132 superfamily 217272 123 230 1.62E-11 59.8556 cl18400 GFO_IDH_MocA_C superfamily - - "Oxidoreductase family, C-terminal alpha/beta domain; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#2164 - CGI_10020133 superfamily 247725 1754 1892 1.13E-70 235.222 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2164 - CGI_10020133 superfamily 243096 1567 1748 4.25E-50 177.875 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#2164 - CGI_10020133 superfamily 247683 1479 1531 2.63E-33 125.186 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#2166 - CGI_10020135 superfamily 241600 435 646 3.16E-98 302.237 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2166 - CGI_10020135 superfamily 241600 259 383 2.30E-62 207.863 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2166 - CGI_10020135 superfamily 241600 17 151 9.63E-57 192.455 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2166 - CGI_10020135 superfamily 241600 184 218 0.000137942 42.4177 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2166 - CGI_10020135 superfamily 241600 161 190 0.00465548 37.6418 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2170 - CGI_10020139 superfamily 241647 113 143 0.000142067 38.2778 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#2172 - CGI_10020141 superfamily 216347 206 623 4.75E-129 389.973 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#2174 - CGI_10020143 superfamily 241547 77 305 1.61E-50 168.615 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#2176 - CGI_10020145 superfamily 241636 78 248 1.61E-56 181.247 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#2177 - CGI_10020146 superfamily 247723 26 96 6.45E-22 87.0423 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2177 - CGI_10020146 superfamily 247723 100 167 4.84E-15 68.1347 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2177 - CGI_10020146 superfamily 247723 205 272 5.25E-14 65.0531 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2178 - CGI_10020147 superfamily 247799 89 154 3.84E-15 69.8127 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#2178 - CGI_10020147 superfamily 247799 12 72 3.15E-13 64.5035 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#2178 - CGI_10020147 superfamily 247799 321 385 2.13E-10 56.3307 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#2178 - CGI_10020147 superfamily 247799 223 299 4.14E-08 49.8659 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#2179 - CGI_10020148 superfamily 241600 217 420 1.08E-55 185.137 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2179 - CGI_10020148 superfamily 192987 64 150 0.000233815 39.4779 cl13724 TMF_TATA_bd superfamily - - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#2180 - CGI_10020149 superfamily 246681 191 332 3.44E-55 179.205 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#2183 - CGI_10020152 superfamily 242274 10 158 5.13E-06 43.6541 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#2186 - CGI_10017321 superfamily 220601 1 305 1.22E-87 267.771 cl10846 TM231 superfamily - - "Transmembrane protein 231; This is a family of transmembrane proteins, given the number 231, of unknown function. It is conserved in eukaryotes." Q#2188 - CGI_10017323 superfamily 243146 295 341 5.61E-09 51.8934 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2188 - CGI_10017323 superfamily 243146 257 306 2.03E-06 44.8567 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2189 - CGI_10017324 superfamily 241874 1 129 9.51E-21 87.9468 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2190 - CGI_10017325 superfamily 241874 209 369 2.09E-57 194.648 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2190 - CGI_10017325 superfamily 241874 31 162 6.20E-51 179.682 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2192 - CGI_10017327 superfamily 243066 22 120 1.99E-11 58.8564 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#2195 - CGI_10017330 superfamily 216962 1 77 6.40E-34 113.94 cl03517 SRP14 superfamily - - Signal recognition particle 14kD protein; The signal recognition particle (SRP) is a multimeric protein involved in targeting secretory proteins to the rough endoplasmic reticulum membrane. SRP14 and SRP9 form a complex essential for SRP RNA binding. Q#2197 - CGI_10017332 superfamily 246669 4 129 8.88E-62 189.406 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#2198 - CGI_10017333 superfamily 247741 50 278 1.08E-75 233.192 cl17187 Aldolase_Class_I superfamily - - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#2200 - CGI_10017335 superfamily 241842 55 150 7.12E-39 129.238 cl00400 Fe-S_biosyn superfamily - - Iron-sulphur cluster biosynthesis; This family is involved in iron-sulphur cluster biosynthesis. Its members include proteins that are involved in nitrogen fixation such as the HesB and HesB-like proteins. Q#2201 - CGI_10017336 superfamily 247725 181 252 3.12E-27 107.772 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2201 - CGI_10017336 superfamily 220215 21 109 1.74E-12 64.5538 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#2204 - CGI_10017339 superfamily 150852 29 109 7.52E-24 95.4689 cl10926 KxDL superfamily - - Uncharacterized conserved protein; This is a family of short proteins which are conserved over a region of 80 residues. There is a characteristic KxDL motif towards the C-terminus. The function is unknown. Q#2208 - CGI_10017343 superfamily 245670 504 686 3.18E-66 223.973 cl11519 DENN superfamily - - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#2208 - CGI_10017343 superfamily 243635 386 470 6.62E-20 87.39 cl04085 uDENN superfamily - - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#2208 - CGI_10017343 superfamily 208095 763 833 5.13E-18 81.4859 cl04084 dDENN superfamily - - dDENN domain; This region is always found associated with pfam02141. It is predicted to form a globular domain. This domain is predicted to be completely alpha helical. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. Q#2208 - CGI_10017343 superfamily 245029 139 198 5.45E-05 43.7904 cl09190 MAPEG superfamily N - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#2209 - CGI_10017344 superfamily 241555 4 162 5.24E-66 206.017 cl00020 GAT_1 superfamily N - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#2210 - CGI_10017345 superfamily 247912 2 188 8.37E-10 57.8965 cl17358 Beta-lactamase superfamily N - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#2212 - CGI_10017347 superfamily 247912 36 294 4.93E-20 89.0976 cl17358 Beta-lactamase superfamily N - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#2212 - CGI_10017347 superfamily 221337 330 413 0.00925312 34.2147 cl13401 DUF3471 superfamily - - "Domain of unknown function (DUF3471); This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144." Q#2214 - CGI_10017349 superfamily 242232 338 392 6.85E-13 64.2202 cl00984 TM2 superfamily C - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#2214 - CGI_10017349 superfamily 242232 203 255 1.55E-11 60.3682 cl00984 TM2 superfamily C - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#2214 - CGI_10017349 superfamily 242232 125 174 2.86E-10 56.0272 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#2214 - CGI_10017349 superfamily 242232 263 311 4.06E-05 41.3896 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#2214 - CGI_10017349 superfamily 220708 413 474 0.00783375 35.0715 cl11017 WWbp superfamily N - "WW-domain ligand protein; The WWbp domain is characterized by several short PY and PT-like motifs of the PPPPY form. These appear to bind directly to the WW domains of WWP1 and WWP2 and other such diverse proteins as dystrophin and YAP (Yes-associated protein). This is the WW-domain binding protein WWbp via PY and PY_like motifs. The presence of a phosphotyrosine residue in the pWBP-1 peptide abolishes WW domain binding which suggests a potential regulatory role for tyrosine phosphorylation in modulating WW domain-ligand interactions. Given the likelihood that WWP1 and WWP2 function as E3 ubiquitin-protein ligases, it is possible that initial substrate-specific recognition occurs via WW domain-substrate protein interaction followed by ubiquitin transfer and subsequent proteolysis. This domain lies just downstream of the GRAM (pfam02893) in many members." Q#2215 - CGI_10017350 superfamily 242232 101 174 5.08E-14 64.6054 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#2216 - CGI_10017351 superfamily 216686 142 333 4.01E-53 176.36 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#2221 - CGI_10008344 superfamily 248264 113 169 1.50E-08 50.3134 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#2231 - CGI_10005609 superfamily 213107 12 50 1.43E-16 74.1941 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#2231 - CGI_10005609 superfamily 210118 252 274 0.00155406 36.5347 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#2231 - CGI_10005609 superfamily 210118 540 562 0.00561185 34.9939 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#2231 - CGI_10005609 superfamily 210118 492 513 0.00838641 34.6087 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#2232 - CGI_10005610 superfamily 199575 1 115 2.83E-51 168.942 cl15439 BTG superfamily - - BTG family; BTG family. Q#2233 - CGI_10005611 superfamily 243082 338 517 7.44E-30 119.895 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#2233 - CGI_10005611 superfamily 197845 578 597 0.0035825 36.3502 cl11736 UIM superfamily - - Ubiquitin-interacting motif; Present in proteasome subunit S5a and other ubiquitin-associated proteins. Q#2236 - CGI_10005614 superfamily 247684 37 458 1.39E-81 264.525 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#2237 - CGI_10005615 superfamily 245226 49 133 5.18E-14 67.3586 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#2238 - CGI_10005616 superfamily 245226 187 223 0.00244689 36.1245 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#2240 - CGI_10005618 superfamily 247684 1 204 6.54E-46 162.832 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#2242 - CGI_10013734 superfamily 247740 14 292 3.24E-178 497.154 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#2243 - CGI_10013735 superfamily 241555 15 222 6.15E-81 242.816 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#2244 - CGI_10013736 superfamily 247856 53 78 0.000185946 34.8309 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#2245 - CGI_10013737 superfamily 241754 32 364 0 624.026 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#2249 - CGI_10013742 superfamily 242428 34 324 8.19E-50 176.42 cl01315 NRDE superfamily - - "NRDE protein; In eukaryotes this family is predicted to play a role in protein secretion and Golgi organisation. In plants this family includes Solanum habrochaites Cwp, which is involved in water permeability in the cuticles of fruit. Mouse T10 has been found to be expressed during early embryogenesis in mice. This protein contains a conserved NRDE motif." Q#2250 - CGI_10013743 superfamily 243035 226 337 2.22E-06 44.9182 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2250 - CGI_10013743 superfamily 243035 19 115 0.000828086 37.2142 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2251 - CGI_10013744 superfamily 241563 162 200 9.37E-05 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2252 - CGI_10013745 superfamily 241563 159 197 0.000121898 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2254 - CGI_10008723 superfamily 243157 3 82 2.14E-10 57.6729 cl02720 PB1 superfamily - - "The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants." Q#2255 - CGI_10008724 superfamily 216878 16 116 7.60E-54 166.317 cl03452 DAD superfamily - - "DAD family; Members of this family are thought to be integral membrane proteins. Some members of this family have been shown to cause apoptosis if mutated, these proteins are known as DAD for defender against death. The family also includes the epsilon subunit of the oligosaccharyltransferase that is involved in N-linked glycosylation." Q#2256 - CGI_10008725 superfamily 244906 624 700 3.67E-09 54.0684 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#2258 - CGI_10008727 superfamily 150815 7 54 3.96E-09 47.7817 cl10887 Phospho_p8 superfamily - - "DNA-binding nuclear phosphoprotein p8; P8 is a short 80-82 amino acid protein that is conserved from nematodes to humans. It carries at least one protein kinase C domain suggesting a possible role in signal transduction and it is thought to be a phosphoprotein, but the sites of phosphorylation and the kinases involved remain to be determined." Q#2259 - CGI_10008728 superfamily 243090 739 929 6.11E-25 104.152 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#2259 - CGI_10008728 superfamily 243090 964 1066 1.03E-13 69.7284 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#2259 - CGI_10008728 superfamily 243090 97 225 0.00302343 37.5901 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#2260 - CGI_10008729 superfamily 203990 501 556 4.19E-06 44.9903 cl07882 Cmc1 superfamily C - Cytochrome c oxidase biogenesis protein Cmc1 like; Cmc1 is a metallo-chaperone like protein which is known to localise to the inner mitochondrial membrane in Saccharomyces cerevisiae. It is essential for full expression of cytochrome c oxidase and respiration. Cmc1 contains two Cx9C motifs and is able to bind copper(I). Cmc1 is thought to play a role in mitochondrial copper trafficking and transfer to cytochrome c oxidase. Q#2261 - CGI_10008730 superfamily 218518 19 282 1.16E-19 86.6802 cl08436 CENP-N superfamily N - "Kinetochore protein CHL4 like; CHL4 is a protein involved in chromosome segregation. It is a component of the central kinetochore which mediates the attachment of the centromere to the mitotic spindle. CENP-N is one of the components that assembles onto the CENP-A-nucleosome-associated (NAC) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC." Q#2262 - CGI_10008731 superfamily 242010 21 193 4.29E-06 46.4047 cl00660 PRK01198 superfamily C - V-type ATP synthase subunit C; Provisional Q#2263 - CGI_10008732 superfamily 241571 24 91 0.000256377 35.4659 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#2268 - CGI_10008737 superfamily 215827 198 381 8.44E-38 140.681 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#2269 - CGI_10008739 superfamily 241547 54 170 3.46E-38 136.139 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#2271 - CGI_10008741 superfamily 246597 1 165 2.45E-126 360.91 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#2272 - CGI_10008742 superfamily 241600 68 280 8.33E-88 263.332 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2273 - CGI_10008743 superfamily 246597 1 68 1.54E-44 147.509 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#2274 - CGI_10008744 superfamily 241600 67 194 2.69E-46 154.742 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2275 - CGI_10008745 superfamily 246597 7 108 1.11E-61 191.807 cl13995 MPP_superfamily superfamily C - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#2276 - CGI_10008746 superfamily 246597 1 118 4.54E-91 269.232 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#2277 - CGI_10008747 superfamily 241600 67 280 5.61E-86 258.71 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2278 - CGI_10008748 superfamily 218118 81 149 1.98E-15 67.6392 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#2279 - CGI_10006547 superfamily 241599 331 387 9.68E-16 71.8908 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2279 - CGI_10006547 superfamily 198730 227 301 2.03E-42 146.43 cl02582 Pou superfamily - - Pou domain - N-terminal to homeobox domain; Pou domain - N-terminal to homeobox domain. Q#2280 - CGI_10006548 superfamily 248097 159 285 2.13E-19 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2280 - CGI_10006548 superfamily 192445 32 139 0.00271122 36.6211 cl10818 Med4 superfamily C - "Vitamin-D-receptor interacting Mediator subunit 4; Members of this family function as part of the Mediator (Med) complex, which links DNA-bound transcriptional regulators and the general transcription machinery, particularly the RNA polymerase II enzyme. They play a role in basal transcription by mediating activation or repression according to the specific complement of transcriptional regulators bound to the promoter." Q#2281 - CGI_10006549 superfamily 246680 7 66 2.14E-21 86.4651 cl14633 DD_superfamily superfamily N - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#2282 - CGI_10006550 superfamily 222150 513 538 0.00486315 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2283 - CGI_10006551 superfamily 247723 78 160 1.82E-21 89.9806 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2283 - CGI_10006551 superfamily 247727 433 522 2.34E-07 49.3507 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#2283 - CGI_10006551 superfamily 247875 235 358 2.04E-17 81.17 cl17321 2OG-FeII_Oxy_2 superfamily N - 2OG-Fe(II) oxygenase superfamily; 2OG-Fe(II) oxygenase superfamily. Q#2283 - CGI_10006551 superfamily 216897 38 85 3.35E-08 51.5281 cl03463 Gal_Lectin superfamily C - Galactose binding lectin domain; Galactose binding lectin domain. Q#2284 - CGI_10006552 superfamily 247856 236 292 0.000120538 39.0681 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#2284 - CGI_10006552 superfamily 247856 199 259 0.000274522 38.2977 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#2284 - CGI_10006552 superfamily 247856 83 133 0.00394151 34.8309 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#2287 - CGI_10006555 superfamily 247725 42 135 3.52E-51 173.991 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2287 - CGI_10006555 superfamily 248318 714 766 1.83E-14 69.3869 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#2287 - CGI_10006555 superfamily 219103 186 304 2.17E-50 172.556 cl05893 Myotub-related superfamily - - "Myotubularin-related; This family represents a region within eukaryotic myotubularin-related proteins that is sometimes found with pfam02893. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease." Q#2287 - CGI_10006555 superfamily 206020 364 418 5.08E-34 124.928 cl18286 Y_phosphatase_m superfamily - - "Myotubularin Y_phosphatase-like; This short region is highly conserved and seems to be common to many myotubularin proteins with protein tyrosine pyrophosphate activity. As the family has a number of highly conserved residues such as histidine, cysteine, glutamine and aspartate, it is possible that this represents a catalytic core of the active enzymatic part of the proteins." Q#2288 - CGI_10006556 superfamily 247725 5 137 1.28E-35 130.255 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2289 - CGI_10006557 superfamily 241573 502 805 5.45E-100 320.433 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#2290 - CGI_10012497 superfamily 245213 40 79 2.33E-05 37.7136 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2290 - CGI_10012497 superfamily 221695 20 43 5.77E-05 36.279 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#2290 - CGI_10012497 superfamily 245213 1 29 0.00148554 32.604 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2291 - CGI_10012498 superfamily 241578 84 124 2.80E-06 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2291 - CGI_10012498 superfamily 221695 28 51 7.32E-06 39.7458 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#2291 - CGI_10012498 superfamily 241578 43 83 0.000410168 37.3644 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2291 - CGI_10012498 superfamily 245213 4 41 0.00355346 32.6086 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2292 - CGI_10012499 superfamily 245213 133 169 0.00352194 34.9198 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2292 - CGI_10012499 superfamily 221695 29 50 0.00725673 33.9678 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#2294 - CGI_10012502 superfamily 247907 248 402 8.43E-30 115.978 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2294 - CGI_10012502 superfamily 247907 34 180 4.42E-28 110.971 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2294 - CGI_10012502 superfamily 247907 441 598 1.43E-24 100.956 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2294 - CGI_10012502 superfamily 245213 628 661 0.000723843 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2294 - CGI_10012502 superfamily 247907 666 713 0.000236155 40.8645 cl17353 LamG superfamily C - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2295 - CGI_10012503 superfamily 245213 334 370 3.90E-09 52.639 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2296 - CGI_10012504 superfamily 241574 915 1156 3.92E-90 290.641 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#2296 - CGI_10012504 superfamily 247725 199 307 3.71E-29 113.934 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2296 - CGI_10012504 superfamily 215882 112 224 2.94E-23 97.3514 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#2296 - CGI_10012504 superfamily 220215 25 106 3.09E-22 93.0586 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#2297 - CGI_10012505 superfamily 241941 63 127 9.36E-14 62.6117 cl00551 Acylphosphatase superfamily N - Acylphosphatase; Acylphosphatase. Q#2300 - CGI_10012508 superfamily 243063 2 176 4.72E-78 234.443 cl02511 GH64-TLP-SF superfamily N - "glycoside hydrolase family 64 (beta-1,3-glucanases which produce specific pentasaccharide oligomers) and thaumatin-like proteins; This superfamily includes glycoside hydrolases of family 64 (GH64), these are mostly bacterial beta-1,3-glucanases which cleave long-chain polysaccharide beta-1,3-glucans, into specific pentasaccharide oligomers and are implicated in fungal cell wall degradation. Also included in this superfamily are thaumatin, the sweet-tasting protein from the African berry Thaumatococcus daniellii, and thaumatin-like proteins (TLPs) which are involved in host defense and a wide range of developmental processes in fungi, plants, and animals. Like GH64s, some TLPs also hydrolyze the beta-1,3-glucans of the type commonly found in fungal walls. Plant TLPs are classified as pathogenesis-related (PR) protein family 5 (PR5), their expression is induced by environmental stresses such as pathogen/pest attack, drought and cold. Several members of the plant TLP family have been reported as food allergens from fruits, and pollen allergens from conifers. Streptomyces matensis laminaripentaose-producing, beta-1,3-glucanase (GH64-LPHase), and TLPs have in common, a core N-terminal barrel domain (domain I) composed of 10 beta-strands, two coming from the C-terminal region of the protein. In TLPs, this core domain is flanked by two shorter domains (domains II and III). Small TLPs, such as Triticum aestivum thaumatin-like xylanase inhibitor, have a deletion in the third domain (domain II). GH64-LPHase has a second C-terminal domain which corresponds positional to, but is much larger than, domain III of TLP. GH64-LPHase and TLPs are described as crescent-fold structures. Critical functional residues, common to GH64-LPHase and TLPs are a Glu and an Asp residue. LPHase has an electronegative, substrate-binding cleft and the afore mentioned conserved Glu and Asp residues are the catalytic residues essential for beta-1,3-glucan cleavage. In TLPs, these residues are two of the four conserved residues which contribute to the strong electronegative character of the cleft which is associated with the antifungal activity of TLPs." Q#2305 - CGI_10012513 superfamily 247724 11 97 3.08E-20 84.5205 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2306 - CGI_10012514 superfamily 248012 4 129 4.37E-15 67.3484 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#2307 - CGI_10012515 superfamily 243035 6 66 1.85E-15 65.3951 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2308 - CGI_10012516 superfamily 243035 139 240 8.70E-20 81.8973 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2309 - CGI_10012517 superfamily 248012 1 88 4.57E-05 37.6381 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#2310 - CGI_10020568 superfamily 217293 22 225 7.46E-34 126.208 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#2310 - CGI_10020568 superfamily 202474 232 282 1.09E-07 51.1153 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#2311 - CGI_10020569 superfamily 217293 22 225 6.35E-33 123.512 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#2311 - CGI_10020569 superfamily 202474 232 282 8.40E-08 51.5005 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#2312 - CGI_10020570 superfamily 217293 23 225 2.72E-35 129.675 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#2312 - CGI_10020570 superfamily 202474 233 283 0.00113733 38.7889 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#2314 - CGI_10020572 superfamily 246710 42 180 1.82E-33 125.693 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#2314 - CGI_10020572 superfamily 246710 599 715 1.88E-32 122.996 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#2314 - CGI_10020572 superfamily 216290 203 331 8.76E-37 134.723 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#2314 - CGI_10020572 superfamily 217685 346 485 4.46E-34 128.221 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#2316 - CGI_10020574 superfamily 245201 14 264 2.03E-156 454.094 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#2317 - CGI_10020575 superfamily 241645 1 112 3.33E-44 142.123 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#2321 - CGI_10020579 superfamily 238191 1135 1659 3.31E-121 395.162 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#2321 - CGI_10020579 superfamily 238191 568 1079 5.99E-119 388.614 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#2321 - CGI_10020579 superfamily 238191 43 536 7.68E-108 356.642 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#2321 - CGI_10020579 superfamily 243061 2130 2230 1.33E-46 165.208 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2321 - CGI_10020579 superfamily 243061 2022 2121 1.70E-42 153.652 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2321 - CGI_10020579 superfamily 243061 1912 2012 6.16E-40 146.333 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2321 - CGI_10020579 superfamily 243061 1807 1904 2.62E-35 132.851 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2321 - CGI_10020579 superfamily 243061 1700 1799 3.78E-32 123.606 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2321 - CGI_10020579 superfamily 243061 2238 2317 1.38E-19 87.3974 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2322 - CGI_10020580 superfamily 247724 11 208 1.01E-120 344.584 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2323 - CGI_10020581 superfamily 202668 504 608 7.10E-22 93.1126 cl04110 BK_channel_a superfamily - - Calcium-activated BK potassium channel alpha subunit; Calcium-activated BK potassium channel alpha subunit. Q#2323 - CGI_10020581 superfamily 219619 284 351 3.11E-11 61.4547 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#2324 - CGI_10020582 superfamily 243074 22 67 1.71E-11 61.3685 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#2325 - CGI_10020584 superfamily 248312 28 205 3.14E-08 50.8149 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#2326 - CGI_10020585 superfamily 241672 264 586 2.88E-66 220.263 cl00192 ribokinase_pfkB_like superfamily - - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#2326 - CGI_10020585 superfamily 242231 1 234 8.70E-113 342.144 cl00983 Indigoidine_A superfamily - - "Indigoidine synthase A like protein; Indigoidine is a blue pigment synthesised by Erwinia chrysanthemi implicated in pathogenicity and protection from oxidative stress. IdgA is involved in indigoidine biosynthesis, but its specific function is unknown. The recommended name for this protein is now pseudouridine-5'-phosphate glycosidase." Q#2327 - CGI_10020586 superfamily 243072 74 177 9.76E-18 75.883 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2328 - CGI_10020587 superfamily 202224 175 289 1.66E-22 91.5883 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#2330 - CGI_10020589 superfamily 247858 27 166 9.71E-14 64.3314 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#2331 - CGI_10020590 superfamily 241613 114 149 1.18E-09 54.5202 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#2331 - CGI_10020590 superfamily 241613 155 186 1.49E-05 42.579 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#2331 - CGI_10020590 superfamily 245814 21 104 1.75E-08 52.1225 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2332 - CGI_10020591 superfamily 241583 52 261 6.84E-13 66.4931 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#2333 - CGI_10020592 superfamily 218118 65 130 4.00E-18 74.1876 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#2334 - CGI_10020593 superfamily 218118 48 125 6.72E-24 89.5956 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#2335 - CGI_10020594 superfamily 218118 30 105 2.63E-17 71.4912 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#2336 - CGI_10020595 superfamily 243126 253 292 3.96E-14 66.8456 cl02650 FYRN superfamily C - F/Y-rich N-terminus; This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00541. Q#2336 - CGI_10020595 superfamily 243127 299 331 1.32E-08 51.9234 cl02651 FYRC superfamily N - F/Y rich C-terminus; This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00542. Q#2337 - CGI_10020596 superfamily 241599 155 212 2.40E-15 68.0388 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2339 - CGI_10020599 superfamily 241599 149 203 6.53E-19 78.054 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2343 - CGI_10020603 superfamily 216686 92 259 3.97E-27 105.483 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#2344 - CGI_10020604 superfamily 221377 202 348 8.07E-06 44.3819 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#2354 - CGI_10020614 superfamily 248469 20 127 2.59E-14 68.9359 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#2354 - CGI_10020614 superfamily 248469 161 258 2.10E-09 54.6835 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#2356 - CGI_10015095 superfamily 217473 54 325 1.86E-25 106.68 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#2358 - CGI_10015097 superfamily 219797 108 748 0 573.108 cl09596 ACC_central superfamily - - "Acetyl-CoA carboxylase, central region; The region featured in this family is found in various eukaryotic acetyl-CoA carboxylases, N-terminal to the catalytic domain (pfam01039). This enzyme (EC:6.4.1.2) is involved in the synthesis of long-chain fatty acids, as it catalyzes the rate-limiting step in this process." Q#2359 - CGI_10015098 superfamily 217473 164 324 4.96E-21 93.1985 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#2362 - CGI_10015101 superfamily 217473 204 351 7.02E-24 102.058 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#2365 - CGI_10015104 superfamily 192054 15 168 9.95E-72 219.002 cl07222 RFC1 superfamily - - "Replication factor RFC1 C terminal domain; This is the C terminal domain of replication factor C, RFC1. RFC complexes hydrolyse ATP and load sliding clamps such as PCNA (proliferating cell nuclear antigen) onto double-stranded DNA. RFC1 is essential for RFC function in vivo." Q#2372 - CGI_10004381 superfamily 241600 62 234 9.91E-55 176.662 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2373 - CGI_10004382 superfamily 241600 2 154 4.07E-38 130.86 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2374 - CGI_10004383 superfamily 247725 5 117 1.77E-77 227.573 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2375 - CGI_10004384 superfamily 247725 403 526 9.46E-71 224.494 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2375 - CGI_10004384 superfamily 243096 209 392 1.46E-26 106.228 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#2376 - CGI_10004385 superfamily 247725 175 301 5.34E-73 239.526 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2376 - CGI_10004385 superfamily 215882 83 187 3.24E-22 94.2698 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#2376 - CGI_10004385 superfamily 220215 12 75 4.69E-16 75.3394 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#2376 - CGI_10004385 superfamily 192138 292 326 1.01E-07 50.6952 cl07378 FA superfamily C - "FERM adjacent (FA); This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase substrates." Q#2381 - CGI_10009185 superfamily 247892 81 429 0 669.17 cl17338 PRK06075 superfamily - - NADH dehydrogenase subunit D; Validated Q#2382 - CGI_10009186 superfamily 222090 346 541 6.69E-14 71.9202 cl18636 Methyltransf_22 superfamily - - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#2382 - CGI_10009186 superfamily 243109 888 961 0.00101218 39.7932 cl02614 SPRY superfamily C - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#2387 - CGI_10009192 superfamily 241913 45 114 5.21E-06 42.9785 cl00509 hot_dog superfamily C - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#2388 - CGI_10009193 superfamily 112546 373 419 9.54E-05 40.1128 cl04240 EPTP superfamily - - "EPTP domain; Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein contains a large repeat in its C terminal section. The architecture and structural features of this repeat make it a likely member 7-bladed beta-propeller fold." Q#2388 - CGI_10009193 superfamily 112546 311 367 0.000186754 39.3424 cl04240 EPTP superfamily - - "EPTP domain; Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein contains a large repeat in its C terminal section. The architecture and structural features of this repeat make it a likely member 7-bladed beta-propeller fold." Q#2388 - CGI_10009193 superfamily 112546 209 257 0.000442615 38.1868 cl04240 EPTP superfamily - - "EPTP domain; Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein contains a large repeat in its C terminal section. The architecture and structural features of this repeat make it a likely member 7-bladed beta-propeller fold." Q#2388 - CGI_10009193 superfamily 112546 157 204 0.000993782 37.0312 cl04240 EPTP superfamily - - "EPTP domain; Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein contains a large repeat in its C terminal section. The architecture and structural features of this repeat make it a likely member 7-bladed beta-propeller fold." Q#2388 - CGI_10009193 superfamily 112546 262 300 0.00136163 36.646 cl04240 EPTP superfamily - - "EPTP domain; Mutations in the LGI/Epitempin gene can result in a special form of epilepsy, autosomal dominant lateral temporal epilepsy. The Epitempin protein contains a large repeat in its C terminal section. The architecture and structural features of this repeat make it a likely member 7-bladed beta-propeller fold." Q#2389 - CGI_10009194 superfamily 241578 22 178 1.46E-35 135.882 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2389 - CGI_10009194 superfamily 241578 234 386 4.05E-32 125.867 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2389 - CGI_10009194 superfamily 241578 444 600 1.01E-31 124.711 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2389 - CGI_10009194 superfamily 241578 677 823 7.49E-18 84.1546 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2389 - CGI_10009194 superfamily 243119 858 912 2.40E-05 44.732 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#2390 - CGI_10009195 superfamily 243119 281 339 0.000993312 37.0281 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#2390 - CGI_10009195 superfamily 243119 97 153 0.0051708 34.7169 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#2391 - CGI_10000572 superfamily 247727 57 139 4.04E-12 59.751 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#2392 - CGI_10009450 superfamily 247948 14 70 1.35E-12 60.7706 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#2394 - CGI_10009452 superfamily 241572 38 134 4.21E-08 49.5445 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#2394 - CGI_10009452 superfamily 241572 179 250 1.23E-06 45.6966 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#2395 - CGI_10009453 superfamily 243083 499 583 9.52E-31 115.544 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#2396 - CGI_10009454 superfamily 243034 83 184 5.59E-17 73.5683 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2397 - CGI_10009455 superfamily 241563 416 447 3.41E-05 42.6595 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2397 - CGI_10009455 superfamily 247792 273 314 0.00278408 37.04 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2397 - CGI_10009455 superfamily 212612 24 219 5.53E-27 111.226 cl08346 Rib_hydrolase superfamily - - "ADP-ribosyl cyclase, also known as cyclic ADP-ribose hydrolase or CD38; ADP-ribosyl cyclase (EC:3.2.2.5) synthesizes the second messenger cyclic-ADP ribose (cADPR), which in turn releases calcium from internal stores. Mammals possess two membrane proteins, CD38 and BST-1/CD157, which exhibit ADP-ribosyl cyclase activity, as well as intracellular soluble ADP-ribose cyclases. CD38 is involved in differentiation, adhesion, and cell proliferation, and has been implicated in diseases such as AIDS, diabetes, and B-cell chronic lymphocytic leukemia. The extramembrane domain of CD38 acts as a multifunctional enzyme, and can synthesize cADPR from NAD+, hydrolyze NAD+ and cADPR to ADPR, as well as catalyze the exchange of the nicotinamide group of NADP+ with nicotinic acid under acidic conditions, to yield NAADP+ (nicotinic acid-adenine dinucleotide phosphate), a metabolite involved in Ca2+ mobilization from acidic stores." Q#2397 - CGI_10009455 superfamily 110440 855 881 0.00124326 37.7725 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2398 - CGI_10009456 superfamily 222150 236 260 0.00800398 33.9045 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2400 - CGI_10009458 superfamily 247046 240 464 1.58E-17 80.5569 cl15705 DUF563 superfamily - - Protein of unknown function (DUF563); Family of uncharacterized proteins. Q#2401 - CGI_10000648 superfamily 248097 11 72 0.00878168 31.0814 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2405 - CGI_10021462 superfamily 242169 6 83 7.76E-05 37.2135 cl00886 Robl_LC7 superfamily - - "Roadblock/LC7 domain; This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role." Q#2406 - CGI_10021463 superfamily 241571 3 114 1.07E-09 56.2666 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#2407 - CGI_10021464 superfamily 243092 406 642 1.77E-48 171.749 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2407 - CGI_10021464 superfamily 243074 324 367 2.62E-12 62.5241 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#2408 - CGI_10021465 superfamily 219448 239 338 7.71E-11 60.0213 cl06523 DRMBL superfamily - - DNA repair metallo-beta-lactamase; The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in DNA repair. Q#2408 - CGI_10021465 superfamily 241867 30 144 1.97E-07 50.9736 cl00446 Lactamase_B superfamily C - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#2409 - CGI_10021466 superfamily 220606 60 241 1.86E-37 135.718 cl10857 Tmemb_40 superfamily - - Predicted membrane protein; This is a region of 280 amino acids from a group of proteins conserved from plants to humans. It is predicted to be a membrane protein but its function is otherwise unknown. Q#2411 - CGI_10021468 superfamily 247755 319 381 1.85E-28 110.231 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2411 - CGI_10021468 superfamily 247755 423 495 6.53E-27 105.994 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2411 - CGI_10021468 superfamily 247755 171 220 1.33E-13 68.244 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2411 - CGI_10021468 superfamily 247755 282 346 3.39E-07 49.74 cl17201 ABC_ATPase superfamily NC - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2411 - CGI_10021468 superfamily 247755 388 455 0.00317082 38.299 cl17201 ABC_ATPase superfamily NC - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2415 - CGI_10021472 superfamily 245835 32 313 2.29E-98 313.883 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#2415 - CGI_10021472 superfamily 243095 523 708 1.84E-87 281.232 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#2415 - CGI_10021472 superfamily 247683 762 814 2.57E-19 83.9901 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#2416 - CGI_10021473 superfamily 151051 18 111 1.38E-13 69.9627 cl11131 Nrf1_DNA-bind superfamily N - "NLS-binding and DNA-binding and dimerisation domains of Nrf1; In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, there is also an NLS domain at 88-116, and a DNA binding and dimerisation domain at 127-282. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity." Q#2417 - CGI_10021474 superfamily 151051 21 113 2.44E-16 78.4371 cl11131 Nrf1_DNA-bind superfamily N - "NLS-binding and DNA-binding and dimerisation domains of Nrf1; In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, there is also an NLS domain at 88-116, and a DNA binding and dimerisation domain at 127-282. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity." Q#2419 - CGI_10021476 superfamily 243066 18 123 2.37E-24 98.0733 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#2419 - CGI_10021476 superfamily 198867 132 231 2.51E-20 86.4435 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#2419 - CGI_10021476 superfamily 243146 424 469 5.17E-13 64.5019 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2419 - CGI_10021476 superfamily 243146 365 409 7.68E-13 63.8346 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2419 - CGI_10021476 superfamily 243146 472 517 2.28E-09 54.1015 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2419 - CGI_10021476 superfamily 243146 507 545 6.29E-08 49.5822 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2419 - CGI_10021476 superfamily 243146 340 376 1.60E-05 42.9307 cl02701 Kelch_3 superfamily N - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2419 - CGI_10021476 superfamily 243146 283 327 3.58E-05 41.7751 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2420 - CGI_10021477 superfamily 241599 240 289 4.75E-06 43.3861 cl00084 homeodomain superfamily C - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2421 - CGI_10021478 superfamily 241599 528 577 2.11E-05 42.6157 cl00084 homeodomain superfamily C - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2421 - CGI_10021478 superfamily 245201 142 285 2.44E-12 65.3357 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#2421 - CGI_10021478 superfamily 245201 212 387 0.00869943 37.0299 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#2422 - CGI_10021479 superfamily 247744 388 506 2.51E-38 139.59 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#2422 - CGI_10021479 superfamily 246925 116 306 0.00399139 38.8758 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#2423 - CGI_10021480 superfamily 243205 1 348 0 615.863 cl02823 phosphagen_kinases superfamily - - "Phosphagen (guanidino) kinases; Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK) or phosphoarginine in the case of arginine kinase, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CK exists in tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial and cytosolic) isoforms. They are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK and AK, the most studied members of this family are also other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK) and hypotaurocyamine kinase (HTK). The majority of bacterial phosphagen kinases appear to lack the N-terminal domain and have not been functionally characterized." Q#2424 - CGI_10021481 superfamily 243205 6 349 0 581.966 cl02823 phosphagen_kinases superfamily - - "Phosphagen (guanidino) kinases; Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK) or phosphoarginine in the case of arginine kinase, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CK exists in tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial and cytosolic) isoforms. They are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK and AK, the most studied members of this family are also other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK) and hypotaurocyamine kinase (HTK). The majority of bacterial phosphagen kinases appear to lack the N-terminal domain and have not been functionally characterized." Q#2425 - CGI_10021482 superfamily 243205 31 98 6.74E-07 44.9976 cl02823 phosphagen_kinases superfamily C - "Phosphagen (guanidino) kinases; Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK) or phosphoarginine in the case of arginine kinase, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CK exists in tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial and cytosolic) isoforms. They are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK and AK, the most studied members of this family are also other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK) and hypotaurocyamine kinase (HTK). The majority of bacterial phosphagen kinases appear to lack the N-terminal domain and have not been functionally characterized." Q#2426 - CGI_10021483 superfamily 202405 16 50 5.86E-18 70.9999 cl03723 ATP-gua_PtransN superfamily C - "ATP:guanido phosphotransferase, N-terminal domain; The N-terminal domain has an all-alpha fold." Q#2429 - CGI_10021486 superfamily 149426 64 212 2.67E-23 94.3853 cl18038 SEFIR superfamily - - "SEFIR domain; This family comprises IL17 receptors (IL17Rs) and SEF proteins. The latter are feedback inhibitors of FGF signalling and are also thought to be receptors. Due to its similarity to the TIR domain (pfam01582), the SEFIR region is thought to be involved in homotypic interactions with other SEFIR/TIR-domain-containing proteins. Thus, SEFs and IL17Rs may be involved in TOLL/IL1R-like signalling pathways." Q#2431 - CGI_10021488 superfamily 248097 16 151 2.76E-15 68.0977 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2432 - CGI_10021489 superfamily 248097 7 111 1.64E-13 67.2902 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2432 - CGI_10021489 superfamily 248097 249 370 8.01E-08 50.3414 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2432 - CGI_10021489 superfamily 248097 118 239 9.26E-08 50.3414 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2432 - CGI_10021489 superfamily 248097 394 496 2.76E-05 42.6745 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2433 - CGI_10021490 superfamily 203591 2 135 1.93E-30 117.087 cl06275 DUF1399 superfamily - - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#2434 - CGI_10021491 superfamily 243096 732 924 5.09E-23 98.524 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#2434 - CGI_10021491 superfamily 247725 914 1089 7.34E-17 80.3001 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2435 - CGI_10021492 superfamily 248028 9 265 1.79E-56 183.859 cl17474 Steroid_dh superfamily - - "3-oxo-5-alpha-steroid 4-dehydrogenase; This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalyzed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants is DET2, a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development." Q#2436 - CGI_10021493 superfamily 248097 104 228 5.90E-16 71.1422 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2437 - CGI_10021494 superfamily 245367 59 156 0.00899675 38.2648 cl10727 DUF2042 superfamily NC - Uncharacterized conserved protein (DUF2042); This entry is the conserved N-terminal 300 residues of a group of proteins found from protozoa to Humans. The function is unknown. Q#2439 - CGI_10021496 superfamily 207657 47 96 0.00345015 34.4075 cl02577 HALZ superfamily - - Homeobox associated leucine zipper; Homeobox associated leucine zipper. Q#2443 - CGI_10021500 superfamily 241563 61 100 0.00059485 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2444 - CGI_10021501 superfamily 222429 17 95 4.23E-09 51.8576 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#2446 - CGI_10001322 superfamily 247637 306 448 5.79E-63 208.65 cl16912 MDR superfamily C - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#2446 - CGI_10001322 superfamily 247746 112 187 0.000912021 38.693 cl17192 ATP-synt_B superfamily NC - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#2447 - CGI_10001390 superfamily 245847 28 75 7.46E-05 37.0968 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#2449 - CGI_10000765 superfamily 241746 978 1041 0.00493518 37.1357 cl00277 Restriction_endonuclease_like superfamily N - "Superfamily of nucleases including Short Patch Repair (Vsr) Endonucleases, archaeal Holliday junction resolvases, MutH methy-directed DNA mismatch-repair endonucleases, and catalytic domains of many restriction endonucleases, such as EcoRI, BamHI, and FokI" Q#2451 - CGI_10001595 superfamily 243362 249 290 0.00329417 36.6343 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#2451 - CGI_10001595 superfamily 110440 371 397 0.00509269 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2452 - CGI_10001596 superfamily 243092 16 87 1.43E-14 67.7452 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2452 - CGI_10001596 superfamily 222150 118 141 3.63E-05 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2455 - CGI_10001836 superfamily 241832 10 123 6.22E-60 183.494 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#2456 - CGI_10001611 superfamily 243124 880 992 3.42E-12 65.5261 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#2456 - CGI_10001611 superfamily 238012 172 211 0.00288964 36.9486 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#2457 - CGI_10007864 superfamily 216254 64 184 3.71E-23 91.5406 cl08303 Recep_L_domain superfamily - - Receptor L domain; The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. Q#2458 - CGI_10007865 superfamily 247743 380 488 5.11E-05 42.5183 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#2458 - CGI_10007865 superfamily 248284 140 178 0.00210238 39.3214 cl17730 GlpB superfamily C - Anaerobic glycerol-3-phosphate dehydrogenase [Amino acid transport and metabolism] Q#2459 - CGI_10007866 superfamily 247792 36 79 1.03E-08 53.2184 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2459 - CGI_10007866 superfamily 241563 173 212 3.19E-07 48.8227 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2459 - CGI_10007866 superfamily 150427 788 1015 9.42E-88 288.524 cl18043 DUF2048 superfamily C - Uncharacterized conserved protein (DUF2048); The proteins in this family are conserved from plants to vertebrates. The function is unknown. Q#2459 - CGI_10007866 superfamily 150427 1079 1190 5.23E-16 79.3607 cl18043 DUF2048 superfamily N - Uncharacterized conserved protein (DUF2048); The proteins in this family are conserved from plants to vertebrates. The function is unknown. Q#2459 - CGI_10007866 superfamily 128778 219 342 6.38E-11 61.5118 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#2459 - CGI_10007866 superfamily 110440 511 538 1.22E-07 49.7137 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2460 - CGI_10007867 superfamily 241745 31 172 2.33E-48 158.024 cl00276 Maf_Ham1 superfamily - - "Maf_Ham1. Maf, a nucleotide binding protein, has been implicated in inhibition of septum formation in eukaryotes, bacteria and archaea. A Ham1-related protein from Methanococcus jannaschii is a novel NTPase that has been shown to hydrolyze nonstandard nucleotides, such as hypoxanthine/xanthine NTP, but not standard nucleotides." Q#2461 - CGI_10007868 superfamily 247068 56 83 0.00473076 33.0558 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2464 - CGI_10007871 superfamily 241659 272 340 0.000185785 39.1095 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#2465 - CGI_10007872 superfamily 241563 7 43 0.000291243 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2467 - CGI_10007875 superfamily 217900 2 34 2.15E-13 68.3775 cl04403 APG9 superfamily N - "Autophagy protein Apg9; In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialised compartment essential for these vesicle-mediated alternative targeting pathways." Q#2470 - CGI_10020841 superfamily 243038 126 180 0.00135583 38.8609 cl02442 DEP superfamily N - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#2473 - CGI_10020844 superfamily 245596 80 324 8.19E-37 134.531 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#2474 - CGI_10020845 superfamily 241578 96 212 1.46E-08 53.3386 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2474 - CGI_10020845 superfamily 247724 401 452 0.00273701 37.9114 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2476 - CGI_10020847 superfamily 247744 871 1016 1.90E-31 122.819 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#2476 - CGI_10020847 superfamily 241578 188 309 1.08E-11 64.1242 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2476 - CGI_10020847 superfamily 115363 503 567 1.20E-07 50.4482 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#2476 - CGI_10020847 superfamily 207713 760 797 1.66E-05 44.253 cl02729 WWE superfamily C - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#2477 - CGI_10020848 superfamily 245814 199 285 1.92E-05 42.8615 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2477 - CGI_10020848 superfamily 245814 82 175 2.07E-07 48.9227 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2477 - CGI_10020848 superfamily 245814 319 388 8.90E-06 44.0564 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2479 - CGI_10020850 superfamily 241607 51 87 8.11E-05 35.7086 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#2480 - CGI_10020851 superfamily 242184 1 54 2.90E-27 98.0282 cl00909 Ribosomal_L24e_L24 superfamily - - "Ribosomal protein L24e/L24 is a ribosomal protein found in eukaryotes (L24) and in archaea (L24e, distinct from archaeal L24). L24e/L24 is located on the surface of the large subunit, adjacent to proteins L14 and L3, and near the translation factor binding site. L24e/L24 appears to play a role in the kinetics of peptide synthesis, and may be involved in interactions between the large and small subunits, either directly or through other factors. In mouse, a deletion mutation in L24 has been identified as the cause for the belly spot and tail (Bst) mutation that results in disrupted pigmentation, somitogenesis and retinal cell fate determination. L24 may be an important protein in eukaryotic reproduction: in shrimp, L24 expression is elevated in the ovary, suggesting a role in oogenesis, and in Arabidopsis, L24 has been proposed to have a specific function in gynoecium development. No protein with sequence or structural homology to L24e/L24 has been identified in bacteria, but a functionally equivalent protein may exist. Bacterial L19 forms an interprotein beta sheet with L14 that is similar to the L24e/L14 interprotein beta sheet observed in the archaeal L24e structures. Some eukaryotic L24 proteins were initially identified as L30, and this alignment model contains several sequences called L30." Q#2481 - CGI_10020852 superfamily 245342 378 483 1.94E-22 92.797 cl10594 ERCC4 superfamily - - ERCC4 domain; This domain is a family of nucleases. The family includes EME1 which is an essential component of a Holliday junction resolvase. EME1 interacts with MUS81 to form a DNA structure-specific endonuclease. Q#2483 - CGI_10020854 superfamily 219619 118 172 5.41E-19 80.7147 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#2483 - CGI_10020854 superfamily 219619 306 383 6.37E-11 57.9879 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#2484 - CGI_10020855 superfamily 245847 635 740 1.34E-08 53.5101 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#2485 - CGI_10020856 superfamily 243049 12 70 0.000255959 37.4451 cl02472 IGFBP superfamily C - Insulin-like growth factor binding protein; Insulin-like growth factor binding protein. Q#2486 - CGI_10020857 superfamily 217685 75 160 2.54E-27 109.346 cl04225 Cu2_monoox_C superfamily N - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#2486 - CGI_10020857 superfamily 110440 487 514 5.33E-05 41.6245 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2486 - CGI_10020857 superfamily 216290 10 33 0.000196643 40.7346 cl03089 Cu2_monooxygen superfamily N - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#2486 - CGI_10020857 superfamily 110440 600 628 0.00209911 37.0021 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2487 - CGI_10020858 superfamily 241995 4 270 1.71E-80 246.359 cl00635 Ntn_Asparaginase_2_like superfamily - - "Ntn-hydrolase superfamily, L-Asparaginase type 2-like enzymes. This family includes Glycosylasparaginase, Taspase 1 and L-Asparaginase type 2 enzymes. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue." Q#2488 - CGI_10020859 superfamily 241623 81 353 4.27E-153 438.601 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#2489 - CGI_10020861 superfamily 243034 786 886 0.000739026 39.2856 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2491 - CGI_10020863 superfamily 242536 272 353 5.98E-28 104.871 cl01497 ATase superfamily - - "The DNA repair protein O6-alkylguanine-DNA alkyltransferase (ATase; also known as AGT, AGAT and MGMT) reverses O6-alkylation DNA damage by transferring O6-alkyl adducts to an active site cysteine irreversibly, without inducing DNA strand breaks. ATases are specific for repair of guanines with O6-alkyl adducts, however human ATase is not limited to O6-methylguanine, repairing many other adducts at the O6-position of guanine as well. ATase is widely distributed among species. Most ATases have N- and C-terminal domains. The C-terminal domain contains the conserved active-site cysteine motif (PCHR), the O6-alkylguanine binding channel, and the helix-turn-helix (HTH) DNA-binding motif. The active site is located near the recognition helix of the HTH motif. While the C-terminal domain of ATase contains residues that are necessary for DNA binding and alkyl transfer, the function of the N-terminal domain is still unknown. Removal of the N-terminal domain abolishes the activity of the C-terminal domain, suggesting an important structural role for the N-terminal domain in orienting the C-terminal domain for proper catalysis. Some ATase C-terminal domain homologs are either single-domain proteins that lack an N-terminal domain, or have a tryptophan substituted in place of the acceptor cysteine (i.e. the motif PCHR is replaced by PWHR). ATase null mutant mice are viable, fertile, and have a normal lifespan." Q#2491 - CGI_10020863 superfamily 247724 1 130 3.66E-25 99.188 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2492 - CGI_10020864 superfamily 243082 592 751 2.28E-58 202.132 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#2492 - CGI_10020864 superfamily 243082 151 354 3.84E-14 73.4639 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#2493 - CGI_10020865 superfamily 246597 51 149 2.43E-56 179.866 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#2493 - CGI_10020865 superfamily 246597 1 46 8.68E-23 90.8849 cl13995 MPP_superfamily superfamily C - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#2494 - CGI_10020866 superfamily 243181 171 299 2.11E-30 117.753 cl02783 TopoII_MutL_Trans superfamily - - "MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of type II DNA topoisomerases (Topo II) and DNA mismatch repair (MutL/MLH1/PMS2) proteins. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. The GyrB dimerizes in response to ATP binding, and is homologous to the N-terminal half of eukaryotic Topo II and the ATPase fragment of MutL. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. Included in this group are proteins similar to human MLH1 and PMS2. MLH1 forms a heterodimer with PMS2 which functions in meiosis and in DNA mismatch repair (MMR). Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families." Q#2494 - CGI_10020866 superfamily 244692 611 754 5.78E-38 139.027 cl07336 MutL_C superfamily - - "MutL C terminal dimerisation domain; MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerisation." Q#2495 - CGI_10020867 superfamily 243187 668 845 1.91E-115 354.288 cl02789 EFG_like_IV superfamily - - "Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm." Q#2495 - CGI_10020867 superfamily 247724 140 339 9.81E-105 327.303 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2495 - CGI_10020867 superfamily 243183 840 919 6.33E-48 165.882 cl02785 Elongation_Factor_C superfamily - - "Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown." Q#2495 - CGI_10020867 superfamily 243185 485 578 6.36E-48 166.215 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#2496 - CGI_10020868 superfamily 220393 35 201 4.49E-36 131.729 cl10751 Tmem26 superfamily N - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#2497 - CGI_10020869 superfamily 220393 1 246 1.30E-43 153.3 cl10751 Tmem26 superfamily - - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#2498 - CGI_10020870 superfamily 247907 1482 1662 1.42E-13 70.91 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2498 - CGI_10020870 superfamily 247907 1250 1414 1.93E-11 64.7469 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2498 - CGI_10020870 superfamily 247907 1762 1913 9.07E-11 62.4357 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2498 - CGI_10020870 superfamily 245213 2427 2463 7.58E-09 54.565 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 1116 1153 2.20E-08 53.4094 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 2227 2264 1.22E-07 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 434 470 1.92E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 472 507 6.59E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 588 624 1.38E-06 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 626 661 2.19E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 354 390 3.23E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 1990 2027 3.96E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 552 586 2.95E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 891 927 2.97E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 1078 1114 4.07E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 510 548 9.95E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 743 797 0.000139244 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 2388 2419 0.000232505 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 2149 2185 0.00025437 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 312 348 0.000333049 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 983 1019 0.000568903 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 1695 1730 0.000753366 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 1156 1199 0.00130842 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 2187 2224 0.00237384 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 862 889 0.00272194 38.3866 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 2111 2146 0.00685316 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2498 - CGI_10020870 superfamily 245213 799 843 0.00566995 37.2264 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2499 - CGI_10020871 superfamily 243050 106 160 1.39E-30 110.205 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#2499 - CGI_10020871 superfamily 243050 45 98 4.36E-22 87.0465 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#2499 - CGI_10020871 superfamily 241599 181 239 2.93E-18 76.128 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2500 - CGI_10020872 superfamily 244539 210 593 4.11E-131 392.022 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#2500 - CGI_10020872 superfamily 241863 8 138 4.44E-32 120.96 cl00438 Flavodoxin_2 superfamily - - Flavodoxin-like fold; This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. Q#2501 - CGI_10020873 superfamily 241599 231 289 1.04E-22 89.61 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2501 - CGI_10020873 superfamily 238076 79 128 2.43E-23 93.2526 cl18938 PAX superfamily N - Paired Box domain Q#2502 - CGI_10020874 superfamily 214531 177 216 4.93E-05 40.6629 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#2502 - CGI_10020874 superfamily 214531 1 30 0.00173166 36.4257 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#2502 - CGI_10020874 superfamily 205157 377 413 0.00270953 35.5911 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#2504 - CGI_10020876 superfamily 218655 126 174 0.00216833 35.2901 cl05269 DUF778 superfamily N - Protein of unknown function (DUF778); This family consists of several eukaryotic proteins of unknown function. Q#2505 - CGI_10020877 superfamily 241613 114 147 9.35E-06 42.579 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#2505 - CGI_10020877 superfamily 241578 237 282 1.80E-08 53.1575 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2505 - CGI_10020877 superfamily 214531 320 357 4.65E-06 43.3593 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#2505 - CGI_10020877 superfamily 241613 170 199 0.0057444 34.4898 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#2506 - CGI_10002065 superfamily 241596 69 122 7.22E-09 52.6015 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#2507 - CGI_10000980 superfamily 241832 6 201 1.64E-81 244.867 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#2508 - CGI_10002553 superfamily 247750 355 578 3.27E-122 367.469 cl17196 E1_enzyme_family superfamily C - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#2510 - CGI_10003119 superfamily 243035 101 225 3.38E-24 93.8385 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2511 - CGI_10015594 superfamily 243109 249 427 1.06E-86 273.304 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#2511 - CGI_10015594 superfamily 207684 7 40 3.23E-10 56.6183 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#2511 - CGI_10015594 superfamily 247807 462 582 0.00376167 36.5042 cl17253 AAA_17 superfamily - - AAA domain; AAA domain. Q#2512 - CGI_10015595 superfamily 241619 21 94 0.000806592 36.4028 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#2513 - CGI_10015596 superfamily 248032 12 194 1.49E-61 191.277 cl17478 NTPase_1 superfamily - - "NTPase; This domain is found across all species from bacteria to human, and the function was determined first in a hyperthermophilic bacterium to be an NTPase. The structure of one member-sequence represents a variation of the RecA fold, and implies that the function might be that of a DNA/RNA modifying enzyme. The sequence carries both a Walker A and Walker B motif which together are characteristic of ATPases or GTPases. The protein exhibits an increased expression profile in human liver cholangiocarcinoma when compared to normal tissue." Q#2514 - CGI_10015597 superfamily 246925 85 192 0.000260673 40.0314 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#2515 - CGI_10015598 superfamily 241642 154 213 2.05E-12 59.4314 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#2515 - CGI_10015598 superfamily 241642 33 90 4.94E-07 44.7938 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#2515 - CGI_10015598 superfamily 216143 100 152 2.47E-11 56.4016 cl02979 SNAP-25 superfamily - - SNAP-25 family; SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of SNARE complexes. Members of this family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment. Q#2516 - CGI_10015600 superfamily 241642 115 156 3.30E-06 45.5642 cl00152 t_SNARE superfamily C - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#2516 - CGI_10015600 superfamily 241642 10 51 1.14E-05 44.0234 cl00152 t_SNARE superfamily N - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#2516 - CGI_10015600 superfamily 216143 61 113 1.47E-07 49.468 cl02979 SNAP-25 superfamily - - SNAP-25 family; SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of SNARE complexes. Members of this family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment. Q#2516 - CGI_10015600 superfamily 246968 450 519 0.00211833 37.7248 cl15456 ADAM_CR superfamily - - ADAM cysteine-rich; ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity. Q#2516 - CGI_10015600 superfamily 221119 573 603 0.00819525 35.396 cl12993 Sugarporin_N superfamily N - "Maltoporin periplasmic N-terminal extension; This domain would appear to be the periplasmic, N-terminal extension of the outer membrane maltoporins, pfam02264, LamB." Q#2517 - CGI_10015601 superfamily 147609 13 178 2.07E-24 93.9603 cl05205 p25-alpha superfamily - - "p25-alpha; This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila." Q#2519 - CGI_10015603 superfamily 216939 110 152 3.12E-07 44.9613 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#2519 - CGI_10015603 superfamily 216939 2 80 4.99E-06 41.8797 cl03492 PC4 superfamily - - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#2523 - CGI_10015607 superfamily 147119 186 379 2.65E-90 285.233 cl04762 SMK-1 superfamily - - "Component of IIS longevity pathway SMK-1; SMK-1 is a component of the IIs longevity pathway which regulates aging in C.elegans. Specifically, SMK-1 influences DAF-16-dependant regulation of the aging process by regulating the transcriptional specificity of DAF-16 activity. SMK-1 plays a role in longevity by modulating the transcriptional specificity of DAF-16." Q#2523 - CGI_10015607 superfamily 247725 25 107 0.000125274 41.2848 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2524 - CGI_10015608 superfamily 217584 35 157 8.24E-32 115.917 cl04100 MOSC_N superfamily - - "MOSC N-terminal beta barrel domain; This domain is found to the N-terminus of pfam03473. The function of this domain is unknown, however it is predicted to adopt a beta barrel fold." Q#2524 - CGI_10015608 superfamily 217583 190 315 1.35E-07 48.5131 cl04097 MOSC superfamily - - "MOSC domain; The MOSC (MOCO sulfurase C-terminal) domain is a superfamily of beta-strand-rich domains identified in the molybdenum cofactor sulfurase and several other proteins from both prokaryotes and eukaryotes. These MOSC domains contain an absolutely conserved cysteine and occur either as stand-alone forms, or fused to other domains such as NifS-like catalytic domain in Molybdenum cofactor sulfurase. The MOSC domain is predicted to be a sulfur-carrier domain that receives sulfur abstracted by the pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulfur-metal clusters." Q#2525 - CGI_10015609 superfamily 243065 313 468 3.94E-25 105.562 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#2525 - CGI_10015609 superfamily 243065 1667 1820 1.01E-24 104.446 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#2525 - CGI_10015609 superfamily 243065 756 911 1.02E-24 104.406 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#2525 - CGI_10015609 superfamily 243065 2114 2270 1.10E-24 104.406 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#2525 - CGI_10015609 superfamily 243065 1212 1367 1.57E-24 104.021 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#2525 - CGI_10015609 superfamily 216897 114 191 2.10E-23 97.752 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#2525 - CGI_10015609 superfamily 243093 205 283 5.69E-10 59.081 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#2525 - CGI_10015609 superfamily 244710 1420 1494 7.44E-06 46.5647 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#2525 - CGI_10015609 superfamily 244710 2319 2397 1.53E-05 45.4091 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#2525 - CGI_10015609 superfamily 244710 963 1037 3.28E-05 44.6387 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#2525 - CGI_10015609 superfamily 244710 507 581 4.24E-05 44.2535 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#2525 - CGI_10015609 superfamily 205157 2497 2523 0.000138186 42.1395 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#2526 - CGI_10015610 superfamily 248097 234 335 1.59E-12 63.053 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2529 - CGI_10015613 superfamily 241581 416 509 5.70E-05 44.6846 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#2529 - CGI_10015613 superfamily 241754 14 322 2.35E-132 423.261 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#2530 - CGI_10015614 superfamily 247743 49 82 0.00670405 32.942 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#2535 - CGI_10003444 superfamily 241563 63 100 0.0078761 34.6203 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2537 - CGI_10003446 superfamily 243689 34 107 7.80E-06 45.6973 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#2537 - CGI_10003446 superfamily 241563 915 956 0.00227919 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2540 - CGI_10003467 superfamily 247805 1238 1387 1.75E-15 76.222 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#2540 - CGI_10003467 superfamily 247905 1599 1708 1.89E-14 72.6556 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#2540 - CGI_10003467 superfamily 221397 575 1013 5.81E-81 276.124 cl14983 DUF3535 superfamily - - "Domain of unknown function (DUF3535); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 439 to 459 amino acids in length. This domain is found associated with pfam00271, pfam02985, pfam00176. This domain has two completely conserved residues (P and K) that may be functionally important." Q#2541 - CGI_10003468 superfamily 247044 489 600 6.13E-52 177.415 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#2541 - CGI_10003468 superfamily 247044 728 825 1.11E-32 122.743 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#2541 - CGI_10003468 superfamily 247044 616 710 2.19E-10 58.5334 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#2541 - CGI_10003468 superfamily 246925 71 283 0.000114552 43.8834 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#2542 - CGI_10003469 superfamily 247044 172 272 1.46E-26 101.539 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#2542 - CGI_10003469 superfamily 247044 65 150 7.02E-14 66.6225 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#2542 - CGI_10003469 superfamily 247044 283 381 8.87E-25 96.9828 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#2544 - CGI_10011466 superfamily 241622 458 538 2.59E-12 63.3546 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#2544 - CGI_10011466 superfamily 241622 79 156 3.45E-12 62.9694 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#2544 - CGI_10011466 superfamily 241622 171 252 1.98E-11 60.6582 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#2547 - CGI_10011469 superfamily 248097 45 159 2.46E-15 68.4458 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2549 - CGI_10011471 superfamily 246681 95 164 4.22E-46 157.987 cl14643 SRPBCC superfamily N - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#2550 - CGI_10011472 superfamily 243110 7 197 1.09E-18 84.4033 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#2551 - CGI_10011473 superfamily 241750 93 362 9.02E-41 145.118 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#2552 - CGI_10011474 superfamily 247921 2 54 0.00798565 31.4733 cl17367 Herpes_VP19C superfamily NC - Herpesvirus capsid shell protein VP19C; Herpesvirus capsid shell protein VP19C. Q#2554 - CGI_10011476 superfamily 241563 60 95 0.000129205 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2556 - CGI_10011478 superfamily 241563 60 95 0.000119096 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2556 - CGI_10011478 superfamily 149105 104 227 0.0028165 38.5701 cl12353 TMPIT superfamily C - "TMPIT-like protein; A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this." Q#2556 - CGI_10011478 superfamily 110440 484 510 0.00662842 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2557 - CGI_10011479 superfamily 241563 38 73 7.56E-05 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2559 - CGI_10011481 superfamily 243061 4 105 5.15E-36 125.917 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2559 - CGI_10011481 superfamily 243061 111 212 2.05E-27 102.805 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2559 - CGI_10011481 superfamily 243061 211 268 1.44E-17 75.4562 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#2560 - CGI_10011482 superfamily 203101 42 142 2.77E-22 88.9535 cl04785 Popeye superfamily N - Popeye protein conserved region; The function of Popeye proteins is not well understood. They are predominantly expressed in cardiac and skeletal muscle. This family represents a conserved region which includes three potential transmembrane domains. Q#2561 - CGI_10011483 superfamily 248012 10 90 9.68E-09 52.5801 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#2561 - CGI_10011483 superfamily 213389 147 316 7.59E-08 50.7507 cl17092 STING_C superfamily - - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#2562 - CGI_10012020 superfamily 243072 856 981 1.26E-40 147.53 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2562 - CGI_10012020 superfamily 243072 625 750 4.00E-40 145.989 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2562 - CGI_10012020 superfamily 243072 955 1080 2.75E-39 143.678 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2562 - CGI_10012020 superfamily 243072 691 816 6.67E-39 142.523 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2562 - CGI_10012020 superfamily 243072 545 684 3.91E-31 120.181 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2562 - CGI_10012020 superfamily 243072 828 859 0.00029369 39.8448 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2562 - CGI_10012020 superfamily 247755 39 116 0.00134278 40.143 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2564 - CGI_10012022 superfamily 222150 129 154 6.49E-05 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2564 - CGI_10012022 superfamily 222150 157 180 0.000225979 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2565 - CGI_10012023 superfamily 222150 312 337 0.000245096 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2565 - CGI_10012023 superfamily 222150 340 363 0.00424187 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2566 - CGI_10012024 superfamily 243034 11 114 1.37E-17 80.1167 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2566 - CGI_10012024 superfamily 243058 819 929 8.21E-05 41.9164 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#2566 - CGI_10012024 superfamily 221172 363 508 2.16E-23 98.4662 cl13196 UNC45-central superfamily - - "Myosin-binding striated muscle assembly central; The UNC-45 or small muscle protein 1 of C.elegans is expressed in two forms from different genomic positions in mammals, as a general tissue protein UNC-45a and a specific form Unc-45b expressed only in striated and skeletal muscle. All members carry up to three amino-terminal tetratricopeptide repeat (TPR) domains towards their N-terminal, a UCS domain at the C-terminal that contains a number of Arm repeats pfam00514 and this central region of approximately 400 residues. Both the general form and the muscle form of UNC-45 function in myotube formation through cell fusion. Myofibril formation requires both GC and SM UNC-45, consistent with the fact that the cytoskeleton is necessary for the development and maintenance of organised myofibrils. The S. pombe Rng3p, is crucial for cell shape, normal actin cytoskeleton, and contractile ring assembly, and is essential for assembly of the myosin II-containing progenitors of the contractile ring. Widespread defects in the cytoskeleton are found in null mutants of all three fungal proteins. Mammalian Unc45 is found to act as a specific chaperone during the folding of myosin and the assembly of striated muscle by forming a stable complex with the general chaperone Hsp90. The exact function of this central region is not known." Q#2566 - CGI_10012024 superfamily 243058 640 749 0.00307374 37.294 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#2567 - CGI_10012025 superfamily 241863 7 182 4.96E-25 97.0744 cl00438 Flavodoxin_2 superfamily - - Flavodoxin-like fold; This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. Q#2568 - CGI_10012026 superfamily 243088 8 61 6.75E-06 42.318 cl02563 PX_domain superfamily N - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#2569 - CGI_10012027 superfamily 247057 384 449 2.05E-36 129.786 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#2569 - CGI_10012027 superfamily 218123 95 292 2.85E-75 240.293 cl04559 CP2 superfamily - - CP2 transcription factor; This family represents a conserved region in the CP2 transcription factor family. Q#2570 - CGI_10012028 superfamily 199166 474 662 9.69E-33 126.288 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#2570 - CGI_10012028 superfamily 199166 372 523 1.14E-18 85.0716 cl15308 AMN1 superfamily N - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#2570 - CGI_10012028 superfamily 243074 302 343 8.80E-13 64.0649 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#2571 - CGI_10012029 superfamily 217210 490 971 4.29E-153 467.525 cl10595 Ald_Xan_dh_C2 superfamily - - Molybdopterin-binding domain of aldehyde dehydrogenase; Molybdopterin-binding domain of aldehyde dehydrogenase. Q#2571 - CGI_10012029 superfamily 243326 375 480 8.32E-31 118.39 cl03161 Ald_Xan_dh_C superfamily - - "Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain; Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. " Q#2571 - CGI_10012029 superfamily 201981 88 146 2.17E-20 87.536 cl08334 Fer2_2 superfamily C - [2Fe-2S] binding domain; [2Fe-2S] binding domain. Q#2575 - CGI_10011029 superfamily 247068 238 331 9.77E-19 86.2133 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 746 837 8.95E-17 80.4353 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5704 5800 2.97E-16 78.8945 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1286 1386 2.69E-15 76.1981 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 966 1059 4.31E-15 75.4277 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1397 1487 3.76E-14 72.7313 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 2158 2239 9.61E-14 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3747 3836 1.08E-13 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3324 3426 1.14E-13 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1193 1274 1.16E-13 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 2354 2444 1.87E-13 70.8053 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5812 5899 3.94E-13 70.0349 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 2046 2141 1.21E-12 68.4941 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 4470 4562 3.99E-12 66.9533 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1500 1597 6.67E-12 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 4269 4354 7.49E-12 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5912 6009 1.00E-11 65.7977 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 2252 2345 1.83E-11 65.0273 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3125 3217 2.47E-11 64.6421 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 4369 4457 2.67E-11 64.2569 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 2913 3008 6.88E-10 60.4049 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3640 3732 9.02E-10 60.0197 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 2705 2797 1.33E-09 59.2493 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3434 3523 1.43E-09 59.2493 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 6197 6277 1.94E-09 58.8641 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3232 3312 2.04E-09 58.8641 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5009 5081 2.44E-09 58.4789 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3023 3117 3.42E-09 58.0938 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5493 5589 1.20E-08 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 4165 4251 1.90E-08 55.7826 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 2808 2886 3.46E-08 55.0122 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 343 434 1.03E-07 53.8566 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3961 4050 1.83E-07 53.0862 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 4575 4672 2.55E-07 52.701 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1933 2034 4.21E-07 51.9306 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3535 3627 5.27E-07 51.5454 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 3858 3942 5.97E-07 51.5454 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5385 5463 6.46E-07 51.1602 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 4779 4870 1.04E-06 50.775 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1723 1815 1.12E-06 50.775 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 2536 2625 1.15E-06 50.3898 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 665 734 2.16E-06 49.6194 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5094 5186 2.22E-06 49.6194 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5599 5690 2.87E-06 49.2342 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1832 1920 2.88E-06 49.2342 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1072 1173 7.67E-06 48.0786 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 4060 4151 1.54E-05 46.923 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 41 102 1.64E-05 46.923 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 4883 4978 3.81E-05 45.7674 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 549 637 7.91E-05 44.997 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 1639 1708 0.000134035 44.2266 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 446 536 0.00014386 44.2266 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 849 953 0.000365032 42.6858 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 5278 5371 0.000444486 42.6858 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2575 - CGI_10011029 superfamily 247068 6074 6178 0.00273014 39.9894 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2576 - CGI_10011030 superfamily 241622 988 1051 1.59E-08 53.0781 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#2576 - CGI_10011030 superfamily 247725 1119 1216 4.52E-28 111.251 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2576 - CGI_10011030 superfamily 221808 541 667 6.36E-20 88.9442 cl15119 Tet_JBP superfamily C - "Oxygenase domain of the 2OGFeDO superfamily; A double-stranded beta helix (DSBH) fold domain of the 2-oxoglutarate (2OG)-Fe(II)-dependent dioxygenase (2OGFeDO) superfamily found in various eukaryotes, bacteria and bacteriophages. Members of this family catalyze nucleic acid modifications, such as thymidine hydroxylation during base J synthesis in kinetoplastids, and the conversion of 5 methyl-cytosine (5-mC) to 5-hydroxymethyl-cytosine (hmC), or further oxidation to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Metazoan TET proteins contain a cysteine-rich region inserted into the core of the DSBH fold. Vertebrate TET proteins are oncogenes that are mutated in various myeloid cancers. Fungal and algal versions of this family are linked to a predicted transposase and show lineage-specific expansions." Q#2576 - CGI_10011030 superfamily 243142 817 946 6.67E-12 64.5699 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#2577 - CGI_10011031 superfamily 204502 20 71 6.23E-22 87.9398 cl11146 GalKase_gal_bdg superfamily - - "Galactokinase galactose-binding signature; This is the highly conserved galactokinase signature sequence which appears to be present in all galactokinases irrespective of how many other ATP binding sites, etc that they carry. The function of this domain appears to be to bind galactose, and the domain is normally at the N-terminus of the enzymes, EC:2.7.1.6. This domain is associated with the families GHMP_kinases_C, pfam08544 and GHMP_kinases_N, pfam00288." Q#2577 - CGI_10011031 superfamily 219894 295 378 0.00396558 35.1681 cl08484 GHMP_kinases_C superfamily - - "GHMP kinases C terminal; This family includes homoserine kinases, galactokinases and mevalonate kinases." Q#2578 - CGI_10011032 superfamily 241563 74 115 1.33E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2579 - CGI_10011033 superfamily 241563 75 116 2.46E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2583 - CGI_10011037 superfamily 203773 189 384 1.18E-69 228.784 cl18250 PGAP1 superfamily - - PGAP1-like protein; The sequences found in this family are similar to PGAP1. This is an endoplasmic reticulum membrane protein with a catalytic serine containing motif that is conserved in a number of lipases. PGAP1 functions as a GPI inositol-deacylase; this deacylation is important for the efficient transport of GPI-anchored proteins from the endoplasmic reticulum to the Golgi body. Q#2584 - CGI_10011038 superfamily 241659 188 276 2.63E-23 92.2253 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#2584 - CGI_10011038 superfamily 206220 6 67 1.39E-18 78.3871 cl16569 Nudc_N superfamily - - N-terminal conserved domain of Nudc; The N-terminus of nuclear distribution gene C homolog (NUDC) proteins contains a highly conserved region consisting of a predicted three helix bundle. In the human homolog this segment has been targeted for structure determination by the Joint Center for Structural Genomics. NUDC forms a complex with other NUD proteins and is involved in several cellular division activities. Recently it was shown that NUDC regulates platelet-activating factor (PAF) acetylhydrolase with PAF being a pro-inflammatory secondary lipidic messenger. Q#2588 - CGI_10011043 superfamily 243072 309 358 0.00536599 35.4371 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2591 - CGI_10011046 superfamily 243072 301 354 0.00357396 36.2075 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2593 - CGI_10011048 superfamily 217293 4 94 3.81E-17 73.4359 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#2595 - CGI_10011050 superfamily 217293 32 231 5.68E-33 123.512 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#2595 - CGI_10011050 superfamily 202474 238 324 1.53E-06 47.6485 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#2596 - CGI_10011051 superfamily 217293 39 234 7.97E-29 111.956 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#2596 - CGI_10011051 superfamily 202474 242 326 8.96E-08 51.1153 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#2597 - CGI_10011052 superfamily 219000 211 328 2.23E-05 44.9448 cl05717 Drf_FH3 superfamily C - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#2600 - CGI_10022624 superfamily 241583 121 325 3.75E-41 152.913 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#2600 - CGI_10022624 superfamily 245321 541 605 1.69E-10 58.7911 cl10507 Disintegrin superfamily - - Disintegrin; Disintegrin. Q#2600 - CGI_10022624 superfamily 128778 647 757 0.00125281 38.7851 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#2601 - CGI_10022625 superfamily 248019 109 534 5.76E-62 215.905 cl17465 DAGK_cat superfamily - - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#2602 - CGI_10022626 superfamily 243035 17 92 2.67E-13 62.7788 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2602 - CGI_10022626 superfamily 243035 97 192 1.01E-11 58.0763 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2603 - CGI_10022627 superfamily 243035 8 112 1.79E-19 78.4305 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2604 - CGI_10022628 superfamily 241568 4 60 1.54E-12 59.0136 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#2605 - CGI_10022629 superfamily 241568 6 62 9.14E-09 49.3836 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#2606 - CGI_10022630 superfamily 241568 6 57 9.94E-09 49.3836 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#2607 - CGI_10022631 superfamily 241568 7 63 2.36E-05 40.1388 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#2608 - CGI_10022632 superfamily 248097 45 167 2.70E-16 71.1422 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2609 - CGI_10022633 superfamily 248097 209 331 5.59E-15 69.9866 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2609 - CGI_10022633 superfamily 248097 97 204 8.06E-11 58.0454 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2610 - CGI_10022634 superfamily 248097 177 302 4.34E-14 66.905 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2611 - CGI_10022635 superfamily 241758 33 178 3.20E-23 90.507 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#2612 - CGI_10022636 superfamily 243092 1251 1537 6.83E-06 48.4852 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2612 - CGI_10022636 superfamily 243092 1006 1286 1.69E-05 47.3296 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2612 - CGI_10022636 superfamily 247743 481 598 0.00239653 39.1052 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#2613 - CGI_10022637 superfamily 241983 480 646 1.85E-23 101.666 cl00614 ADP_ribosyl_GH superfamily N - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#2613 - CGI_10022637 superfamily 243212 809 940 1.90E-19 86.6289 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#2613 - CGI_10022637 superfamily 215866 677 786 3.14E-14 71.5876 cl18349 Arrestin_N superfamily N - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#2613 - CGI_10022637 superfamily 241983 290 379 3.50E-11 63.9162 cl00614 ADP_ribosyl_GH superfamily C - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#2614 - CGI_10022638 superfamily 248305 5 334 2.32E-33 129.78 cl17751 Glyco_transf_22 superfamily - - Alg9-like mannosyltransferase family; Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. Q#2615 - CGI_10022639 superfamily 243362 184 225 0.00766767 35.4787 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#2617 - CGI_10022641 superfamily 245213 570 605 6.63E-09 53.0242 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2617 - CGI_10022641 superfamily 245213 607 638 6.30E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2617 - CGI_10022641 superfamily 245213 533 568 1.05E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2617 - CGI_10022641 superfamily 245213 501 529 0.00142501 37.4562 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2618 - CGI_10022642 superfamily 247794 31 448 0 586.448 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#2618 - CGI_10022642 superfamily 245206 464 586 0.00047313 41.5196 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#2619 - CGI_10022643 superfamily 245206 17 89 0.00016039 41.5196 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#2620 - CGI_10022644 superfamily 218109 358 418 4.16E-07 48.0906 cl12292 Gly_transf_sug superfamily N - "Glycosyltransferase sugar-binding region containing DXD motif; The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases." Q#2620 - CGI_10022644 superfamily 217539 385 463 0.000723852 39.9989 cl18414 Nucleotid_trans superfamily NC - Nucleotide-diphospho-sugar transferase; Proteins in this family have been been predicted to be nucleotide-diphospho-sugar transferases. Q#2621 - CGI_10022645 superfamily 241832 11 74 5.94E-13 60.785 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#2622 - CGI_10022646 superfamily 241632 19 379 9.84E-121 386.993 cl00137 SERPIN superfamily - - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#2622 - CGI_10022646 superfamily 247068 1236 1332 7.71E-29 113.948 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 1027 1125 1.36E-26 107.399 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 1446 1548 2.32E-26 106.629 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 1138 1229 5.60E-21 91.2209 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 504 581 3.20E-20 88.9097 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 691 821 1.67E-17 80.8205 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 1341 1438 4.06E-17 79.6649 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 932 1002 9.62E-10 58.0938 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 1557 1644 1.21E-09 57.7086 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 832 911 1.23E-05 45.7674 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2622 - CGI_10022646 superfamily 247068 605 677 1.49E-05 45.3822 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2623 - CGI_10022647 superfamily 241632 19 378 1.20E-132 387.379 cl00137 SERPIN superfamily - - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#2624 - CGI_10022648 superfamily 241632 19 378 7.68E-132 385.067 cl00137 SERPIN superfamily - - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#2625 - CGI_10022649 superfamily 241563 73 112 0.000138132 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2626 - CGI_10022650 superfamily 110440 163 187 0.00451422 33.5353 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2627 - CGI_10022651 superfamily 217252 364 476 6.97E-36 129.222 cl08372 Pyr_redox_dim superfamily - - "Pyridine nucleotide-disulphide oxidoreductase, dimerisation domain; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases." Q#2627 - CGI_10022651 superfamily 215691 188 268 2.92E-13 65.685 cl15766 Pyr_redox superfamily - - Pyridine nucleotide-disulphide oxidoreductase; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. Q#2627 - CGI_10022651 superfamily 183782 6 53 4.58E-05 44.5039 cl18137 PRK12834 superfamily C - putative FAD-binding dehydrogenase; Reviewed Q#2627 - CGI_10022651 superfamily 248054 149 218 0.000719235 39.5931 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#2629 - CGI_10022653 superfamily 110440 314 340 0.00646889 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2632 - CGI_10022656 superfamily 241574 90 291 1.51E-20 87.2561 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#2635 - CGI_10022659 superfamily 207637 73 145 7.48E-12 57.9331 cl02541 CIDE_N superfamily - - "CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein." Q#2637 - CGI_10022661 superfamily 218284 1 61 3.51E-09 48.7899 cl04786 SOUL superfamily N - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#2638 - CGI_10022662 superfamily 243092 138 290 0.00958899 36.1588 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2640 - CGI_10022664 superfamily 245596 1 90 3.94E-27 101.251 cl11394 Glyco_tranf_GTA_type superfamily NC - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#2641 - CGI_10022665 superfamily 217473 105 328 9.42E-27 110.532 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#2644 - CGI_10022668 superfamily 147446 3 79 2.15E-32 109.068 cl05015 UPF0197 superfamily - - Uncharacterized protein family (UPF0197); This family of proteins is functionally uncharacterized. Q#2645 - CGI_10022669 superfamily 247736 73 122 3.58E-10 52.6632 cl17182 NAT_SF superfamily N - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#2646 - CGI_10022670 superfamily 247736 115 184 7.56E-11 57.2856 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#2648 - CGI_10022672 superfamily 247727 177 273 4.78E-14 67.455 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#2649 - CGI_10022673 superfamily 215731 59 103 0.00215236 34.8813 cl08245 Gln-synt_C superfamily N - "Glutamine synthetase, catalytic domain; Glutamine synthetase, catalytic domain. " Q#2650 - CGI_10022674 superfamily 215731 75 108 0.000104956 38.7333 cl08245 Gln-synt_C superfamily NC - "Glutamine synthetase, catalytic domain; Glutamine synthetase, catalytic domain. " Q#2651 - CGI_10007687 superfamily 245226 498 546 5.34E-07 48.4509 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#2654 - CGI_10007690 superfamily 222150 272 295 7.21E-06 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2654 - CGI_10007690 superfamily 222150 242 269 0.000911481 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2654 - CGI_10007690 superfamily 246975 288 308 0.00503953 34.2449 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#2656 - CGI_10007692 superfamily 241754 13 721 0 812.267 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#2656 - CGI_10007692 superfamily 210118 747 769 0.000535612 39.2311 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#2659 - CGI_10006128 superfamily 218286 508 809 8.03E-99 312.486 cl04794 Vps16_C superfamily - - "Vps16, C-terminal region; This protein forms part of the Class C vacuolar protein sorting (Vps) complex. Vps16 is essential for vacuolar protein sorting, which is essential for viability in plants, but not yeast. The Class C Vps complex is required for SNARE-mediated membrane fusion at the lysosome-like yeast vacuole. It is thought to play essential roles in membrane docking and fusion at the Golgi-to-endosome and endosome-to-vacuole stages of transport. The role of VPS16 in this complex is not known." Q#2661 - CGI_10006130 superfamily 247725 148 280 1.57E-26 101.646 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2662 - CGI_10006131 superfamily 245201 213 406 9.56E-41 146.613 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#2662 - CGI_10006131 superfamily 245309 39 117 2.34E-07 48.6444 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#2663 - CGI_10006132 superfamily 243100 224 285 8.56E-09 51.0255 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#2664 - CGI_10006133 superfamily 245213 52 88 9.85E-06 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2664 - CGI_10006133 superfamily 245213 10 45 3.78E-05 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2667 - CGI_10006138 superfamily 241584 728 821 1.11E-09 57.1211 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2667 - CGI_10006138 superfamily 241584 949 1050 1.49E-07 50.5727 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2667 - CGI_10006138 superfamily 243035 5 115 1.05E-06 48.385 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2667 - CGI_10006138 superfamily 241584 1059 1149 0.000221568 40.9427 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2667 - CGI_10006138 superfamily 245814 632 714 3.02E-15 73.1086 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2667 - CGI_10006138 superfamily 245814 526 612 3.31E-13 67.1452 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2667 - CGI_10006138 superfamily 245814 167 232 1.03E-09 57.2248 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2667 - CGI_10006138 superfamily 245814 244 321 3.62E-09 55.6392 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2667 - CGI_10006138 superfamily 245814 373 446 1.51E-08 53.6633 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2667 - CGI_10006138 superfamily 245814 463 512 1.32E-05 44.32 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2669 - CGI_10008263 superfamily 241758 5 151 1.83E-26 97.8258 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#2670 - CGI_10008264 superfamily 241758 26 111 1.52E-12 59.691 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#2671 - CGI_10008265 superfamily 241758 7 138 9.17E-18 74.7138 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#2672 - CGI_10008266 superfamily 110440 54 80 0.00896311 29.6833 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2673 - CGI_10008267 superfamily 241758 160 209 3.39E-14 66.2394 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#2674 - CGI_10008268 superfamily 241758 825 874 5.25E-13 67.395 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#2675 - CGI_10008269 superfamily 222324 17 83 2.58E-13 60.4834 cl16352 zf-3CxxC superfamily - - Zinc-binding domain; This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue. Q#2676 - CGI_10008270 superfamily 241845 110 274 6.34E-39 138.238 cl00407 tRNA_m1G_MT superfamily - - "tRNA (Guanine-1)-methyltransferase; This is a family of tRNA (Guanine-1)-methyltransferases EC:2.1.1.31. In E.coli K12 this enzyme catalyzes the conversion of a guanosine residue to N1-methylguanine in position 37, next to the anticodon, in tRNA." Q#2677 - CGI_10008271 superfamily 243072 300 424 9.50E-37 137.9 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2677 - CGI_10008271 superfamily 243072 232 358 1.47E-35 134.819 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2677 - CGI_10008271 superfamily 243072 1267 1394 1.59E-35 134.433 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2677 - CGI_10008271 superfamily 243072 1374 1496 1.66E-30 120.181 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2677 - CGI_10008271 superfamily 243072 1436 1562 2.46E-30 119.411 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2677 - CGI_10008271 superfamily 243072 470 621 6.32E-29 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2677 - CGI_10008271 superfamily 243072 142 291 1.85E-26 108.24 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2677 - CGI_10008271 superfamily 247799 1895 1956 2.99E-14 71.0519 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#2677 - CGI_10008271 superfamily 243072 1238 1270 1.23E-05 45.2376 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2678 - CGI_10008272 superfamily 243175 146 255 3.30E-32 116.093 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#2678 - CGI_10008272 superfamily 241832 58 131 4.45E-27 100.758 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#2679 - CGI_10008273 superfamily 246598 351 507 3.08E-97 294.118 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#2679 - CGI_10008273 superfamily 117535 93 182 0.000270219 39.4449 cl07540 DUF1873 superfamily - - Domain of unknown function (DUF1873); This domain is predominantly found in the amino terminal region of Ubiquitin carboxyl-terminal hydrolase 8 (USP8). It has no known function. Q#2680 - CGI_10008274 superfamily 243095 23 200 9.01E-38 140.675 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#2680 - CGI_10008274 superfamily 247637 849 937 4.43E-16 79.2002 cl16912 MDR superfamily C - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#2681 - CGI_10008275 superfamily 247742 72 396 1.02E-145 421.742 cl17188 enolase_like superfamily - - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#2682 - CGI_10008276 superfamily 247637 8 353 1.06E-101 306.083 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#2683 - CGI_10008277 superfamily 241573 10 322 4.08E-103 320.433 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#2683 - CGI_10008277 superfamily 241653 341 484 9.92E-40 143.208 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#2685 - CGI_10008279 superfamily 245864 13 447 3.45E-114 346.188 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#2688 - CGI_10003022 superfamily 245814 103 190 1.50E-15 68.6262 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2694 - CGI_10020619 superfamily 247856 140 189 3.08E-05 39.8385 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#2695 - CGI_10020620 superfamily 245213 434 471 4.18E-08 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 512 548 2.64E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 357 394 5.05E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 589 625 8.40E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 319 355 8.63E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 473 510 5.98E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 706 744 1.71E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 822 858 6.82E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 396 431 6.97E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 746 781 0.00012265 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 627 662 0.000804226 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 784 819 0.00097964 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 281 317 0.00258309 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2695 - CGI_10020620 superfamily 245213 665 704 0.00328733 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2696 - CGI_10020621 superfamily 241750 48 427 3.18E-176 500.633 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#2698 - CGI_10020623 superfamily 243409 36 144 5.80E-23 90.7185 cl03389 RNase_P_p30 superfamily - - RNase P subunit p30; This protein is part of the RNase P complex that is involved in tRNA maturation. Q#2700 - CGI_10020625 superfamily 241640 125 398 1.61E-74 234.093 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#2701 - CGI_10020626 superfamily 241584 273 356 2.52E-14 69.8327 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2701 - CGI_10020626 superfamily 241584 369 462 7.23E-14 68.2919 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2701 - CGI_10020626 superfamily 241584 171 262 1.25E-13 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2701 - CGI_10020626 superfamily 241584 467 563 8.27E-07 47.4911 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2702 - CGI_10020627 superfamily 241574 84 280 1.42E-90 282.167 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#2702 - CGI_10020627 superfamily 241574 340 566 1.37E-48 170.074 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#2703 - CGI_10020628 superfamily 243064 23 150 9.88E-31 110.911 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#2704 - CGI_10020629 superfamily 243064 23 150 8.34E-35 121.697 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#2705 - CGI_10020630 superfamily 243092 205 512 8.35E-59 197.943 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2705 - CGI_10020630 superfamily 128914 40 90 3.45E-05 41.7878 cl15352 CTLH superfamily - - C-terminal to LisH motif; Alpha-helical motif of unknown function. Q#2705 - CGI_10020630 superfamily 199226 5 38 0.000725805 37.414 cl11662 LisH superfamily - - "LisH; The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex." Q#2706 - CGI_10020631 superfamily 247736 39 98 2.41E-06 42.6481 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#2706 - CGI_10020631 superfamily 247736 138 186 0.00616182 34.5069 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#2707 - CGI_10020632 superfamily 241563 116 158 6.07E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2707 - CGI_10020632 superfamily 110440 517 543 0.00067231 37.7725 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2707 - CGI_10020632 superfamily 128778 154 275 0.000927335 38.3999 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#2709 - CGI_10020634 superfamily 247723 40 117 1.10E-53 166.568 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2710 - CGI_10020635 superfamily 248305 12 453 4.42E-39 147.499 cl17751 Glyco_transf_22 superfamily - - Alg9-like mannosyltransferase family; Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. Q#2711 - CGI_10020636 superfamily 201217 259 306 6.65E-09 52.9132 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#2711 - CGI_10020636 superfamily 201217 208 255 1.07E-08 52.1428 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#2711 - CGI_10020636 superfamily 201217 106 154 2.58E-07 48.2908 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#2711 - CGI_10020636 superfamily 201217 157 204 1.51E-06 45.9796 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#2711 - CGI_10020636 superfamily 201217 309 360 3.27E-06 45.2092 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#2711 - CGI_10020636 superfamily 201217 54 102 8.42E-06 43.6684 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#2711 - CGI_10020636 superfamily 205718 39 67 8.33E-05 40.5514 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#2712 - CGI_10020637 superfamily 242889 303 402 6.48E-20 84.5769 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#2714 - CGI_10020639 superfamily 247740 32 262 2.87E-99 300.565 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#2715 - CGI_10020640 superfamily 218802 8 147 4.71E-40 142.115 cl05462 DUF862 superfamily - - "PPPDE putative peptidase domain; The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p)." Q#2716 - CGI_10020641 superfamily 243092 464 726 6.65E-61 210.654 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2716 - CGI_10020641 superfamily 243092 181 494 1.77E-46 169.823 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2716 - CGI_10020641 superfamily 204007 738 873 1.64E-56 192.782 cl07300 Utp13 superfamily - - Utp13 specific WD40 associated domain; Utp13 is a component of the five protein Pwp2 complex that forms part of a stable particle subunit independent of the U3 small nucleolar ribonucleoprotein that is essential for the initial assembly steps of the 90S pre-ribosome. Pwp2 is capable of interacting directly with the 35 S pre-rRNA 5' end. Q#2717 - CGI_10020642 superfamily 247792 19 60 3.08E-08 49.7516 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2719 - CGI_10020644 superfamily 241546 212 307 1.72E-24 100.428 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#2720 - CGI_10020645 superfamily 247792 139 192 0.00019315 37.8104 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2722 - CGI_10020647 superfamily 247684 29 248 1.20E-62 204.048 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#2729 - CGI_10019692 superfamily 215647 279 487 4.73E-31 120.791 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#2729 - CGI_10019692 superfamily 243029 210 265 1.94E-11 60.4421 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#2729 - CGI_10019692 superfamily 241557 43 90 3.94E-07 48.6684 cl00022 YbaK_like superfamily C - "YbaK-like. The YbaK family of deacylase domains includes the INS amino acid-editing domain of the bacterial class II prolyl tRNA synthetase (ProRS), and it's trans-acting homologs, YbaK, ProX, and PrdX. The primary function of INS is to hydrolyze mischarged cysteinyl-tRNA(Pro)'s, thus helping ensure the fidelity of translation. Organisms whose ProRS lacks the INS domain express an INS homolog in trans (e.g. YbaK, ProX, or PrdX)." Q#2733 - CGI_10019696 superfamily 247723 105 183 1.66E-43 146.338 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2733 - CGI_10019696 superfamily 247723 26 96 8.44E-41 138.923 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2735 - CGI_10019698 superfamily 238191 74 594 6.82E-108 336.612 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#2737 - CGI_10019700 superfamily 241600 113 253 7.96E-47 157.017 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2737 - CGI_10019700 superfamily 241619 17 78 0.00456656 34.0949 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#2738 - CGI_10019702 superfamily 241600 3 216 2.67E-76 231.36 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#2739 - CGI_10019703 superfamily 110440 237 263 0.00258211 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#2740 - CGI_10019704 superfamily 247809 640 811 1.94E-28 113.91 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#2741 - CGI_10019705 superfamily 241622 44 125 3.45E-24 90.7038 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#2742 - CGI_10019706 superfamily 242120 23 222 2.40E-92 273.25 cl00821 Ribosomal_S3Ae superfamily - - Ribosomal S3Ae family; Ribosomal S3Ae family. Q#2743 - CGI_10019707 superfamily 242120 60 264 1.40E-96 285.961 cl00821 Ribosomal_S3Ae superfamily - - Ribosomal S3Ae family; Ribosomal S3Ae family. Q#2745 - CGI_10019709 superfamily 245227 62 476 0 653.501 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#2746 - CGI_10019710 superfamily 243066 457 544 4.66E-08 53.8488 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#2746 - CGI_10019710 superfamily 210118 2737 2755 0.00463737 37.6903 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#2747 - CGI_10019711 superfamily 245213 646 675 0.00217179 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2747 - CGI_10019711 superfamily 234316 2427 2477 4.00E-05 44.398 cl14012 Rhs_assc_core superfamily C - "RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain." Q#2747 - CGI_10019711 superfamily 248053 936 1010 0.000254889 41.8937 cl17499 Peptidase_M14NE-CP-C_like superfamily - - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#2747 - CGI_10019711 superfamily 219677 815 845 0.00217937 38.5728 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#2747 - CGI_10019711 superfamily 245826 1699 1735 0.00741247 37.2809 cl11982 RHS_repeat superfamily - - RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. Q#2748 - CGI_10019712 superfamily 247692 260 619 2.61E-52 184.262 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#2748 - CGI_10019712 superfamily 247692 107 283 4.31E-06 48.3449 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#2749 - CGI_10019713 superfamily 217181 103 189 5.75E-41 141.264 cl18395 Pirin superfamily - - "Pirin; This family consists of Pirin proteins from both eukaryotes and prokaryotes. The function of Pirin is unknown but the gene coding for this protein is known to be expressed in all tissues in the human body although it is expressed most strongly in the liver and heart. Pirin is known to be a nuclear protein, exclusively localised within the nucleoplasma and predominantly concentrated within dot-like subnuclear structures. A tomato homologue of human Pirin has been found to be induced during programmed cell death. Human Pirin interacts with Bcl-3 and NFI and hence is probably involved in the regulation of DNA transcription and replication. It appears to be an Fe(II)-containing member of the Cupin superfamily." Q#2749 - CGI_10019713 superfamily 203319 242 349 1.91E-34 123.45 cl05342 Pirin_C superfamily - - Pirin C-terminal cupin domain; This region is found the C-terminal half of the Pirin protein. Q#2750 - CGI_10019714 superfamily 217181 33 116 1.62E-32 116.996 cl18395 Pirin superfamily - - "Pirin; This family consists of Pirin proteins from both eukaryotes and prokaryotes. The function of Pirin is unknown but the gene coding for this protein is known to be expressed in all tissues in the human body although it is expressed most strongly in the liver and heart. Pirin is known to be a nuclear protein, exclusively localised within the nucleoplasma and predominantly concentrated within dot-like subnuclear structures. A tomato homologue of human Pirin has been found to be induced during programmed cell death. Human Pirin interacts with Bcl-3 and NFI and hence is probably involved in the regulation of DNA transcription and replication. It appears to be an Fe(II)-containing member of the Cupin superfamily." Q#2750 - CGI_10019714 superfamily 203319 169 276 7.90E-30 109.583 cl05342 Pirin_C superfamily - - Pirin C-terminal cupin domain; This region is found the C-terminal half of the Pirin protein. Q#2751 - CGI_10019715 superfamily 247792 13 60 2.96E-06 45.1292 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2751 - CGI_10019715 superfamily 241563 151 188 0.000122255 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2751 - CGI_10019715 superfamily 128778 213 318 0.000479627 39.1703 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#2752 - CGI_10019716 superfamily 248012 16 151 5.86E-20 81.2156 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#2753 - CGI_10019717 superfamily 248012 16 155 8.01E-20 80.8304 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#2755 - CGI_10019719 superfamily 248012 16 155 6.24E-18 75.8228 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#2756 - CGI_10010239 superfamily 246664 19 420 4.18E-132 391.553 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#2757 - CGI_10010240 superfamily 246664 315 697 4.45E-132 402.338 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#2757 - CGI_10010240 superfamily 246664 214 256 5.63E-06 48.0754 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#2760 - CGI_10010243 superfamily 199166 118 317 5.63E-23 95.472 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#2760 - CGI_10010243 superfamily 243074 29 72 1.14E-12 62.5241 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#2760 - CGI_10010243 superfamily 199166 331 386 0.00137675 38.4624 cl15308 AMN1 superfamily C - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#2761 - CGI_10010244 superfamily 243092 327 643 2.81E-19 87.7756 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2761 - CGI_10010244 superfamily 151977 101 131 1.39E-07 48.9622 cl13056 Dynein_IC2 superfamily - - "Cytoplasmic dynein 1 intermediate chain 2; Intermediate chain IC 2 forms part of the complex cytoplasmic dynein 1 along with a heavy chain (HC), two light intermediate chains (LICs) and three light chains (LCs). The complex is responsible for hydrolysing ATP to generate force toward the minus end of microtubules. IC binds to the HC via the N terminal binding domain on the HC and ICs contain binding sites for the LCs. The ICs are responsible for binding to kinetochores and the Golgi apparatus through an interaction with the p150Glued subunit of dynactin which is another complex." Q#2762 - CGI_10010245 superfamily 243035 77 204 7.64E-29 110.787 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2762 - CGI_10010245 superfamily 243035 229 331 1.08E-17 79.5861 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2762 - CGI_10010245 superfamily 243035 358 430 1.84E-07 49.5406 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2762 - CGI_10010245 superfamily 243035 40 75 7.60E-06 44.6562 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2763 - CGI_10010246 superfamily 243035 89 203 4.03E-33 124.269 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2763 - CGI_10010246 superfamily 243035 378 492 2.79E-22 93.4533 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2763 - CGI_10010246 superfamily 243035 227 352 1.12E-29 115.094 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2763 - CGI_10010246 superfamily 243035 527 629 1.05E-24 100.143 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2763 - CGI_10010246 superfamily 243035 651 756 1.27E-24 100.457 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2763 - CGI_10010246 superfamily 243035 4 53 5.35E-09 54.6095 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2764 - CGI_10010247 superfamily 243035 89 142 3.96E-13 61.847 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2765 - CGI_10010248 superfamily 243035 130 200 7.25E-06 42.2218 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2765 - CGI_10010248 superfamily 243035 11 120 1.56E-11 57.995 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2766 - CGI_10010249 superfamily 241607 36 61 0.00470299 30.3402 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#2767 - CGI_10010250 superfamily 222269 74 285 1.05E-34 129.751 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#2769 - CGI_10010252 superfamily 246723 7 505 0 598.781 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#2771 - CGI_10010254 superfamily 243095 300 514 1.57E-74 239.94 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#2772 - CGI_10010255 superfamily 241641 98 137 0.000242483 38.2137 cl00150 TY superfamily C - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#2773 - CGI_10010256 superfamily 245819 153 340 5.23E-47 168.141 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#2774 - CGI_10010257 superfamily 247692 198 547 2.40E-45 163.616 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#2774 - CGI_10010257 superfamily 247692 49 240 2.93E-12 67.6708 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#2775 - CGI_10010258 superfamily 247058 51 237 1.00E-37 133.455 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#2777 - CGI_10002431 superfamily 241563 62 100 4.06E-05 41.1188 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2777 - CGI_10002431 superfamily 241563 8 53 0.00436691 35.1476 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2780 - CGI_10002434 superfamily 245596 104 318 6.08E-113 329.543 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#2784 - CGI_10019723 superfamily 248054 54 102 0.00216193 36.296 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#2785 - CGI_10019724 superfamily 247724 56 176 1.12E-11 59.3436 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2789 - CGI_10019729 superfamily 191128 47 143 1.70E-10 53.6968 cl04846 Ninjurin superfamily - - Ninjurin; Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation and function in some tissues. Q#2790 - CGI_10019730 superfamily 191128 75 112 7.35E-05 38.674 cl04846 Ninjurin superfamily C - Ninjurin; Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation and function in some tissues. Q#2791 - CGI_10019731 superfamily 241580 71 148 1.54E-48 161.568 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#2793 - CGI_10019733 superfamily 216112 675 1023 7.13E-112 354.295 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#2795 - CGI_10019735 superfamily 219619 290 342 1.21E-17 78.0183 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#2795 - CGI_10019735 superfamily 219619 432 509 7.93E-13 64.1511 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#2796 - CGI_10019736 superfamily 202662 235 271 0.00696015 35.6106 cl18231 B3_4 superfamily NC - B3/4 domain; This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins. Q#2798 - CGI_10019738 superfamily 241659 146 227 6.39E-13 61.3819 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#2798 - CGI_10019738 superfamily 241659 41 119 3.15E-06 43.2775 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#2801 - CGI_10019741 superfamily 241563 75 115 8.24E-08 49.4 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#2801 - CGI_10019741 superfamily 217754 119 233 4.62E-05 42.233 cl04284 RasGAP_C superfamily C - RasGAP C-terminus; RasGAP C-terminus. Q#2803 - CGI_10019743 superfamily 243250 357 592 1.63E-114 355.033 cl02959 Glyco_hydro_9 superfamily C - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#2803 - CGI_10019743 superfamily 243250 590 713 2.83E-29 120.446 cl02959 Glyco_hydro_9 superfamily N - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#2803 - CGI_10019743 superfamily 248312 59 202 1.07E-08 53.9052 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#2804 - CGI_10019744 superfamily 245864 48 346 3.83E-40 147.81 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#2806 - CGI_10019746 superfamily 216566 1052 1151 0.00110689 40.6337 cl18370 Peptidase_M23 superfamily - - "Peptidase family M23; Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins such as Escherichia coli murein hydrolase activator NlpD, for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown." Q#2807 - CGI_10019747 superfamily 203136 103 223 9.96E-10 55.4284 cl04867 LRAT superfamily - - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#2807 - CGI_10019747 superfamily 203136 354 396 0.0012 37.324 cl04867 LRAT superfamily N - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#2808 - CGI_10019748 superfamily 244943 21 206 8.26E-51 166.181 cl08415 TPK superfamily - - "Thiamine pyrophosphokinase; Thiamine pyrophosphokinase (TPK, EC:2.7.6.2, also spelled thiamin pyrophosphokinase) catalyzes the transfer of a pyrophosphate group from ATP to vitamin B1 (thiamine) to form the coenzyme thiamine pyrophosphate (TPP). TPP is required for central metabolic functions, and thiamine deficiency is associated with potentially fatal human diseases. The structure of thiamine pyrophosphokinase suggests that the enzyme may operate by a mechanism of pyrophosphoryl transfer similar to those described for pyrophosphokinases functioning in nucleotide biosynthesis." Q#2809 - CGI_10019749 superfamily 217882 489 783 6.76E-115 353.896 cl11712 ORC2 superfamily - - Origin recognition complex subunit 2; All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in. Q#2809 - CGI_10019749 superfamily 242406 60 203 1.48E-17 80.7133 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#2810 - CGI_10019750 superfamily 220692 59 364 2.14E-25 103.823 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#2811 - CGI_10004417 superfamily 244307 1505 1968 0 682.961 cl06123 DHR2_DOCK superfamily - - "Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins; DOCK proteins comprise a family of atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. They are also called the CZH (CED-5, Dock180, and MBC-zizimin homology) family, after the first family members identified. Dock180 was first isolated as a binding partner for the adaptor protein Crk. The Caenorhabditis elegans protein, Ced-5, is essential for cell migration and phagocytosis, while the Drosophila ortholog, Myoblast city (MBC), is necessary for myoblast fusion and dorsal closure. DOCKs are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1 (or Dock180), 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1, and DHR-2 (also called CZH2 or Docker). This alignment model represents the DHR-2 domain of DOCK proteins, which contains the catalytic GEF activity for Rac and/or Cdc42." Q#2811 - CGI_10004417 superfamily 246669 540 677 1.44E-65 222.228 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#2811 - CGI_10004417 superfamily 221285 65 146 2.14E-13 68.917 cl13339 DUF3398 superfamily - - Domain of unknown function (DUF3398); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 100 amino acids in length. Q#2812 - CGI_10004418 superfamily 245201 33 403 0 722.332 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#2812 - CGI_10004418 superfamily 241566 1076 1122 7.29E-11 59.8132 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#2812 - CGI_10004418 superfamily 247725 1044 1088 1.81E-07 50.8481 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2812 - CGI_10004418 superfamily 149849 877 943 0.00133433 38.6283 cl07487 Rho_Binding superfamily - - "Rho Binding; Rho Binding Domain is responsible for the recognition and binding of Rho binding domain-containing proteins (such as ROCK) to Rho, resulting in activation of the GTPase which in turn modulates the phosphorylation of various signalling proteins. This domain is within an amphipathic alpha-helical coiled-coil and interacts with Rho through predominantly hydrophobic interactions." Q#2813 - CGI_10004419 superfamily 241862 200 349 1.13E-21 91.6488 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#2816 - CGI_10007066 superfamily 248012 7 75 0.00377584 34.2212 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#2817 - CGI_10007067 superfamily 217740 29 183 7.70E-12 60.4529 cl18427 Scramblase superfamily C - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#2818 - CGI_10007068 superfamily 247856 68 123 7.46E-05 38.6829 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#2821 - CGI_10007071 superfamily 247723 241 309 1.25E-36 133.226 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2821 - CGI_10007071 superfamily 207717 162 236 9.64E-26 102.133 cl02755 LAM superfamily - - "LA motif RNA-binding domain; This domain is found at the N-terminus of La RNA-binding proteins as well as in other related proteins. Typically, the domain co-occurs with an RNA-recognition motif (RRM), and together these domains function to bind primary transcripts of RNA polymerase III in the La autoantigen (Lupus La protein, LARP3, or Sjoegren syndrome type B antigen, SS-B). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes." Q#2822 - CGI_10007072 superfamily 241706 22 102 2.77E-48 151.533 cl00229 eIF1_SUI1_like superfamily - - "Eukaryotic initiation factor 1 and related proteins; Members of the eIF1/SUI1 (eukaryotic initiation factor 1) family are found in eukaryotes, archaea, and some bacteria; eukaryotic members are understood to play an important role in accurate initiator codon recognition during translation initiation. eIF1 interacts with 18S rRNA in the 40S ribosomal subunit during eukaryotic translation initiation. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown. The function of non-eukaryotic family members is also unclear." Q#2824 - CGI_10007074 superfamily 241647 1288 1316 1.37E-07 50.219 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#2824 - CGI_10007074 superfamily 243091 484 607 2.12E-40 147.481 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#2824 - CGI_10007074 superfamily 149349 1355 1443 5.69E-22 93.0782 cl07026 SRI superfamily - - "SRI (Set2 Rpb1 interacting) domain; The SRI (Set2 Rpb1 interacting) domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation. This domain is conserved from yeast to humans. Members of this family form a compact, closed three-helix bundle, with an up-down-up topology. The first and second helices are antiparallel to each other and are of similar length; the third helix, which is packed across helices alpha1 and alpha2 is slightly shorter, consisting of only 15 amino acids. Most conserved hydrophobic residues are largely buried in the interior of the structure and form an extensive and contiguous hydrophobic core that stabilises the packing of the three-helix bundle. This domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation." Q#2824 - CGI_10007074 superfamily 197795 432 483 2.26E-20 87.455 cl02673 AWS superfamily - - associated with SET domains; subdomain of PRESET Q#2824 - CGI_10007074 superfamily 214703 608 624 0.000167436 41.2368 cl02636 PostSET superfamily - - Cysteine-rich motif following a subset of SET domains; Cysteine-rich motif following a subset of SET domains. Q#2825 - CGI_10007075 superfamily 247750 44 289 5.67E-71 224.664 cl17196 E1_enzyme_family superfamily - - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#2826 - CGI_10007076 superfamily 222431 338 426 1.52E-21 88.8185 cl16447 RPAP3_C superfamily - - Potential Monad-binding region of RPAP3; This domain is found at the C-terminus of RNA-polymerase II-associated proteins. These proteins bind to Monad and are involved in regulating apoptosis. They contain TPR-repeats towards the N_terminus. Q#2828 - CGI_10007078 superfamily 221630 24 91 9.20E-07 42.3982 cl13918 CWC25 superfamily - - "Pre-mRNA splicing factor; This domain family is found in eukaryotes, and is approximately 100 amino acids in length. The family is found in association with pfam10197. There is a single completely conserved residue Y that may be functionally important. Cwc25 has been identified to associate with pre-mRNA splicing factor Cef1/Ntc85, a component of the Prp19-associated complex (NTC) involved in spliceosome activation. Cwc25 is neither tightly associated with NTC nor required for spliceosome activation, but is required for the first catalytic reaction." Q#2837 - CGI_10009832 superfamily 245213 294 332 3.29E-08 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2837 - CGI_10009832 superfamily 245213 525 560 5.96E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2837 - CGI_10009832 superfamily 245213 448 484 1.21E-06 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2837 - CGI_10009832 superfamily 245213 412 445 6.38E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2837 - CGI_10009832 superfamily 245213 494 521 0.000128544 40.6978 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2837 - CGI_10009832 superfamily 245213 335 369 0.000293233 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2837 - CGI_10009832 superfamily 245213 377 407 0.000719449 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2837 - CGI_10009832 superfamily 219501 28 104 2.31E-19 83.9189 cl06622 MNNL superfamily - - N terminus of Notch ligand; This entry represents a region of conserved sequence at the N terminus of several Notch ligand proteins. Q#2840 - CGI_10009835 superfamily 201526 50 123 8.99E-27 98.7644 cl09522 Synaptobrevin superfamily - - Synaptobrevin; Synaptobrevin. Q#2841 - CGI_10009836 superfamily 241874 51 463 0 550.356 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2841 - CGI_10009836 superfamily 241874 12 57 9.01E-16 78.8718 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2841 - CGI_10009836 superfamily 245010 467 536 2.35E-05 43.1348 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#2841 - CGI_10009836 superfamily 241874 520 563 0.0025599 39.1334 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2842 - CGI_10009837 superfamily 248458 52 236 1.05E-05 46.5381 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#2842 - CGI_10009837 superfamily 247913 428 609 8.75E-29 120.656 cl17359 PTR2 superfamily N - POT family; The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters. Q#2843 - CGI_10009838 superfamily 243034 551 650 2.64E-21 90.5171 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2843 - CGI_10009838 superfamily 243034 619 718 2.26E-17 78.9611 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2843 - CGI_10009838 superfamily 243034 59 166 4.98E-05 42.3672 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2843 - CGI_10009838 superfamily 243034 440 581 0.000235471 40.4412 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#2844 - CGI_10009839 superfamily 247723 392 473 4.28E-39 137.31 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2844 - CGI_10009839 superfamily 247723 10 82 9.28E-31 113.483 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2845 - CGI_10009840 superfamily 241578 693 859 3.18E-13 69.517 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#2845 - CGI_10009840 superfamily 246031 319 440 3.66E-20 89.5212 cl12567 Beta-Casp superfamily - - Beta-Casp domain; The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. Q#2846 - CGI_10009841 superfamily 218912 51 139 1.62E-05 43.7879 cl18485 COG2 superfamily N - "COG (conserved oligomeric Golgi) complex component, COG2; The COG complex comprises eight proteins COG1-8. The COG complex plays critical roles in Golgi structure and function. The proposed function of the complex is to mediate the initial physical contact between transport vesicles and their membrane targets. A comparable role in tethering vesicles has been suggested for at least six additional large multisubunit complexes, including the exocyst, a complex that mediates trafficking to the plasma membrane. COG2 structure reveals a six-helix bundle with few conserved surface features but a general resemblance to recently determined crystal structures of four different exocyst subunits. These bundles inCOG2 may act as platforms for interaction with other trafficing proteins including SNAREs (soluble N-ethylmaleimide factor attachment protein receptors) and Rabs." Q#2847 - CGI_10009842 superfamily 241646 462 506 3.46E-06 44.7483 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#2848 - CGI_10001210 superfamily 243077 7 59 3.48E-15 64.4889 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#2849 - CGI_10015052 superfamily 248097 82 201 5.95E-21 84.6242 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#2850 - CGI_10015053 superfamily 216456 474 576 8.40E-22 93.5422 cl03182 RYDR_ITPR superfamily C - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#2850 - CGI_10015053 superfamily 197746 112 166 1.21E-05 43.4839 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#2850 - CGI_10015053 superfamily 197746 310 338 0.00206054 36.5503 cl02624 MIR superfamily C - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#2851 - CGI_10015054 superfamily 241599 484 532 3.25E-10 56.4829 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2851 - CGI_10015054 superfamily 241599 395 453 2.38E-08 51.0901 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2852 - CGI_10015055 superfamily 241599 482 530 9.32E-11 58.0237 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2852 - CGI_10015055 superfamily 241599 393 451 1.85E-08 51.4753 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#2853 - CGI_10015056 superfamily 216456 610 788 4.35E-50 178.286 cl03182 RYDR_ITPR superfamily - - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#2853 - CGI_10015056 superfamily 216456 1 90 1.76E-25 107.024 cl03182 RYDR_ITPR superfamily N - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#2854 - CGI_10015057 superfamily 218109 144 192 1.47E-06 45.3942 cl12292 Gly_transf_sug superfamily N - "Glycosyltransferase sugar-binding region containing DXD motif; The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases." Q#2855 - CGI_10015058 superfamily 218109 144 192 1.40E-06 45.3942 cl12292 Gly_transf_sug superfamily N - "Glycosyltransferase sugar-binding region containing DXD motif; The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases." Q#2856 - CGI_10015059 superfamily 245847 24 143 7.56E-10 53.3294 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#2857 - CGI_10015060 superfamily 247755 1 167 5.92E-97 311.539 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2857 - CGI_10015060 superfamily 247755 1070 1142 3.68E-36 138.97 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2857 - CGI_10015060 superfamily 244201 517 634 2.77E-28 111.939 cl05797 SMC_hinge superfamily - - SMC proteins Flexible Hinge Domain; This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. Q#2857 - CGI_10015060 superfamily 243054 822 931 0.000520825 41.6624 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#2857 - CGI_10015060 superfamily 245835 671 848 0.000893023 41.1849 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#2860 - CGI_10015063 superfamily 194414 135 181 1.76E-08 51.5113 cl02684 zf-DBF superfamily - - DBF zinc finger; This domain is predicted to bind metal ions and is often found associated with pfam00533 and pfam02178. Q#2861 - CGI_10015064 superfamily 241750 30 297 4.44E-85 258.272 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#2862 - CGI_10015065 superfamily 246908 115 214 1.63E-47 161.211 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#2862 - CGI_10015065 superfamily 247683 54 106 2.57E-23 93.0308 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#2862 - CGI_10015065 superfamily 245201 228 485 7.41E-154 442.145 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#2868 - CGI_10013626 superfamily 248312 21 159 6.32E-05 40.0293 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#2870 - CGI_10013628 superfamily 247755 72 126 2.07E-16 72.5376 cl17201 ABC_ATPase superfamily NC - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#2872 - CGI_10013630 superfamily 245206 2 262 9.33E-124 356.46 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#2873 - CGI_10013631 superfamily 247724 19 183 1.22E-115 333.508 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2874 - CGI_10013632 superfamily 248458 43 414 1.14E-18 85.4433 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#2875 - CGI_10013633 superfamily 147195 6 83 3.48E-45 154.626 cl04835 NCD1 superfamily - - "NAB conserved region 1 (NCD1); Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This region consists of the N-terminal NAB conserved region 1, which interacts with the EGR1 inhibitory domain (R1). It may also mediate multimerisation." Q#2875 - CGI_10013633 superfamily 218320 213 311 3.24E-35 129.677 cl04836 NCD2 superfamily N - "NAB conserved region 2 (NCD2); Nab1 and Nab2 are co-repressors that specifically interact with and repress transcription mediated by the three members of the NGFI-A (Egr-1, Krox24, zif/268) family of transcription factors. This family consists of NAB conserved region 2, near the C-terminus of the protein. It is necessary for transcriptional repression by the Nab proteins. It is also required for transcription activation by Nab proteins at Nab-activated promoters." Q#2877 - CGI_10013635 superfamily 247724 76 270 3.61E-95 299.794 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2877 - CGI_10013635 superfamily 247725 518 593 1.25E-38 140.919 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2877 - CGI_10013635 superfamily 243072 913 972 2.54E-12 65.4826 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2877 - CGI_10013635 superfamily 243047 759 867 1.48E-49 172.034 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#2877 - CGI_10013635 superfamily 247725 679 739 4.88E-18 81.9833 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#2879 - CGI_10013637 superfamily 243088 39 165 1.46E-62 200.287 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#2879 - CGI_10013637 superfamily 245835 186 430 6.95E-86 263.477 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#2881 - CGI_10013639 superfamily 248422 2023 2199 3.06E-05 46.9305 cl17868 CHAT superfamily - - CHAT domain; These proteins appear to be related to peptidases in peptidase clan CD that includes the caspases. This domain has been termed the CHAT domain for Caspase HetF Associated with Tprs. This family has been identified as a sister group to the separins. Q#2882 - CGI_10013640 superfamily 241750 24 360 1.11E-97 296.037 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#2883 - CGI_10013641 superfamily 241596 107 163 1.64E-14 64.9279 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#2884 - CGI_10013642 superfamily 247724 50 282 4.45E-149 422.725 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2884 - CGI_10013642 superfamily 247063 276 351 6.35E-40 137.226 cl15768 TGS superfamily - - "The TGS domain, named after the ThrRS, GTPase, and SpoT/RelA proteins where it occurs, is structurally similar to ubiquitin. TGS is a small domain of about 50 amino acid residues with a predominantly beta-sheet structure. There is no direct information on the function of the TGS domain, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role." Q#2885 - CGI_10013643 superfamily 220774 38 305 2.68E-15 78.8201 cl11118 EIF4E-T superfamily C - "Nucleocytoplasmic shuttling protein for mRNA cap-binding EIF4E; EIF4E-T is the transporter protein for shuttling the mRNA cap-binding protein EIF4E protein, targeting it for nuclear import. EIF4E-T contains several key binding domains including two functional leucine-rich NESs (nuclear export signals) between residues 438-447 and 613-638 in the human protein. The other two binding domains are an EIF4E-binding site, between residues 27-42 in Q9EST3, and a bipartite NLS (nuclear localisation signals) between 194-211, and these lie in family EIF4E-T_N. EIF4E is the eukaryotic translation initiation factor 4E that is the rate-limiting factor for cap-dependent translation initiation." Q#2887 - CGI_10013645 superfamily 247723 153 236 3.48E-43 149.329 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2887 - CGI_10013645 superfamily 247792 257 299 1.17E-08 51.6776 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2887 - CGI_10013645 superfamily 245220 312 374 4.18E-24 96.3198 cl09957 zf-UBP superfamily - - Zn-finger in ubiquitin-hydrolases and other protein; Zn-finger in ubiquitin-hydrolases and other protein. Q#2889 - CGI_10013647 superfamily 241550 31 325 3.58E-120 351.834 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#2893 - CGI_10003928 superfamily 247692 78 417 1.14E-120 361.172 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#2894 - CGI_10003929 superfamily 247692 1 92 1.75E-17 76.2804 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#2895 - CGI_10006473 superfamily 247068 39 138 6.22E-31 113.562 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2895 - CGI_10006473 superfamily 247068 146 244 2.35E-19 81.5909 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2895 - CGI_10006473 superfamily 247068 266 326 8.36E-07 46.1526 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2895 - CGI_10006473 superfamily 247068 5 31 9.32E-05 39.9894 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2896 - CGI_10006474 superfamily 247907 302 462 5.47E-28 113.667 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2896 - CGI_10006474 superfamily 247907 526 674 1.40E-15 77.0732 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#2896 - CGI_10006474 superfamily 238012 830 877 2.49E-13 67.7646 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#2896 - CGI_10006474 superfamily 245213 216 251 5.83E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2896 - CGI_10006474 superfamily 245213 486 519 7.14E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2896 - CGI_10006474 superfamily 245213 703 732 0.00245632 38.3866 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#2896 - CGI_10006474 superfamily 215647 1302 1532 4.10E-52 186.66 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#2896 - CGI_10006474 superfamily 221370 957 1189 9.42E-16 78.9525 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#2896 - CGI_10006474 superfamily 243146 2385 2430 3.61E-11 61.5234 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2896 - CGI_10006474 superfamily 243086 1216 1288 1.33E-09 56.9998 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#2896 - CGI_10006474 superfamily 243146 2346 2396 9.21E-06 45.6271 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2896 - CGI_10006474 superfamily 243029 886 942 0.000564606 40.7969 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#2896 - CGI_10006474 superfamily 243146 2544 2581 0.00366625 38.0262 cl02701 Kelch_3 superfamily C - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#2903 - CGI_10005476 superfamily 220679 16 182 8.08E-41 138.614 cl18567 Methyltransf_16 superfamily - - Putative methyltransferase; Putative methyltransferase. Q#2904 - CGI_10005477 superfamily 247723 22 117 4.08E-13 65.8943 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#2904 - CGI_10005477 superfamily 242788 133 228 3.07E-05 43.1113 cl01936 Rad52_Rad22 superfamily N - "Rad52/22 family double-strand break repair protein; The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to Rad52. These proteins contain two helix-hairpin-helix motifs." Q#2905 - CGI_10005478 superfamily 241802 1900 2212 2.72E-75 255.89 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#2905 - CGI_10005478 superfamily 207662 163 222 7.59E-07 49.3834 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#2911 - CGI_10005484 superfamily 241559 24 128 5.19E-22 93.9147 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#2911 - CGI_10005484 superfamily 241559 141 236 2.59E-14 71.5731 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#2911 - CGI_10005484 superfamily 241559 243 335 6.20E-09 55.3947 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#2911 - CGI_10005484 superfamily 216033 775 855 1.35E-17 80.4556 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#2911 - CGI_10005484 superfamily 216033 1216 1301 5.34E-17 78.9148 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#2911 - CGI_10005484 superfamily 216033 1315 1400 4.44E-15 73.1368 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#2911 - CGI_10005484 superfamily 216033 1124 1207 1.34E-13 68.8996 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#2911 - CGI_10005484 superfamily 216033 546 628 1.32E-12 66.2032 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#2911 - CGI_10005484 superfamily 216033 858 954 2.60E-09 56.188 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#2911 - CGI_10005484 superfamily 216033 704 765 3.96E-05 43.4764 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#2913 - CGI_10005486 superfamily 203591 83 216 9.59E-29 110.154 cl06275 DUF1399 superfamily - - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#2915 - CGI_10012147 superfamily 201217 12 61 7.22E-10 49.4464 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#2916 - CGI_10012148 superfamily 243092 31 329 5.07E-57 196.017 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#2916 - CGI_10012148 superfamily 221242 366 666 6.36E-59 202.381 cl13285 DUF3337 superfamily - - Domain of unknown function (DUF3337); This family of proteins are functionally uncharacterized. This family is only found in eukaryotes. This presumed domain is typically between 285 to 342 amino acids in length. Q#2918 - CGI_10012150 superfamily 245206 33 320 2.33E-107 318.068 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#2919 - CGI_10012151 superfamily 245206 32 321 1.81E-109 323.075 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#2920 - CGI_10012152 superfamily 241913 173 245 2.25E-09 53.3789 cl00509 hot_dog superfamily C - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#2923 - CGI_10012155 superfamily 219996 120 172 0.000878434 37.0528 cl07379 Gon7 superfamily N - Gon7 family; In S. cerevisiae Gon7 is a member of the KEOPS protein complex. A protein complex proposed to be involved in transcription and promoting telomere uncapping and telomere elongation. Q#2924 - CGI_10012156 superfamily 247856 106 167 6.10E-12 57.5577 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#2924 - CGI_10012156 superfamily 247856 33 89 0.00198429 34.0605 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#2925 - CGI_10012157 superfamily 241868 1393 1565 9.86E-21 92.5686 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#2926 - CGI_10012158 superfamily 245201 373 589 3.40E-34 132.746 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#2926 - CGI_10012158 superfamily 216276 204 280 1.47E-06 48.3131 cl15639 Activin_recp superfamily - - "Activin types I and II receptor domain; This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box." Q#2926 - CGI_10012158 superfamily 216276 35 103 1.08E-05 45.6167 cl15639 Activin_recp superfamily - - "Activin types I and II receptor domain; This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box." Q#2927 - CGI_10012159 superfamily 243179 136 240 5.67E-29 107.434 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#2929 - CGI_10006832 superfamily 247068 64 150 0.000669757 37.7258 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#2930 - CGI_10006833 superfamily 247639 47 290 6.29E-42 147.608 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#2931 - CGI_10006834 superfamily 217390 174 313 3.26E-05 42.1617 cl18407 TPT superfamily - - Triose-phosphate Transporter family; This family includes transporters with a specificity for triose phosphate. Q#2933 - CGI_10006836 superfamily 219910 3 129 2.09E-10 53.7735 cl07253 DUF1761 superfamily - - Protein of unknown function (DUF1761); Family of conserved fungal and bacterial membrane proteins with unknown function. Q#2934 - CGI_10006837 superfamily 247724 14 176 7.20E-102 293.571 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2935 - CGI_10000237 superfamily 247724 3 177 3.32E-131 368.432 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#2936 - CGI_10000218 superfamily 222150 67 92 0.00018971 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2937 - CGI_10018525 superfamily 245206 4 274 6.29E-104 304.93 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#2940 - CGI_10018528 superfamily 245225 167 533 4.61E-28 115.803 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#2942 - CGI_10018530 superfamily 247675 248 312 0.00681797 36.1625 cl17011 Arginase_HDAC superfamily C - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#2943 - CGI_10018531 superfamily 191913 61 164 9.72E-09 52.7203 cl07876 NIPSNAP superfamily C - NIPSNAP; Members of this family include many hypothetical proteins. It also includes members of the NIPSNAP family which have putative roles in vesicular transport. This domain is often found in duplicate. Q#2945 - CGI_10018533 superfamily 222150 880 905 0.0035342 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#2946 - CGI_10018534 superfamily 220249 158 220 5.06E-10 53.3781 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#2947 - CGI_10018535 superfamily 245595 103 396 5.07E-164 465.072 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#2947 - CGI_10018535 superfamily 216944 34 84 4.31E-08 49.8883 cl03496 Propep_M14 superfamily N - "Carboxypeptidase activation peptide; Carboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase, and is responsible for modulation of folding and activity of the pro-enzyme." Q#2949 - CGI_10018537 superfamily 241750 62 274 2.39E-44 154.428 cl00281 metallo-dependent_hydrolases superfamily C - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#2952 - CGI_10018540 superfamily 220390 52 218 6.12E-42 142.794 cl10748 Peptidase_M76 superfamily - - Peptidase M76 family; This is a family of metalloproteases. Proteins in this family are also annotated as Ku70-binding proteins. Q#2953 - CGI_10018541 superfamily 243035 84 178 1.05E-06 49.5406 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2953 - CGI_10018541 superfamily 215647 2390 2603 6.05E-47 171.638 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#2953 - CGI_10018541 superfamily 243086 2307 2351 5.65E-10 58.1554 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#2955 - CGI_10018543 superfamily 243072 15 139 7.75E-14 64.327 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2956 - CGI_10018544 superfamily 243072 89 218 2.09E-16 75.883 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2956 - CGI_10018544 superfamily 243072 308 444 8.25E-15 71.2606 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2956 - CGI_10018544 superfamily 243072 159 299 3.99E-12 63.5566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2961 - CGI_10018549 superfamily 245598 531 864 2.54E-144 432.14 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#2961 - CGI_10018549 superfamily 243072 341 461 2.56E-26 105.929 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2961 - CGI_10018549 superfamily 243072 200 327 1.44E-19 86.2834 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#2964 - CGI_10018553 superfamily 217895 21 134 1.52E-05 44.1711 cl04401 CD20 superfamily - - "CD20-like family; This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulfide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probably topology where both amino- and carboxy termini protrude into the cytoplasm. This family also includes LR8 like proteins from humans, mice and rats. The function of the human LR8 protein is unknown although it is known to be strongly expressed in the lung fibroblasts. This family also includes sarcospan is a transmembrane component of dystrophin-associated glycoprotein. Loss of the sarcoglycan complex and sarcospan alone is sufficient to cause muscular dystrophy. The role of the sarcoglycan complex and sarcospan is thought to be to strengthen the dystrophin axis connecting the basement membrane with the cytoskeleton." Q#2965 - CGI_10018554 superfamily 243035 173 262 8.64E-19 79.2009 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2965 - CGI_10018554 superfamily 198867 5 45 2.11E-07 47.3361 cl06652 BACK superfamily N - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#2967 - CGI_10018556 superfamily 243035 78 186 9.69E-21 83.4381 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#2969 - CGI_10002834 superfamily 241609 505 581 8.28E-24 95.9079 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#2969 - CGI_10002834 superfamily 241571 371 499 3.17E-07 48.5627 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#2969 - CGI_10002834 superfamily 241583 139 321 9.79E-40 143.095 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#2970 - CGI_10001618 superfamily 220714 54 180 1.45E-68 211.343 cl11024 Kin17_mid superfamily - - "Domain of Kin17 curved DNA-binding protein; Kin17_mid is the conserved central 169 residue region of a family of Kin17 proteins. Towards the N-terminal end there is a zinc-finger domain, and in human and mouse members there is a RecA-like domain further downstream. The Kin17 protein in humans forms intra-nuclear foci during cell proliferation and is re-distributed in the nucleoplasm during the cell cycle." Q#2970 - CGI_10001618 superfamily 217505 181 291 2.35E-46 162.414 cl04021 Serinc superfamily N - Serine incorporator (Serinc); This is a family of eukaryotic membrane proteins which incorporate serine into membranes and facilitate the synthesis of the serine-derived lipids phosphatidylserine and sphingolipid. Members of this family contain 11 transmembrane domains and form intracellular complexes with key enzymes involved in serine and sphingolipid biosynthesis. Q#2970 - CGI_10001618 superfamily 205121 28 52 0.00456817 34.0168 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#2971 - CGI_10001619 superfamily 247792 84 140 5.24E-11 54.5743 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#2973 - CGI_10000768 superfamily 244363 293 378 6.33E-28 108.236 cl06336 Commd superfamily N - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#2976 - CGI_10004388 superfamily 245814 351 420 5.73E-12 62.5067 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2976 - CGI_10004388 superfamily 245814 628 667 1.85E-05 43.1644 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2976 - CGI_10004388 superfamily 245814 142 224 8.05E-05 41.3369 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#2984 - CGI_10004370 superfamily 243905 74 172 0.000111646 39.4778 cl04855 BLUF superfamily - - Sensors of blue-light using FAD; The BLUF domain has been shown to bind FAD in the AppA protein. AppA is involved in the repression of photosynthesis genes in response to blue-light. Q#2985 - CGI_10004371 superfamily 242793 160 335 3.11E-63 204.209 cl01947 MT-A70 superfamily - - "MT-A70; MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs." Q#2986 - CGI_10004372 superfamily 192535 42 226 0.000129325 42.1978 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#2988 - CGI_10013747 superfamily 246669 5 114 1.88E-66 217.56 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#2989 - CGI_10013748 superfamily 241610 1082 1135 1.01E-16 78.4458 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#2989 - CGI_10013748 superfamily 241610 362 413 5.79E-10 58.8006 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#2989 - CGI_10013748 superfamily 215827 1385 1558 9.12E-29 118.34 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#2989 - CGI_10013748 superfamily 243065 3904 4063 3.11E-28 115.192 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#2989 - CGI_10013748 superfamily 215827 562 720 4.52E-25 107.554 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#2989 - CGI_10013748 superfamily 244710 4104 4174 2.72E-19 86.6753 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#2989 - CGI_10013748 superfamily 248289 3541 3595 7.72E-10 58.5895 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3851 3908 5.33E-08 53.1967 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2326 2380 1.24E-07 52.0411 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3426 3482 1.59E-07 51.6559 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3310 3366 9.21E-07 49.7299 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2090 2141 1.52E-06 48.9595 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2971 3025 1.83E-06 48.5743 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2564 2620 5.62E-06 47.4187 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3656 3716 6.98E-06 47.0335 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2444 2501 8.08E-06 46.6483 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 1927 1978 1.19E-05 46.2631 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2505 2561 5.10E-05 44.3371 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2625 2681 6.49E-05 43.9519 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2383 2441 0.000113229 43.1815 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3029 3080 0.000433749 41.6407 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2855 2909 0.000471008 41.6407 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3083 3137 0.000561923 41.2555 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2741 2795 0.000563578 41.2555 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2266 2323 0.000590722 41.2555 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 1982 2039 0.00105333 40.4851 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3253 3307 0.00172277 39.7147 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 2798 2852 0.00486648 38.5591 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3791 3843 0.00524871 38.1739 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3140 3184 0.00660302 38.1739 cl17735 VWC superfamily C - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2989 - CGI_10013748 superfamily 248289 3598 3653 0.00675819 37.7887 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#2990 - CGI_10013749 superfamily 243058 250 347 4.77E-08 50.3907 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#2991 - CGI_10013750 superfamily 241574 285 514 2.63E-119 354.585 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#2991 - CGI_10013750 superfamily 197431 95 155 6.08E-05 42.7964 cl06408 UP_III_II superfamily NC - "Uroplakin IIIb, IIIa and II; Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains separating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers; six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis." Q#2992 - CGI_10013751 superfamily 241584 459 575 7.61E-09 54.0395 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2992 - CGI_10013751 superfamily 241584 346 452 2.26E-06 46.7207 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2992 - CGI_10013751 superfamily 241584 264 328 0.000703024 39.0167 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2992 - CGI_10013751 superfamily 242406 676 817 1.92E-19 86.4913 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#2993 - CGI_10013752 superfamily 241554 47 187 2.02E-60 199.92 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#2993 - CGI_10013752 superfamily 247069 342 478 1.84E-21 92.063 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#2994 - CGI_10013753 superfamily 218702 100 147 0.000523408 36.111 cl05324 Dimer_Tnp_hAT superfamily N - hAT family dimerisation domain; This dimerisation domain is found at the C terminus of the transposases of elements belonging to the Activator superfamily (hAT element superfamily). The isolated dimerisation domain forms extremely stable dimers in vitro. Q#2995 - CGI_10013754 superfamily 241874 77 617 0 824.986 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#2996 - CGI_10013755 superfamily 241570 240 361 2.62E-15 73.129 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#2996 - CGI_10013755 superfamily 241570 140 220 1.04E-05 44.6242 cl00047 CAP_ED superfamily N - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#2998 - CGI_10013757 superfamily 241584 210 302 2.68E-08 50.5727 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#2998 - CGI_10013757 superfamily 245814 88 178 4.04E-08 50.1965 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3000 - CGI_10013759 superfamily 110440 356 383 0.00356022 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#3002 - CGI_10013761 superfamily 243092 2 233 3.12E-28 108.962 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3003 - CGI_10004662 superfamily 243072 479 604 4.84E-25 100.921 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3003 - CGI_10004662 superfamily 243072 378 505 2.09E-09 55.4674 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3003 - CGI_10004662 superfamily 241568 253 298 5.56E-06 44.376 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#3005 - CGI_10004664 superfamily 215754 39 117 4.23E-15 68.434 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#3005 - CGI_10004664 superfamily 215754 240 275 5.45E-09 51.8704 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#3006 - CGI_10004665 superfamily 243082 336 723 2.95E-141 436.303 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#3010 - CGI_10017878 superfamily 248458 50 135 8.87E-06 45.3825 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#3011 - CGI_10017879 superfamily 243035 279 386 9.35E-14 68.4153 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3011 - CGI_10017879 superfamily 241619 177 228 9.44E-06 43.9977 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#3011 - CGI_10017879 superfamily 243035 56 146 3.49E-05 42.607 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3012 - CGI_10017880 superfamily 241568 379 413 0.00023178 39.3684 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#3013 - CGI_10017881 superfamily 247755 940 1167 1.08E-121 379.145 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#3013 - CGI_10017881 superfamily 247755 312 501 1.01E-101 323.267 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#3013 - CGI_10017881 superfamily 216049 104 228 7.87E-20 90.8082 cl18356 ABC_membrane superfamily NC - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#3013 - CGI_10017881 superfamily 216049 606 894 1.65E-15 77.3262 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#3014 - CGI_10017882 superfamily 243555 20 209 3.96E-20 86.6762 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#3016 - CGI_10017884 superfamily 110204 679 882 1.97E-91 289.875 cl03127 Lysyl_oxidase superfamily - - Lysyl oxidase; Lysyl oxidase. Q#3016 - CGI_10017884 superfamily 243061 26 131 2.00E-31 119.369 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#3016 - CGI_10017884 superfamily 243061 585 674 4.25E-18 81.2342 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#3017 - CGI_10017885 superfamily 241782 12 457 0 686.395 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#3018 - CGI_10017886 superfamily 192535 55 287 0.000220122 41.0422 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#3019 - CGI_10017887 superfamily 247639 17 267 2.64E-35 129.118 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#3021 - CGI_10017889 superfamily 245213 308 344 2.24E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 458 493 6.23E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 232 267 1.54E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 571 607 4.03E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 271 306 7.36E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 346 382 8.95E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 118 154 0.000135894 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 195 230 0.000328151 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 609 645 0.000898978 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 810 844 0.00270787 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 536 569 0.00373293 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 245213 421 455 0.00438224 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3021 - CGI_10017889 superfamily 243035 1135 1263 0.00764245 36.829 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3022 - CGI_10017890 superfamily 247792 11 58 9.94E-06 43.9736 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3022 - CGI_10017890 superfamily 128778 205 330 1.78E-06 46.8743 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#3022 - CGI_10017890 superfamily 243109 545 739 6.44E-06 45.7369 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#3022 - CGI_10017890 superfamily 241563 159 197 0.00602615 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3023 - CGI_10017891 superfamily 192535 68 345 9.05E-06 45.6646 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#3032 - CGI_10017900 superfamily 242406 4 113 6.37E-08 47.9713 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#3034 - CGI_10004732 superfamily 245864 33 424 2.00E-92 289.178 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3035 - CGI_10004733 superfamily 241758 133 232 2.06E-12 61.2318 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#3035 - CGI_10004733 superfamily 241758 111 143 1.93E-09 53.1426 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#3036 - CGI_10004734 superfamily 241758 7 143 4.48E-22 86.2698 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#3040 - CGI_10004738 superfamily 110440 253 280 0.000518405 36.6169 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#3042 - CGI_10006049 superfamily 241629 294 420 9.95E-39 137.858 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#3042 - CGI_10006049 superfamily 241629 36 135 2.55E-27 105.673 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#3042 - CGI_10006049 superfamily 241629 136 214 6.23E-19 82.6808 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#3042 - CGI_10006049 superfamily 241629 216 293 2.31E-15 72.7588 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#3044 - CGI_10006051 superfamily 243555 24 211 1.03E-17 79.3574 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#3046 - CGI_10005689 superfamily 241902 32 118 5.16E-32 112.202 cl00493 trimeric_dUTPase superfamily - - "Trimeric dUTP diphosphatases; Trimeric dUTP diphosphatases, or dUTPases, are the most common family of dUTPase, found in bacteria, eukaryotes, and archaea. They catalyze the hydrolysis of the dUTP-Mg complex (dUTP-Mg) into dUMP and pyrophosphate. This reaction is crucial for the preservation of chromosomal integrity as it removes dUTP and therefore reduces the cellular dUTP/dTTP ratio, and prevents dUTP from being incorporated into DNA. It also provides dUMP as the precursor for dTTP synthesis via the thymidylate synthase pathway. dUTPases are homotrimeric, except some monomeric viral dUTPases, which have been shown to mimic a trimer. Active sites are located at the subunit interface." Q#3048 - CGI_10005691 superfamily 247794 5 135 1.62E-37 131.875 cl17240 FDH_GDH_like superfamily NC - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#3049 - CGI_10005692 superfamily 243072 203 343 1.93E-09 57.0082 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3049 - CGI_10005692 superfamily 201924 104 157 4.43E-05 42.5169 cl03316 Cauli_VI superfamily - - "Caulimovirus viroplasmin; This family consists of various caulimovirus viroplasmin proteins. The viroplasmin protein is encoded by gene VI and is the main component of viral inclusion bodies or viroplasms. Inclusions are the site of viral assembly, DNA synthesis and accumulation. Two domains exist within gene VI corresponding approximately to the 5' third and middle third of gene VI, these influence systemic infection in a light-dependent manner." Q#3049 - CGI_10005692 superfamily 243175 1010 1078 0.00120207 38.5588 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#3049 - CGI_10005692 superfamily 243125 23 46 0.00217932 37.3838 cl02649 LEM superfamily N - "LEM (Lap2/Emerin/Man1) domain found in emerin, lamina-associated polypeptide 2 (LAP2), inner nuclear membrane protein Man1 and similar proteins; The family corresponds to a group of inner nuclear membrane proteins containing LEM domain. Emerin occurs in four phosphorylated forms and plays a role in cell cycle-dependent events. It is absent from the inner nuclear membrane in most patients with X-linked muscular dystrophy. Emerin interacts with A-type and B-type lamins. Man1, also termed LEM domain-containing protein 3 (LEMD3) is an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and post-mitotic reassembly. Some LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are non-membrane nuclear polypeptides. This family also contains LEM domain-containing protein LEMP-1 and LEM2. LEMP-1, also termed cancer/testis antigen 50 (CT50), is encoded by LEMD1, a novel testis-specific gene expressed in colorectal cancers. LEMP-1 may function as a cancer-testis antigen for immunotherapy of colorectal carcinoma (CRC). LEM2, also termed LEMD2, is a novel Man1-related ubiquitously expressed inner nuclear membrane protein required for normal nuclear envelope morphology. Association with lamin A is required for its proper nuclear envelope localization while its binding to lamin C plays an important role in the organization of lamin A/C complexes. Some uncharacterized LEM domain-containing proteins are also included in this family. Unlike other family members, these harbor an ankyrin repeat region that may mediate protein-protein interactions." Q#3050 - CGI_10005693 superfamily 220736 425 563 3.62E-21 89.6762 cl11068 PTEN_C2 superfamily - - "C2 domain of PTEN tumour-suppressor protein; This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (pfam00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane." Q#3050 - CGI_10005693 superfamily 241574 320 418 4.25E-08 52.0685 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#3051 - CGI_10005694 superfamily 247986 381 474 1.48E-10 60.8498 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#3051 - CGI_10005694 superfamily 245225 20 377 8.51E-45 165.946 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#3051 - CGI_10005694 superfamily 197504 587 685 9.14E-23 95.8192 cl18192 PBPe superfamily C - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#3052 - CGI_10005695 superfamily 247986 176 262 1.94E-12 67.013 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#3052 - CGI_10005695 superfamily 197504 375 511 5.80E-45 159.762 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#3052 - CGI_10005695 superfamily 245225 15 114 4.12E-23 101.618 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#3052 - CGI_10005695 superfamily 245746 903 963 3.18E-19 83.8138 cl11668 Lig_chan-Glu_bd superfamily - - "Ligated ion channel L-glutamate- and glycine-binding site; This region, sometimes called the S1 domain, is the luminal domain just upstream of the first, M1, transmembrane region of transmembrane ion-channel proteins, and it binds L-glutamate and glycine. It is found in association with Lig_chan, pfam00060." Q#3052 - CGI_10005695 superfamily 245225 611 816 6.19E-16 79.6617 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#3052 - CGI_10005695 superfamily 197504 940 1028 3.69E-12 65.0033 cl18192 PBPe superfamily C - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#3053 - CGI_10015136 superfamily 177822 88 269 5.83E-23 95.3721 cl18088 PLN02164 superfamily N - sulfotransferase Q#3054 - CGI_10015137 superfamily 245201 1 123 2.93E-21 87.6773 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#3055 - CGI_10015138 superfamily 246681 322 362 8.77E-18 80.6176 cl14643 SRPBCC superfamily N - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#3055 - CGI_10015138 superfamily 241571 148 257 0.00229936 36.2363 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#3056 - CGI_10015139 superfamily 242406 4 105 2.04E-13 63.3793 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#3057 - CGI_10015140 superfamily 222150 176 201 0.000666476 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3057 - CGI_10015140 superfamily 222150 148 173 0.00241065 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3057 - CGI_10015140 superfamily 246975 135 156 0.00789461 33.0893 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#3058 - CGI_10015142 superfamily 221744 56 314 1.31E-28 113.685 cl18614 CABIT superfamily - - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#3059 - CGI_10015143 superfamily 245596 190 424 3.35E-87 271.895 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#3059 - CGI_10015143 superfamily 245596 20 161 7.73E-50 174.054 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#3062 - CGI_10015146 superfamily 243034 60 156 2.45E-09 53.1528 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#3063 - CGI_10015147 superfamily 222258 196 374 1.89E-09 56.4223 cl18656 AAA_30 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#3063 - CGI_10015147 superfamily 222209 494 578 0.00405544 36.2249 cl18648 UvrD_C_2 superfamily - - Family description; This domain is found at the C-terminus of a wide variety of helicase enzymes. This domain has a AAA-like structural fold. Q#3065 - CGI_10015149 superfamily 243050 32 90 9.49E-15 70.9114 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#3065 - CGI_10015149 superfamily 247916 471 539 1.28E-06 47.3775 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#3066 - CGI_10015150 superfamily 118278 230 464 2.55E-133 403.874 cl10730 Membralin superfamily N - "Tumour-associated protein; Membralin is evolutionarily highly conserved; though it seems to represent a unique protein family. The protein appears to contain several transmembrane regions. In humans it is expressed in certain cancers, particularly ovarian cancers. Membralin-like gene homologues have been identified in plants including grape, cotton and tomato." Q#3066 - CGI_10015150 superfamily 118278 19 137 1.10E-33 133.464 cl10730 Membralin superfamily C - "Tumour-associated protein; Membralin is evolutionarily highly conserved; though it seems to represent a unique protein family. The protein appears to contain several transmembrane regions. In humans it is expressed in certain cancers, particularly ovarian cancers. Membralin-like gene homologues have been identified in plants including grape, cotton and tomato." Q#3068 - CGI_10015152 superfamily 247743 225 367 5.21E-28 111.084 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#3068 - CGI_10015152 superfamily 247743 477 644 6.26E-28 111.084 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#3068 - CGI_10015152 superfamily 216993 23 103 5.48E-11 59.8813 cl15641 CDC48_N superfamily - - "Cell division protein 48 (CDC48), N-terminal domain; This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain." Q#3068 - CGI_10015152 superfamily 244926 128 181 8.79E-09 52.9866 cl08380 CDC48_2 superfamily - - "Cell division protein 48 (CDC48), domain 2; This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the C-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain." Q#3068 - CGI_10015152 superfamily 204202 713 756 0.000695977 38.7757 cl07827 Vps4_C superfamily - - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#3069 - CGI_10015153 superfamily 192997 458 587 4.49E-35 132.32 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#3071 - CGI_10015155 superfamily 246723 19 615 0 922.337 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#3072 - CGI_10015156 superfamily 246723 33 495 0 703.158 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#3074 - CGI_10015158 superfamily 215647 96 246 7.98E-33 120.406 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#3074 - CGI_10015158 superfamily 243029 6 77 2.14E-18 76.6205 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#3075 - CGI_10015159 superfamily 243179 64 155 0.00407519 34.4308 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#3077 - CGI_10005107 superfamily 243035 290 390 0.000618848 39.5254 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3077 - CGI_10005107 superfamily 246918 214 276 1.35E-08 52.9743 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3077 - CGI_10005107 superfamily 246918 1000 1056 2.96E-08 52.2039 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3078 - CGI_10005108 superfamily 243035 484 618 1.40E-05 43.7626 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3079 - CGI_10005109 superfamily 241638 133 260 7.58E-16 71.2452 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#3080 - CGI_10005110 superfamily 241638 53 184 1.89E-17 75.074 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#3082 - CGI_10002691 superfamily 241584 426 521 6.85E-08 50.5727 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3082 - CGI_10002691 superfamily 241584 318 415 2.15E-06 45.9503 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3082 - CGI_10002691 superfamily 241584 267 305 0.00122574 37.4759 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3085 - CGI_10002767 superfamily 247725 14 68 6.22E-06 44.637 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3087 - CGI_10003891 superfamily 245864 33 250 2.10E-18 82.7114 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3088 - CGI_10003892 superfamily 245864 302 638 6.58E-58 203.279 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3089 - CGI_10003151 superfamily 245847 294 362 9.30E-05 40.9488 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#3091 - CGI_10003047 superfamily 110440 453 479 3.43E-05 41.2393 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#3091 - CGI_10003047 superfamily 241563 35 77 0.000111398 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3094 - CGI_10002912 superfamily 247097 86 122 0.00020734 37.4306 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#3099 - CGI_10002857 superfamily 242406 1 63 0.00226091 34.8745 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#3101 - CGI_10007352 superfamily 241874 88 381 6.63E-82 269.351 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#3101 - CGI_10007352 superfamily 241874 458 578 2.25E-31 126.827 cl00456 SLC5-6-like_sbd superfamily NC - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#3101 - CGI_10007352 superfamily 241874 358 406 0.000422618 41.6978 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#3102 - CGI_10007353 superfamily 243035 97 196 1.08E-21 86.1345 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3102 - CGI_10007353 superfamily 243035 7 98 6.75E-19 78.8157 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3103 - CGI_10007354 superfamily 219000 339 451 1.12E-35 135.467 cl05717 Drf_FH3 superfamily C - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#3103 - CGI_10007354 superfamily 219001 103 324 9.78E-18 82.7419 cl05720 Drf_GBD superfamily - - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#3104 - CGI_10007355 superfamily 245815 2 259 8.24E-165 467.937 cl11961 ALDH-SF superfamily N - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#3106 - CGI_10007357 superfamily 216209 142 350 0.00125115 42.2379 cl15638 Mononeg_RNA_pol superfamily C - "Mononegavirales RNA dependent RNA polymerase; Members of the Mononegavirales including the Paramyxoviridae, like other non-segmented negative strand RNA viruses, have an RNA-dependent RNA polymerase composed of two subunits, a large protein L and a phosphoprotein P. This is a protein family of the L protein. The L protein confers the RNA polymerase activity on the complex. The P protein acts as a transcription factor." Q#3108 - CGI_10007359 superfamily 243077 66 114 1.08E-10 57.1701 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#3108 - CGI_10007359 superfamily 244850 218 282 7.91E-17 74.5711 cl08096 DUF1992 superfamily - - Domain of unknown function (DUF1992); This family of proteins are functionally uncharacterized. Q#3109 - CGI_10007360 superfamily 247684 2 64 1.45E-17 74.6211 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3110 - CGI_10013932 superfamily 247725 9 109 0.000268604 39.8343 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3111 - CGI_10013933 superfamily 247725 26 171 1.96E-40 144.921 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3113 - CGI_10013935 superfamily 247725 41 198 2.04E-15 74.8145 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3115 - CGI_10013937 superfamily 242565 215 283 0.00369176 35.6326 cl01535 Repair_PSII superfamily N - "Repair protein; In plants, this domain plays a role in the photosystem II (PSII) repair cycle. It may be involved in the regulation of synthesis/degradation of the D1 protein of the PSII core and in the assembly of PSII monomers into dimers in the grana stacks. Its function in other organisms is unknown." Q#3116 - CGI_10013938 superfamily 192718 6 163 5.42E-36 125.059 cl12725 DUF2962 superfamily - - Protein of unknown function (DUF2962); This eukaryotic family of proteins has no known function. Q#3120 - CGI_10013942 superfamily 128469 278 393 1.43E-09 55.5392 cl17971 VPS9 superfamily - - Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. Q#3121 - CGI_10013943 superfamily 247792 950 990 9.02E-09 53.2184 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3122 - CGI_10013944 superfamily 219619 215 265 1.44E-06 44.8912 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#3123 - CGI_10013945 superfamily 243072 136 260 2.46E-33 126.344 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3123 - CGI_10013945 superfamily 243072 235 400 2.46E-15 74.3422 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3124 - CGI_10013946 superfamily 246597 8 300 0 684.477 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#3125 - CGI_10013947 superfamily 246597 508 791 1.46E-64 219.097 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#3125 - CGI_10013947 superfamily 247856 253 281 0.00657909 35.502 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3128 - CGI_10013950 superfamily 215731 54 320 8.43E-28 113.077 cl08245 Gln-synt_C superfamily - - "Glutamine synthetase, catalytic domain; Glutamine synthetase, catalytic domain. " Q#3128 - CGI_10013950 superfamily 215731 456 719 3.01E-25 105.758 cl08245 Gln-synt_C superfamily - - "Glutamine synthetase, catalytic domain; Glutamine synthetase, catalytic domain. " Q#3129 - CGI_10013951 superfamily 215731 54 320 7.81E-27 106.528 cl08245 Gln-synt_C superfamily - - "Glutamine synthetase, catalytic domain; Glutamine synthetase, catalytic domain. " Q#3130 - CGI_10013952 superfamily 241554 22 89 1.06E-26 97.7263 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#3132 - CGI_10004136 superfamily 241622 2 76 1.34E-13 61.0434 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#3135 - CGI_10004139 superfamily 247792 48 94 2.69E-08 50.1368 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3136 - CGI_10004140 superfamily 247743 214 338 2.57E-07 50.276 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#3136 - CGI_10004140 superfamily 248006 876 917 0.00501484 36.0099 cl17452 TPR_10 superfamily - - Tetratricopeptide repeat; Tetratricopeptide repeat. Q#3136 - CGI_10004140 superfamily 243034 796 903 0.00540339 36.204 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#3138 - CGI_10003578 superfamily 219797 20 55 3.77E-06 43.4582 cl09596 ACC_central superfamily N - "Acetyl-CoA carboxylase, central region; The region featured in this family is found in various eukaryotic acetyl-CoA carboxylases, N-terminal to the catalytic domain (pfam01039). This enzyme (EC:6.4.1.2) is involved in the synthesis of long-chain fatty acids, as it catalyzes the rate-limiting step in this process." Q#3140 - CGI_10003580 superfamily 217473 177 324 1.12E-23 101.288 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#3144 - CGI_10024751 superfamily 243092 104 400 2.05E-22 98.176 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3144 - CGI_10024751 superfamily 243092 480 585 2.01E-06 48.8704 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3145 - CGI_10024752 superfamily 216686 65 249 3.49E-46 156.329 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#3146 - CGI_10024753 superfamily 247924 138 188 5.38E-15 67.5891 cl17370 PEMT superfamily C - Phospholipid methyltransferase; The S. cerevisiae phospholipid methyltransferase (EC:2.1.1.16) has a broad substrate specificity of unsaturated phospholipids. Q#3150 - CGI_10024757 superfamily 247741 72 305 6.24E-42 148.227 cl17187 Aldolase_Class_I superfamily - - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#3152 - CGI_10024759 superfamily 217830 7 134 2.16E-63 199.869 cl08410 Autophagy_N superfamily - - "Autophagocytosis associated protein (Atg3), N-terminal domain; Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the lysosome/vacuole. Atg3 is a ubiquitin like modifier that is topologically similar to the canonical E2 enzyme. It catalyzes the conjugation of Atg8 and phosphatidylethanolamine." Q#3152 - CGI_10024759 superfamily 202841 212 273 2.27E-30 110.388 cl08411 Autophagy_act_C superfamily - - "Autophagocytosis associated protein, active-site domain; Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The cysteine residue within the HPC motif is the putative active-site residue for recognition of the Apg5 subunit of the autophagosome complex." Q#3152 - CGI_10024759 superfamily 150967 294 318 5.87E-10 53.6928 cl11047 Autophagy_Cterm superfamily - - Autophagocytosis associated protein C-terminal; Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The small C-terminal domain is likely to be a distinct binding region for the stability of the autophagosome complex. It carries a highly characteristic conserved FLKF sequence motif. Q#3153 - CGI_10024760 superfamily 246669 1705 1835 1.28E-65 220.613 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3153 - CGI_10024760 superfamily 246669 228 381 2.68E-62 211.723 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3153 - CGI_10024760 superfamily 246669 1470 1593 3.46E-57 196.232 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3153 - CGI_10024760 superfamily 246669 64 174 4.56E-56 192.406 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3153 - CGI_10024760 superfamily 246669 967 1084 3.13E-55 190.832 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3153 - CGI_10024760 superfamily 149289 151 222 6.29E-30 115.846 cl06959 FerI superfamily - - FerI (NUC094) domain; This domain is present in proteins of the Ferlin family. It is often located between two C2 domains. Q#3153 - CGI_10024760 superfamily 116739 614 691 1.83E-23 97.5684 cl06958 FerB superfamily - - FerB (NUC096) domain; This is central domain B in proteins of the Ferlin family. Q#3153 - CGI_10024760 superfamily 214777 708 764 1.43E-09 56.7644 cl02740 DysFN superfamily - - "Dysferlin domain, N-terminal region; Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region." Q#3153 - CGI_10024760 superfamily 149301 521 587 2.68E-05 44.2128 cl06973 FerA superfamily - - FerA (NUC095) domain; This is central domain A in proteins of the Ferlin family. Q#3153 - CGI_10024760 superfamily 246669 1145 1228 0.000259856 41.4077 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3153 - CGI_10024760 superfamily 214777 777 832 0.000267067 41.3564 cl02740 DysFN superfamily - - "Dysferlin domain, N-terminal region; Domain of unknown function present in yeast peroxisomal proteins, dysferlin, myoferlin and hypothetical proteins. Due to an insertion of a dysferlin domain within a second dysferlin domain we have chosen to predict these domains in two parts: the N-terminal region and the C-terminal region." Q#3154 - CGI_10024761 superfamily 206035 103 144 0.00219193 36.7563 cl16440 Enkurin superfamily N - "Calmodulin-binding; This is a family of apparent calmodulin-binding proteins found at high levels in the testis and vomeronasal organ and at lower levels in certain other tissues. Enkurin is a scaffold protein that binds PI3 kinase to sperm transient receptor potential (canonical) (TRPC) channels. The mammalian transient receptor potential (canonical) channels are the primary candidates for the Ca(2+) entry pathway activated by the hormones, growth factors, and neurotransmitters that exert their effect through activation of PLC. Calmodulin binds to the C-terminus of all TRPC channels, and dissociation of calmodulin from TRPC4 results in profound activation of the channel." Q#3156 - CGI_10024763 superfamily 222090 5 199 2.48E-16 73.461 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#3157 - CGI_10024764 superfamily 222090 5 182 1.19E-14 68.8386 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#3158 - CGI_10024765 superfamily 243035 368 478 9.02E-16 74.1933 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3159 - CGI_10024766 superfamily 243099 5 124 3.24E-32 113.197 cl02575 Bcl-2_like superfamily - - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#3160 - CGI_10024767 superfamily 241629 41 175 2.76E-36 128.587 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#3161 - CGI_10024768 superfamily 218493 562 703 1.45E-28 113.221 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#3161 - CGI_10024768 superfamily 207716 54 119 2.12E-13 67.2822 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#3162 - CGI_10024769 superfamily 243035 96 156 3.92E-18 75.7955 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3163 - CGI_10024770 superfamily 199156 60 74 0.00158354 35.1225 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#3164 - CGI_10024771 superfamily 243035 53 82 0.00571512 32.3965 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3165 - CGI_10024772 superfamily 247794 74 478 0 722.32 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#3169 - CGI_10024776 superfamily 247042 35 427 1.87E-34 132.84 cl15693 Sema superfamily - - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#3169 - CGI_10024776 superfamily 243104 406 457 4.96E-08 49.8592 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#3170 - CGI_10024777 superfamily 245201 754 1017 4.85E-83 271.332 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#3170 - CGI_10024777 superfamily 247038 259 349 2.75E-19 85.0598 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3170 - CGI_10024777 superfamily 247038 358 434 7.53E-12 63.0121 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3170 - CGI_10024777 superfamily 243104 216 257 3.78E-06 45.622 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#3170 - CGI_10024777 superfamily 247038 438 544 1.06E-05 45.1005 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3170 - CGI_10024777 superfamily 247038 546 635 0.00012365 41.6592 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3171 - CGI_10024778 superfamily 223248 235 416 3.03E-12 65.0968 cl18701 SEC59 superfamily - - Dolichol kinase [Lipid metabolism] Q#3171 - CGI_10024778 superfamily 221533 496 559 2.92E-05 42.2988 cl13726 TMF_DNA_bd superfamily - - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#3171 - CGI_10024778 superfamily 241634 430 533 0.000464868 38.8661 cl00143 SynN superfamily - - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#3172 - CGI_10024779 superfamily 109875 1 118 1.28E-50 162.26 cl17923 T4_deiodinase superfamily N - "Iodothyronine deiodinase; Iodothyronine deiodinase converts thyroxine (T4) to 3,5,3'-triiodothyronine (T3)." Q#3174 - CGI_10024781 superfamily 243092 59 352 5.34E-78 243.781 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3175 - CGI_10024782 superfamily 247805 18 217 4.41E-86 264.732 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#3175 - CGI_10024782 superfamily 247905 232 362 3.23E-33 122.346 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#3176 - CGI_10024783 superfamily 247743 243 322 9.42E-11 59.8523 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#3176 - CGI_10024783 superfamily 209247 467 551 1.80E-19 84.0323 cl11083 ClpB_D2-small superfamily - - "C-terminal, D2-small domain, of ClpB protein; This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, pfam00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighboring subunit and thereby providing enough binding energy to stabilise the functional assembly. The domain is associated with two Clp_N, pfam02861, at the N-terminus as well as AAA, pfam00004 and AAA_2, pfam07724." Q#3177 - CGI_10024784 superfamily 243072 63 174 3.54E-26 100.536 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3177 - CGI_10024784 superfamily 243073 251 290 6.69E-05 39.4011 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#3178 - CGI_10024785 superfamily 217473 424 635 2.69E-07 51.5969 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#3179 - CGI_10024786 superfamily 243263 73 492 3.05E-19 89.7746 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#3181 - CGI_10024788 superfamily 241862 17 341 3.14E-16 78.5821 cl00437 COG0428 superfamily - - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#3181 - CGI_10024788 superfamily 191382 456 556 2.07E-11 61.1691 cl09387 CENP-H superfamily - - "Centromere protein H (CENP-H); This family consists of several eukaryotic centromere protein H (CENP-H) sequences. Macromolecular centromere-kinetochore complex plays a critical role in sister chromatid separation, but its complete protein composition as well as its precise dynamic function during mitosis has not yet been clearly determined. CENP-H contains a coiled-coil structure and a nuclear localisation signal. CENP-H is specifically and constitutively localised in kinetochores throughout the cell cycle. CENP-H may play a role in kinetochore organisation and function throughout the cell cycle. This the C-terminus of the region, which is conserved from fungi to humans." Q#3182 - CGI_10024789 superfamily 241862 96 254 2.54E-08 52.3584 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#3183 - CGI_10024790 superfamily 218212 1088 1452 3.38E-144 447.132 cl18448 DUF608 superfamily - - "Protein of unknown function, DUF608; This family represents a conserved region with a pankaryotic distribution in a number of uncharacterized proteins." Q#3183 - CGI_10024790 superfamily 221464 741 1021 1.41E-59 209.118 cl13626 GBA2_N superfamily - - "beta-Glucocerebrosidase 2 N terminal; This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 320 to 354 amino acids in length. This domain is found associated with pfam04685. This domain is found in the protein beta-Glucocerebrosidase 2. It is found just after the extreme N terminus. This protein is located in the ER. The N terminal is thought to be the luminal domain while the C terminal is the cytosolic domain. The catalytic domain of GBA-2 is unknown." Q#3184 - CGI_10024791 superfamily 245596 14 244 3.88E-120 345.435 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#3185 - CGI_10024792 superfamily 218802 4 148 1.70E-54 172.161 cl05462 DUF862 superfamily - - "PPPDE putative peptidase domain; The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p)." Q#3186 - CGI_10024793 superfamily 242611 92 383 7.28E-149 428.067 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#3187 - CGI_10024794 superfamily 111929 198 285 1.62E-21 88.2266 cl03885 Str_synth superfamily - - Strictosidine synthase; Strictosidine synthase (E.C. 4.3.3.2) is a key enzyme in alkaloid biosynthesis. It catalyzes the condensation of tryptamine with secologanin to form strictosidine. Q#3188 - CGI_10024795 superfamily 247856 173 224 1.48E-09 52.1649 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3189 - CGI_10024796 superfamily 248458 81 433 6.78E-12 65.4129 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#3190 - CGI_10024797 superfamily 247057 28 81 3.85E-27 95.8497 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#3191 - CGI_10024798 superfamily 241566 277 326 2.08E-16 72.1395 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#3191 - CGI_10024798 superfamily 241566 126 175 3.45E-13 63.6652 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#3192 - CGI_10024799 superfamily 247725 30 115 1.15E-15 67.7655 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3193 - CGI_10024800 superfamily 151109 11 136 2.76E-33 115.766 cl11199 UPF0556 superfamily - - Uncharacterized protein family UPF0556; This family of proteins has no known function. Q#3194 - CGI_10024801 superfamily 245206 5 254 2.06E-165 482.592 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#3194 - CGI_10024801 superfamily 241913 481 602 6.50E-65 213.235 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#3194 - CGI_10024801 superfamily 241913 321 451 4.21E-16 76.1955 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#3194 - CGI_10024801 superfamily 242376 715 817 9.79E-11 59.5462 cl01225 SCP2 superfamily - - "SCP-2 sterol transfer family; This domain is involved in binding sterols. It is found in the SCP2 protein, as well as the C terminus of the enzyme estradiol 17 beta-dehydrogenase EC:1.1.1.62. The UNC-24 protein contains an SPFH domain pfam01145." Q#3194 - CGI_10024801 superfamily 242376 622 690 1.70E-05 43.7531 cl01225 SCP2 superfamily - - "SCP-2 sterol transfer family; This domain is involved in binding sterols. It is found in the SCP2 protein, as well as the C terminus of the enzyme estradiol 17 beta-dehydrogenase EC:1.1.1.62. The UNC-24 protein contains an SPFH domain pfam01145." Q#3196 - CGI_10024803 superfamily 243146 123 165 1.12E-09 53.4342 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#3196 - CGI_10024803 superfamily 243146 79 132 8.60E-05 39.5799 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#3196 - CGI_10024803 superfamily 243146 12 52 0.00015443 38.7471 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#3196 - CGI_10024803 superfamily 243146 173 213 0.00080677 36.4855 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#3197 - CGI_10024804 superfamily 247639 1 254 1.20E-33 124.111 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#3198 - CGI_10024805 superfamily 241629 27 147 2.45E-13 66.7748 cl00133 SCP superfamily N - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#3198 - CGI_10024805 superfamily 217895 181 263 3.92E-10 57.2679 cl04401 CD20 superfamily C - "CD20-like family; This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulfide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probably topology where both amino- and carboxy termini protrude into the cytoplasm. This family also includes LR8 like proteins from humans, mice and rats. The function of the human LR8 protein is unknown although it is known to be strongly expressed in the lung fibroblasts. This family also includes sarcospan is a transmembrane component of dystrophin-associated glycoprotein. Loss of the sarcoglycan complex and sarcospan alone is sufficient to cause muscular dystrophy. The role of the sarcoglycan complex and sarcospan is thought to be to strengthen the dystrophin axis connecting the basement membrane with the cytoskeleton." Q#3199 - CGI_10024806 superfamily 215866 8 150 1.93E-32 120.123 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#3199 - CGI_10024806 superfamily 243212 179 309 1.49E-21 89.3253 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#3200 - CGI_10008582 superfamily 243092 67 413 2.32E-25 103.569 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3201 - CGI_10008583 superfamily 246925 69 189 3.23E-05 45.8094 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3201 - CGI_10008583 superfamily 214507 615 666 0.000654016 38.9504 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#3201 - CGI_10008583 superfamily 246925 190 428 0.0026904 39.6462 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3202 - CGI_10008584 superfamily 247805 493 651 1.10E-18 85.4668 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#3202 - CGI_10008584 superfamily 247805 1269 1415 2.28E-17 81.6148 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#3202 - CGI_10008584 superfamily 247905 783 825 1.37E-06 48.3881 cl17351 HELICc superfamily NC - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#3202 - CGI_10008584 superfamily 214946 903 1212 1.92E-96 314.296 cl15345 Sec63 superfamily - - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#3203 - CGI_10008585 superfamily 247905 19 97 1.14E-09 53.7809 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#3204 - CGI_10008586 superfamily 241571 171 279 3.42E-18 79.3786 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#3204 - CGI_10008586 superfamily 216897 39 114 0.00257542 35.7349 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#3207 - CGI_10008589 superfamily 214946 12 176 2.53E-63 211.833 cl15345 Sec63 superfamily C - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#3207 - CGI_10008589 superfamily 214946 396 499 2.28E-13 69.6947 cl15345 Sec63 superfamily N - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#3210 - CGI_10008592 superfamily 110440 96 120 0.00309387 33.1501 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#3211 - CGI_10008593 superfamily 219188 98 302 5.33E-45 157.832 cl18498 Lung_7-TM_R superfamily C - Lung seven transmembrane receptor; This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins. Q#3212 - CGI_10016414 superfamily 243092 55 212 3.18E-19 87.7756 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3213 - CGI_10016415 superfamily 243179 127 254 2.91E-26 100.841 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#3214 - CGI_10016416 superfamily 247792 462 505 5.65E-12 61.3076 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3215 - CGI_10016417 superfamily 219843 39 251 2.25E-77 242.145 cl18528 ATP-grasp_2 superfamily - - ATP-grasp domain; ATP-grasp domain. Q#3215 - CGI_10016417 superfamily 215988 310 430 5.92E-34 123.906 cl18355 Ligase_CoA superfamily - - "CoA-ligase; This family includes the CoA ligases Succinyl-CoA synthetase alpha and beta chains, malate CoA ligase and ATP-citrate lyase. Some members of the family utilise ATP others use GTP." Q#3216 - CGI_10016418 superfamily 241583 301 402 1.62E-20 91.1459 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#3216 - CGI_10016418 superfamily 241583 236 351 7.08E-05 42.3092 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#3216 - CGI_10016418 superfamily 246968 591 692 0.000444698 40.036 cl15456 ADAM_CR superfamily - - ADAM cysteine-rich; ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity. Q#3222 - CGI_10016424 superfamily 248097 69 177 1.36E-18 78.0758 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3223 - CGI_10016425 superfamily 245864 11 466 3.53E-64 216.376 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3224 - CGI_10016426 superfamily 243061 97 198 2.38E-32 113.976 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#3224 - CGI_10016426 superfamily 243061 21 94 9.65E-07 44.255 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#3229 - CGI_10016431 superfamily 241754 64 734 0 1157.21 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#3229 - CGI_10016431 superfamily 247725 1978 2153 1.78E-108 344.996 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3229 - CGI_10016431 superfamily 247725 1448 1584 6.00E-78 256.364 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3229 - CGI_10016431 superfamily 243052 1153 1258 2.21E-46 164.814 cl02480 MyTH4 superfamily - - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#3229 - CGI_10016431 superfamily 243052 1733 1843 1.41E-40 149.818 cl02480 MyTH4 superfamily N - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#3229 - CGI_10016431 superfamily 247683 1572 1635 6.83E-33 124.546 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#3229 - CGI_10016431 superfamily 243052 1012 1051 6.70E-06 46.97 cl02480 MyTH4 superfamily C - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#3229 - CGI_10016431 superfamily 222636 861 993 0.00304267 38.3915 cl16962 DUF4355 superfamily - - Domain of unknown function (DUF4355); This family of proteins is found in bacteria and viruses. Proteins in this family are typically between 180 and 214 amino acids in length. Q#3230 - CGI_10016432 superfamily 141815 14 320 1.40E-165 466.456 cl04275 Mtc superfamily - - Tricarboxylate carrier; Tricarboxylate carrier. Q#3232 - CGI_10016434 superfamily 241563 18 50 0.000117077 40.0131 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3233 - CGI_10016435 superfamily 243141 1 22 8.52E-05 35.755 cl02687 RWD superfamily C - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#3234 - CGI_10016436 superfamily 242406 451 591 3.84E-22 93.4249 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#3234 - CGI_10016436 superfamily 245814 339 410 2.01E-10 57.9004 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3234 - CGI_10016436 superfamily 245814 48 105 1.68E-09 55.7101 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3234 - CGI_10016436 superfamily 245814 419 448 6.65E-05 41.7221 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3234 - CGI_10016436 superfamily 245814 227 308 0.000183275 40.1813 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3235 - CGI_10016437 superfamily 246669 1307 1444 6.55E-83 269.257 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3235 - CGI_10016437 superfamily 246669 737 858 1.05E-50 177.055 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3235 - CGI_10016437 superfamily 241622 597 669 2.82E-13 67.5918 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#3236 - CGI_10016438 superfamily 243078 10 154 3.38E-61 203.302 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#3236 - CGI_10016438 superfamily 247683 216 269 8.40E-32 118.337 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#3237 - CGI_10016439 superfamily 217293 1 100 1.76E-12 64.1911 cl03788 Neur_chan_LBD superfamily NC - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#3237 - CGI_10016439 superfamily 202474 151 185 0.00936971 35.7073 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#3239 - CGI_10016441 superfamily 222438 378 443 6.97E-16 74.0561 cl16459 zf-C3Hc3H superfamily - - Potential DNA-binding domain; This domain is likely to be the DNA-binding domain of chromatin re-modelling proteins and helicases. Q#3240 - CGI_10016442 superfamily 222438 41 103 1.47E-12 59.4186 cl16459 zf-C3Hc3H superfamily - - Potential DNA-binding domain; This domain is likely to be the DNA-binding domain of chromatin re-modelling proteins and helicases. Q#3242 - CGI_10016444 superfamily 241571 4 111 2.39E-29 113.276 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#3242 - CGI_10016444 superfamily 241571 115 225 5.22E-22 92.4754 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#3242 - CGI_10016444 superfamily 241571 232 346 1.54E-20 88.2382 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#3242 - CGI_10016444 superfamily 241571 452 572 1.06E-11 62.4298 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#3242 - CGI_10016444 superfamily 241571 351 449 6.96E-11 60.1186 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#3245 - CGI_10016447 superfamily 241609 32 98 4.85E-14 62.3955 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#3246 - CGI_10017239 superfamily 243092 5 265 4.69E-81 248.404 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3247 - CGI_10017240 superfamily 245226 96 262 8.87E-27 104.305 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#3249 - CGI_10017243 superfamily 241564 73 142 9.11E-23 86.5507 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#3251 - CGI_10017246 superfamily 207794 175 397 1.35E-111 334.181 cl02948 GH20_hexosaminidase superfamily C - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#3251 - CGI_10017246 superfamily 111707 46 173 1.62E-27 105.575 cl03741 Glyco_hydro_20b superfamily - - "Glycosyl hydrolase family 20, domain 2; This domain has a zincin-like fold." Q#3255 - CGI_10017250 superfamily 243072 450 548 6.26E-09 54.3118 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3255 - CGI_10017250 superfamily 243072 254 379 2.06E-08 52.771 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3255 - CGI_10017250 superfamily 243072 33 124 1.68E-05 43.9115 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3256 - CGI_10017251 superfamily 208802 751 1042 7.20E-144 441.485 cl07974 DRE_TIM_metallolyase superfamily - - "DRE-TIM metallolyase superfamily; The DRE-TIM metallolyase superfamily includes 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM"." Q#3256 - CGI_10017251 superfamily 245604 1306 1371 4.65E-14 69.3671 cl11404 Biotinyl_lipoyl_domains superfamily - - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#3256 - CGI_10017251 superfamily 247809 336 545 5.68E-89 288.046 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#3256 - CGI_10017251 superfamily 201133 222 330 2.53E-49 171.896 cl02837 CPSase_L_chain superfamily - - "Carbamoyl-phosphate synthase L chain, N-terminal domain; Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117." Q#3256 - CGI_10017251 superfamily 244920 560 667 2.05E-42 152.182 cl08365 Biotin_carb_C superfamily - - "Biotin carboxylase C-terminal domain; Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyzes the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain." Q#3257 - CGI_10017252 superfamily 247949 22 318 1.87E-79 259.179 cl17395 TAF6 superfamily - - "TATA Binding Protein (TBP) Associated Factor 6 (TAF6) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex; The TATA Binding Protein (TBP) Associated Factor 6 (TAF6) is one of several TAFs that bind TBP and are involved in forming Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTFs) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A new, unified nomenclature has been suggested for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF6 is a shared subunit of histone acetyltransferase complex SAGA and TFIID complexes. TAF6 domain interacts with TAF9 and makes a novel histone-like heterodimer that is structurally related to histones H4 and H3. TAF6 may also interact with the downstream core promoter element (DPE)." Q#3258 - CGI_10017253 superfamily 243092 75 317 1.08E-44 159.808 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3258 - CGI_10017253 superfamily 192278 340 488 2.55E-36 131.879 cl15684 UTP15_C superfamily - - "UTP15 C terminal; U3 snoRNA is ubiquitous in eukaryotes and is required for nucleolar processing of pre-18S ribosomal RNA. It is a component of the ribosomal small subunit (SSU) processome. UTP15 is needed for optimal pre-ribosomal RNA transcription by RNA polymerase I, together with a subset of U3 proteins required for transcription (t-UTPs). This entry represents the C terminal of UTP15, and is found adjacent to WD40 repeats (pfam00400)." Q#3259 - CGI_10017254 superfamily 243120 186 276 7.65E-35 125.769 cl02633 ARID superfamily - - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#3260 - CGI_10017255 superfamily 247684 2 382 0 588.339 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3261 - CGI_10017256 superfamily 245201 20 267 9.95E-79 258.219 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#3261 - CGI_10017256 superfamily 243239 690 800 5.12E-49 170.171 cl02916 POLO_box superfamily - - "Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases; The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides." Q#3261 - CGI_10017256 superfamily 243239 578 689 2.85E-47 165.131 cl02916 POLO_box superfamily - - "Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases; The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides." Q#3261 - CGI_10017256 superfamily 243239 910 989 1.85E-28 110.758 cl02916 POLO_box superfamily - - "Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases; The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides." Q#3262 - CGI_10017257 superfamily 247725 261 381 1.35E-26 107.405 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3262 - CGI_10017257 superfamily 215882 156 282 5.63E-17 79.247 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#3262 - CGI_10017257 superfamily 241645 62 142 0.00643375 36.5427 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#3265 - CGI_10017260 superfamily 241868 1 216 4.63E-27 102.969 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#3267 - CGI_10009381 superfamily 247856 13 42 0.00368811 34.878 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3269 - CGI_10009383 superfamily 220692 1 287 3.51E-15 73.3925 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#3270 - CGI_10009384 superfamily 244913 985 1423 7.80E-155 484.783 cl08327 Glyco_hydro_47 superfamily - - "Glycosyl hydrolase family 47; Members of this family are alpha-mannosidases that catalyze the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2)." Q#3270 - CGI_10009384 superfamily 220393 116 315 1.06E-46 171.404 cl10751 Tmem26 superfamily C - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#3270 - CGI_10009384 superfamily 244870 1498 1563 5.56E-23 97.4312 cl08238 PA superfamily N - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#3270 - CGI_10009384 superfamily 193258 484 587 0.000660957 42.3309 cl15087 Innate_immun superfamily N - "Invertebrate innate immunity transcript family; The immune response of the purple sea urchin appears to be more complex than previously believed in that it uses immune-related gene families homologous to vertebrate Toll-like and NOD/NALP-like receptor families as well as C-type lectins and a rudimentary complement system. In addition, the species also produces this unusual family of mRNAs, also known as 185/333, which is strongly upregulated in response to pathogen challenge." Q#3272 - CGI_10009386 superfamily 247684 7 161 4.88E-22 90.7995 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3273 - CGI_10009387 superfamily 243072 658 783 5.17E-20 87.8242 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3273 - CGI_10009387 superfamily 243072 731 850 1.54E-18 83.587 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3273 - CGI_10009387 superfamily 243072 827 945 3.40E-16 76.6534 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3273 - CGI_10009387 superfamily 247724 255 352 6.69E-10 58.5015 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3275 - CGI_10009389 superfamily 216939 9 68 1.44E-06 43.0353 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#3275 - CGI_10009389 superfamily 216939 93 135 4.99E-05 38.7981 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#3276 - CGI_10009390 superfamily 241564 70 139 7.39E-21 81.5431 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#3277 - CGI_10009391 superfamily 247724 446 544 1.08E-08 53.4939 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3278 - CGI_10009392 superfamily 243034 113 209 7.64E-06 43.5228 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#3278 - CGI_10009392 superfamily 243034 180 278 2.35E-05 42.3672 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#3280 - CGI_10009394 superfamily 100115 71 424 2.41E-129 381.247 cl18930 StaR_like superfamily - - "StaR_like; a well-conserved protein found in bacteria, plants, and animals. A family member from Streptomyces toyocaensis, StaR is part of a gene cluster involved in the biosynthesis of glycopeptide antibiotics (GPAs), specifically A47934. It has been speculated that StaR could be a flavoprotein hydroxylating a tyrosine sidechain. Some family members have been annotated as proteins containing tetratricopeptide (TPR) repeats, which may at least indicate mostly alpha-helical secondary structure." Q#3281 - CGI_10009395 superfamily 247723 17 96 1.45E-57 186.201 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3281 - CGI_10009395 superfamily 247723 102 177 6.30E-53 173.891 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3281 - CGI_10009395 superfamily 247723 195 275 8.42E-32 116.519 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3281 - CGI_10009395 superfamily 207685 405 446 1.71E-20 85.3034 cl02642 PABP superfamily N - "Poly-adenylate binding protein, unique domain; The region featured in this family is found towards the C-terminus of poly(A)-binding proteins (PABPs). These are eukaryotic proteins that, through their binding of the 3' poly(A) tail on mRNA, have very important roles in the pathways of gene expression. They seem to provide a scaffold on which other proteins can bind and mediate processes such as export, translation and turnover of the transcripts. Moreover, they may act as antagonists to the binding of factors that allow mRNA degradation, regulating mRNA longevity. PABPs are also involved in nuclear transport. PABPs interact with poly(A) tails via RNA-recognition motifs (pfam00076). Note that the PABP C-terminal region is also found in members of the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains - these are also included in this family." Q#3281 - CGI_10009395 superfamily 247723 299 324 2.16E-11 59.9404 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3288 - CGI_10010664 superfamily 248312 34 158 2.17E-09 54.2904 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#3289 - CGI_10010665 superfamily 241832 2 90 1.81E-38 130.551 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#3289 - CGI_10010665 superfamily 243175 104 227 2.65E-35 123.585 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#3293 - CGI_10024060 superfamily 243066 492 584 6.76E-19 82.7388 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3293 - CGI_10024060 superfamily 243066 335 458 1.59E-15 73.4205 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3293 - CGI_10024060 superfamily 247724 224 277 6.09E-06 45.5759 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3293 - CGI_10024060 superfamily 241622 153 186 0.00920763 35.0472 cl00117 PDZ superfamily NC - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#3294 - CGI_10024061 superfamily 246710 61 197 2.34E-34 127.233 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#3294 - CGI_10024061 superfamily 217685 371 528 3.48E-33 124.754 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#3294 - CGI_10024061 superfamily 216290 230 355 5.16E-29 111.997 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#3295 - CGI_10024062 superfamily 243109 145 282 2.20E-36 129.466 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#3296 - CGI_10024063 superfamily 247724 54 215 1.30E-69 225.501 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3296 - CGI_10024063 superfamily 243066 441 545 3.28E-17 78.0429 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3296 - CGI_10024063 superfamily 243066 281 419 1.31E-06 46.9152 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3297 - CGI_10024064 superfamily 243066 192 296 1.50E-17 77.2725 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3297 - CGI_10024064 superfamily 243066 32 170 3.36E-07 47.6856 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3298 - CGI_10024065 superfamily 241789 12 145 1.25E-80 236.929 cl00328 Ribosomal_L14 superfamily - - Ribosomal protein L14p/L23e; Ribosomal protein L14p/L23e. Q#3299 - CGI_10024066 superfamily 243050 5 57 5.91E-34 121.329 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#3299 - CGI_10024066 superfamily 247683 379 429 1.13E-27 103.936 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#3299 - CGI_10024066 superfamily 159567 67 95 1.02E-05 42.4067 cl11595 Nebulin superfamily - - Nebulin repeat; Nebulin repeat. Q#3299 - CGI_10024066 superfamily 159567 172 200 0.000194234 38.9399 cl11595 Nebulin superfamily - - Nebulin repeat; Nebulin repeat. Q#3299 - CGI_10024066 superfamily 159567 97 127 0.00124554 36.6748 cl11595 Nebulin superfamily - - Nebulin repeat; Nebulin repeat. Q#3299 - CGI_10024066 superfamily 159567 205 228 0.00177666 35.9044 cl11595 Nebulin superfamily - - Nebulin repeat; Nebulin repeat. Q#3301 - CGI_10024068 superfamily 247724 192 352 3.68E-06 45.5252 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3301 - CGI_10024068 superfamily 242902 25 151 1.43E-14 70.3534 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#3302 - CGI_10024069 superfamily 247724 172 378 1.72E-05 43.5992 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3302 - CGI_10024069 superfamily 242902 13 129 9.06E-05 41.538 cl02144 TLD superfamily N - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#3303 - CGI_10024070 superfamily 247724 224 384 2.62E-05 43.214 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3303 - CGI_10024070 superfamily 242902 25 176 2.95E-16 75.361 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#3304 - CGI_10024071 superfamily 247724 225 385 6.26E-05 42.0584 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3304 - CGI_10024071 superfamily 242902 26 177 1.97E-15 73.0498 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#3305 - CGI_10024072 superfamily 247724 224 384 3.53E-05 42.8288 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3305 - CGI_10024072 superfamily 242902 25 176 2.57E-14 69.583 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#3307 - CGI_10024074 superfamily 243135 236 539 5.51E-74 243.737 cl02666 KU superfamily - - "Ku-core domain; includes the central DNA-binding beta-barrels, polypeptide rings, and the C-terminal arm of Ku proteins. The Ku protein consists of two tightly associated homologous subunits, Ku70 and Ku80, and was originally identified as an autoantigen recognized by the sera of patients with an autoimmunity disease. In eukaryotes, the Ku heterodimer contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by non-homologous end-joining. The bacterial Ku homologs does not contain the conserved N-terminal extension that is present in the eukaryotic Ku protein." Q#3307 - CGI_10024074 superfamily 117355 585 706 2.79E-37 135.946 cl07408 Ku_PK_bind superfamily - - Ku C terminal domain like; The non-homologous end joining (NHEJ) pathway is one method by which double stranded breaks in chromosomal DNA are repaired. Ku is a component of a multi-protein complex that is involved in the NHEJ. Ku has affinity for DNA ends and recruits the DNA-dependent protein kinase catalytic subunit (DNA-PKcs). This domain is found at the C terminal of Ku which binds to DNA-PKcs. Q#3307 - CGI_10024074 superfamily 241578 7 236 2.16E-18 84.7488 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3308 - CGI_10024076 superfamily 241575 48 109 2.94E-12 58.8231 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#3309 - CGI_10024077 superfamily 241575 80 141 5.84E-16 73.8459 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#3309 - CGI_10024077 superfamily 241575 219 280 4.88E-15 71.1495 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#3309 - CGI_10024077 superfamily 243132 306 683 3.30E-159 466.855 cl02661 A_deamin superfamily - - "Adenosine-deaminase (editase) domain; Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defence against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc." Q#3310 - CGI_10024078 superfamily 199166 1649 1825 8.29E-20 90.8496 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#3310 - CGI_10024078 superfamily 243074 1491 1537 2.31E-16 76.0061 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#3311 - CGI_10024079 superfamily 216381 325 667 1.10E-74 245.578 cl03136 Oxysterol_BP superfamily - - Oxysterol-binding protein; Oxysterol-binding protein. Q#3311 - CGI_10024079 superfamily 247725 1 103 1.61E-30 116.204 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3312 - CGI_10024080 superfamily 247725 364 410 1.62E-05 43.7861 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3313 - CGI_10024081 superfamily 209274 367 412 9.01E-17 74.4906 cl11211 NADH_4Fe-4S superfamily - - NADH-ubiquinone oxidoreductase-F iron-sulfur binding region; NADH-ubiquinone oxidoreductase-F iron-sulfur binding region. Q#3313 - CGI_10024081 superfamily 220798 278 330 5.56E-05 40.6684 cl14799 SLBB superfamily - - SLBB domain; SLBB domain. Q#3315 - CGI_10024083 superfamily 241599 29 83 4.26E-22 83.4468 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#3318 - CGI_10024086 superfamily 241599 61 118 9.33E-25 98.0844 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#3319 - CGI_10024087 superfamily 241599 145 203 5.51E-24 93.462 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#3323 - CGI_10024091 superfamily 241599 158 216 2.26E-23 90.3804 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#3325 - CGI_10024093 superfamily 243035 79 190 1.55E-12 64.1781 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3325 - CGI_10024093 superfamily 243035 359 454 3.06E-10 57.6297 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3325 - CGI_10024093 superfamily 243035 224 276 5.50E-05 41.4514 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3326 - CGI_10024094 superfamily 247727 97 193 9.24E-13 64.7586 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#3326 - CGI_10024094 superfamily 247727 357 453 1.90E-11 61.2918 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#3327 - CGI_10024095 superfamily 247727 162 258 4.76E-14 67.0698 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#3328 - CGI_10024096 superfamily 247727 163 259 5.35E-14 67.0698 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#3329 - CGI_10024098 superfamily 245213 357 391 1.50E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3329 - CGI_10024098 superfamily 245213 280 315 6.83E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3329 - CGI_10024098 superfamily 245213 93 131 0.00153399 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3331 - CGI_10024100 superfamily 245599 252 472 1.35E-85 264.854 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#3331 - CGI_10024100 superfamily 207662 128 209 4.43E-57 185.856 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#3332 - CGI_10026403 superfamily 245226 54 205 1.86E-88 261.681 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#3333 - CGI_10026404 superfamily 247692 49 548 1.17E-57 200.521 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#3336 - CGI_10026408 superfamily 245226 5 94 7.01E-52 164.226 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#3337 - CGI_10026409 superfamily 247692 42 149 1.07E-12 64.1071 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#3338 - CGI_10026410 superfamily 247692 53 323 1.00E-45 159.994 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#3339 - CGI_10026411 superfamily 242787 97 163 0.00523644 36.7251 cl01935 DUF2391 superfamily C - "Putative integral membrane protein (DUF2391); This entry is found in Nostoc sp. PCC 7120, Agrobacterium tumefaciens, Rhizobium meliloti, and Gloeobacter violaceus in a conserved two-gene neighborhood. Proteins containing this entry appear to span the membrane seven times." Q#3340 - CGI_10026412 superfamily 242212 69 153 2.18E-17 73.1751 cl00945 Ribosomal_L18ae superfamily - - Ribosomal L18ae/LX protein domain; This family includes eukaryotic L18ae as well as archaebacterial specific LX. Ribosomal protein L18ae forms part of the 60S ribosomal subunit. Q#3341 - CGI_10026413 superfamily 241578 265 526 4.21E-111 334.34 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3341 - CGI_10026413 superfamily 246669 149 258 3.26E-61 199.329 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3341 - CGI_10026413 superfamily 246669 16 129 2.57E-59 194.325 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3350 - CGI_10026422 superfamily 241563 104 139 0.000746421 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3350 - CGI_10026422 superfamily 188588 1164 1215 0.000884133 39.1091 cl14866 exo_TIGR04073 superfamily - - "putative exosortase-associated protein, TIGR04073 family; Members of this protein family are found in beta, gamma, and delta proteobacteria, and in the verrucomicrobia. Twenty-two of twenty-four species encountered contain the PEP-CTERM/exosortase system for modulating extracellular polysaccharide biosynthesis production, suggesting a role in protein sorting. The N-terminal signal sequence is divergent and not included in the model. PSI-BLAST and HMM searches suggest a distant sequence relationship between a region of this protein of about 100 amino acids and a corresponding region of the very large eukaryotic protein vps13, associated with vacuolar protein sorting in yeast." Q#3350 - CGI_10026422 superfamily 243092 362 473 0.00283104 40.0108 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3351 - CGI_10026423 superfamily 241591 25 100 2.04E-26 97.6919 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#3352 - CGI_10026424 superfamily 241594 456 808 4.01E-162 477.828 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#3352 - CGI_10026424 superfamily 241647 389 419 2.82E-10 56.7674 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#3352 - CGI_10026424 superfamily 241647 178 207 2.69E-09 54.071 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#3352 - CGI_10026424 superfamily 241647 324 353 2.92E-08 50.9894 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#3352 - CGI_10026424 superfamily 241647 219 249 6.94E-08 49.8338 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#3352 - CGI_10026424 superfamily 246669 18 115 1.68E-39 143.262 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3354 - CGI_10026426 superfamily 243034 57 154 0.000823467 38.5152 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#3355 - CGI_10026427 superfamily 242372 50 162 1.33E-19 82.8047 cl01221 DTW superfamily N - DTW domain; This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after. Q#3356 - CGI_10026428 superfamily 207637 6 82 2.40E-31 110.705 cl02541 CIDE_N superfamily - - "CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein." Q#3356 - CGI_10026428 superfamily 220094 77 181 7.05E-08 48.7289 cl07593 DFF-C superfamily C - "DNA Fragmentation factor 45kDa, C terminal domain; The C terminal domain of DNA Fragmentation factor 45kDa (DFF-C) consists of four alpha-helices, which are folded in a helix-packing arrangement, with alpha-2 and alpha-3 packing against a long C-terminal helix (alpha-4). The main function of this domain is the inhibition of DFF40 by binding to its C-terminal catalytic domain through ionic interactions, thereby inhibiting the fragmentation of DNA in the apoptotic process. In addition to blocking the DNase activity of DFF40, the C-terminal region of DFF45 is also important for the DFF40-specific folding chaperone activity, as demonstrated by the ability of DFF45 to refold DFF40." Q#3357 - CGI_10026429 superfamily 246669 498 630 3.18E-77 244.18 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3357 - CGI_10026429 superfamily 246669 357 479 3.87E-72 230.247 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3357 - CGI_10026429 superfamily 145459 7 93 6.46E-36 130.298 cl12264 RPH3A_effect_N superfamily - - "Rabphilin-3A effector domain N-terminal; This is a the N-terminus of a family of proteins involved in protein transport in synaptic vesicles. Rabphilin-3A has been shown to contact Rab3A, a small G protein important in neurotransmitter release, in two distinct areas. Most member proteins carry an FVHE-PHD type zinc-finger domain at the C-terminus." Q#3357 - CGI_10026429 superfamily 203315 92 145 0.00384463 35.9415 cl05335 zf-piccolo superfamily - - "Piccolo Zn-finger; This (predicted) Zinc finger is found in the bassoon and piccolo proteins. There are eight conserved cysteines, suggesting that it coordinates two zinc ligands." Q#3358 - CGI_10026430 superfamily 201778 7 133 8.56E-28 105.369 cl18219 GFO_IDH_MocA superfamily - - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#3358 - CGI_10026430 superfamily 217272 149 239 1.60E-07 48.2996 cl18400 GFO_IDH_MocA_C superfamily - - "Oxidoreductase family, C-terminal alpha/beta domain; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#3362 - CGI_10026434 superfamily 201778 24 150 4.41E-28 106.524 cl18219 GFO_IDH_MocA superfamily - - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#3362 - CGI_10026434 superfamily 217272 166 253 5.81E-11 58.7 cl18400 GFO_IDH_MocA_C superfamily - - "Oxidoreductase family, C-terminal alpha/beta domain; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#3364 - CGI_10026436 superfamily 243033 60 165 2.08E-06 43.8461 cl02428 Ependymin superfamily N - Ependymin; Ependymin. Q#3366 - CGI_10026438 superfamily 238076 44 153 3.02E-75 234.236 cl18938 PAX superfamily - - Paired Box domain Q#3366 - CGI_10026438 superfamily 241599 234 292 7.43E-25 96.9288 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#3369 - CGI_10026441 superfamily 243072 529 666 2.97E-27 108.24 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3369 - CGI_10026441 superfamily 243072 463 588 2.33E-26 105.929 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3369 - CGI_10026441 superfamily 115363 4 65 1.14E-15 73.1749 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#3369 - CGI_10026441 superfamily 241760 76 118 2.00E-14 69.4095 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#3369 - CGI_10026441 superfamily 115363 145 214 1.24E-12 64.3153 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#3369 - CGI_10026441 superfamily 247792 707 740 3.21E-06 45.4484 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3373 - CGI_10026445 superfamily 245201 483 529 9.25E-05 42.9294 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#3374 - CGI_10026446 superfamily 247069 111 233 1.66E-19 84.359 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#3374 - CGI_10026446 superfamily 243095 252 409 1.08E-67 215.279 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#3375 - CGI_10026447 superfamily 243100 331 369 3.38E-05 41.9067 cl02576 B_zip1 superfamily N - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#3376 - CGI_10026448 superfamily 247743 2518 2666 0.00259985 40.2071 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#3376 - CGI_10026448 superfamily 193256 2837 3107 3.16E-78 264.117 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#3376 - CGI_10026448 superfamily 193257 3486 3715 2.50E-57 202.138 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#3376 - CGI_10026448 superfamily 193253 3120 3466 8.30E-51 187.166 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#3376 - CGI_10026448 superfamily 203878 128 156 1.09E-13 69.5428 cl07007 Sgf11 superfamily - - "Sgf11 (transcriptional regulation protein); The Sgf11 family is a SAGA complex subunit in Saccharomyces cerevisiae. The SAGA complex is a multisubunit protein complex involved in transcriptional regulation. SAGA combines proteins involved in interactions with DNA-bound activators and TATA-binding protein (TBP), as well as enzymes for histone acetylation and deubiquitylation." Q#3376 - CGI_10026448 superfamily 247743 2158 2304 3.47E-06 48.8308 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#3377 - CGI_10026449 superfamily 241750 84 574 9.51E-113 349.3 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#3377 - CGI_10026449 superfamily 241750 562 604 0.00767062 37.654 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#3378 - CGI_10026450 superfamily 221377 205 342 3.82E-10 58.6342 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#3379 - CGI_10026451 superfamily 247724 10 210 1.08E-58 188.898 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3380 - CGI_10026452 superfamily 243066 31 87 6.50E-05 36.9001 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3382 - CGI_10026454 superfamily 243066 35 100 9.43E-11 53.4636 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3383 - CGI_10026455 superfamily 222150 345 367 0.00343385 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3385 - CGI_10026457 superfamily 216347 483 907 2.18E-82 273.643 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#3386 - CGI_10026458 superfamily 202715 59 156 6.66E-27 98.4192 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#3387 - CGI_10026459 superfamily 217617 118 341 2.69E-30 116.747 cl15988 Sulfotransfer_2 superfamily - - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#3390 - CGI_10026463 superfamily 241546 4 127 1.16E-47 151.554 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#3392 - CGI_10026465 superfamily 245213 27 57 0.000281592 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3392 - CGI_10026465 superfamily 241546 495 589 1.68E-19 85.02 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#3393 - CGI_10026466 superfamily 189857 1 72 2.83E-14 63.0378 cl07832 Caveolin superfamily N - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#3397 - CGI_10021632 superfamily 248281 181 264 2.89E-07 47.2651 cl17727 GT1 superfamily - - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#3398 - CGI_10021633 superfamily 247727 245 343 1.76E-07 48.5803 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#3399 - CGI_10021634 superfamily 247692 81 616 0 556.239 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#3400 - CGI_10021635 superfamily 247723 42 121 1.27E-39 139.177 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3400 - CGI_10021635 superfamily 247723 155 223 5.51E-34 123.63 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3404 - CGI_10021639 superfamily 152683 8 100 2.11E-08 48.4381 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#3404 - CGI_10021639 superfamily 241568 106 161 0.000950747 35.2436 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#3405 - CGI_10021640 superfamily 241750 255 598 0 524.239 cl00281 metallo-dependent_hydrolases superfamily C - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#3408 - CGI_10021643 superfamily 243092 1 223 1.67E-55 180.609 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3409 - CGI_10021644 superfamily 241584 633 720 6.93E-07 48.2615 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3409 - CGI_10021644 superfamily 245814 41 114 7.21E-05 42.0911 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3409 - CGI_10021644 superfamily 245814 431 510 0.000166785 40.9355 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3409 - CGI_10021644 superfamily 245814 226 308 6.84E-14 68.686 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3409 - CGI_10021644 superfamily 245814 312 388 1.34E-06 47.5206 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3409 - CGI_10021644 superfamily 245814 129 214 0.000590189 39.0257 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3409 - CGI_10021644 superfamily 241584 516 605 0.00521885 36.2479 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3410 - CGI_10021645 superfamily 243129 122 195 1.13E-10 55.3374 cl02653 MA3 superfamily C - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#3412 - CGI_10021647 superfamily 247723 17 96 2.54E-58 191.979 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3412 - CGI_10021647 superfamily 247723 102 177 2.42E-53 178.129 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3412 - CGI_10021647 superfamily 247723 299 376 3.82E-53 177.426 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3412 - CGI_10021647 superfamily 247723 195 275 4.58E-32 119.216 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3412 - CGI_10021647 superfamily 207685 560 630 6.30E-39 137.691 cl02642 PABP superfamily - - "Poly-adenylate binding protein, unique domain; The region featured in this family is found towards the C-terminus of poly(A)-binding proteins (PABPs). These are eukaryotic proteins that, through their binding of the 3' poly(A) tail on mRNA, have very important roles in the pathways of gene expression. They seem to provide a scaffold on which other proteins can bind and mediate processes such as export, translation and turnover of the transcripts. Moreover, they may act as antagonists to the binding of factors that allow mRNA degradation, regulating mRNA longevity. PABPs are also involved in nuclear transport. PABPs interact with poly(A) tails via RNA-recognition motifs (pfam00076). Note that the PABP C-terminal region is also found in members of the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains - these are also included in this family." Q#3413 - CGI_10021648 superfamily 100115 59 177 1.09E-23 94.6582 cl18930 StaR_like superfamily C - "StaR_like; a well-conserved protein found in bacteria, plants, and animals. A family member from Streptomyces toyocaensis, StaR is part of a gene cluster involved in the biosynthesis of glycopeptide antibiotics (GPAs), specifically A47934. It has been speculated that StaR could be a flavoprotein hydroxylating a tyrosine sidechain. Some family members have been annotated as proteins containing tetratricopeptide (TPR) repeats, which may at least indicate mostly alpha-helical secondary structure." Q#3416 - CGI_10021652 superfamily 241574 9 237 2.20E-84 263.678 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#3416 - CGI_10021652 superfamily 241574 305 502 8.00E-17 79.1669 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#3417 - CGI_10021653 superfamily 247947 450 484 0.000199179 39.2901 cl17393 HTH_Hin_like superfamily - - "Helix-turn-helix domain of Hin and related proteins, a family of DNA-binding domains unique to bacteria and represented by the Hin protein of Salmonella. The basic HTH domain is a simple fold comprised of three core helices that form a right-handed helical bundle. The principal DNA-protein interface is formed by the third helix, the recognition helix, inserting itself into the major groove of the DNA. A diverse array of HTH domains participate in a variety of functions that depend on their DNA-binding properties. HTH_Hin represents one of the simplest versions of the HTH domains; the characterization of homologous relationships between various sequence-diverse HTH domain families remains difficult. The Hin recombinase induces the site-specific inversion of a chromosomal DNA segment containing a promoter, which controls the alternate expression of two genes by reversibly switching orientation. The Hin recombinase consists of a single polypeptide chain containing a DNA-binding domain (HTH_Hin) and a catalytic domain." Q#3418 - CGI_10003256 superfamily 243035 52 166 2.37E-13 63.0225 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3419 - CGI_10003257 superfamily 243035 17 63 1.70E-05 38.126 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3420 - CGI_10003258 superfamily 241609 160 233 1.30E-18 77.8035 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#3420 - CGI_10003258 superfamily 241629 104 163 4.94E-10 55.0878 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#3421 - CGI_10014485 superfamily 248022 13 108 7.64E-12 62.6803 cl17468 Aa_trans superfamily N - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#3424 - CGI_10014488 superfamily 248458 22 404 1.95E-18 85.4433 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#3425 - CGI_10014489 superfamily 215776 34 92 1.35E-20 81.101 cl18343 OTCace superfamily N - "Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain; Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain. " Q#3426 - CGI_10014490 superfamily 241600 51 241 7.46E-76 231.36 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3427 - CGI_10014491 superfamily 243507 143 235 4.80E-15 72.7307 cl03728 Alpha_kinase superfamily N - "Alpha-kinase family; This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains." Q#3430 - CGI_10014494 superfamily 247725 128 208 2.17E-11 57.9732 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3431 - CGI_10014495 superfamily 217247 286 576 1.82E-85 271.189 cl18397 Glyco_hydro_2_C superfamily - - "Glycosyl hydrolases family 2, TIM barrel domain; This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities." Q#3431 - CGI_10014495 superfamily 217248 1 174 1.66E-33 125.461 cl18398 Glyco_hydro_2_N superfamily - - "Glycosyl hydrolases family 2, sugar binding domain; This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities and has a jelly-roll fold." Q#3431 - CGI_10014495 superfamily 216070 176 278 9.16E-12 62.5372 cl12242 Glyco_hydro_2 superfamily - - "Glycosyl hydrolases family 2; This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities." Q#3433 - CGI_10014497 superfamily 243035 663 784 3.50E-31 118.876 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3433 - CGI_10014497 superfamily 241578 14 165 3.18E-30 117.778 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3433 - CGI_10014497 superfamily 243119 241 286 1.44E-05 43.5865 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3434 - CGI_10014498 superfamily 241578 10 161 3.16E-30 115.081 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3434 - CGI_10014498 superfamily 241611 402 479 0.00265628 37.368 cl00102 PTX superfamily N - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#3435 - CGI_10014499 superfamily 241600 69 279 1.29E-82 250.235 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3436 - CGI_10014500 superfamily 241600 69 267 8.48E-77 235.212 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3437 - CGI_10014502 superfamily 241578 33 183 9.85E-30 114.311 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3438 - CGI_10014503 superfamily 217903 47 198 1.38E-47 167.136 cl09317 Mak10 superfamily - - "Mak10 subunit, NatC N(alpha)-terminal acetyltransferase; NatC N(alpha)-terminal acetyltransferases contains Mak10p, Mak31p and Mak3p subunits. All three subunits are associated with each other to form the active complex." Q#3440 - CGI_10014505 superfamily 245306 37 135 5.63E-21 83.0187 cl10465 Peptidase_S24_S26 superfamily - - "The S24, S26 LexA/signal peptidase superfamily contains LexA-related and type I signal peptidase families. The S24 LexA protein domains include: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The S26 type I signal peptidase (SPase) family also includes mitochondrial inner membrane protease (IMP)-like members. SPases are essential membrane-bound proteases which function to cleave away the amino-terminal signal peptide from the translocated pre-protein, thus playing a crucial role in the transport of proteins across membranes in all living organisms. All members in this superfamily are unique serine proteases that carry out catalysis using a serine/lysine dyad instead of the prototypical serine/histidine/aspartic acid triad found in most serine proteases." Q#3441 - CGI_10005215 superfamily 248020 206 285 0.00454955 38.9848 cl17466 Sulfatase superfamily N - Sulfatase; Sulfatase. Q#3442 - CGI_10005216 superfamily 241705 258 507 1.66E-118 354.687 cl00228 HIT_like superfamily N - "HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups." Q#3442 - CGI_10005216 superfamily 241705 55 212 6.16E-65 215.63 cl00228 HIT_like superfamily C - "HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups." Q#3444 - CGI_10005218 superfamily 243082 517 642 6.67E-42 154.723 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#3446 - CGI_10005220 superfamily 248097 32 160 3.46E-13 62.2826 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3447 - CGI_10005221 superfamily 248097 62 184 7.95E-17 73.0682 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3448 - CGI_10005222 superfamily 238076 13 138 1.40E-82 246.947 cl18938 PAX superfamily - - Paired Box domain Q#3451 - CGI_10004348 superfamily 247684 312 726 4.45E-85 279.933 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3454 - CGI_10004986 superfamily 247684 1 263 7.26E-51 173.232 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3456 - CGI_10004988 superfamily 247069 1321 1467 4.74E-14 71.6474 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#3456 - CGI_10004988 superfamily 221603 1279 1337 1.03E-16 78.9756 cl13877 BNIP2 superfamily N - "Bcl2-/adenovirus E1B nineteen kDa-interacting protein 2; This domain family is found in eukaryotes, and is typically between 119 and 133 amino acids in length. There is a conserved HGGY sequence motif. This family is Bcl2-/adenovirus E1B nineteen kDa-interacting protein 2. It interacts with pro- and anti- apoptotic molecules in the cell." Q#3456 - CGI_10004988 superfamily 202421 7 139 3.99E-06 47.1226 cl03740 DHHA2 superfamily - - DHHA2 domain; This domain is often found adjacent to the DHH domain pfam01368 and is called DHHA2 for DHH associated domain. This domain is diagnostic of DHH subfamily 2 members. The domain is about 120 residues long and contains a conserved DXK motif at its amino terminus. Q#3458 - CGI_10006649 superfamily 241563 62 96 6.74E-05 41.9391 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3459 - CGI_10006650 superfamily 110440 390 417 0.00828724 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#3462 - CGI_10006653 superfamily 241645 193 267 7.86E-11 59.6243 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#3462 - CGI_10006653 superfamily 199166 453 581 1.82E-09 57.3372 cl15308 AMN1 superfamily N - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#3462 - CGI_10006653 superfamily 246925 684 784 0.000761089 41.187 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3463 - CGI_10006654 superfamily 215647 1008 1230 7.96E-06 47.2181 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#3466 - CGI_10003894 superfamily 241563 38 73 6.68E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3466 - CGI_10003894 superfamily 110440 462 488 0.00822535 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#3469 - CGI_10019885 superfamily 243090 344 464 1.36E-65 209.51 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#3469 - CGI_10019885 superfamily 243090 185 328 7.50E-25 100.605 cl02565 RGS superfamily N - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#3469 - CGI_10019885 superfamily 243090 51 110 1.60E-15 74.026 cl02565 RGS superfamily C - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#3470 - CGI_10019887 superfamily 216050 3 73 9.11E-06 40.7503 cl18357 rve superfamily C - Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site. Q#3471 - CGI_10019888 superfamily 241578 420 476 0.000209925 41.0122 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3473 - CGI_10019891 superfamily 241888 85 332 4.43E-116 339.357 cl00473 BI-1-like superfamily - - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#3474 - CGI_10019892 superfamily 218440 42 234 0.004874 38.3641 cl14936 AF-4 superfamily NC - "AF-4 proto-oncoprotein; This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila." Q#3474 - CGI_10019892 superfamily 208802 240 299 0.00729625 37.0255 cl07974 DRE_TIM_metallolyase superfamily NC - "DRE-TIM metallolyase superfamily; The DRE-TIM metallolyase superfamily includes 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM"." Q#3475 - CGI_10019893 superfamily 245202 14 56 9.03E-21 78.0607 cl09927 S1_like superfamily C - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#3476 - CGI_10019894 superfamily 241866 27 357 0 523.482 cl00445 Iso_dh superfamily - - Isocitrate/isopropylmalate dehydrogenase; Isocitrate/isopropylmalate dehydrogenase. Q#3477 - CGI_10019895 superfamily 242122 1546 1676 0.000323333 41.0785 cl00824 HEPN superfamily - - HEPN domain; HEPN domain. Q#3478 - CGI_10019897 superfamily 242122 279 409 4.07E-06 44.5453 cl00824 HEPN superfamily - - HEPN domain; HEPN domain. Q#3480 - CGI_10019899 superfamily 241733 1077 1144 5.11E-42 149.209 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#3480 - CGI_10019899 superfamily 217744 164 1064 0 588.878 cl09313 Nrap superfamily - - "Nrap protein; Members of this family are nucleolar RNA-associated proteins (Nrap) which are highly conserved from yeast (Saccharomyces cerevisiae) to human. In the mouse, Nrap is ubiquitously expressed and is specifically localised in the nucleolus. Nrap is a large nucleolar protein (of more than 1000 amino acids). Nrap appears to be associated with ribosome biogenesis by interacting with pre-rRNA primary transcript." Q#3481 - CGI_10019900 superfamily 243077 107 166 1.59E-07 46.7697 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#3481 - CGI_10019900 superfamily 219548 191 262 1.54E-15 69.1811 cl06670 HSCB_C superfamily - - HSCB C-terminal oligomerisation domain; This domain is the HSCB C-terminal oligomerisation domain and is found on co-chaperone proteins. Q#3482 - CGI_10019901 superfamily 243100 331 383 4.74E-10 55.7739 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#3483 - CGI_10019902 superfamily 247757 59 276 4.31E-73 225.422 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#3485 - CGI_10019904 superfamily 247757 1 51 3.76E-15 67.875 cl17203 Fer4_NifH superfamily N - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#3486 - CGI_10019905 superfamily 241563 60 100 0.000701713 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3487 - CGI_10019906 superfamily 243072 684 810 1.43E-28 113.247 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3487 - CGI_10019906 superfamily 220879 252 312 0.00571181 37.0281 cl12395 DUF2730 superfamily N - Protein of unknown function (DUF2730); This family of proteins with unknown function appears to be restricted to Gammaproteobacteria. Q#3487 - CGI_10019906 superfamily 247746 110 208 0.00755828 36.8526 cl17192 ATP-synt_B superfamily N - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#3488 - CGI_10019907 superfamily 245864 48 273 3.92E-24 99.6602 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3489 - CGI_10019908 superfamily 245864 54 518 6.11E-86 275.311 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3493 - CGI_10005140 superfamily 217403 351 451 1.31E-20 86.7078 cl18408 2OG-FeII_Oxy superfamily - - "2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 catalyzing the reaction: Procollagen L-proline + 2-oxoglutarate + O2 <=> procollagen trans- 4-hydroxy-L-proline + succinate + CO2. The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB." Q#3493 - CGI_10005140 superfamily 222608 248 312 2.01E-15 72.287 cl18680 DIOX_N superfamily C - non-haem dioxygenase in morphine synthesis N-terminal; This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity. Q#3493 - CGI_10005140 superfamily 248281 95 176 3.14E-07 48.0355 cl17727 GT1 superfamily - - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#3494 - CGI_10005141 superfamily 243035 5 127 6.54E-23 87.6753 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3495 - CGI_10005142 superfamily 241572 53 138 6.12E-14 65.3376 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#3495 - CGI_10005142 superfamily 241572 146 233 2.14E-08 49.9297 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#3496 - CGI_10005143 superfamily 241563 172 206 4.49E-05 40.3983 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3497 - CGI_10005144 superfamily 241563 96 135 1.17E-06 47.474 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3497 - CGI_10005144 superfamily 128778 146 253 0.00916432 36.4739 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#3498 - CGI_10005145 superfamily 246669 1008 1131 2.03E-55 189.766 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#3498 - CGI_10005145 superfamily 241622 678 772 1.76E-16 76.8366 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#3499 - CGI_10005847 superfamily 247723 215 293 8.71E-44 146.581 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3499 - CGI_10005847 superfamily 247723 1 73 5.81E-40 136.741 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3502 - CGI_10024562 superfamily 241755 9 339 1.66E-133 387.345 cl00288 EPT_RTPC-like superfamily - - "This domain family includes the Enolpyruvate transferase (EPT) family and the RNA 3' phosphate cyclase family (RTPC). These 2 families differ in that EPT is formed by 3 repeats of an alpha-beta structural domain while RTPC has 3 similar repeats with a 4th slightly different domain inserted between the 2nd and 3rd repeat. They evidently share the same active site location, although the catalytic residues differ." Q#3504 - CGI_10024564 superfamily 241619 569 627 9.73E-06 45.1533 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#3504 - CGI_10024564 superfamily 241619 633 706 0.000119236 41.7956 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#3504 - CGI_10024564 superfamily 241619 42 98 0.0019411 38.2197 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#3506 - CGI_10024566 superfamily 245213 1163 1199 1.85E-10 58.8022 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3506 - CGI_10024566 superfamily 245213 1377 1410 2.00E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3506 - CGI_10024566 superfamily 245213 1126 1160 0.000359978 40.3126 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3506 - CGI_10024566 superfamily 245213 1347 1372 0.0069789 36.4606 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3506 - CGI_10024566 superfamily 150843 238 430 3.26E-38 144.66 cl10917 Med8 superfamily - - "Mediator of RNA polymerase II transcription complex subunit 8; Arc32, or Med8, is one of the subunits of the Mediator complex of RNA polymerase II. The region conserved contains two alpha helices putatively necessary for binding to other subunits within the core of the Mediator complex. The N-terminus of Med8 binds to the essential core Head part of Mediator and the C-terminus hinges to Med18 on the non-essential part of the Head that also includes Med20." Q#3506 - CGI_10024566 superfamily 150843 24 177 3.29E-24 103.829 cl10917 Med8 superfamily - - "Mediator of RNA polymerase II transcription complex subunit 8; Arc32, or Med8, is one of the subunits of the Mediator complex of RNA polymerase II. The region conserved contains two alpha helices putatively necessary for binding to other subunits within the core of the Mediator complex. The N-terminus of Med8 binds to the essential core Head part of Mediator and the C-terminus hinges to Med18 on the non-essential part of the Head that also includes Med20." Q#3506 - CGI_10024566 superfamily 246680 517 598 2.18E-08 53.5836 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#3506 - CGI_10024566 superfamily 219501 934 971 0.000289397 40.7766 cl06622 MNNL superfamily C - N terminus of Notch ligand; This entry represents a region of conserved sequence at the N terminus of several Notch ligand proteins. Q#3507 - CGI_10024567 superfamily 241592 49 66 0.000187188 35.6654 cl00074 H2A superfamily C - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#3508 - CGI_10024568 superfamily 219632 14 92 1.14E-24 91.1804 cl06786 Eaf7 superfamily - - "Chromatin modification-related protein EAF7; The S. cerevisiae member of this family is part of NuA4, the only essential histone acetyltransferase complex in Saccharomyces cerevisiae involved in global histone acetylation." Q#3509 - CGI_10024569 superfamily 247805 244 442 7.10E-88 275.132 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#3509 - CGI_10024569 superfamily 247905 470 584 1.39E-32 122.732 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#3509 - CGI_10024569 superfamily 247799 82 128 3.05E-10 57.1847 cl17245 KH-I superfamily N - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#3511 - CGI_10024571 superfamily 241554 82 214 3.25E-18 81.5379 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#3511 - CGI_10024571 superfamily 241752 457 590 1.98E-15 74.6743 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#3512 - CGI_10024572 superfamily 247684 9 182 1.54E-20 87.2591 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3513 - CGI_10024573 superfamily 241832 501 601 7.62E-05 41.8274 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#3514 - CGI_10024574 superfamily 222150 379 404 0.000164019 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3515 - CGI_10024575 superfamily 202715 117 213 2.64E-29 106.508 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#3516 - CGI_10024576 superfamily 202715 114 212 3.02E-31 111.516 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#3517 - CGI_10024577 superfamily 202715 117 213 1.88E-36 124.998 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#3518 - CGI_10024578 superfamily 202715 11 114 2.07E-29 103.427 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#3519 - CGI_10024579 superfamily 247684 9 182 1.09E-20 87.6443 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3520 - CGI_10024580 superfamily 248028 150 304 8.27E-30 111.343 cl17474 Steroid_dh superfamily - - "3-oxo-5-alpha-steroid 4-dehydrogenase; This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalyzed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants is DET2, a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development." Q#3520 - CGI_10024580 superfamily 241645 2 78 4.38E-09 52.0769 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#3521 - CGI_10024581 superfamily 247723 469 538 0.000446714 39.5957 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3521 - CGI_10024581 superfamily 192930 153 266 1.78E-32 122.89 cl13496 DUF3546 superfamily - - Domain of unknown function (DUF3546); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 93 to 114 amino acids in length. This domain has two completely conserved Y residues that may be functionally important. Q#3523 - CGI_10024583 superfamily 214560 38 237 9.08E-22 94.3464 cl18311 TSPN superfamily - - Thrombospondin N-terminal -like domains; Heparin-binding and cell adhesion domain of thrombospondin Q#3524 - CGI_10024584 superfamily 241793 16 129 4.81E-34 118.376 cl00333 Ribosomal_L13 superfamily - - "Ribosomal protein L13. Protein L13, a large ribosomal subunit protein, is one of five proteins required for an early folding intermediate of 23S rRNA in the assembly of the large subunit. L13 is situated on the bottom of the large subunit, near the polypeptide exit site. It interacts with proteins L3 and L6, and forms an extensive network of interactions with 23S rRNA. L13 has been identified as a homolog of the human breast basic conserved protein 1 (BBC1), a protein identified through its increased expression in breast cancer. L13 expression is also upregulated in a variety of human gastrointestinal cancers, suggesting it may play a role in the etiology of a variety of human malignancies." Q#3525 - CGI_10024585 superfamily 243058 128 238 0.00011654 40.3756 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#3526 - CGI_10024586 superfamily 218708 38 276 4.71E-54 177.927 cl05328 DUF829 superfamily - - Eukaryotic protein of unknown function (DUF829); This family consists of several uncharacterized eukaryotic proteins. Q#3527 - CGI_10024587 superfamily 248469 62 184 0.00205729 36.9643 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#3529 - CGI_10024589 superfamily 241838 11 77 6.75E-13 63.6163 cl00395 FMT_core superfamily N - "Formyltransferase, catalytic core domain; Formyltransferase, catalytic core domain. The proteins of this superfamily contain a formyltransferase domain that hydrolyzes the removal of a formyl group from its substrate as part of a multistep transfer mechanism, and this alignment model represents the catalytic core of the formyltransferase domain. This family includes the following known members; Glycinamide Ribonucleotide Transformylase (GART), Formyl-FH4 Hydrolase, Methionyl-tRNA Formyltransferase, ArnA, and 10-Formyltetrahydrofolate Dehydrogenase (FDH). Glycinamide Ribonucleotide Transformylase (GART) catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Methionyl-tRNA Formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA, which plays important role in translation initiation. ArnA is required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. 10-formyltetrahydrofolate dehydrogenase (FDH) catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. Members of this family are multidomain proteins. The formyltransferase domain is located at the N-terminus of FDH, Methionyl-tRNA Formyltransferase and ArnA, and at the C-terminus of Formyl-FH4 Hydrolase. Prokaryotic Glycinamide Ribonucleotide Transformylase (GART) is a single domain protein while eukaryotic GART is a trifunctional protein that catalyzes the second, third and fifth steps in de novo purine biosynthesis." Q#3534 - CGI_10024594 superfamily 241832 12 105 7.74E-35 122.667 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#3534 - CGI_10024594 superfamily 218936 125 268 1.02E-56 181.634 cl05619 PITH superfamily - - "PITH domain; This family was formerly known as DUF1000. The full-length, Txnl1, protein which is a probable component of the 26S proteasome, uses its C-terminal, PITH, domain to associate specifically with the 26S proteasome. PITH derives from proteasome-interacting thioredoxin domain." Q#3535 - CGI_10024595 superfamily 248097 92 217 2.06E-14 66.5198 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3536 - CGI_10024596 superfamily 248097 14 139 2.54E-10 53.423 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3537 - CGI_10024597 superfamily 248097 24 148 1.30E-17 74.2238 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3538 - CGI_10024598 superfamily 216686 10 190 8.10E-39 140.921 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#3538 - CGI_10024598 superfamily 216686 335 517 2.38E-38 139.766 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#3541 - CGI_10024601 superfamily 241874 22 320 3.42E-151 444.041 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#3542 - CGI_10024602 superfamily 241600 223 438 2.41E-100 301.467 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3543 - CGI_10024603 superfamily 247736 95 161 2.23E-09 50.7373 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#3545 - CGI_10024605 superfamily 247723 8 73 4.56E-24 91.9044 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3545 - CGI_10024605 superfamily 199156 91 104 0.0015423 35.1116 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#3546 - CGI_10024606 superfamily 248097 48 170 9.81E-17 72.2978 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3547 - CGI_10024607 superfamily 245202 79 157 4.57E-34 126.585 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#3547 - CGI_10024607 superfamily 247805 398 536 1.56E-28 113.201 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#3547 - CGI_10024607 superfamily 219532 896 996 5.88E-37 135.905 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#3547 - CGI_10024607 superfamily 243778 772 862 6.70E-35 129.651 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#3547 - CGI_10024607 superfamily 247905 607 675 2.92E-08 52.5993 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#3548 - CGI_10024608 superfamily 247740 306 522 1.10E-89 277.068 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#3549 - CGI_10024609 superfamily 216901 202 393 3.58E-61 205.897 cl03466 Rap_GAP superfamily - - Rap/ran-GAP; Rap/ran-GAP. Q#3549 - CGI_10024609 superfamily 243036 450 737 1.32E-43 159.711 cl02434 CNH superfamily - - "CNH domain; Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations." Q#3552 - CGI_10024612 superfamily 248012 108 212 8.55E-20 81.47 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#3553 - CGI_10024613 superfamily 217473 24 293 5.70E-24 102.058 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#3556 - CGI_10022585 superfamily 247684 1 337 1.16E-61 208.285 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3557 - CGI_10022586 superfamily 247684 32 458 4.40E-98 307.667 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3559 - CGI_10022588 superfamily 241758 55 80 0.00143091 33.7699 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#3560 - CGI_10022589 superfamily 243119 14 59 1.18E-05 37.7985 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3561 - CGI_10022590 superfamily 248097 73 170 1.41E-15 73.4534 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3561 - CGI_10022590 superfamily 243119 448 502 4.82E-05 41.2653 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3562 - CGI_10022591 superfamily 248097 71 201 1.75E-22 88.8614 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3563 - CGI_10022592 superfamily 248097 45 147 5.90E-17 74.609 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3564 - CGI_10022593 superfamily 245595 430 729 3.19E-143 424.241 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#3564 - CGI_10022593 superfamily 243119 340 384 1.53E-06 46.2728 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3564 - CGI_10022593 superfamily 243119 45 92 6.80E-06 44.3569 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3564 - CGI_10022593 superfamily 243119 185 226 0.000223738 39.7245 cl02629 CBM_14 superfamily C - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3565 - CGI_10022594 superfamily 245847 4 144 5.08E-17 72.9745 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#3579 - CGI_10022610 superfamily 241563 93 135 0.000348101 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3580 - CGI_10022611 superfamily 247723 32 132 2.02E-52 177.1 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3580 - CGI_10022611 superfamily 241563 486 528 4.80E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3581 - CGI_10022612 superfamily 247744 3 184 6.47E-74 224.706 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#3583 - CGI_10022614 superfamily 247856 52 111 1.04E-05 40.6089 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3583 - CGI_10022614 superfamily 247856 85 155 8.49E-05 38.2977 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3584 - CGI_10022615 superfamily 247736 259 314 0.00123163 37.2553 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#3584 - CGI_10022615 superfamily 202401 348 536 6.39E-110 328.748 cl03719 NMT_C superfamily - - "Myristoyl-CoA:protein N-myristoyltransferase, C-terminal domain; The N and C-terminal domains of NMT are structurally similar, each adopting an acyl-CoA N-acyltransferase-like fold." Q#3585 - CGI_10022616 superfamily 243090 377 509 4.48E-39 142.552 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#3585 - CGI_10022616 superfamily 243089 97 281 2.29E-37 139.316 cl02564 PXA superfamily - - PXA domain; This domain is associated with PX domains pfam00787. Q#3585 - CGI_10022616 superfamily 243088 540 643 3.97E-36 133.548 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#3585 - CGI_10022616 superfamily 149621 734 845 2.75E-27 108.141 cl07303 Nexin_C superfamily - - Sorting nexin C terminal; This region is found a the C terminal of proteins belonging to the sorting nexin family. It is found on proteins which also contain pfam00787. Q#3587 - CGI_10022618 superfamily 248262 7 280 5.57E-108 321.1 cl17708 HMBS superfamily - - "Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophylls, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). HMBS consists of three domains, and is believed to bind substrate through a hinge-bending motion of domains I and II. HMBS is found in all organisms except viruses." Q#3588 - CGI_10022619 superfamily 243065 225 380 4.88E-17 78.2125 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#3589 - CGI_10022620 superfamily 243065 583 729 2.73E-30 118.658 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#3589 - CGI_10022620 superfamily 243065 94 265 8.31E-19 85.1461 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#3589 - CGI_10022620 superfamily 244710 774 851 3.27E-12 63.5633 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#3590 - CGI_10022621 superfamily 241613 65 101 4.80E-10 58.3722 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#3590 - CGI_10022621 superfamily 241613 185 219 2.19E-09 56.4462 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#3590 - CGI_10022621 superfamily 241613 148 183 2.25E-09 56.4462 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#3590 - CGI_10022621 superfamily 241613 112 144 3.99E-07 49.8978 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#3590 - CGI_10022621 superfamily 241613 24 61 5.79E-06 46.431 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#3590 - CGI_10022621 superfamily 241613 1 20 0.00373277 37.9566 cl00104 LDLa superfamily N - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#3590 - CGI_10022621 superfamily 246918 2617 2667 1.58E-12 66.0711 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3590 - CGI_10022621 superfamily 246918 2673 2722 5.68E-07 49.5075 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3590 - CGI_10022621 superfamily 246918 2368 2415 5.46E-06 46.8111 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3590 - CGI_10022621 superfamily 246918 2729 2774 2.88E-05 44.4999 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3590 - CGI_10022621 superfamily 246918 2574 2612 0.000112614 42.7778 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3591 - CGI_10022622 superfamily 217293 28 227 2.41E-44 154.328 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#3591 - CGI_10022622 superfamily 202474 234 322 5.31E-08 51.8857 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#3592 - CGI_10006983 superfamily 245201 1 201 5.62E-61 202.467 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#3595 - CGI_10006986 superfamily 247694 37 118 2.96E-11 55.3079 cl17070 AMPKA_C_like superfamily - - "C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha subunit and similar domains; This family is composed of AMPKs, microtubule-associated protein/microtubule affinity regulating kinases (MARKs), yeast Kcc4p-like proteins, plant calcineurin B-Like (CBL)-interacting protein kinases (CIPKs), and similar proteins. They are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. AMPKs act as sensors for the energy status of the cell and are activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. MARKs phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Kcc4p and related proteins are septin-associated proteins that are involved in septin organization and in the yeast morphogenesis checkpoint coordinating the cell cycle with bud formation. CIPKs interact with the calcineurin B-like (CBL) calcium sensors to form a signaling network that decode specific calcium signals triggered by a variety of environmental stimuli including salinity, drought, cold, light, and mechanical perturbation, among others. All members of this family contain an N-terminal catalytic kinase domain and a C-terminal regulatory domain which is also called kinase associated domain 1 (KA1) in some cases. The C-terminal regulatory domain serves as a protein interaction domain in AMPKs and CIPKs. In MARKs and Kcc4p-like proteins, this domain binds phospholipids and may be involved in membrane localization." Q#3598 - CGI_10006989 superfamily 209898 90 112 0.00330323 34.6866 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#3598 - CGI_10006989 superfamily 209898 42 63 0.003796 34.6235 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#3598 - CGI_10006989 superfamily 209898 65 86 0.0088717 33.4679 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#3604 - CGI_10000715 superfamily 241779 8 185 6.74E-50 163.893 cl00318 YjeF_N superfamily N - "YjeF-related protein N-terminus; YjeF-N domain is a novel version of the Rossmann fold with a set of catalytic residues and structural features that are different from the conventional dehydrogenases. YjeF-N domain is fused to Ribokinases in bacteria (YjeF), where they may be phosphatases, and to divergent Sm and the FDF domain in eukaryotes (Dcp3p and FLJ21128), where they may be involved in decapping and catalyze hydrolytic RNA-processing reactions." Q#3605 - CGI_10001102 superfamily 241563 75 110 3.24E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3607 - CGI_10011661 superfamily 248192 46 426 6.78E-89 277.613 cl17638 PLN02808 superfamily - - alpha-galactosidase Q#3609 - CGI_10011663 superfamily 217645 75 338 8.69E-79 248.028 cl04187 DUF303 superfamily - - Domain of unknown function (DUF303); Distribution of this domain seems limited to prokaryotes and viruses. Q#3610 - CGI_10011664 superfamily 241600 105 260 1.12E-47 159.328 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3610 - CGI_10011664 superfamily 241600 26 98 1.80E-15 71.5027 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3611 - CGI_10011665 superfamily 241600 16 141 1.23E-41 139.683 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3612 - CGI_10011666 superfamily 217007 89 396 2.94E-102 319.931 cl11995 Syja_N superfamily - - SacI homology domain; This Pfam family represents a protein domain which shows homology to the yeast protein SacI. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin. Q#3613 - CGI_10011667 superfamily 202715 175 263 2.62E-25 96.8784 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#3614 - CGI_10011668 superfamily 215648 189 424 1.00E-66 223.241 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#3614 - CGI_10011668 superfamily 245225 1 93 2.70E-26 111.951 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#3614 - CGI_10011668 superfamily 219467 109 158 7.93E-14 67.7435 cl08456 NCD3G superfamily - - "Nine Cysteines Domain of family 3 GPCR; This conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the pfam00003 in several receptor proteins." Q#3615 - CGI_10011669 superfamily 243035 30 147 2.31E-29 105.78 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3616 - CGI_10011670 superfamily 243035 29 90 6.13E-13 61.0965 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3616 - CGI_10011670 superfamily 243035 105 130 0.0030449 34.5878 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3617 - CGI_10011671 superfamily 243035 23 82 6.47E-13 60.3062 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3618 - CGI_10011672 superfamily 243035 62 129 8.13E-18 74.1933 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3619 - CGI_10011673 superfamily 243035 130 256 2.97E-26 100.002 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3619 - CGI_10011673 superfamily 243035 24 110 1.32E-15 70.3413 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3620 - CGI_10011674 superfamily 215731 52 332 4.53E-35 129.64 cl08245 Gln-synt_C superfamily - - "Glutamine synthetase, catalytic domain; Glutamine synthetase, catalytic domain. " Q#3621 - CGI_10000223 superfamily 246597 2 88 1.51E-59 185.892 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#3622 - CGI_10009100 superfamily 241578 1454 1617 3.66E-11 63.5206 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3622 - CGI_10009100 superfamily 241578 263 461 4.70E-06 47.7274 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3623 - CGI_10009101 superfamily 241578 265 463 1.19E-06 48.4978 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3624 - CGI_10009102 superfamily 241888 161 324 1.15E-59 192.756 cl00473 BI-1-like superfamily - - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#3625 - CGI_10009103 superfamily 144608 308 395 0.000274082 41.3465 cl18013 Mg_chelatase superfamily N - "Magnesium chelatase, subunit ChlI; Magnesium-chelatase is a three-component enzyme that catalyzes the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in channelling inter- mediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weight between 38-42 kDa." Q#3626 - CGI_10009104 superfamily 202500 162 192 0.00396033 37.1121 cl03819 HEAT superfamily - - HEAT repeat; The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514). Q#3627 - CGI_10009105 superfamily 214560 170 347 5.46E-22 94.3464 cl18311 TSPN superfamily - - Thrombospondin N-terminal -like domains; Heparin-binding and cell adhesion domain of thrombospondin Q#3627 - CGI_10009105 superfamily 246918 119 165 6.16E-10 56.0559 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3627 - CGI_10009105 superfamily 246918 419 468 1.08E-07 49.5075 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3630 - CGI_10006280 superfamily 243077 66 102 1.97E-15 69.4965 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#3633 - CGI_10006283 superfamily 219184 143 320 6.29E-47 158.999 cl12337 Clp1 superfamily - - "Pre-mRNA cleavage complex II protein Clp1; This family consists of several pre-mRNA cleavage complex II Clp1 (or HeaB) proteins. Six different protein factors are required in vitro for 3' end formation of mammalian pre-mRNAs by endonucleolytic cleavage and polyadenylation. Clp1 is a subunit of cleavage complex IIA, which is required for cleavage, but not for polyadenylation of pre-mRNA." Q#3633 - CGI_10006283 superfamily 247757 26 141 5.19E-30 111.718 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#3634 - CGI_10012165 superfamily 177822 11 189 8.64E-19 81.5049 cl18088 PLN02164 superfamily N - sulfotransferase Q#3637 - CGI_10012168 superfamily 241702 644 878 2.75E-104 326.845 cl00224 PLPDE_IV superfamily - - "PyridoxaL 5'-Phosphate Dependent Enzymes class IV (PLPDE_IV). This D-amino acid superfamily, one of five classes of PLPDE, consists of branched-chain amino acid aminotransferases (BCAT), D-amino acid transferases (DAAT), and 4-amino-4-deoxychorismate lyases (ADCL). BCAT catalyzes the reversible transamination reaction between the L-branched-chain amino and alpha-keto acids. DAAT catalyzes the synthesis of D-glutamic acid and D-alanine, and ADCL converts 4-amino-4-deoxychorismate to p-aminobenzoate and pyruvate. Except for a few enzymes, i. e., Escherichia coli and Salmonella BCATs, which are homohexamers arranged as a double trimer, the class IV PLPDEs are homodimers. Homodimer formation is required for catalytic activity." Q#3637 - CGI_10012168 superfamily 241597 381 452 1.18E-37 136.274 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#3637 - CGI_10012168 superfamily 241721 322 382 0.00289723 39.1362 cl00246 MTHFR superfamily N - "Methylenetetrahydrofolate reductase (MTHFR). 5,10-Methylenetetrahydrofolate is reduced to 5-methyltetrahydrofolate by methylenetetrahydrofolate reductase, a cytoplasmic, NAD(P)-dependent enzyme. 5-methyltetrahydrofolate is utilized by methionine synthase to convert homocysteine to methionine. The enzymatic mechanism is a ping-pong bi-bi mechanism, in which NAD(P)+ release precedes the binding of methylenetetrahydrofolate and the acceptor is free FAD. The family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from prokaryotes and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The bacterial enzyme is a homotetramer and NADH is the preferred reductant while the eukaryotic enzyme is a homodimer and NADPH is the preferred reductant. In humans, there are several clinically significant mutations in MTHFR that result in hyperhomocysteinemia, which is a risk factor for the development of cardiovascular disease." Q#3638 - CGI_10012169 superfamily 241872 13 368 1.58E-39 145.691 cl00453 CDP-OH_P_transf superfamily - - CDP-alcohol phosphatidyltransferase; All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. Q#3640 - CGI_10012171 superfamily 247725 19 123 4.26E-67 210.207 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3640 - CGI_10012171 superfamily 222294 275 370 3.77E-09 53.3831 cl16338 UCH37_bd superfamily - - "Ubiquitin C-terminal hydrolase 37 receptor binding site; This domain appears frequently at the C-terminus of Proteasom_Rpn13, pfam04683. It is a proteasome subunit that binds the de-ubiquitinating enzyme, UCH37." Q#3641 - CGI_10012172 superfamily 245213 38 68 0.00131118 34.5346 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3641 - CGI_10012172 superfamily 245213 74 103 0.00162719 34.5346 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3641 - CGI_10012172 superfamily 245213 2 31 0.00738061 32.6086 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3642 - CGI_10012173 superfamily 241596 195 231 0.0038064 34.4971 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#3643 - CGI_10012174 superfamily 243035 98 205 2.04E-10 58.0149 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3648 - CGI_10004250 superfamily 148650 29 112 5.46E-10 56.5547 cl06269 Pex26 superfamily C - Pex26 protein; This family consists of Pex26 and related mammalian proteins. Pex26 is a type II peroxisomal membrane protein which recruits Pex6-Pex1 complexes to peroxisomes. Mutations in Pex26 can lead to human disorders. Q#3649 - CGI_10004251 superfamily 241758 62 111 1.93E-17 72.7878 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#3650 - CGI_10004252 superfamily 243072 39 155 8.14E-30 109.01 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3653 - CGI_10007661 superfamily 243125 5 41 0.000723419 36.6134 cl02649 LEM superfamily - - "LEM (Lap2/Emerin/Man1) domain found in emerin, lamina-associated polypeptide 2 (LAP2), inner nuclear membrane protein Man1 and similar proteins; The family corresponds to a group of inner nuclear membrane proteins containing LEM domain. Emerin occurs in four phosphorylated forms and plays a role in cell cycle-dependent events. It is absent from the inner nuclear membrane in most patients with X-linked muscular dystrophy. Emerin interacts with A-type and B-type lamins. Man1, also termed LEM domain-containing protein 3 (LEMD3) is an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and post-mitotic reassembly. Some LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are non-membrane nuclear polypeptides. This family also contains LEM domain-containing protein LEMP-1 and LEM2. LEMP-1, also termed cancer/testis antigen 50 (CT50), is encoded by LEMD1, a novel testis-specific gene expressed in colorectal cancers. LEMP-1 may function as a cancer-testis antigen for immunotherapy of colorectal carcinoma (CRC). LEM2, also termed LEMD2, is a novel Man1-related ubiquitously expressed inner nuclear membrane protein required for normal nuclear envelope morphology. Association with lamin A is required for its proper nuclear envelope localization while its binding to lamin C plays an important role in the organization of lamin A/C complexes. Some uncharacterized LEM domain-containing proteins are also included in this family. Unlike other family members, these harbor an ankyrin repeat region that may mediate protein-protein interactions." Q#3654 - CGI_10007662 superfamily 245201 32 274 4.98E-54 177.429 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#3656 - CGI_10007664 superfamily 241675 62 295 1.24E-92 281.442 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#3657 - CGI_10007665 superfamily 222150 719 744 0.00589298 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3657 - CGI_10007665 superfamily 222150 747 772 0.0075341 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3657 - CGI_10007665 superfamily 222150 578 603 0.00862079 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3657 - CGI_10007665 superfamily 205629 617 654 0.00994712 35.2574 cl16278 zf-trcl superfamily C - Probable zinc-binding domain; This is a probably zinc-binding domain with two CxxC sequence motifs found in various families of bacteria. Q#3660 - CGI_10007668 superfamily 219542 38 148 5.89E-39 139.299 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#3660 - CGI_10007668 superfamily 219541 469 608 9.39E-27 106.013 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#3660 - CGI_10007668 superfamily 215896 159 322 1.17E-11 62.6976 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#3662 - CGI_10014323 superfamily 241882 542 761 0.00314669 39.1569 cl00465 yhhT superfamily C - "Predicted permease, member of the PurR regulon [General function prediction only]" Q#3671 - CGI_10014332 superfamily 241610 151 199 1.03E-14 70.3566 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#3671 - CGI_10014332 superfamily 241610 300 347 7.55E-12 62.2674 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#3673 - CGI_10014334 superfamily 218570 128 159 0.00754401 33.9153 cl05109 Pacifastin_I superfamily - - "Pacifastin inhibitor (LCMII); Structures of members of this family show that they are comprised of a triple-stranded antiparallel beta-sheet connected by three disulfide bridges, which defines this as a novel family of serine protease inhibitors." Q#3676 - CGI_10022206 superfamily 246925 147 334 2.76E-07 51.9726 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3676 - CGI_10022206 superfamily 246925 302 498 7.23E-06 47.7354 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3677 - CGI_10022207 superfamily 248012 275 409 1.01E-13 68.1188 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#3678 - CGI_10022208 superfamily 246925 366 612 3.90E-06 48.1206 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3678 - CGI_10022208 superfamily 246925 154 338 0.0082507 37.335 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3679 - CGI_10022209 superfamily 242899 18 167 3.07E-63 194.316 cl02135 TRAPP superfamily - - "Transport protein particle (TRAPP) component; TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterized TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localise TRAPP to the Golgi." Q#3681 - CGI_10022211 superfamily 219635 147 356 1.74E-61 198.639 cl06790 Peptidase_C78 superfamily - - Peptidase family C78; This family formerly known as DUF1671 has been shown to be a cysteine peptidase called (Ufm1)-specific protease. Q#3682 - CGI_10022212 superfamily 241600 2 35 0.000999621 37.2199 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3683 - CGI_10022213 superfamily 243360 6 302 6.74E-119 347.751 cl03253 SAM_decarbox superfamily - - "Adenosylmethionine decarboxylase; This is a family of S-adenosylmethionine decarboxylase (SAMDC) proenzymes. In the biosynthesis of polyamines SAMDC produces decarboxylated S-adenosylmethionine, which serves as the aminopropyl moiety necessary for spermidine and spermine biosynthesis from putrescine. The Pfam alignment contains both the alpha and beta chains that are cleaved to form the active enzyme." Q#3684 - CGI_10022214 superfamily 241647 99 126 3.77E-10 55.997 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#3684 - CGI_10022214 superfamily 241647 137 167 1.61E-07 48.6782 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#3684 - CGI_10022214 superfamily 207669 272 321 2.14E-13 65.9286 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#3684 - CGI_10022214 superfamily 207669 405 463 0.000108026 40.6349 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#3684 - CGI_10022214 superfamily 207669 491 541 0.00323039 36.2682 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#3688 - CGI_10022218 superfamily 243061 1 42 3.20E-19 80.4638 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#3690 - CGI_10022220 superfamily 247684 9 327 6.74E-74 241.027 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3692 - CGI_10022222 superfamily 247941 480 656 0.000108544 42.1145 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#3693 - CGI_10022223 superfamily 243193 54 86 0.000557954 34.6766 cl02797 Cyt_c_Oxidase_VIIc superfamily N - "Cytochrome c oxidase subunit VIIc. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. The VIIc subunit is found only in eukaryotes and its specific function remains unclear. Peroxide inactivation of bovine CcO coincides with the direct oxidation of tryptophan (W19) within subunit VIIc, along with other structural changes in other subunits." Q#3695 - CGI_10022225 superfamily 243061 209 306 2.12E-37 130.54 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#3697 - CGI_10022227 superfamily 245596 78 154 0.000552764 38.9825 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#3698 - CGI_10022228 superfamily 241584 140 223 1.28E-09 53.2691 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3699 - CGI_10022229 superfamily 241584 635 708 2.15E-10 59.0471 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3699 - CGI_10022229 superfamily 241584 245 335 3.17E-09 55.5803 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3699 - CGI_10022229 superfamily 241584 748 818 1.12E-08 54.0395 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3699 - CGI_10022229 superfamily 241584 537 624 5.54E-05 42.4835 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3699 - CGI_10022229 superfamily 241584 438 519 0.00414924 36.7055 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#3699 - CGI_10022229 superfamily 245814 158 227 2.39E-05 43.6593 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3700 - CGI_10022230 superfamily 247727 110 213 3.22E-08 50.5063 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#3701 - CGI_10004009 superfamily 241574 294 427 3.62E-12 65.2997 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#3705 - CGI_10001832 superfamily 248312 79 117 0.00314614 35.4069 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#3707 - CGI_10004425 superfamily 241600 1 176 4.74E-84 249.465 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3713 - CGI_10007306 superfamily 227554 52 481 4.16E-25 109.522 cl18816 LOC7 superfamily C - "Chromosome condensation complex Condensin, subunit H [Chromatin structure and dynamics / Cell division and chromosome partitioning]" Q#3713 - CGI_10007306 superfamily 227554 665 737 3.79E-05 45.5788 cl18816 LOC7 superfamily N - "Chromosome condensation complex Condensin, subunit H [Chromatin structure and dynamics / Cell division and chromosome partitioning]" Q#3714 - CGI_10007307 superfamily 245818 101 479 1.74E-75 241.716 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#3714 - CGI_10007307 superfamily 241565 6 82 0.00387126 35.3751 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#3715 - CGI_10007308 superfamily 241563 218 254 1.42E-05 42.4664 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3716 - CGI_10007309 superfamily 247723 33 101 1.84E-17 76.1869 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3716 - CGI_10007309 superfamily 247723 195 263 2.19E-10 56.1216 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3718 - CGI_10007311 superfamily 241554 154 242 9.44E-19 80.3923 cl00019 Macro superfamily NC - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#3718 - CGI_10007311 superfamily 241554 3 68 1.60E-13 65.3696 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#3725 - CGI_10008765 superfamily 248264 2 67 1.12E-10 57.6322 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#3726 - CGI_10008766 superfamily 246723 12 209 1.74E-33 122.013 cl14813 GluZincin superfamily C - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#3727 - CGI_10008767 superfamily 246723 250 508 9.80E-59 199.824 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#3733 - CGI_10002567 superfamily 219561 61 284 1.91E-71 226.224 cl06682 FEZ superfamily - - "FEZ-like protein; This is a family of eukaryotic proteins thought to be involved in axonal outgrowth and fasciculation. The N-terminal regions of these sequences are less conserved than the C-terminal regions, and are highly acidic. The C. elegans homolog, UNC-76, may play structural and signalling roles in the control of axonal extension and adhesion (particularly in the presence of adjacent neuronal cells) and these roles have also been postulated for other FEZ family proteins. Certain homologs have been definitively found to interact with the N-terminal variable region (V1) of PKC-zeta, and this interaction causes cytoplasmic translocation of the FEZ family protein in mammalian neuronal cells. The C-terminal region probably participates in the association with the regulatory domain of PKC-zeta. The members of this family are predicted to form coiled-coil structures, which may interact with members of the RhoA family of signalling proteins, but are not thought to contain other characteristic protein motifs. Certain members of this family are expressed almost exclusively in the brain, whereas others (such as FEZ2) are expressed in other tissues, and are thought to perform similar but unknown functions in these tissues." Q#3734 - CGI_10002568 superfamily 247723 243 312 1.33E-39 138.614 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3734 - CGI_10002568 superfamily 247723 349 451 1.07E-37 133.647 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#3734 - CGI_10002568 superfamily 247065 25 134 1.54E-05 43.4874 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#3736 - CGI_10002570 superfamily 247065 25 135 2.75E-07 46.569 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#3737 - CGI_10007682 superfamily 203720 67 109 8.17E-11 56.0174 cl08457 A2M_recep superfamily N - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#3741 - CGI_10007686 superfamily 243092 35 214 4.33E-09 55.804 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3747 - CGI_10003818 superfamily 241758 3 96 6.63E-21 81.2622 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#3749 - CGI_10003820 superfamily 245814 28 93 5.63E-07 42.8615 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3750 - CGI_10003821 superfamily 245814 147 221 7.55E-05 39.7799 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3750 - CGI_10003821 superfamily 245814 37 109 0.000401818 37.8539 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3756 - CGI_10000983 superfamily 248097 253 375 7.53E-21 86.5502 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3756 - CGI_10000983 superfamily 248097 57 156 5.72E-14 67.6754 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3761 - CGI_10001109 superfamily 218284 43 211 8.35E-46 151.638 cl04786 SOUL superfamily - - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#3762 - CGI_10001175 superfamily 217617 52 285 4.68E-48 162.586 cl15988 Sulfotransfer_2 superfamily - - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#3765 - CGI_10006498 superfamily 247725 216 363 8.30E-54 184.851 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3765 - CGI_10006498 superfamily 152255 446 533 8.66E-23 96.7644 cl13287 DUF3338 superfamily N - Domain of unknown function (DUF3338); This family of proteins are functionally uncharacterized. This family is found in eukaryotes. This presumed domain is about 130 amino acids in length. Q#3765 - CGI_10006498 superfamily 215882 111 223 3.08E-21 91.5734 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#3765 - CGI_10006498 superfamily 220215 26 105 5.13E-20 86.8954 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#3765 - CGI_10006498 superfamily 245835 507 610 0.00389514 38.8862 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#3768 - CGI_10006501 superfamily 241868 106 212 2.53E-39 136.24 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#3769 - CGI_10006502 superfamily 222269 171 403 6.07E-46 161.337 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#3773 - CGI_10006506 superfamily 241832 82 195 3.36E-40 138.123 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#3773 - CGI_10006506 superfamily 241645 257 308 3.70E-20 82.7534 cl00155 UBQ superfamily N - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#3774 - CGI_10006507 superfamily 215647 373 616 6.92E-56 191.668 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#3774 - CGI_10006507 superfamily 243086 317 356 1.61E-10 57.7702 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#3774 - CGI_10006507 superfamily 243029 58 108 0.000712556 38.4857 cl02422 HRM superfamily N - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#3775 - CGI_10007987 superfamily 241600 2 65 1.18E-13 61.4875 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3777 - CGI_10007989 superfamily 245847 218 361 1.75E-17 77.9821 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#3777 - CGI_10007989 superfamily 241619 53 120 0.00748535 34.0949 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#3778 - CGI_10007990 superfamily 245847 185 319 3.06E-19 82.2193 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#3778 - CGI_10007990 superfamily 241619 84 128 0.00520056 34.4801 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#3779 - CGI_10007991 superfamily 241600 598 818 4.15E-78 252.932 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3779 - CGI_10007991 superfamily 241600 289 526 4.25E-71 234.057 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3780 - CGI_10007992 superfamily 241600 346 389 2.14E-12 64.5691 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3781 - CGI_10007993 superfamily 241600 859 1078 2.97E-77 253.317 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3781 - CGI_10007993 superfamily 241600 3 135 3.45E-49 174.736 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3782 - CGI_10007994 superfamily 248097 34 163 1.53E-13 63.4382 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3783 - CGI_10007995 superfamily 248097 34 163 1.53E-13 63.4382 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3784 - CGI_10004623 superfamily 147609 47 121 5.06E-19 77.7819 cl05205 p25-alpha superfamily N - "p25-alpha; This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila." Q#3785 - CGI_10004624 superfamily 241642 149 190 6.26E-05 40.1714 cl00152 t_SNARE superfamily C - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#3785 - CGI_10004624 superfamily 216143 95 147 8.06E-08 48.3124 cl02979 SNAP-25 superfamily - - SNAP-25 family; SNAP-25 (synaptosome-associated protein 25 kDa) proteins are components of SNARE complexes. Members of this family contain a cluster of cysteine residues that can be palmitoylated for membrane attachment. Q#3786 - CGI_10004625 superfamily 248097 443 570 1.98E-17 78.8462 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3786 - CGI_10004625 superfamily 248097 101 207 6.86E-13 65.7494 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3786 - CGI_10004625 superfamily 248097 222 290 0.000115528 40.7114 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3787 - CGI_10004626 superfamily 245213 107 153 0.00016362 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3787 - CGI_10004626 superfamily 243051 157 239 4.96E-05 41.5673 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#3787 - CGI_10004626 superfamily 245213 36 69 0.00523739 33.7642 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3788 - CGI_10004627 superfamily 245213 55 90 1.46E-11 54.9502 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3789 - CGI_10004628 superfamily 241592 49 113 5.32E-21 84.5857 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#3790 - CGI_10004629 superfamily 246940 82 328 1.68E-11 62.7361 cl15377 Radical_SAM superfamily - - "Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin." Q#3795 - CGI_10010289 superfamily 241750 91 260 1.64E-26 105.733 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#3798 - CGI_10010292 superfamily 245205 129 205 4.97E-07 47.6177 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#3798 - CGI_10010292 superfamily 245205 424 500 6.44E-07 47.6177 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#3800 - CGI_10010294 superfamily 216363 9 82 3.50E-09 49.0058 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#3801 - CGI_10001408 superfamily 243051 148 222 7.56E-08 49.6838 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#3803 - CGI_10001556 superfamily 245847 48 131 9.93E-05 40.1784 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#3805 - CGI_10002246 superfamily 243035 565 691 2.15E-24 99.2313 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#3805 - CGI_10002246 superfamily 207794 1 428 0 540.727 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#3805 - CGI_10002246 superfamily 245008 450 485 7.67E-09 52.9608 cl09101 E_set superfamily C - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#3806 - CGI_10002383 superfamily 220695 36 142 0.00825464 36.4027 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#3807 - CGI_10001794 superfamily 241563 61 100 8.35E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3813 - CGI_10021997 superfamily 241570 464 573 1.95E-26 105.486 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#3813 - CGI_10021997 superfamily 149466 88 163 4.26E-22 92.0853 cl07147 Ion_trans_N superfamily - - Ion transport protein N-terminal; This metazoan domain is found to the N-terminus of pfam00520 in voltage- and cyclic nucleotide-gated K/Na ion channels. Q#3814 - CGI_10021998 superfamily 241754 10 681 0 777.984 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#3814 - CGI_10021998 superfamily 246908 858 954 1.29E-29 114.988 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#3814 - CGI_10021998 superfamily 210118 688 707 0.00306785 36.9304 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#3815 - CGI_10021999 superfamily 245814 218 290 4.84E-07 46.7135 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3815 - CGI_10021999 superfamily 245814 6 98 2.88E-12 62.0597 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3815 - CGI_10021999 superfamily 245814 116 196 1.03E-08 51.7373 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3816 - CGI_10022000 superfamily 247805 225 364 1.42E-22 96.6375 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#3816 - CGI_10022000 superfamily 241762 60 119 1.64E-20 88.1391 cl00297 R3H superfamily - - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#3816 - CGI_10022000 superfamily 247905 610 756 8.78E-09 55.3217 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#3816 - CGI_10022000 superfamily 243072 487 566 1.27E-07 51.6154 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3816 - CGI_10022000 superfamily 217926 1269 1399 2.72E-38 141.929 cl04418 YTH superfamily - - "YT521-B-like domain; A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily." Q#3816 - CGI_10022000 superfamily 243778 813 906 6.29E-27 107.695 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#3816 - CGI_10022000 superfamily 219532 943 1065 1.62E-16 78.125 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#3817 - CGI_10022001 superfamily 241868 170 322 7.71E-50 166.116 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#3817 - CGI_10022001 superfamily 247999 55 92 0.00194454 36.0358 cl17445 PHD superfamily C - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#3818 - CGI_10022002 superfamily 241578 2 164 1.93E-40 148.594 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3818 - CGI_10022002 superfamily 241578 1256 1417 9.65E-38 140.89 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3818 - CGI_10022002 superfamily 241581 276 389 1.34E-05 45.0698 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#3818 - CGI_10022002 superfamily 247792 540 569 3.50E-05 43.2032 cl17238 RING superfamily N - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3818 - CGI_10022002 superfamily 241578 1102 1218 5.53E-20 89.5946 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3819 - CGI_10022003 superfamily 241578 85 250 1.63E-38 133.951 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3819 - CGI_10022003 superfamily 241578 6 67 0.00350439 36.594 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3820 - CGI_10022004 superfamily 241578 108 269 1.30E-36 135.112 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3820 - CGI_10022004 superfamily 241578 315 482 3.30E-36 134.278 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3820 - CGI_10022004 superfamily 241578 530 694 3.14E-31 120.084 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3821 - CGI_10022005 superfamily 241578 92 253 3.03E-27 104.296 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3821 - CGI_10022005 superfamily 241578 14 72 2.03E-08 51.4599 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3822 - CGI_10022006 superfamily 241578 128 287 7.20E-35 125.867 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3823 - CGI_10022007 superfamily 246679 1668 1862 1.27E-91 297.158 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#3823 - CGI_10022007 superfamily 246679 1508 1652 6.42E-58 198.523 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#3823 - CGI_10022007 superfamily 241581 34 75 9.19E-06 46.2254 cl00062 FHA superfamily N - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#3823 - CGI_10022007 superfamily 242046 1123 1200 2.19E-05 44.5686 cl00718 TOPRIM superfamily - - "Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function." Q#3823 - CGI_10022007 superfamily 241565 84 149 0.000618146 39.9975 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#3823 - CGI_10022007 superfamily 247787 1258 1448 3.39E-40 152.166 cl17233 RecA-like_NTPases superfamily - - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#3825 - CGI_10022009 superfamily 245835 15 256 1.79E-41 149.549 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#3825 - CGI_10022009 superfamily 242067 340 518 1.19E-27 112.446 cl00751 DUF155 superfamily N - "Uncharacterized ACR, YagE family COG1723; Uncharacterized ACR, YagE family COG1723. " Q#3826 - CGI_10022010 superfamily 241739 178 319 3.45E-79 254.422 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#3826 - CGI_10022010 superfamily 245205 45 174 3.17E-46 159.994 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#3826 - CGI_10022010 superfamily 241739 458 588 1.26E-66 220.91 cl00268 class_II_aaRS-like_core superfamily N - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#3826 - CGI_10022010 superfamily 145868 359 426 0.00304191 36.4622 cl08382 GAD superfamily - - GAD domain; This domain is found in some members of the GatB and aspartyl tRNA synthetases. Q#3827 - CGI_10022011 superfamily 247792 27 76 2.27E-08 51.6776 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3827 - CGI_10022011 superfamily 241563 175 205 0.000382216 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3828 - CGI_10022012 superfamily 151744 126 203 6.42E-19 79.6878 cl12843 DUF3105 superfamily N - Protein of unknown function (DUF3105); Some members in this family of proteins are annotated as membrane proteins however this cannot be confirmed. Currently no function is known. Q#3829 - CGI_10022013 superfamily 247775 14 572 3.75E-34 134.776 cl17221 ArsB_NhaD_permease superfamily - - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#3830 - CGI_10022014 superfamily 247775 247 595 4.69E-28 117.442 cl17221 ArsB_NhaD_permease superfamily N - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#3830 - CGI_10022014 superfamily 247775 686 838 3.03E-27 114.219 cl17221 ArsB_NhaD_permease superfamily N - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#3830 - CGI_10022014 superfamily 247775 49 162 1.71E-24 105.745 cl17221 ArsB_NhaD_permease superfamily C - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#3831 - CGI_10022015 superfamily 241860 42 286 4.43E-92 277.062 cl00435 ICL_KPHMT superfamily - - "Members of the ICL/PEPM_KPHMT enzyme superfamily catalyze the formation and cleavage of either P-C or C-C bonds. Typical members are phosphoenolpyruvate mutase (PEPM), phosphonopyruvate hydrolase (PPH), carboxyPEP mutase (CPEP mutase), oxaloacetate hydrolase (OAH), isocitrate lyase (ICL), 2-methylisocitrate lyase (MICL), and ketopantoate hydroxymethyltransferase (KPHMT)." Q#3832 - CGI_10022016 superfamily 248469 584 701 0.000237135 40.8163 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#3832 - CGI_10022016 superfamily 215733 93 323 5.99E-49 173.905 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#3832 - CGI_10022016 superfamily 216063 765 968 4.74E-48 169.721 cl02929 Cation_ATPase_C superfamily - - "Cation transporting ATPase, C-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. This family represents 5 transmembrane helices." Q#3832 - CGI_10022016 superfamily 222006 400 508 7.75E-27 106.538 cl16182 Hydrolase_like2 superfamily - - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#3832 - CGI_10022016 superfamily 243244 4 62 1.85E-17 78.7314 cl02930 Cation_ATPase_N superfamily - - "Cation transporter/ATPase, N-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport." Q#3839 - CGI_10002812 superfamily 241550 91 375 8.80E-102 306.433 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#3840 - CGI_10002813 superfamily 243072 13 127 7.14E-27 99.7654 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3841 - CGI_10002814 superfamily 243072 3 124 1.91E-24 92.0614 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3844 - CGI_10003726 superfamily 241580 81 156 8.50E-42 144.234 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#3846 - CGI_10010480 superfamily 241763 465 680 8.32E-130 385.999 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#3846 - CGI_10010480 superfamily 241613 155 185 0.00166382 36.801 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#3847 - CGI_10010481 superfamily 246925 458 568 4.06E-05 44.6538 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3848 - CGI_10010482 superfamily 198867 102 202 3.34E-30 114.178 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#3848 - CGI_10010482 superfamily 243066 1 94 3.18E-28 108.859 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3848 - CGI_10010482 superfamily 243146 336 383 1.77E-05 42.8205 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#3849 - CGI_10010483 superfamily 216411 26 146 0.000701782 36.8711 cl15974 MARVEL superfamily - - "Membrane-associating domain; MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis." Q#3851 - CGI_10010485 superfamily 243066 371 467 1.29E-20 86.976 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#3851 - CGI_10010485 superfamily 201217 93 143 1.10E-11 60.6172 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#3851 - CGI_10010485 superfamily 201217 146 196 3.00E-09 53.2984 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#3851 - CGI_10010485 superfamily 201217 251 287 8.31E-08 49.4464 cl08266 RCC1 superfamily C - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#3851 - CGI_10010485 superfamily 205718 235 264 3.38E-07 47.0998 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#3851 - CGI_10010485 superfamily 201217 40 90 4.71E-07 47.1352 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#3851 - CGI_10010485 superfamily 198867 475 521 0.00345656 36.1653 cl06652 BACK superfamily C - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#3852 - CGI_10010486 superfamily 247725 85 141 3.57E-25 100.429 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3852 - CGI_10010486 superfamily 211425 472 559 3.54E-20 86.1927 cl16935 Orc6_mid superfamily - - "Middle domain of the origin recognition complex subunit 6; Orc6 is a subunit of the origin recognition complex in eukaryotes, and it may be involved in binding to DNA. This model describes the central or middle domain of Orc6, whose structure resembles that of TFIIB, a DNA-binding transcription factor. Orc6 appears to form distinct complexes with DNA, and a putative DNA-binding site has been identified." Q#3852 - CGI_10010486 superfamily 241622 2 47 4.51E-09 54.1099 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#3852 - CGI_10010486 superfamily 247725 164 275 0.000210844 40.143 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3852 - CGI_10010486 superfamily 218595 422 493 0.0010243 40.3261 cl18465 ORC6 superfamily C - "Origin recognition complex subunit 6 (ORC6); This family consists of several eukaryotic origin recognition complex subunit 6 (ORC6) proteins. Despite differences in their structure and sequences among eukaryotic replicators, ORC is a conserved feature of replication initiation in all eukaryotes. ORC-related genes have been identified in organisms ranging from S. pombe to plants to humans. All DNA replication initiation is driven by a single conserved eukaryotic initiator complex termed he origin recognition complex (ORC). The ORC is a six protein complex. The function of ORC is reviewed in." Q#3853 - CGI_10010487 superfamily 247065 8 112 6.26E-22 85.8594 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#3854 - CGI_10010488 superfamily 221130 37 153 1.07E-19 81.1679 cl13043 ARL2_Bind_BART superfamily - - The ARF-like 2 binding protein BART; BART binds specifically to ARL2.GTP with a high affinity however it does not bind to ARL2.GDP. It is thought that this specific interaction is due to BART being the first identified ARL2-specific effector. The function is not completely characterized. BART is predominantly cytosolic but can also be found to be associated with mitochondria. BART is also involved in binding to the adenine nucleotide transporter ANT1. Q#3855 - CGI_10010489 superfamily 243092 479 608 7.59E-12 66.5896 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3855 - CGI_10010489 superfamily 243092 42 223 2.55E-11 65.0488 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3856 - CGI_10010490 superfamily 247724 3 170 1.07E-129 365.207 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3857 - CGI_10010491 superfamily 247724 148 216 1.12E-10 57.5171 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3858 - CGI_10014165 superfamily 248012 51 153 3.14E-24 92.2556 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#3858 - CGI_10014165 superfamily 248012 4 34 0.00148714 35.2461 cl17458 TIR_2 superfamily NC - TIR domain; This is a family of bacterial Toll-like receptors. Q#3859 - CGI_10014166 superfamily 222150 315 339 5.73E-05 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3859 - CGI_10014166 superfamily 222150 398 422 0.000153547 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3859 - CGI_10014166 superfamily 222150 342 367 0.000361149 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3859 - CGI_10014166 superfamily 222150 374 395 0.00138326 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#3860 - CGI_10014167 superfamily 241578 187 376 1.65E-24 102.472 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#3863 - CGI_10014170 superfamily 241547 53 360 2.65E-34 128.543 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#3863 - CGI_10014170 superfamily 241749 391 531 1.46E-26 105.16 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#3864 - CGI_10014171 superfamily 245864 48 469 4.07E-114 348.114 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3865 - CGI_10014172 superfamily 245864 91 531 3.88E-109 336.558 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3865 - CGI_10014172 superfamily 245864 18 50 0.000903944 40.3394 cl12078 p450 superfamily NC - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3866 - CGI_10014173 superfamily 245864 37 501 1.15E-109 336.943 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3867 - CGI_10014174 superfamily 245864 5 434 3.02E-113 343.877 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3868 - CGI_10014175 superfamily 245864 42 478 2.42E-116 353.507 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#3869 - CGI_10014176 superfamily 247792 115 150 4.55E-05 40.1216 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3869 - CGI_10014176 superfamily 190233 198 253 0.000524953 37.4338 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#3870 - CGI_10014177 superfamily 248024 156 323 9.71E-16 75.0121 cl17470 SBF superfamily - - "Sodium Bile acid symporter family; This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds." Q#3871 - CGI_10014178 superfamily 241600 22 157 1.90E-53 170.499 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#3872 - CGI_10014179 superfamily 241583 215 349 1.35E-14 71.6138 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#3873 - CGI_10014180 superfamily 217473 572 782 6.96E-15 75.0941 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#3873 - CGI_10014180 superfamily 216981 314 431 2.55E-06 47.1422 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#3874 - CGI_10014181 superfamily 246925 47 231 0.000341944 41.187 cl15309 LRR_RI superfamily C - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#3875 - CGI_10014182 superfamily 242900 9 151 5.93E-33 116.216 cl02137 PRA1 superfamily - - PRA1 family protein; This family includes the PRA1 (Prenylated rab acceptor) protein which is a Rab guanine dissociation inhibitor (GDI) displacement factor. This family also includes the glutamate transporter EAAC1 interacting protein GTRAP3-18. Q#3876 - CGI_10014183 superfamily 247809 138 302 3.97E-28 108.132 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#3878 - CGI_10014185 superfamily 248312 21 191 2.17E-07 47.7333 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#3884 - CGI_10014191 superfamily 220695 74 309 0.000587881 39.8695 cl18571 7TM_GPCR_Srx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#3892 - CGI_10002286 superfamily 216572 4 82 3.11E-05 40.3359 cl03265 Pep_M12B_propep superfamily N - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#3893 - CGI_10002287 superfamily 247684 200 409 5.32E-30 120.46 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3893 - CGI_10002287 superfamily 247684 73 163 2.52E-14 73.8507 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#3894 - CGI_10007199 superfamily 248097 7 130 1.65E-20 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3895 - CGI_10007200 superfamily 243119 519 565 1.17E-09 55.1425 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3895 - CGI_10007200 superfamily 243119 581 633 5.00E-05 41.6505 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3895 - CGI_10007200 superfamily 243119 241 286 0.00146647 37.0382 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3895 - CGI_10007200 superfamily 245814 428 498 0.00409841 36.1603 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3896 - CGI_10007201 superfamily 243060 193 258 3.71E-09 53.922 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#3898 - CGI_10007203 superfamily 219849 336 427 5.15E-19 82.2323 cl09597 RIH_assoc superfamily - - "RyR and IP3R Homology associated; This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1,4,5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels. There seems to be no known function for this domain. Also see the IP3-binding domain pfam01365 and pfam02815." Q#3901 - CGI_10005299 superfamily 248264 5 125 6.11E-06 41.4538 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#3903 - CGI_10005301 superfamily 243072 558 683 9.88E-38 138.671 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3903 - CGI_10005301 superfamily 243072 855 980 2.42E-37 137.515 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3903 - CGI_10005301 superfamily 243072 624 749 2.94E-34 129.041 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3903 - CGI_10005301 superfamily 243072 791 914 7.55E-33 124.803 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3903 - CGI_10005301 superfamily 243072 479 617 3.43E-13 67.7938 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3903 - CGI_10005301 superfamily 243072 761 793 0.000555795 39.0744 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3903 - CGI_10005301 superfamily 247743 12 47 0.00258871 38.3348 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#3904 - CGI_10005337 superfamily 243090 1092 1214 1.60E-27 110.174 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#3904 - CGI_10005337 superfamily 243090 392 504 3.51E-16 78.1561 cl02565 RGS superfamily C - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#3904 - CGI_10005337 superfamily 243090 605 703 0.000146963 41.8273 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#3904 - CGI_10005337 superfamily 243090 848 949 0.000625659 39.6818 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#3905 - CGI_10005338 superfamily 247743 1857 1942 0.00499124 39.4367 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#3905 - CGI_10005338 superfamily 193256 3056 3320 4.41E-62 217.508 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#3905 - CGI_10005338 superfamily 193257 3686 3862 2.36E-48 176.329 cl15086 AAA_9 superfamily C - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#3905 - CGI_10005338 superfamily 193253 3333 3664 3.59E-40 155.965 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#3905 - CGI_10005338 superfamily 193251 2757 2965 3.63E-36 142.382 cl18188 AAA_7 superfamily N - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#3906 - CGI_10005339 superfamily 190637 102 178 1.08E-32 121.739 cl04081 HELP superfamily - - "HELP motif; The founding member of the EMAP protein family is the 75 kDa Echinoderm Microtubule-Associated Protein, so-named for its abundance in sea urchin, sand dollar and starfish eggs. The Hydrophobic EMAP-Like Protein (HELP) motif was identified initially in the human EMAP-Like Protein 2 (EML2) and subsequently in the entire EMAP Protein family. The HELP motif is approximately 60-70 amino acids in length and is conserved amongst metazoans. Although the HELP motif is hydrophobic, there is no evidence that EMAP-Like Proteins are membrane-associated. All members of the EMAP-Like Protein family, identified to-date, are constructed with an amino terminal HELP motif followed by a WD domain. In C. elegans, EMAP-Like Protein-1 (ELP-1) is required for touch sensation indicating that ELP-1 may play a role in mechanosensation. The localization of ELP-1 to microtubules and adhesion sites implies that ELP-1 may transmit forces between the body surface and the touch receptor neurons." Q#3906 - CGI_10005339 superfamily 243092 286 619 3.87E-25 105.88 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3906 - CGI_10005339 superfamily 243092 172 362 8.28E-14 71.212 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3906 - CGI_10005339 superfamily 243092 589 733 6.78E-09 56.5744 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#3907 - CGI_10005340 superfamily 241597 490 546 1.73E-13 66.1031 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#3910 - CGI_10026706 superfamily 215847 158 622 4.10E-64 224.633 cl09510 Lipoxygenase superfamily - - Lipoxygenase; Lipoxygenase. Q#3910 - CGI_10026706 superfamily 241546 2 106 1.42E-22 93.7736 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#3911 - CGI_10026707 superfamily 215847 213 654 6.47E-74 252.367 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#3911 - CGI_10026707 superfamily 241546 4 106 5.47E-15 72.2024 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#3912 - CGI_10026708 superfamily 241546 4 111 2.10E-24 98.9504 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#3912 - CGI_10026708 superfamily 215847 214 654 1.95E-86 286.265 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#3913 - CGI_10026709 superfamily 245213 587 621 0.000935784 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#3913 - CGI_10026709 superfamily 246918 743 794 1.54E-13 66.8415 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3913 - CGI_10026709 superfamily 246918 686 737 1.08E-10 58.7523 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3913 - CGI_10026709 superfamily 246918 636 656 0.00490886 36.0255 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#3915 - CGI_10026711 superfamily 241599 89 138 9.08E-18 74.5872 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#3916 - CGI_10026712 superfamily 245206 26 262 4.21E-56 188.641 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#3916 - CGI_10026712 superfamily 245206 268 365 1.03E-17 81.5553 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#3916 - CGI_10026712 superfamily 245206 376 428 1.27E-05 45.3466 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#3917 - CGI_10026713 superfamily 206084 573 594 1.54E-06 45.6696 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#3917 - CGI_10026713 superfamily 206084 466 488 9.39E-06 43.3584 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#3918 - CGI_10026714 superfamily 241593 30 82 9.09E-05 42.6338 cl00075 HATPase_c superfamily C - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#3918 - CGI_10026714 superfamily 243181 212 351 6.62E-33 126.662 cl02783 TopoII_MutL_Trans superfamily - - "MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of type II DNA topoisomerases (Topo II) and DNA mismatch repair (MutL/MLH1/PMS2) proteins. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. The GyrB dimerizes in response to ATP binding, and is homologous to the N-terminal half of eukaryotic Topo II and the ATPase fragment of MutL. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. Included in this group are proteins similar to human MLH1 and PMS2. MLH1 forms a heterodimer with PMS2 which functions in meiosis and in DNA mismatch repair (MMR). Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families." Q#3918 - CGI_10026714 superfamily 244692 1522 1658 5.11E-21 92.0322 cl07336 MutL_C superfamily - - "MutL C terminal dimerisation domain; MutL and MutS are key components of the DNA repair machinery that corrects replication errors. MutS recognises mispaired or unpaired bases in a DNA duplex and in the presence of ATP, recruits MutL to form a DNA signaling complex for repair. The N terminal region of MutL contains the ATPase domain and the C terminal is involved in dimerisation." Q#3920 - CGI_10026716 superfamily 248097 90 194 1.75E-22 89.2466 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#3925 - CGI_10026721 superfamily 245814 39 123 3.69E-11 59.247 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3925 - CGI_10026721 superfamily 245814 240 315 2.67E-15 70.612 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3925 - CGI_10026721 superfamily 204025 325 358 9.72E-07 45.3201 cl07344 PLAC superfamily - - PLAC (protease and lacunin) domain; The PLAC (protease and lacunin) domain is a short six-cysteine region that is usually found at the C terminal of proteins. It is found in a range of proteins including PACE4 (paired basic amino acid cleaving enzyme 4) and the extracellular matrix protein lacunin. Q#3925 - CGI_10026721 superfamily 245814 182 223 4.08E-05 41.0679 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3927 - CGI_10026723 superfamily 222070 326 470 9.65E-14 68.4732 cl18634 DDE_3 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#3927 - CGI_10026723 superfamily 238076 230 283 4.91E-10 57.0438 cl18938 PAX superfamily N - Paired Box domain Q#3927 - CGI_10026723 superfamily 245814 12 71 5.54E-08 50.0979 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3927 - CGI_10026723 superfamily 248005 180 218 2.63E-05 41.8319 cl17451 HTH_23 superfamily C - Homeodomain-like domain; Homeodomain-like domain. Q#3927 - CGI_10026723 superfamily 245814 89 159 0.00443614 35.4298 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#3930 - CGI_10026726 superfamily 247725 487 619 1.00E-58 196.017 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3930 - CGI_10026726 superfamily 247725 254 376 6.05E-48 165.221 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3930 - CGI_10026726 superfamily 245835 19 234 4.45E-80 256.759 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#3936 - CGI_10026732 superfamily 193607 2332 2462 1.90E-72 240.552 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#3936 - CGI_10026732 superfamily 247792 2283 2323 3.37E-08 52.8332 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#3936 - CGI_10026732 superfamily 241554 1800 1969 1.02E-53 188.633 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#3936 - CGI_10026732 superfamily 241554 1206 1346 4.36E-34 130.458 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#3941 - CGI_10026738 superfamily 241563 79 119 3.17E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3941 - CGI_10026738 superfamily 241563 25 70 0.00223319 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3942 - CGI_10026739 superfamily 144673 56 104 1.49E-19 76.6594 cl18014 TSC22 superfamily - - TSC-22/dip/bun family; TSC-22/dip/bun family. Q#3952 - CGI_10026750 superfamily 248022 37 284 2.55E-06 48.4279 cl17468 Aa_trans superfamily C - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#3953 - CGI_10026751 superfamily 242534 543 740 2.73E-42 156.24 cl01495 Glyco_hydro_10 superfamily - - Glycosyl hydrolase family 10; Glycosyl hydrolase family 10. Q#3953 - CGI_10026751 superfamily 216848 288 423 2.97E-11 62.0688 cl03406 CBM_4_9 superfamily - - Carbohydrate binding domain; This family includes diverse carbohydrate binding domains. Q#3955 - CGI_10006847 superfamily 245610 2 262 1.09E-164 460.356 cl11424 nitrilase superfamily - - "Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes; This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy." Q#3956 - CGI_10006848 superfamily 241692 121 373 9.75E-94 295.219 cl00214 Aldolase_II superfamily - - "Class II Aldolase and Adducin head (N-terminal) domain. Aldolases are ubiquitous enzymes catalyzing central steps of carbohydrate metabolism. Based on enzymatic mechanisms, this superfamily has been divided into two distinct classes (Class I and II). Class II enzymes are further divided into two sub-classes A and B. This family includes class II A aldolases and adducins which has not been ascribed any enzymatic function. Members of this class are primarily bacterial and eukaryotic in origin and include L-fuculose-1-phosphate, L-rhamnulose-1-phosphate aldolases and L-ribulose-5-phosphate 4-epimerases. They all share the ability to promote carbon-carbon bond cleavage and stabilize enolate intermediates using divalent cations." Q#3957 - CGI_10006849 superfamily 247739 241 333 4.25E-36 136.934 cl17185 LPLAT superfamily C - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#3957 - CGI_10006849 superfamily 247739 665 956 6.86E-08 55.2373 cl17185 LPLAT superfamily N - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#3957 - CGI_10006849 superfamily 247739 335 375 1.29E-07 51.8052 cl17185 LPLAT superfamily N - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#3958 - CGI_10006850 superfamily 243038 1290 1365 0.000890649 39.6313 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#3959 - CGI_10006851 superfamily 198898 1307 1337 0.000781802 39.6586 cl07406 c-SKI_SMAD_bind superfamily C - c-SKI Smad4 binding domain; c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4 Q#3962 - CGI_10006854 superfamily 248054 6 62 7.63E-10 55.1708 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#3963 - CGI_10006855 superfamily 216686 39 228 2.04E-41 142.847 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#3964 - CGI_10006856 superfamily 241646 132 167 2.11E-05 40.1259 cl00156 WAP superfamily N - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#3965 - CGI_10004003 superfamily 243179 153 268 3.04E-08 50.6091 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#3966 - CGI_10004004 superfamily 243179 94 200 2.30E-05 41.3643 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#3968 - CGI_10004006 superfamily 245201 161 481 8.24E-55 190.388 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#3970 - CGI_10004008 superfamily 243119 413 462 0.00363861 37.0281 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#3971 - CGI_10016932 superfamily 241563 38 75 1.58E-06 45.548 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#3971 - CGI_10016932 superfamily 110440 463 489 0.00232346 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#3972 - CGI_10016933 superfamily 243267 20 384 3.76E-110 329.96 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#3973 - CGI_10016934 superfamily 243267 25 389 1.38E-136 398.14 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#3975 - CGI_10016936 superfamily 243267 36 401 8.10E-123 362.702 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#3976 - CGI_10016937 superfamily 243267 48 417 1.11E-97 298.373 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#3978 - CGI_10016939 superfamily 241624 449 598 2.20E-26 108.953 cl00120 PP2Cc superfamily N - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#3978 - CGI_10016939 superfamily 241624 199 261 2.97E-05 45.0099 cl00120 PP2Cc superfamily C - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#3979 - CGI_10016940 superfamily 247856 333 394 0.0032917 35.9865 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3980 - CGI_10016941 superfamily 247724 20 182 4.20E-115 327.972 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#3981 - CGI_10016942 superfamily 247725 88 206 1.01E-53 182.92 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#3981 - CGI_10016942 superfamily 216381 575 902 1.84E-154 460.519 cl03136 Oxysterol_BP superfamily - - Oxysterol-binding protein; Oxysterol-binding protein. Q#3982 - CGI_10016943 superfamily 241647 147 174 5.87E-07 45.9818 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#3985 - CGI_10016946 superfamily 243054 1790 1995 1.27E-21 96.7459 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3985 - CGI_10016946 superfamily 243054 2544 2743 5.11E-16 79.7971 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3985 - CGI_10016946 superfamily 243054 1895 2110 6.22E-14 73.2487 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3985 - CGI_10016946 superfamily 243054 2114 2324 3.05E-12 67.8559 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3985 - CGI_10016946 superfamily 243054 2224 2430 2.06E-11 65.1596 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3985 - CGI_10016946 superfamily 243054 2360 2537 1.48E-09 59.3816 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3985 - CGI_10016946 superfamily 243054 1233 1426 1.27E-08 56.3 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3985 - CGI_10016946 superfamily 243054 1342 1582 2.27E-07 52.8332 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3985 - CGI_10016946 superfamily 207632 535 563 0.000464104 40.5433 cl02531 Plectin superfamily - - "Plectin repeat; This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen." Q#3985 - CGI_10016946 superfamily 207632 611 647 0.00135195 39.3561 cl02531 Plectin superfamily C - "Plectin repeat; This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen." Q#3985 - CGI_10016946 superfamily 243054 1584 1781 0.00189609 40.892 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3986 - CGI_10016947 superfamily 243054 18 218 2.83E-28 113.309 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3986 - CGI_10016947 superfamily 247856 368 430 2.23E-05 42.9201 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3986 - CGI_10016947 superfamily 141488 450 522 1.64E-29 112.542 cl02524 GAS2 superfamily - - Growth-Arrest-Specific Protein 2 Domain; Growth-Arrest-Specific Protein 2 Domain. Q#3986 - CGI_10016947 superfamily 243054 228 334 1.51E-06 46.9393 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#3987 - CGI_10016948 superfamily 247038 1041 1121 1.23E-12 67.8649 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 247038 259 318 1.10E-11 64.7833 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 247038 1125 1203 5.58E-11 62.8573 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 247038 1802 1852 1.94E-08 55.1533 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 247038 1622 1689 3.63E-07 51.3013 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 247038 1898 1964 9.12E-06 47.0641 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 247038 1734 1772 1.21E-05 46.6789 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 247038 24 114 0.000193994 43.2188 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 220608 2147 2261 1.44E-30 120.874 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#3987 - CGI_10016948 superfamily 220608 3021 3141 1.33E-24 103.54 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#3987 - CGI_10016948 superfamily 247038 1298 1376 0.000757344 41.274 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 247038 1982 2048 0.000829415 40.8888 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3987 - CGI_10016948 superfamily 244965 357 467 0.000845209 41.2335 cl08459 PA14 superfamily - - "PA14 domain; This domain forms an insert in bacterial beta-glucosidases and is found in other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium prespore-cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding." Q#3987 - CGI_10016948 superfamily 247038 1531 1616 0.00306907 39.3602 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#3988 - CGI_10016949 superfamily 243072 771 921 7.03E-23 96.2986 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3988 - CGI_10016949 superfamily 243072 734 797 3.71E-08 52.771 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#3989 - CGI_10016950 superfamily 247856 69 124 2.12E-17 72.1953 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3989 - CGI_10016950 superfamily 247856 1 56 0.000341473 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#3990 - CGI_10005653 superfamily 246902 142 262 6.97E-41 140.832 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#3990 - CGI_10005653 superfamily 246902 19 119 5.52E-20 83.8827 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#3994 - CGI_10014616 superfamily 128937 4 69 4.13E-12 57.6576 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#3994 - CGI_10014616 superfamily 128937 79 142 3.11E-08 46.872 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#3997 - CGI_10014619 superfamily 128937 30 91 6.37E-11 53.4204 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#3998 - CGI_10014620 superfamily 128937 3 69 1.20E-16 69.984 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#3998 - CGI_10014620 superfamily 128937 79 140 4.75E-12 57.6576 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#3999 - CGI_10014621 superfamily 128937 4 69 3.07E-09 49.5684 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#3999 - CGI_10014621 superfamily 128937 79 142 8.73E-09 48.4128 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#4000 - CGI_10014622 superfamily 241805 42 90 1.33E-08 46.9218 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#4001 - CGI_10014623 superfamily 245839 2 113 1.43E-29 110.272 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#4001 - CGI_10014623 superfamily 241805 267 310 1.68E-08 50.1634 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#4002 - CGI_10014624 superfamily 241550 283 645 1.58E-150 441.971 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#4002 - CGI_10014624 superfamily 243175 75 177 1.50E-13 67.5238 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#4003 - CGI_10014625 superfamily 248226 624 719 0.00228178 41.17 cl17672 PRK13629 superfamily N - threonine/serine transporter TdcC; Provisional Q#4003 - CGI_10014625 superfamily 218825 1804 1900 0.00717482 39.3383 cl05490 APC_basic superfamily C - APC basic domain; This region of the APC family of proteins is known as the basic domain. It contains a high proportion of positively charged amino acids and interacts with microtubules. Q#4004 - CGI_10014626 superfamily 241571 905 1016 3.61E-40 147.944 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1140 1249 1.27E-38 143.707 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 2549 2660 1.22E-37 140.625 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1021 1133 4.67E-37 139.085 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1256 1366 1.50E-36 137.544 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 673 785 4.97E-35 133.307 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 3130 3241 5.03E-35 133.307 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1370 1479 5.52E-35 132.921 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1714 1826 6.20E-35 132.921 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1599 1712 1.90E-34 131.381 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 792 903 1.05E-32 126.373 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 2784 2895 1.07E-32 126.373 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1482 1594 1.10E-32 126.373 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1944 2054 1.25E-32 126.373 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 2314 2426 3.51E-32 124.832 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 1832 1941 9.59E-32 123.677 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 3483 3594 1.34E-31 123.291 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 2087 2188 1.51E-30 120.21 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 3249 3360 1.82E-30 119.825 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 3015 3127 2.13E-30 119.825 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 3601 3712 4.00E-30 119.054 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 556 668 9.61E-30 117.899 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 2191 2308 2.66E-27 110.965 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 2433 2544 8.98E-26 106.343 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 2666 2779 1.26E-25 105.957 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 2900 3013 3.25E-23 99.0238 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 241571 3366 3480 2.59E-20 90.5494 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4004 - CGI_10014626 superfamily 245213 514 550 2.13E-09 56.8762 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4004 - CGI_10014626 superfamily 245213 244 284 1.00E-08 54.9502 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4004 - CGI_10014626 superfamily 245213 208 241 1.60E-07 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4004 - CGI_10014626 superfamily 245213 480 512 3.46E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4004 - CGI_10014626 superfamily 245213 338 375 0.000484637 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4004 - CGI_10014626 superfamily 205157 386 424 3.14E-07 50.2287 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#4004 - CGI_10014626 superfamily 205157 430 464 4.13E-06 47.1471 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#4004 - CGI_10014626 superfamily 216053 122 214 0.00337287 39.199 cl02922 Flagellin_N superfamily C - Bacterial flagellin N-terminal helical region; Flagellins polymerise to form bacterial flagella. This family includes flagellins and hook associated protein 3. Structurally this family forms an extended helix that interacts with pfam00700. Q#4005 - CGI_10014627 superfamily 243093 62 146 1.02E-06 47.0797 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#4006 - CGI_10014628 superfamily 247727 79 181 4.87E-10 55.899 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#4007 - CGI_10014629 superfamily 241596 3 68 0.0049692 35.2675 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#4008 - CGI_10014630 superfamily 241570 902 1032 2.53E-17 80.4478 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#4008 - CGI_10014630 superfamily 245312 111 363 3.07E-07 53.0123 cl10482 KefB superfamily C - "Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]" Q#4009 - CGI_10014631 superfamily 216300 140 351 5.36E-26 106.341 cl18362 Bac_surface_Ag superfamily - - "Surface antigen; This entry includes the following surface antigens; D15 antigen from H.influenzae, OMA87 from P.multocida, OMP85 from N.meningitidis and N.gonorrhoeae. The family also includes a number of eukaryotic proteins that are members of the UPF0140 family. There also appears to be a relationship to pfam03865 (personal obs: C Yeats). In eukaryotes, it appears that these proteins are not surface antigens; S. cerevisiae YNL026W (SAM50) is an essential component of the Sorting and Assembly Machinery (SAM) of the mitochondrial outer membrane. The protein was localised to the mitochondria." Q#4009 - CGI_10014631 superfamily 219346 13 92 0.00645354 34.6071 cl18507 Surf_Ag_VNR superfamily - - "Surface antigen variable number repeat; This family is found primarily in bacterial surface antigens, normally as variable number repeats at the N-terminus. The C-terminus of these proteins is normally represented by pfam01103. The alignment centres on a -GY- or -GF- motif. Some members of this family are found in the mitochondria. It is predicted to have a mixed alpha/beta secondary structure." Q#4011 - CGI_10014633 superfamily 241580 208 287 3.67E-39 136.53 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#4012 - CGI_10014634 superfamily 247683 226 276 7.89E-18 76.1009 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#4013 - CGI_10001670 superfamily 241563 85 121 2.55E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4015 - CGI_10001672 superfamily 241563 68 108 1.68E-05 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4016 - CGI_10001673 superfamily 241563 130 170 3.62E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4017 - CGI_10004701 superfamily 150884 206 317 1.13E-46 159.222 cl10958 Med19 superfamily C - Mediator of RNA pol II transcription subunit 19; Med19 represents a family of conserved proteins which are members of the multi-protein co-activator Mediator complex. Mediator is required for activation of RNA polymerase II transcription by DNA binding transactivators. Q#4019 - CGI_10004703 superfamily 245201 11 282 3.60E-155 444.752 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4020 - CGI_10015543 superfamily 149701 209 253 3.75E-23 89.1977 cl07373 Integrin_b_cyt superfamily - - "Integrin beta cytoplasmic domain; Integrins are a group of transmembrane proteins which function as extracellular matrix receptors and in cell adhesion. Integrins are ubiquitously expressed and are heterodimeric, each composed of an alpha and beta subunit. Several variations of the the alpha and beta subunits exist, and association of different alpha and beta subunits can have different a different binding specificity. This domain corresponds to the cytoplasmic domain of the beta subunit." Q#4020 - CGI_10015543 superfamily 219669 106 185 6.00E-17 73.1959 cl06832 Integrin_B_tail superfamily - - Integrin beta tail domain; This is the beta tail domain of the Integrin protein. Integrins are receptors which are involved in cell-cell and cell-extracellular matrix interactions. Q#4021 - CGI_10015544 superfamily 248097 74 197 2.79E-21 85.3946 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4022 - CGI_10015545 superfamily 215647 57 149 7.31E-05 42.2105 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#4023 - CGI_10015546 superfamily 241599 289 347 2.42E-23 95.388 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#4023 - CGI_10015546 superfamily 245876 19 117 8.60E-12 63.289 cl12113 HSF_DNA-bind superfamily - - HSF-type DNA-binding; HSF-type DNA-binding. Q#4025 - CGI_10015548 superfamily 241599 160 217 3.39E-22 87.2988 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#4026 - CGI_10015549 superfamily 247986 298 419 1.66E-05 45.4418 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#4026 - CGI_10015549 superfamily 197504 523 654 6.81E-37 135.495 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#4026 - CGI_10015549 superfamily 245225 1 205 3.07E-22 97.7348 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#4027 - CGI_10015550 superfamily 245225 55 332 3.71E-10 60.9853 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#4027 - CGI_10015550 superfamily 247986 377 466 5.13E-10 58.9238 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#4027 - CGI_10015550 superfamily 197504 578 712 1.68E-43 153.984 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#4028 - CGI_10015551 superfamily 247986 369 470 4.74E-09 55.457 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#4028 - CGI_10015551 superfamily 245225 67 332 6.15E-07 49.97 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#4029 - CGI_10015552 superfamily 243058 106 221 1.69E-12 65.4135 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#4029 - CGI_10015552 superfamily 243058 443 556 1.17E-09 56.9391 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#4029 - CGI_10015552 superfamily 243058 30 139 4.95E-09 55.0131 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#4029 - CGI_10015552 superfamily 243058 370 478 5.62E-08 51.9315 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#4029 - CGI_10015552 superfamily 243058 186 290 1.51E-07 50.3907 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#4029 - CGI_10015552 superfamily 222722 747 885 2.50E-21 93.427 cl18686 EDR1 superfamily N - "Ethylene-responsive protein kinase Le-CTR1; EDR1 regulates disease resistance and ethylene-induced senescence, and is also involved in stress response signalling and cell death regulation." Q#4031 - CGI_10015554 superfamily 246681 1046 1241 3.65E-97 309.624 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#4031 - CGI_10015554 superfamily 247057 154 213 1.01E-31 119.83 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#4031 - CGI_10015554 superfamily 243095 793 1011 7.32E-105 332.075 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#4033 - CGI_10015556 superfamily 241578 105 278 5.95E-47 161.402 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4033 - CGI_10015556 superfamily 247097 290 324 0.00821797 34.349 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#4034 - CGI_10015557 superfamily 243064 268 304 4.94E-05 42.4026 cl02512 NTR_like superfamily NC - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#4039 - CGI_10015562 superfamily 217294 180 433 3.05E-122 361.381 cl08381 GatB_N superfamily - - GatB/GatE catalytic domain; This domain is found in the GatB and GatE proteins. Q#4040 - CGI_10015563 superfamily 220656 33 138 5.48E-22 85.7669 cl10939 Erf4 superfamily - - Golgin subfamily A member 7/ERF4 family; This family of proteins includes Golgin subfamily A member 7 proteins as well as Ras modification protein ERF4. Q#4041 - CGI_10015564 superfamily 241682 63 120 5.97E-08 47.4159 cl00203 Ribosomal_L30_like superfamily - - "Ribosomal protein L30, which is found in eukaryotes and prokaryotes but not in archaea, is one of the smallest ribosomal proteins with a molecular mass of about 7kDa. L30 binds the 23SrRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome. The eukaryotic L30 members have N- and/or C-terminal extensions not found in their prokaryotic orthologs. L30 is closely related to the ribosomal L7 protein found in eukaryotes and archaea." Q#4042 - CGI_10015565 superfamily 248012 436 572 1.37E-23 97.394 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#4043 - CGI_10015566 superfamily 243263 3 388 2.29E-72 236.921 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#4044 - CGI_10015567 superfamily 241610 48 99 6.53E-20 77.2902 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#4050 - CGI_10014670 superfamily 219542 38 147 1.43E-38 137.758 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#4050 - CGI_10014670 superfamily 219541 420 561 1.64E-22 94.0723 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#4050 - CGI_10014670 superfamily 215896 154 314 3.78E-12 64.2384 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#4051 - CGI_10014671 superfamily 219542 44 148 1.51E-41 146.233 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#4051 - CGI_10014671 superfamily 219541 419 565 9.17E-25 100.235 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#4051 - CGI_10014671 superfamily 215896 177 311 2.06E-18 82.3428 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#4052 - CGI_10014672 superfamily 248097 3 121 5.87E-14 63.4382 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4053 - CGI_10014673 superfamily 248097 5 117 6.76E-11 54.9638 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4054 - CGI_10014674 superfamily 246925 97 181 0.00255732 40.0314 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#4056 - CGI_10014677 superfamily 217380 711 946 1.81E-72 244.542 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#4057 - CGI_10014678 superfamily 246680 367 443 3.85E-16 73.3899 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4058 - CGI_10014679 superfamily 246680 12 94 1.08E-16 69.3262 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4059 - CGI_10014680 superfamily 214773 137 375 3.05E-79 247.335 cl18315 CAP10 superfamily - - Putative lipopolysaccharide-modifying enzyme; Putative lipopolysaccharide-modifying enzyme. Q#4060 - CGI_10014681 superfamily 204791 7 286 1.53E-89 275.919 cl13393 WASH_WAHD superfamily - - "WAHD domain of WASH complex; This domain forms part of the WASH-complex of domains and proteins that activates the Arp2/3 complex, see pfam04062. The Arp2/3 complex regulates endocytosis, sorting, and trafficking within the cell. The WAHD domain attaches to the FAM21 proteins via its N-terminal residues and to the microtubules via its C-terminal residues." Q#4061 - CGI_10014682 superfamily 241758 7 150 2.92E-14 65.469 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#4062 - CGI_10014683 superfamily 241758 7 147 1.13E-19 80.1066 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#4063 - CGI_10014684 superfamily 247856 100 173 8.21E-11 54.4761 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4063 - CGI_10014684 superfamily 247856 65 126 2.42E-09 50.2389 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4063 - CGI_10014684 superfamily 244899 40 85 0.00457968 33.6174 cl08302 S-100 superfamily N - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#4064 - CGI_10014685 superfamily 241758 16 96 7.71E-19 76.2546 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#4065 - CGI_10014686 superfamily 247724 2 29 0.00318347 36.2316 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4067 - CGI_10014688 superfamily 221676 167 242 1.43E-05 45.8658 cl14995 Vezatin superfamily NC - Mysoin-binding motif of peroxisomes; Vezatin is a peroxisome transmembrane receptor that is involved in membrane-membrane and cell-cell adhesions. In the movement of peroxisomes it binds to class V and class VIIa myosins to guide the organelle through the microtubules and allow pathogens to internalise themselves into host cells. Vezatin is crucial for spermatozoan production. In mouse cells it interacts with the cadherin-catenin complex bridging it to the C-terminal FERM domain of myosin VIIA. Q#4068 - CGI_10014689 superfamily 241567 66 234 9.43E-15 71.4775 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#4069 - CGI_10014690 superfamily 241567 44 215 4.22E-19 81.901 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#4074 - CGI_10009148 superfamily 241571 346 452 2.76E-23 99.409 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4074 - CGI_10009148 superfamily 241571 232 339 1.62E-14 73.9858 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4074 - CGI_10009148 superfamily 241568 552 598 4.97E-09 56.3172 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#4074 - CGI_10009148 superfamily 241613 193 228 6.73E-09 55.2906 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#4074 - CGI_10009148 superfamily 241568 660 714 7.54E-09 55.932 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#4074 - CGI_10009148 superfamily 241568 1002 1056 1.07E-08 55.1616 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#4074 - CGI_10009148 superfamily 245213 3396 3430 5.16E-08 53.0242 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 3282 3318 5.55E-08 53.0242 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 3170 3207 1.28E-07 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 3433 3469 2.10E-07 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 241568 1060 1102 1.82E-06 48.6132 cl00043 CCP superfamily C - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#4074 - CGI_10009148 superfamily 241568 936 988 3.75E-06 47.8428 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#4074 - CGI_10009148 superfamily 241571 454 541 4.44E-06 48.1775 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4074 - CGI_10009148 superfamily 241568 602 656 5.64E-06 47.4576 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#4074 - CGI_10009148 superfamily 245213 3245 3279 5.64E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 241568 719 772 1.72E-05 45.9168 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#4074 - CGI_10009148 superfamily 245213 3514 3545 4.34E-05 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 815 846 4.87E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 773 804 0.000324286 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 3106 3134 0.00146673 39.9274 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 3057 3096 0.00195847 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 3475 3507 0.00417034 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 245213 4301 4342 0.00870653 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4074 - CGI_10009148 superfamily 111397 3792 3870 1.74E-11 63.8994 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#4074 - CGI_10009148 superfamily 219525 4090 4137 2.98E-10 59.7401 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#4074 - CGI_10009148 superfamily 219525 4252 4297 3.86E-08 53.5769 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#4074 - CGI_10009148 superfamily 241611 3593 3733 2.17E-07 52.776 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#4074 - CGI_10009148 superfamily 219525 4144 4191 3.09E-07 50.8805 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#4074 - CGI_10009148 superfamily 219525 4204 4244 7.55E-06 46.6433 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#4074 - CGI_10009148 superfamily 243035 104 190 1.31E-05 46.9142 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4074 - CGI_10009148 superfamily 219525 2953 3000 1.96E-05 45.4877 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#4074 - CGI_10009148 superfamily 219525 3007 3054 3.56E-05 44.7174 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#4075 - CGI_10009149 superfamily 241591 21 85 1.02E-12 60.7127 cl00073 H15 superfamily C - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#4077 - CGI_10009151 superfamily 246680 134 203 1.39E-09 54.5152 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4078 - CGI_10009152 superfamily 246680 134 203 2.82E-09 53.3596 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4080 - CGI_10009154 superfamily 150458 114 158 2.10E-11 55.7807 cl10765 Oxidored-like superfamily - - "Oxidoreductase-like protein, N-terminal; Members of this family are found in the N terminal region of various oxidoreductase like proteins. Their exact function is, as yet, unknown." Q#4081 - CGI_10009156 superfamily 246925 126 275 1.46E-12 66.9953 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#4085 - CGI_10007449 superfamily 241754 52 385 0 636.737 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#4086 - CGI_10007450 superfamily 192483 78 132 4.29E-13 63.0394 cl10893 Pet191_N superfamily - - Cytochrome c oxidase assembly protein PET191; Pet191_N is the conserved N-terminal of a family of conserved proteins found from nematodes to humans. It carries six highly conserved cysteine residues. Pet191 is required for the assembly of active cytochrome c oxidase but does not form part of the final assembled complex. Q#4089 - CGI_10007453 superfamily 218223 51 550 7.64E-135 404.058 cl04698 Radial_spoke superfamily - - "Radial spokehead-like protein; This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologues, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene." Q#4091 - CGI_10007455 superfamily 241822 186 234 0.00123658 35.9133 cl00373 Ribosomal_S18 superfamily - - Ribosomal protein S18; Ribosomal protein S18. Q#4092 - CGI_10007456 superfamily 241637 83 148 4.55E-06 45.377 cl00146 TFIIS_I superfamily - - N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme Q#4092 - CGI_10007456 superfamily 245716 849 870 1.04E-05 43.7721 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#4094 - CGI_10007458 superfamily 241547 68 316 4.08E-71 239.623 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#4095 - CGI_10007459 superfamily 218601 6 148 4.72E-39 138.514 cl05181 SURF2 superfamily C - "Surfeit locus protein 2 (SURF2); Surfeit locus protein 2 is part of a group of at least six sequence unrelated genes (Surf-1 to Surf-6). The six Surfeit genes have been classified as housekeeping genes, being expressed in all tissue types tested and not containing a TATA box in their promoter region. The exact function of SURF2 is unknown." Q#4096 - CGI_10007460 superfamily 217293 51 263 1.71E-47 163.573 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#4096 - CGI_10007460 superfamily 202474 272 363 3.08E-29 113.518 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#4096 - CGI_10007460 superfamily 202474 396 424 0.00044246 40.3297 cl08379 Neur_chan_memb superfamily N - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#4097 - CGI_10007461 superfamily 217293 80 297 8.64E-37 135.068 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#4097 - CGI_10007461 superfamily 202474 305 386 1.46E-29 114.673 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#4097 - CGI_10007461 superfamily 202474 419 448 0.00936713 36.0925 cl08379 Neur_chan_memb superfamily N - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#4098 - CGI_10007462 superfamily 202474 6 87 8.28E-21 84.6277 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#4098 - CGI_10007462 superfamily 202474 103 140 1.04E-05 42.2557 cl08379 Neur_chan_memb superfamily N - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#4099 - CGI_10007463 superfamily 241563 93 132 0.00035173 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4101 - CGI_10000714 superfamily 245225 16 136 7.01E-17 75.4256 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily NC - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#4102 - CGI_10027552 superfamily 245205 96 175 3.84E-10 55.3217 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#4105 - CGI_10027555 superfamily 243035 249 379 4.17E-25 100.387 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4107 - CGI_10027557 superfamily 243035 268 395 4.80E-25 98.4609 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4107 - CGI_10027557 superfamily 243035 60 186 5.83E-25 98.4609 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4111 - CGI_10027561 superfamily 243035 107 219 2.20E-21 86.5197 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4111 - CGI_10027561 superfamily 245814 52 86 0.00609633 33.882 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4112 - CGI_10027562 superfamily 245814 282 348 1.80E-09 54.9136 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4112 - CGI_10027562 superfamily 221377 19 132 3.41E-09 55.1674 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#4112 - CGI_10027562 superfamily 245814 121 181 9.24E-05 40.8532 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4114 - CGI_10027564 superfamily 245596 252 470 3.24E-62 206.775 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#4116 - CGI_10027566 superfamily 247724 10 175 4.18E-128 360.876 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4117 - CGI_10027567 superfamily 246722 256 390 8.97E-106 311.095 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#4117 - CGI_10027567 superfamily 207684 84 118 7.04E-07 45.8327 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#4118 - CGI_10027568 superfamily 241607 451 499 7.62E-20 84.6633 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#4118 - CGI_10027568 superfamily 248458 45 274 1.47E-10 61.9461 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4119 - CGI_10027569 superfamily 241752 647 989 1.74E-164 489.089 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#4119 - CGI_10027569 superfamily 242589 520 623 1.36E-49 171.624 cl01581 WGR superfamily - - "WGR domain; The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs) as well as the putative Escherichia coli molybdate metabolism regulator and related bacterial proteins, a small family of bacterial DNA ligases, and various other bacterial proteins of unknown function. It has been called WGR after the most conserved central motif of the domain. The domain occurs in single-domain proteins and in a variety of domain architectures, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain." Q#4119 - CGI_10027569 superfamily 241565 374 447 0.00415293 36.5307 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#4119 - CGI_10027569 superfamily 189650 12 86 4.94E-28 109.679 cl02913 zf-PARP superfamily - - Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region; Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. Q#4119 - CGI_10027569 superfamily 191934 270 324 6.41E-21 88.5062 cl06892 PADR1 superfamily - - PADR1 (NUC008) domain; This domain is found in poly(ADP-ribose)-synthetases. The function of this domain is unknown. Q#4119 - CGI_10027569 superfamily 189650 112 192 3.86E-20 86.9522 cl02913 zf-PARP superfamily - - Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region; Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. Q#4120 - CGI_10027570 superfamily 241599 130 184 2.04E-13 63.0312 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#4121 - CGI_10027571 superfamily 187395 46 80 0.0050963 36.2778 cl14622 flgI superfamily NC - flagellar basal body P-ring protein; Reviewed Q#4125 - CGI_10027576 superfamily 219677 69 95 0.0018631 37.4172 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#4127 - CGI_10027579 superfamily 241619 889 931 1.93E-06 50.1609 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#4127 - CGI_10027579 superfamily 243065 3078 3238 2.28E-12 69.3529 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4127 - CGI_10027579 superfamily 243065 5081 5244 1.76E-11 66.6565 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4127 - CGI_10027579 superfamily 243065 8300 8453 4.25E-11 65.5409 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4127 - CGI_10027579 superfamily 243065 6093 6253 5.53E-11 65.1157 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4127 - CGI_10027579 superfamily 243065 1309 1472 6.28E-10 61.6889 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4127 - CGI_10027579 superfamily 243065 2282 2442 2.02E-09 60.1081 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4127 - CGI_10027579 superfamily 243065 308 465 5.47E-09 58.9925 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4127 - CGI_10027579 superfamily 243065 7101 7261 1.07E-08 57.7969 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4127 - CGI_10027579 superfamily 241611 5682 5828 1.82E-05 47.7684 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#4127 - CGI_10027579 superfamily 241619 1953 2028 0.000289089 43.3397 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#4127 - CGI_10027579 superfamily 243065 4085 4226 0.0057881 40.0777 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#4128 - CGI_10027580 superfamily 247799 64 122 3.29E-11 58.2921 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#4128 - CGI_10027580 superfamily 177724 131 357 1.56E-28 111.699 cl15624 PLN00108 superfamily - - unknown protein; Provisional Q#4129 - CGI_10027581 superfamily 219849 1829 1920 6.42E-17 79.9211 cl09597 RIH_assoc superfamily - - "RyR and IP3R Homology associated; This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1,4,5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels. There seems to be no known function for this domain. Also see the IP3-binding domain pfam01365 and pfam02815." Q#4129 - CGI_10027581 superfamily 216456 520 672 1.46E-16 81.2158 cl03182 RYDR_ITPR superfamily - - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#4129 - CGI_10027581 superfamily 216456 1182 1347 2.62E-06 49.2442 cl03182 RYDR_ITPR superfamily - - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#4132 - CGI_10027584 superfamily 247724 50 243 2.43E-05 42.8288 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4134 - CGI_10027586 superfamily 247683 1378 1439 1.65E-37 137.038 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#4134 - CGI_10027586 superfamily 247683 584 645 7.34E-36 132.477 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#4134 - CGI_10027586 superfamily 247683 1495 1555 3.82E-24 98.6055 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#4134 - CGI_10027586 superfamily 151039 243 330 0.000170549 42.0927 cl11115 Cenp-F_leu_zip superfamily C - "Leucine-rich repeats of kinetochore protein Cenp-F/LEK1; Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. There are several leucine-rich repeats along the sequence of LEK1 that are considered to be zippers, though they do not appear to be binding DNA directly in this instance." Q#4134 - CGI_10027586 superfamily 241584 757 832 0.00574345 36.8237 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#4138 - CGI_10027590 superfamily 247856 365 419 1.87E-12 64.9335 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4139 - CGI_10027591 superfamily 241737 11 167 9.53E-91 264.406 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#4140 - CGI_10027592 superfamily 219502 306 358 5.93E-11 60.1507 cl06625 Nucleos_tra2_C superfamily C - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#4140 - CGI_10027592 superfamily 201962 124 194 1.11E-10 57.0004 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#4140 - CGI_10027592 superfamily 219507 203 287 2.39E-09 53.7823 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#4141 - CGI_10027593 superfamily 219502 1 104 4.62E-42 139.887 cl06625 Nucleos_tra2_C superfamily N - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#4142 - CGI_10027594 superfamily 245227 22 188 4.13E-25 109.013 cl10013 Glycosyltransferase_GTB_type superfamily N - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#4142 - CGI_10027594 superfamily 241559 245 375 1.58E-15 75.4251 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#4142 - CGI_10027594 superfamily 141488 412 480 3.37E-35 131.104 cl02524 GAS2 superfamily - - Growth-Arrest-Specific Protein 2 Domain; Growth-Arrest-Specific Protein 2 Domain. Q#4143 - CGI_10027595 superfamily 243081 242 479 9.50E-75 237.893 cl02549 OLF superfamily - - Olfactomedin-like domain; Olfactomedin-like domain. Q#4144 - CGI_10027596 superfamily 241758 41 162 7.97E-18 75.8694 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#4146 - CGI_10027598 superfamily 241547 369 455 1.48E-23 99.6647 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#4146 - CGI_10027598 superfamily 241547 545 647 5.97E-17 80.0195 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#4147 - CGI_10027599 superfamily 241547 75 192 7.82E-29 112.376 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#4147 - CGI_10027599 superfamily 241547 311 381 5.09E-16 75.8133 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#4153 - CGI_10027605 superfamily 247856 50 111 2.21E-11 55.6317 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4156 - CGI_10027608 superfamily 243066 41 84 0.000313599 36.0561 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#4157 - CGI_10027609 superfamily 243066 82 195 6.35E-09 51.0789 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#4158 - CGI_10027610 superfamily 243091 548 649 3.10E-10 58.8851 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#4158 - CGI_10027610 superfamily 222150 731 756 0.000249464 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4158 - CGI_10027610 superfamily 222150 816 841 0.00236118 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4158 - CGI_10027610 superfamily 246975 803 824 0.00808643 35.4005 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#4159 - CGI_10027611 superfamily 241770 48 150 2.40E-11 57.0204 cl00309 PRTases_typeI superfamily - - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#4160 - CGI_10027612 superfamily 247755 414 653 5.19E-111 336.895 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#4160 - CGI_10027612 superfamily 216049 89 366 9.62E-15 73.8594 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#4161 - CGI_10027613 superfamily 247755 169 403 9.74E-117 343.058 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#4161 - CGI_10027613 superfamily 216049 5 121 7.57E-09 54.9846 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#4162 - CGI_10027614 superfamily 220695 87 225 1.38E-05 44.4919 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#4163 - CGI_10027615 superfamily 243064 33 83 4.68E-05 39.6494 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#4164 - CGI_10027616 superfamily 243064 21 118 1.13E-10 55.0574 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#4166 - CGI_10027618 superfamily 145834 13 44 4.98E-07 42.4996 cl03761 Glycos_trans_3N superfamily N - "Glycosyl transferase family, helical bundle domain; This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate." Q#4166 - CGI_10027618 superfamily 216013 57 85 1.03E-05 40.7037 cl02901 Glycos_transf_3 superfamily C - "Glycosyl transferase family, a/b domain; This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate." Q#4167 - CGI_10027619 superfamily 216013 84 327 2.26E-42 151.256 cl02901 Glycos_transf_3 superfamily - - "Glycosyl transferase family, a/b domain; This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate." Q#4167 - CGI_10027619 superfamily 145834 7 71 8.25E-16 72.16 cl03761 Glycos_trans_3N superfamily - - "Glycosyl transferase family, helical bundle domain; This family includes anthranilate phosphoribosyltransferase (TrpD), thymidine phosphorylase. All these proteins can transfer a phosphorylated ribose substrate." Q#4167 - CGI_10027619 superfamily 244483 359 432 6.71E-13 64.0955 cl06734 PYNP_C superfamily - - "Pyrimidine nucleoside phosphorylase C-terminal domain; This domain is found at the C-terminal end of the large alpha/beta domain making up various pyrimidine nucleoside phosphorylases. It has slightly different conformations in different members of this family. For example, in pyrimidine nucleoside phosphorylase (PYNP) there is an added three-stranded anti-parallel beta sheet as compared to other members of the family, such as E. coli thymidine phosphorylase (TP). The domain contains an alpha/ beta hammerhead fold and residues in this domain seem to be important in formation of the homodimer." Q#4168 - CGI_10027620 superfamily 241578 274 471 1.08E-08 55.3291 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4170 - CGI_10027622 superfamily 246908 87 171 6.57E-16 69.7918 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#4173 - CGI_10027625 superfamily 243035 63 159 6.72E-15 66.8745 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4174 - CGI_10027626 superfamily 247941 144 275 3.16E-10 58.5013 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#4174 - CGI_10027626 superfamily 247941 407 541 4.89E-10 57.7309 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#4175 - CGI_10027627 superfamily 128937 3 69 2.10E-14 64.206 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#4175 - CGI_10027627 superfamily 128937 78 143 2.76E-11 55.3464 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#4176 - CGI_10027628 superfamily 128937 3 69 2.10E-14 64.206 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#4176 - CGI_10027628 superfamily 128937 78 143 2.76E-11 55.3464 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#4177 - CGI_10002387 superfamily 247684 73 483 5.24E-81 263.754 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4179 - CGI_10026933 superfamily 246664 224 595 4.75E-157 459.348 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#4179 - CGI_10026933 superfamily 246664 75 165 2.63E-06 48.82 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#4180 - CGI_10026934 superfamily 246664 124 523 4.16E-146 428.917 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#4180 - CGI_10026934 superfamily 246664 22 65 7.32E-08 53.4682 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#4181 - CGI_10026935 superfamily 247057 961 1026 1.01E-33 125.466 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#4181 - CGI_10026935 superfamily 248259 538 635 1.77E-35 131.218 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#4181 - CGI_10026935 superfamily 248259 646 733 2.37E-32 122.358 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#4181 - CGI_10026935 superfamily 248259 449 527 4.44E-32 121.588 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#4181 - CGI_10026935 superfamily 201844 828 857 5.23E-12 62.313 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#4181 - CGI_10026935 superfamily 201844 750 779 7.14E-09 53.4534 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#4183 - CGI_10026937 superfamily 243062 289 386 5.30E-35 125.466 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#4183 - CGI_10026937 superfamily 216062 36 217 4.06E-16 75.9374 cl02928 TGFb_propeptide superfamily - - TGF-beta propeptide; This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. Q#4184 - CGI_10026938 superfamily 241705 33 114 8.66E-26 97.071 cl00228 HIT_like superfamily N - "HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups." Q#4185 - CGI_10026939 superfamily 241889 95 241 1.71E-63 199.008 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#4188 - CGI_10026942 superfamily 241581 3 83 1.34E-14 67.7966 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#4189 - CGI_10026943 superfamily 241596 38 87 1.28E-10 53.7571 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#4190 - CGI_10026944 superfamily 241596 42 85 1.28E-06 42.2011 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#4192 - CGI_10026946 superfamily 220756 1262 1463 2.43E-14 73.4858 cl11092 Urb2 superfamily - - Urb2/Npa2 family; This family includes the Urb2 protein from yeast that are involved in ribosome biogenesis. Q#4198 - CGI_10026952 superfamily 243263 97 586 7.89E-72 240.773 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#4199 - CGI_10026953 superfamily 243263 128 346 3.00E-33 129.078 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#4201 - CGI_10026955 superfamily 247725 300 413 6.03E-57 185.517 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#4202 - CGI_10026956 superfamily 247723 65 146 3.78E-44 148.225 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4202 - CGI_10026956 superfamily 247723 149 230 3.78E-44 148.225 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4202 - CGI_10026956 superfamily 247723 12 57 1.31E-18 78.9174 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4205 - CGI_10026959 superfamily 199908 148 208 0.000107429 39.5463 cl16908 DnaJ_zf superfamily - - "Zinc finger domain of DnaJ and HSP40; Central/middle or CxxCxGxG-motif containing domain of DnaJ/Hsp40 (heat shock protein 40). DnaJ proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonin family. Hsp40 proteins are characterized by the presence of an N-terminal J domain, which mediates the interaction with Hsp70. This central domain contains four repeats of a CxxCxGxG motif and binds to two Zinc ions. It has been implicated in substrate binding." Q#4206 - CGI_10026960 superfamily 199908 417 468 1.04E-06 46.4799 cl16908 DnaJ_zf superfamily - - "Zinc finger domain of DnaJ and HSP40; Central/middle or CxxCxGxG-motif containing domain of DnaJ/Hsp40 (heat shock protein 40). DnaJ proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonin family. Hsp40 proteins are characterized by the presence of an N-terminal J domain, which mediates the interaction with Hsp70. This central domain contains four repeats of a CxxCxGxG motif and binds to two Zinc ions. It has been implicated in substrate binding." Q#4207 - CGI_10026961 superfamily 247739 1 217 2.51E-34 123.552 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#4210 - CGI_10026964 superfamily 247725 560 733 1.48E-77 251.036 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#4210 - CGI_10026964 superfamily 243096 382 568 8.70E-45 160.156 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#4210 - CGI_10026964 superfamily 241565 116 178 7.49E-11 59.6426 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#4210 - CGI_10026964 superfamily 241565 209 274 4.85E-05 42.3087 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#4212 - CGI_10026966 superfamily 241559 27 131 1.24E-21 92.3739 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#4212 - CGI_10026966 superfamily 241559 140 239 1.90E-10 59.6319 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#4212 - CGI_10026966 superfamily 241559 249 335 1.38E-07 50.7723 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#4212 - CGI_10026966 superfamily 216033 1003 1093 4.39E-19 84.3076 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#4212 - CGI_10026966 superfamily 216033 913 997 8.76E-16 75.0628 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#4212 - CGI_10026966 superfamily 216033 616 702 9.88E-14 68.8996 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#4212 - CGI_10026966 superfamily 216033 815 903 2.33E-11 61.966 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#4212 - CGI_10026966 superfamily 216033 526 611 1.12E-09 56.9584 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#4212 - CGI_10026966 superfamily 216033 363 431 3.34E-05 43.4764 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#4212 - CGI_10026966 superfamily 216033 490 521 0.00313315 37.3132 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#4213 - CGI_10026967 superfamily 215847 217 355 9.21E-28 113.695 cl09510 Lipoxygenase superfamily NC - Lipoxygenase; Lipoxygenase. Q#4213 - CGI_10026967 superfamily 241546 2 106 5.03E-24 94.9292 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#4215 - CGI_10026970 superfamily 247905 834 942 2.04E-26 107.709 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#4215 - CGI_10026970 superfamily 247805 536 681 2.69E-23 98.9487 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#4215 - CGI_10026970 superfamily 248013 420 474 7.97E-08 51.4959 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#4215 - CGI_10026970 superfamily 248013 342 385 5.86E-06 46.1031 cl17459 CHROMO superfamily N - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#4215 - CGI_10026970 superfamily 206078 1422 1514 2.66E-34 129.308 cl16467 DUF4208 superfamily - - Domain of unknown function (DUF4208); This domain is found at the C-terminus of chromodomain-helicase-DNA-binding proteins. The exact function of the domain is undetermined. Q#4215 - CGI_10026970 superfamily 247804 1231 1265 0.000222532 41.0135 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#4216 - CGI_10026971 superfamily 241739 43 287 1.07E-96 305.438 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#4216 - CGI_10026971 superfamily 244970 684 724 3.91E-07 48.1474 cl08469 tRNA_SAD superfamily - - "Threonyl and Alanyl tRNA synthetase second additional domain; The catalytically active from of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain." Q#4218 - CGI_10026973 superfamily 243066 1 61 2.92E-06 43.3105 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#4220 - CGI_10026975 superfamily 241645 10 72 6.80E-10 55.351 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#4220 - CGI_10026975 superfamily 241578 185 350 2.13E-09 55.2646 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4222 - CGI_10026977 superfamily 243035 115 236 5.19E-12 60.0109 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4223 - CGI_10026978 superfamily 247792 359 401 9.74E-08 49.7516 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4223 - CGI_10026978 superfamily 241644 194 291 1.33E-13 68.7681 cl00154 UBCc superfamily N - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#4223 - CGI_10026978 superfamily 241563 689 716 0.006056 35.3907 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4225 - CGI_10026980 superfamily 247038 197 285 2.16E-13 65.5537 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#4225 - CGI_10026980 superfamily 246669 18 89 4.70E-07 47.4467 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#4226 - CGI_10026981 superfamily 241758 1 114 5.93E-21 82.4178 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#4227 - CGI_10026982 superfamily 241758 8 108 4.49E-14 63.543 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#4230 - CGI_10026985 superfamily 243061 11 113 1.33E-35 134.007 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 2738 2841 3.71E-33 127.073 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 979 1081 2.64E-32 124.377 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1296 1412 8.86E-24 99.7238 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 2296 2397 2.54E-22 95.4866 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 342 442 6.88E-21 91.6346 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1953 2056 4.48E-20 88.9382 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1191 1289 2.07E-19 87.0122 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1850 1945 9.55E-19 85.0862 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 660 760 1.20E-18 85.0862 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 2523 2612 2.40E-17 80.9738 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1415 1515 6.35E-17 80.0786 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 763 864 9.75E-17 79.3082 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 2401 2498 1.02E-16 79.3082 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 553 653 2.95E-16 78.1526 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1646 1733 2.24E-15 75.4562 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 128 219 1.77E-14 72.8846 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 894 972 2.22E-14 72.3746 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1106 1181 3.13E-14 72.1142 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1737 1834 3.50E-14 71.9894 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 249 334 3.61E-13 68.9078 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 1533 1630 1.42E-12 67.367 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 2621 2728 3.81E-12 65.8262 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 445 543 5.40E-12 65.441 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 2072 2157 3.20E-10 60.0482 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4230 - CGI_10026985 superfamily 243061 2206 2285 3.07E-07 51.3134 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4231 - CGI_10026986 superfamily 247692 269 765 4.06E-168 498.844 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#4231 - CGI_10026986 superfamily 241567 1 256 1.61E-12 66.8551 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#4232 - CGI_10026987 superfamily 204502 26 77 1.31E-26 101.422 cl11146 GalKase_gal_bdg superfamily - - "Galactokinase galactose-binding signature; This is the highly conserved galactokinase signature sequence which appears to be present in all galactokinases irrespective of how many other ATP binding sites, etc that they carry. The function of this domain appears to be to bind galactose, and the domain is normally at the N-terminus of the enzymes, EC:2.7.1.6. This domain is associated with the families GHMP_kinases_C, pfam08544 and GHMP_kinases_N, pfam00288." Q#4232 - CGI_10026987 superfamily 219894 355 435 1.51E-08 51.7316 cl08484 GHMP_kinases_C superfamily - - "GHMP kinases C terminal; This family includes homoserine kinases, galactokinases and mevalonate kinases." Q#4233 - CGI_10026988 superfamily 219309 1 496 5.50E-36 138.392 cl06259 PDCD9 superfamily - - Mitochondrial 28S ribosomal protein S30 (PDCD9); This family consists of several eukaryotic mitochondrial 28S ribosomal protein S30 (or programmed cell death protein 9 PDCD9) sequences. The exact function of this family is unknown although it is known to be a component of the mitochondrial ribosome and a component in cellular apoptotic signaling pathways. Q#4236 - CGI_10026991 superfamily 221137 228 284 4.39E-14 66.2404 cl13084 Med27 superfamily C - "Mediator complex subunit 27; Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species {1-2]. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Mediator exists in two major forms in human cells: a smaller form that interacts strongly with pol II and activates transcription, and a large form that does not interact strongly with pol II and does not directly activate transcription. The ubiquitous expression of Med27 mRNA suggests a universal requirement for Med27 in transcriptional initiation. Loss of Crsp34/Med27 decreases amacrine cell number, but increases the number of rod photoreceptor cells." Q#4238 - CGI_10026993 superfamily 247755 1 57 1.54E-27 99.5362 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#4239 - CGI_10026994 superfamily 243077 132 159 2.27E-08 47.1549 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#4239 - CGI_10026994 superfamily 243077 10 37 5.80E-07 43.3029 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#4240 - CGI_10005755 superfamily 248458 2 327 2.49E-06 47.3085 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4241 - CGI_10005756 superfamily 248458 44 220 5.37E-13 66.1833 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4242 - CGI_10005757 superfamily 241600 7 187 1.05E-89 264.488 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#4243 - CGI_10005758 superfamily 216033 334 369 4.85E-05 41.5504 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#4244 - CGI_10005759 superfamily 241613 76 106 3.16E-09 50.283 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#4244 - CGI_10005759 superfamily 241571 17 70 7.16E-07 45.0959 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4245 - CGI_10005760 superfamily 245175 5 31 1.54E-08 48.0992 cl09847 Sen15 superfamily N - "Sen15 protein; The Sen15 subunit of the tRNA intron-splicing endonuclease is one of the two structural subunits of this heterotetrameric enzyme. Residues 36-157 of this subunit possess a novel homodimeric fold. Each monomer consists of three alpha-helices and a mixed antiparallel/parallel beta-sheet. Two monomers of Sen15 fold with two monomers of Sen34, one of the two catalytic subunits, to form an alpha2-beta2 tetramer as part of the functional endonuclease assembly." Q#4245 - CGI_10005760 superfamily 245175 59 84 1.20E-07 45.4028 cl09847 Sen15 superfamily N - "Sen15 protein; The Sen15 subunit of the tRNA intron-splicing endonuclease is one of the two structural subunits of this heterotetrameric enzyme. Residues 36-157 of this subunit possess a novel homodimeric fold. Each monomer consists of three alpha-helices and a mixed antiparallel/parallel beta-sheet. Two monomers of Sen15 fold with two monomers of Sen34, one of the two catalytic subunits, to form an alpha2-beta2 tetramer as part of the functional endonuclease assembly." Q#4246 - CGI_10005761 superfamily 247792 367 407 2.85E-09 52.8332 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4248 - CGI_10005763 superfamily 241732 896 1010 1.48E-37 139.285 cl00258 RIBOc superfamily - - "RIBOc. Ribonuclease III C terminal domain. This group consists of eukaryotic, bacterial and archeal ribonuclease III (RNAse III) proteins. RNAse III is a double stranded RNA-specific endonuclease. Prokaryotic RNAse III is important in post-transcriptional control of mRNA stability and translational efficiency. It is involved in the processing of ribosomal RNA precursors. Prokaryotic RNAse III also plays a role in the maturation of tRNA precursors and in the processing of phage and plasmid transcripts. Eukaryotic RNase III's participate (through direct cleavage) in rRNA processing, in processing of small nucleolar RNAs (snoRNAs) and snRNA's (components of the spliceosome). In eukaryotes RNase III or RNaseIII like enzymes such as Dicer are involved in RNAi (RNA interference) and miRNA (micro-RNA) gene silencing." Q#4248 - CGI_10005763 superfamily 241732 1057 1189 8.94E-34 128.499 cl00258 RIBOc superfamily - - "RIBOc. Ribonuclease III C terminal domain. This group consists of eukaryotic, bacterial and archeal ribonuclease III (RNAse III) proteins. RNAse III is a double stranded RNA-specific endonuclease. Prokaryotic RNAse III is important in post-transcriptional control of mRNA stability and translational efficiency. It is involved in the processing of ribosomal RNA precursors. Prokaryotic RNAse III also plays a role in the maturation of tRNA precursors and in the processing of phage and plasmid transcripts. Eukaryotic RNase III's participate (through direct cleavage) in rRNA processing, in processing of small nucleolar RNAs (snoRNAs) and snRNA's (components of the spliceosome). In eukaryotes RNase III or RNaseIII like enzymes such as Dicer are involved in RNAi (RNA interference) and miRNA (micro-RNA) gene silencing." Q#4248 - CGI_10005763 superfamily 241575 1195 1267 5.80E-13 66.1419 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#4249 - CGI_10005764 superfamily 247639 42 384 0 539.125 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#4250 - CGI_10005765 superfamily 247856 80 138 3.42E-14 64.1061 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4250 - CGI_10005765 superfamily 247856 113 185 2.36E-13 61.7949 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4250 - CGI_10005765 superfamily 247856 53 103 0.00389266 33.6753 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4251 - CGI_10005766 superfamily 247792 285 332 2.21E-06 43.9736 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4252 - CGI_10005767 superfamily 247792 141 180 6.48E-05 38.5808 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4253 - CGI_10005768 superfamily 247905 416 536 4.47E-25 101.546 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#4253 - CGI_10005768 superfamily 247805 146 356 1.69E-63 211.574 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#4253 - CGI_10005768 superfamily 222474 595 655 1.09E-20 87.0933 cl16500 DUF4217 superfamily - - Domain of unknown function (DUF4217); This short domain is found at the C-terminus of many helicase proteins. Q#4254 - CGI_10005769 superfamily 241600 77 291 1.79E-104 306.089 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#4255 - CGI_10005770 superfamily 241782 86 470 1.81E-90 281.923 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#4256 - CGI_10005771 superfamily 247792 419 458 4.63E-06 43.9736 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4256 - CGI_10005771 superfamily 247057 348 391 2.72E-12 62.3054 cl15755 SAM_superfamily superfamily C - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#4258 - CGI_10015523 superfamily 245815 25 499 0 896.181 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#4259 - CGI_10015524 superfamily 245814 13 106 1.27E-11 61.2491 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4259 - CGI_10015524 superfamily 245814 313 391 1.62E-10 57.4542 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4259 - CGI_10015524 superfamily 245814 187 282 2.51E-10 57.1301 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4259 - CGI_10015524 superfamily 245814 112 176 8.04E-07 46.9967 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4262 - CGI_10015527 superfamily 245205 2019 2129 1.86E-39 144.677 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#4262 - CGI_10015527 superfamily 245205 2139 2386 1.52E-60 211.154 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#4262 - CGI_10015527 superfamily 245205 2397 2503 1.49E-07 52.5363 cl09930 RPA_2b-aaRSs_OBF_like superfamily C - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#4262 - CGI_10015527 superfamily 201361 1162 1184 0.00422306 37.8427 cl02912 BRCA2 superfamily - - BRCA2 repeat; The alignment covers only the most conserved region of the repeat. Q#4262 - CGI_10015527 superfamily 201361 1284 1311 0.00460424 37.4575 cl02912 BRCA2 superfamily - - BRCA2 repeat; The alignment covers only the most conserved region of the repeat. Q#4262 - CGI_10015527 superfamily 201361 731 761 0.00601951 37.0723 cl02912 BRCA2 superfamily - - BRCA2 repeat; The alignment covers only the most conserved region of the repeat. Q#4263 - CGI_10015528 superfamily 247792 19 62 2.28E-07 48.2108 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4263 - CGI_10015528 superfamily 241563 150 181 0.000413612 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4265 - CGI_10015530 superfamily 244906 9 75 3.96E-28 99.9071 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#4268 - CGI_10015534 superfamily 247724 18 132 1.57E-46 155.427 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4268 - CGI_10015534 superfamily 243035 142 256 2.29E-26 100.772 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4269 - CGI_10015535 superfamily 241596 65 121 5.54E-14 68.0095 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#4269 - CGI_10015535 superfamily 241563 234 270 5.24E-06 44.5855 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4269 - CGI_10015535 superfamily 110440 661 687 0.000585923 38.5429 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#4270 - CGI_10015536 superfamily 247792 18 69 2.35E-08 51.2924 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4271 - CGI_10015537 superfamily 247792 18 69 2.43E-07 48.2108 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4271 - CGI_10015537 superfamily 241563 101 147 0.00659347 35.0055 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4272 - CGI_10015538 superfamily 247792 35 86 1.48E-07 48.596 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4273 - CGI_10015539 superfamily 220386 43 223 4.09E-83 261.099 cl10743 KOG2701 superfamily - - "Coiled-coil domain-containing protein (DUF2037); This entry represents the conserved N-terminal 200 residues of a family of proteins conserved from plants to vertebrates. In Drosophila it comes from the Fidipidine gene, and is of unknown function." Q#4274 - CGI_10015540 superfamily 148723 3 191 4.82E-62 195.77 cl06350 INSIG superfamily - - "Insulin-induced protein (INSIG); This family contains a number of eukaryotic Insulin-induced proteins (INSIG-1 and INSIG-2) approximately 200 residues long. INSIG-1 and INSIG-2 are found in the endoplasmic reticulum and bind the sterol-sensing domain of SREBP cleavage-activating protein (SCAP), preventing it from escorting SREBPs to the Golgi. Their combined action permits feedback regulation of cholesterol synthesis over a wide range of sterol concentrations." Q#4275 - CGI_10015541 superfamily 241600 23 193 4.69E-63 196.693 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#4276 - CGI_10015542 superfamily 241600 70 114 2.93E-08 49.1611 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#4279 - CGI_10001766 superfamily 242146 1 63 2.45E-11 56.052 cl00859 Cytochrome_b_N superfamily N - "Cytochrome b (N-terminus)/b6/petB: Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms. Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites. The C-terminal portion of cytochrome b is described in a separate CD." Q#4279 - CGI_10001766 superfamily 241673 42 106 1.94E-06 42.2477 cl00193 cytochrome_b_C superfamily C - "Cytochrome b(C-terminus)/b6/petD: Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms. Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites. The C-terminal domain is involved in forming the ubiquinol/ubiquinone binding sites, but not the heme binding sites. The N-terminal portion of cytochrome b, which contains both heme binding sites, is described in a separate CD." Q#4281 - CGI_10007884 superfamily 241563 65 100 0.00875052 31.874 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4283 - CGI_10007886 superfamily 241563 72 112 0.000117322 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4289 - CGI_10007892 superfamily 243092 444 716 1.60E-15 76.6048 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4290 - CGI_10007893 superfamily 212559 282 326 5.96E-24 94.9886 cl18297 SANT_MTA3_like superfamily - - "Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis." Q#4290 - CGI_10007893 superfamily 216509 175 226 2.75E-07 48.0038 cl03218 ELM2 superfamily - - "ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in a member from Arabidopsis thaliana. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain." Q#4291 - CGI_10007894 superfamily 245815 39 98 0.000524406 39.107 cl11961 ALDH-SF superfamily NC - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#4292 - CGI_10013049 superfamily 243035 25 91 8.65E-12 56.6156 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4293 - CGI_10013050 superfamily 245201 12 299 0 514.497 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4294 - CGI_10013051 superfamily 187408 362 700 2.09E-179 524.919 cl14654 V_Alix_like superfamily - - "Protein-interacting V-domain of mammalian Alix and related domains; This superfamily contains the V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. The Alix V-domain contains a binding site, partially conserved in this superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Members of this superfamily have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members, including Alix, HD-PTP, and Bro1, also have a proline-rich region (PRR), which binds multiple partners in Alix, including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. The C-terminal portion (V-domain and PRR) of Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes; it interacts with a YPxL motif in Doa4s catalytic domain to stimulate its deubiquitination activity. Rim20 may bind the ESCRT-III subunit Snf7, bringing the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and promoting the proteolytic activation of Rim101. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate often absent in human kidney, breast, lung, and cervical tumors. HD-PTP has a C-terminal catalytically inactive tyrosine phosphatase domain." Q#4294 - CGI_10013051 superfamily 187403 2 346 2.80E-177 519.545 cl14649 BRO1_Alix_like superfamily - - "Protein-interacting Bro1-like domain of mammalian Alix and related domains; This superfamily includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1 and Rim20 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, HD-PTP, and Brox) and Snf7 (in the case of yeast Bro1, and Rim20). The single domain protein human Brox, and the isolated Bro1-like domains of Alix, HD-PTP and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix, HD-PTP, Bro1, and Rim20 also have a V-shaped (V) domain, which in the case of Alix, has been shown to be a dimerization domain and to contain a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in this superfamily. Alix, HD-PTP and Bro1 also have a proline-rich region (PRR); the Alix PRR binds multiple partners. Rhophilin-1, and -2, in addition to this Bro1-like domain, have an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This protein has a C-terminal, catalytically inactive tyrosine phosphatase domain." Q#4296 - CGI_10013053 superfamily 245213 28 65 8.11E-05 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4299 - CGI_10013056 superfamily 241599 77 135 5.80E-24 92.6916 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#4301 - CGI_10013058 superfamily 241599 66 124 5.60E-22 86.9136 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#4302 - CGI_10013059 superfamily 149667 81 171 3.48E-08 50.0615 cl07343 GON superfamily C - GON domain; The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. Q#4303 - CGI_10013060 superfamily 149667 1 111 5.30E-07 47.7503 cl07343 GON superfamily N - GON domain; The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. Q#4308 - CGI_10013065 superfamily 243051 94 248 5.03E-26 100.53 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#4308 - CGI_10013065 superfamily 245213 53 90 2.64E-07 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4308 - CGI_10013065 superfamily 245213 14 48 1.25E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4310 - CGI_10012186 superfamily 241584 2622 2713 1.56E-10 60.9731 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#4310 - CGI_10012186 superfamily 241584 2552 2616 1.98E-06 48.6467 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#4310 - CGI_10012186 superfamily 243092 2085 2412 4.29E-25 108.576 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4310 - CGI_10012186 superfamily 243092 1816 2160 2.99E-23 103.184 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4310 - CGI_10012186 superfamily 204895 149 220 1.48E-21 92.4033 cl13764 DUF3651 superfamily - - "Protein of unknown function (DUF3651); This domain family is found in eukaryotes, and is approximately 70 amino acids in length. This family is frequently annotated as a membrane protein but there is little associated literature to back this up." Q#4311 - CGI_10012187 superfamily 218231 96 163 3.84E-07 46.5026 cl04708 ELMO_CED12 superfamily N - "ELMO/CED-12 family; This family represents a conserved domain which is found in a number of eukaryotic proteins including CED-12, ELMO I and ELMO II. ELMO1 is a component of signalling pathways that regulate phagocytosis and cell migration and is the mammalian orthologue of the C. elegans gene, ced-12. CED-12 is required for the engulfment of dying cells and cell migration. In mammalian cells, ELMO1 interacts with Dock180 as part of the CrkII/Dock180/Rac pathway responsible for phagocytosis and cell migration. ELMO1 is ubiquitously expressed, although its expression is highest in the spleen, an organ rich in immune cells. ELMO1 has a PH domain and a polyproline sequence motif at its C terminus which are not present in this alignment." Q#4312 - CGI_10012188 superfamily 241645 59 145 5.87E-07 44.0717 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#4313 - CGI_10012189 superfamily 241622 41 83 4.42E-08 52.5691 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#4313 - CGI_10012189 superfamily 241647 420 438 0.00642267 36.4247 cl00157 WW superfamily C - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#4318 - CGI_10012194 superfamily 219000 31 178 2.11E-25 101.184 cl05717 Drf_FH3 superfamily - - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#4319 - CGI_10012195 superfamily 219001 1 45 5.64E-05 37.6735 cl05720 Drf_GBD superfamily N - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#4320 - CGI_10012196 superfamily 242914 21 60 1.93E-17 69.1796 cl02163 zf-CSL superfamily N - CSL zinc finger; This is a zinc binding motif which contains four cysteine residues which chelate zinc. This domain is often found associated with a pfam00226 domain. This domain is named after the conserved motif of the final cysteine. Q#4322 - CGI_10012198 superfamily 247792 80 125 0.000950668 38.1956 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4322 - CGI_10012198 superfamily 128778 263 383 1.48E-10 59.9711 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#4322 - CGI_10012198 superfamily 110440 683 710 4.55E-06 45.0913 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#4322 - CGI_10012198 superfamily 241563 213 256 9.12E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4323 - CGI_10012199 superfamily 220421 11 482 6.00E-47 172.305 cl10789 DUF2352 superfamily - - Uncharacterized conserved protein (DUF2352); Members of this family of uncharacterized proteins have no known function. Q#4324 - CGI_10012200 superfamily 243306 33 291 1.85E-126 364.186 cl03114 RNase_PH superfamily - - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#4325 - CGI_10012201 superfamily 247724 2333 2495 4.50E-47 169.439 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4325 - CGI_10012201 superfamily 243109 1443 1572 4.99E-32 126.255 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#4325 - CGI_10012201 superfamily 248012 2989 3096 1.32E-14 73.3808 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#4325 - CGI_10012201 superfamily 243109 553 693 1.42E-08 56.1483 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#4325 - CGI_10012201 superfamily 246925 2020 2131 0.00309471 41.187 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#4325 - CGI_10012201 superfamily 199166 60 109 0.00703288 39.618 cl15308 AMN1 superfamily NC - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#4327 - CGI_10003583 superfamily 218118 64 101 3.79E-09 51.0757 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#4328 - CGI_10003584 superfamily 218118 38 103 1.12E-06 42.2161 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#4329 - CGI_10003585 superfamily 218118 122 187 1.77E-15 68.4096 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#4331 - CGI_10003587 superfamily 199156 236 251 0.00156408 35.8929 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#4332 - CGI_10004785 superfamily 247743 1773 1916 9.32E-08 52.6828 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#4333 - CGI_10004786 superfamily 193256 551 814 1.33E-69 237.923 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#4333 - CGI_10004786 superfamily 193251 180 453 2.64E-51 185.139 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#4333 - CGI_10004786 superfamily 193257 1192 1403 3.58E-40 151.291 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#4333 - CGI_10004786 superfamily 193253 831 1172 8.11E-39 150.958 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#4334 - CGI_10004787 superfamily 247792 746 787 0.000387706 39.3512 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4335 - CGI_10004788 superfamily 243109 67 194 4.30E-82 249.523 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#4339 - CGI_10014720 superfamily 217293 40 173 3.68E-25 99.6295 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#4341 - CGI_10014722 superfamily 217293 177 383 1.27E-24 101.555 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#4341 - CGI_10014722 superfamily 217293 29 168 4.76E-24 100.015 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#4341 - CGI_10014722 superfamily 202474 391 457 3.98E-06 46.8781 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#4342 - CGI_10014723 superfamily 243092 50 342 2.49E-77 241.47 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4345 - CGI_10014726 superfamily 241814 25 226 6.42E-26 104.292 cl00360 COG0212 superfamily - - 5-formyltetrahydrofolate cyclo-ligase [Coenzyme metabolism] Q#4345 - CGI_10014726 superfamily 247723 430 498 2.08E-15 71.6018 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4346 - CGI_10014727 superfamily 208802 10 283 0 510.012 cl07974 DRE_TIM_metallolyase superfamily - - "DRE-TIM metallolyase superfamily; The DRE-TIM metallolyase superfamily includes 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM"." Q#4348 - CGI_10014729 superfamily 241583 118 205 3.59E-09 56.0403 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#4348 - CGI_10014729 superfamily 241571 321 417 3.78E-07 48.9256 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#4348 - CGI_10014729 superfamily 241583 219 251 0.000389179 40.6323 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#4352 - CGI_10014734 superfamily 226447 746 842 0.00103297 38.6074 cl18758 COG3937 superfamily - - Uncharacterized conserved protein [Function unknown] Q#4356 - CGI_10009615 superfamily 241640 1 225 8.66E-43 145.882 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#4357 - CGI_10009616 superfamily 241802 384 869 3.77E-176 520.643 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#4357 - CGI_10009616 superfamily 247744 212 365 3.51E-29 114.96 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#4358 - CGI_10009617 superfamily 220651 178 358 3.18E-46 158.46 cl10932 Mlf1IP superfamily - - "Myelodysplasia-myeloid leukemia factor 1-interacting protein; This entry is the conserved central region of a group of proteins that are putative transcriptional repressors. The structure contains a putative 14-3-3 binding motif involved in the subcellular localisation of various regulatory molecules, and it may be that interaction with the transcription factor DREF could be regulated through this motif. DREF regulates proliferation-related genes in Drosophila. Mlf1IP is expressed in both the nuclei and the cytoplasm and thus may have multi-functions." Q#4358 - CGI_10009617 superfamily 216981 24 42 0.0087079 34.8158 cl17087 OTU superfamily C - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#4360 - CGI_10003702 superfamily 216112 47 162 1.34E-38 135.117 cl02964 RNB superfamily C - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#4361 - CGI_10003703 superfamily 241563 68 109 9.05E-07 46.3184 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4361 - CGI_10003703 superfamily 241563 28 59 0.00338096 35.918 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4364 - CGI_10006073 superfamily 241691 403 523 0.000332083 40.5732 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#4366 - CGI_10001021 superfamily 247916 126 216 3.88E-20 85.127 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#4367 - CGI_10001022 superfamily 242274 3 71 6.13E-21 84.0034 cl01053 SGNH_hydrolase superfamily N - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#4370 - CGI_10005773 superfamily 245022 2 35 0.00278323 33.1859 cl09154 MrpF_PhaF superfamily C - "Multiple resistance and pH regulation protein F (MrpF / PhaF); Members of the PhaF / MrpF family are predicted to be an integral membrane proteins with three transmembrane regions, involved in regulation of pH. PhaF is part of a potassium efflux system involved in pH regulation. It is also involved in symbiosis in Rhizobium meliloti. MrpF is part of a Na+/H+ antiporter complex, also involved in pH homeostasis. MrpF is thought to be an efflux system for Na+ and cholate. The Mrp system in Bacilli may also have primary energisation capacities." Q#4371 - CGI_10005774 superfamily 241600 220 429 1.90E-88 271.806 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#4373 - CGI_10005776 superfamily 246925 270 423 5.53E-12 67.3805 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#4373 - CGI_10005776 superfamily 246925 467 708 1.86E-07 53.5134 cl15309 LRR_RI superfamily C - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#4374 - CGI_10005777 superfamily 214507 209 247 5.50E-06 42.8024 cl15307 LRRCT superfamily C - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#4374 - CGI_10005777 superfamily 246925 23 114 0.000562174 39.261 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#4380 - CGI_10017381 superfamily 241574 801 978 2.25E-66 223.617 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4380 - CGI_10017381 superfamily 241574 1000 1072 4.09E-08 53.7437 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4381 - CGI_10017382 superfamily 248264 299 458 3.75E-39 139.68 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#4381 - CGI_10017382 superfamily 243161 5 90 9.52E-15 69.7305 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#4381 - CGI_10017382 superfamily 222263 218 310 2.56E-08 51.1645 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#4387 - CGI_10017389 superfamily 247905 726 870 1.40E-26 106.553 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#4387 - CGI_10017389 superfamily 247805 546 670 4.18E-08 52.3396 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#4388 - CGI_10017390 superfamily 241596 30 90 7.92E-10 54.1423 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#4389 - CGI_10017391 superfamily 241596 4 64 9.91E-11 56.0683 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#4390 - CGI_10017392 superfamily 241624 152 408 3.81E-66 222.587 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#4392 - CGI_10017394 superfamily 243039 294 423 3.00E-70 228.453 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#4392 - CGI_10017394 superfamily 247792 26 69 5.46E-09 53.6036 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4392 - CGI_10017394 superfamily 241563 110 144 1.33E-06 46.5115 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4392 - CGI_10017394 superfamily 128778 150 275 4.11E-10 58.4303 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#4393 - CGI_10017395 superfamily 241802 9 291 3.86E-96 304.41 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#4393 - CGI_10017395 superfamily 247775 267 785 2.77E-61 215.283 cl17221 ArsB_NhaD_permease superfamily - - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#4394 - CGI_10017396 superfamily 241802 9 319 2.22E-106 315.966 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#4395 - CGI_10017397 superfamily 247775 53 91 9.73E-07 44.113 cl17221 ArsB_NhaD_permease superfamily NC - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#4399 - CGI_10017401 superfamily 247723 118 191 3.63E-45 147.244 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4399 - CGI_10017401 superfamily 247723 8 78 1.70E-35 122.196 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4400 - CGI_10017402 superfamily 243034 398 503 0.000954993 37.7448 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#4401 - CGI_10017403 superfamily 247723 43 118 1.97E-45 149.787 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4402 - CGI_10017404 superfamily 218333 422 790 1.02E-16 84.1545 cl09347 DNA_pol_phi superfamily N - DNA polymerase phi; This family includes the fifth essential DNA polymerase in yeast EC:2.7.7.7. Pol5p is localised exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units. Q#4402 - CGI_10017404 superfamily 218333 63 243 1.54E-07 54.4942 cl09347 DNA_pol_phi superfamily C - DNA polymerase phi; This family includes the fifth essential DNA polymerase in yeast EC:2.7.7.7. Pol5p is localised exclusively to the nucleolus and binds near or at the enhancer region of rRNA-encoding DNA repeating units. Q#4403 - CGI_10017405 superfamily 202638 148 370 2.77E-38 140.036 cl18230 Anp1 superfamily - - "Anp1; The members of this family (Anp1, Van1 and Mnn9) are membrane proteins required for proper Golgi function. These proteins co-localise within the cis Golgi, and that they are physically associated in two distinct complexes." Q#4404 - CGI_10017406 superfamily 243094 1198 1530 8.58E-178 550.387 cl02569 RasGAP superfamily - - "Ras GTPase Activating Domain; RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator." Q#4404 - CGI_10017406 superfamily 247725 1713 1822 2.13E-59 202.699 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#4404 - CGI_10017406 superfamily 247069 1572 1712 1.07E-07 53.1578 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#4406 - CGI_10001352 superfamily 241600 53 270 5.22E-70 217.493 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#4407 - CGI_10003187 superfamily 241600 121 325 3.64E-33 122.771 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#4408 - CGI_10008525 superfamily 241574 923 1117 4.07E-84 275.233 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4408 - CGI_10008525 superfamily 216363 22 78 2.57E-12 65.5694 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#4408 - CGI_10008525 superfamily 241574 1288 1374 1.15E-06 48.5091 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4410 - CGI_10008527 superfamily 219821 166 240 4.27E-10 59.307 cl07136 VWA_N superfamily N - "VWA N-terminal; This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits." Q#4410 - CGI_10008527 superfamily 217211 770 851 8.03E-07 48.4346 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#4410 - CGI_10008527 superfamily 217211 496 556 0.000197317 41.501 cl03691 Cache_1 superfamily C - Cache domain; Cache domain. Q#4410 - CGI_10008527 superfamily 241578 268 450 0.000616739 40.627 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4411 - CGI_10008528 superfamily 241649 13 103 7.50E-10 51.2416 cl00159 fer2 superfamily - - "2Fe-2S iron-sulfur cluster binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins, which act as electron carriers in photosynthesis and ferredoxins, which participate in redox chains (from bacteria to mammals). Fold is ismilar to thioredoxin." Q#4412 - CGI_10008529 superfamily 241563 65 101 6.67E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4413 - CGI_10008530 superfamily 241644 8 149 4.38E-41 141.956 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#4414 - CGI_10008531 superfamily 241782 28 388 0 530.622 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#4419 - CGI_10000487 superfamily 242885 10 172 1.24E-79 236.725 cl02106 IF4E superfamily - - Eukaryotic initiation factor 4E; Eukaryotic initiation factor 4E. Q#4422 - CGI_10020102 superfamily 217473 162 327 4.51E-25 107.066 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#4422 - CGI_10020102 superfamily 241563 724 759 0.00107707 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4423 - CGI_10020103 superfamily 241691 31 147 0.00208988 35.9508 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#4425 - CGI_10020105 superfamily 247724 8 173 2.74E-118 336.196 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4426 - CGI_10020106 superfamily 207684 30 59 0.00111721 38.514 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#4428 - CGI_10020108 superfamily 248264 121 164 0.000217872 39.913 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#4432 - CGI_10020112 superfamily 241610 13 65 5.95E-15 63.0378 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#4433 - CGI_10020113 superfamily 247639 15 274 1.89E-43 150.689 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#4434 - CGI_10020114 superfamily 245205 94 174 2.30E-14 65.7221 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#4437 - CGI_10020117 superfamily 246975 139 160 0.0093438 33.4745 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#4439 - CGI_10020119 superfamily 245201 407 677 6.65E-28 114.191 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4440 - CGI_10020120 superfamily 247068 582 663 1.82E-05 43.8414 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4440 - CGI_10020120 superfamily 247068 672 759 0.000240905 40.3746 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4440 - CGI_10020120 superfamily 247068 467 528 0.00951269 35.367 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4442 - CGI_10020122 superfamily 248054 59 125 5.32E-13 60.9488 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#4444 - CGI_10000626 superfamily 245879 26 73 6.19E-09 47.8523 cl12116 DUSP superfamily C - DUSP domain; The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. Q#4446 - CGI_10017929 superfamily 243039 47 148 3.83E-05 41.5959 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#4450 - CGI_10017933 superfamily 241636 83 261 2.85E-65 206.285 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#4451 - CGI_10017934 superfamily 241636 95 273 2.68E-62 202.047 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#4452 - CGI_10017935 superfamily 241636 1 150 9.15E-51 166.609 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#4454 - CGI_10017937 superfamily 247741 26 114 1.52E-17 75.809 cl17187 Aldolase_Class_I superfamily NC - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#4456 - CGI_10017939 superfamily 247741 8 62 0.000360827 39.5992 cl17187 Aldolase_Class_I superfamily N - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#4458 - CGI_10017941 superfamily 222150 290 315 5.36E-05 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4458 - CGI_10017941 superfamily 222150 234 259 0.00307011 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4458 - CGI_10017941 superfamily 222150 318 341 0.00807449 33.9045 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4459 - CGI_10017942 superfamily 241578 180 340 5.30E-46 161.305 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4459 - CGI_10017942 superfamily 243119 679 729 3.73E-06 45.1273 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#4460 - CGI_10017943 superfamily 241776 162 337 3.10E-60 194.338 cl00315 RPS2 superfamily - - "Ribosomal protein S2 (RPS2), involved in formation of the translation initiation complex, where it might contact the messenger RNA and several components of the ribosome. It has been shown that in Escherichia coli RPS2 is essential for the binding of ribosomal protein S1 to the 30s ribosomal subunit. In humans, most likely in all vertebrates, and perhaps in all metazoans, the protein also functions as the 67 kDa laminin receptor (LAMR1 or 67LR), which is formed from a 37 kDa precursor, and is overexpressed in many tumors. 67LR is a cell surface receptor which interacts with a variety of ligands, laminin-1 and others. It is assumed that the ligand interactions are mediated via the conserved C-terminus, which becomes extracellular as the protein undergoes conformational changes which are not well understood. Specifically, a conserved palindromic motif, LMWWML, may participate in the interactions. 67LR plays essential roles in the adhesion of cells to the basement membrane and subsequent signalling events, and has been linked to several diseases. Some evidence also suggests that the precursor of 67LR, 37LRP is also present in the nucleus in animals, where it appears associated with histones." Q#4461 - CGI_10017944 superfamily 241578 181 340 1.24E-28 109.689 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4463 - CGI_10017946 superfamily 217293 39 223 4.16E-35 127.364 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#4464 - CGI_10017947 superfamily 247684 12 443 3.32E-103 320.764 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4465 - CGI_10017948 superfamily 247724 202 296 2.55E-22 93.2163 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4465 - CGI_10017948 superfamily 247724 105 190 2.88E-07 49.6887 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4467 - CGI_10017950 superfamily 220695 48 183 0.00112311 39.0991 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#4471 - CGI_10017954 superfamily 217853 21 174 2.47E-37 135.094 cl04371 Las1 superfamily - - Las1-like; Las1 is an essential nuclear protein involved in cell morphogenesis and cell surface growth. Q#4475 - CGI_10017958 superfamily 242173 6 149 3.59E-63 192.861 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#4477 - CGI_10017960 superfamily 247789 18 81 6.10E-07 44.557 cl17235 ABC2_membrane superfamily NC - ABC-2 type transporter; ABC-2 type transporter. Q#4478 - CGI_10017961 superfamily 243310 1 133 6.20E-43 143.533 cl03120 ELO superfamily N - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#4479 - CGI_10001015 superfamily 247677 576 707 7.24E-39 140.883 cl17013 W2 superfamily - - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#4479 - CGI_10001015 superfamily 243128 89 265 1.30E-22 96.6298 cl02652 MIF4G superfamily C - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#4479 - CGI_10001015 superfamily 243129 390 491 1.13E-16 77.2937 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#4482 - CGI_10013509 superfamily 247684 58 463 1.65E-83 269.917 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4485 - CGI_10013512 superfamily 215866 7 148 1.75E-32 120.893 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#4485 - CGI_10013512 superfamily 243212 173 308 1.06E-16 76.6137 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#4488 - CGI_10013515 superfamily 248318 997 1050 4.65E-22 92.4989 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#4488 - CGI_10013515 superfamily 221354 1306 1657 0 559.298 cl13422 DUF3480 superfamily - - Domain of unknown function (DUF3480); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 350 to 362 amino acids in length. This domain is found associated with pfam01363. Q#4489 - CGI_10013516 superfamily 245226 11 190 2.86E-65 201.295 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#4490 - CGI_10013517 superfamily 243263 78 329 9.16E-32 123.672 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#4491 - CGI_10013518 superfamily 247725 176 281 1.06E-52 177.906 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#4491 - CGI_10013518 superfamily 247683 658 711 6.10E-27 104.683 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#4491 - CGI_10013518 superfamily 243095 274 471 3.02E-105 322.035 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#4491 - CGI_10013518 superfamily 245835 3 128 9.53E-56 190.219 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#4492 - CGI_10013519 superfamily 191362 77 129 2.55E-23 87.7114 cl05351 zf-nanos superfamily - - "Nanos RNA binding domain; This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localised determinant of posterior pattern. Nanos RNA is localised to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localised source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localised and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development." Q#4493 - CGI_10013520 superfamily 243141 79 209 1.41E-22 92.7646 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#4494 - CGI_10013521 superfamily 218505 29 255 5.12E-79 240.611 cl04994 UNC-50 superfamily - - UNC-50 family; Gmh1p from S. cerevisiae is located in the Golgi membrane and interacts with ARF exchange factors. Q#4496 - CGI_10013523 superfamily 198867 203 273 5.07E-06 45.0249 cl06652 BACK superfamily C - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#4496 - CGI_10013523 superfamily 243066 119 190 5.07E-06 44.9892 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#4497 - CGI_10013524 superfamily 241554 65 199 4.81E-31 118.517 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#4497 - CGI_10013524 superfamily 241752 526 654 2.00E-15 73.5077 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#4497 - CGI_10013524 superfamily 241554 204 309 1.65E-09 56.1147 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#4498 - CGI_10013525 superfamily 247684 249 488 6.09E-37 140.875 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4498 - CGI_10013525 superfamily 246680 2 54 0.00163813 37.1812 cl14633 DD_superfamily superfamily N - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4498 - CGI_10013525 superfamily 247684 213 238 0.00483889 38.0272 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4499 - CGI_10013526 superfamily 246680 120 186 0.00153846 36.2497 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4500 - CGI_10013527 superfamily 247684 93 503 9.98E-86 276.466 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4501 - CGI_10013528 superfamily 247684 150 560 2.72E-85 276.851 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4502 - CGI_10013529 superfamily 246680 13 89 0.000523317 34.4848 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4503 - CGI_10013530 superfamily 247684 127 544 1.56E-71 239.487 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4505 - CGI_10013532 superfamily 247684 49 471 4.21E-85 274.155 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4507 - CGI_10001470 superfamily 247948 55 108 9.39E-16 68.4746 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#4508 - CGI_10001471 superfamily 247755 18 60 3.64E-17 73.7272 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#4509 - CGI_10001642 superfamily 243035 140 165 0.000791954 36.4238 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4510 - CGI_10011281 superfamily 246918 194 246 3.33E-11 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4510 - CGI_10011281 superfamily 246918 23 75 8.20E-11 57.9819 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4510 - CGI_10011281 superfamily 246918 422 474 2.11E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4510 - CGI_10011281 superfamily 246918 365 416 2.50E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4510 - CGI_10011281 superfamily 246918 479 531 2.58E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4510 - CGI_10011281 superfamily 246918 308 360 1.74E-09 54.1299 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4510 - CGI_10011281 superfamily 246918 251 303 8.56E-09 52.2039 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4510 - CGI_10011281 superfamily 246918 137 189 6.81E-08 49.5075 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4510 - CGI_10011281 superfamily 246918 80 132 8.83E-08 49.5075 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4511 - CGI_10011282 superfamily 191608 692 837 1.86E-75 246.39 cl06029 DUF1227 superfamily - - Protein of unknown function (DUF1227); This family represents a conserved region within a number of eukaryotic DNA repair helicases (EC:3.6.1.-). Q#4511 - CGI_10011282 superfamily 248014 966 1110 1.41E-58 199.042 cl17460 Csf4_U superfamily - - CRISPR/Cas system-associated DinG family helicase Csf4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase Q#4511 - CGI_10011282 superfamily 219153 452 616 4.14E-43 155.976 cl15854 DEAD_2 superfamily - - "DEAD_2; This represents a conserved region within a number of RAD3-like DNA-binding helicases that are seemingly ubiquitous - members include proteins of eukaryotic, bacterial and archaeal origin. RAD3 is involved in nucleotide excision repair, and forms part of the transcription factor TFIIH in yeast." Q#4511 - CGI_10011282 superfamily 219199 20 63 6.46E-07 48.1452 cl06070 zf-GRF superfamily - - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#4512 - CGI_10011283 superfamily 241599 397 453 5.91E-09 52.6309 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#4514 - CGI_10011285 superfamily 247725 8 101 2.40E-38 134.189 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#4517 - CGI_10011288 superfamily 215754 31 107 6.25E-10 54.952 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#4517 - CGI_10011288 superfamily 215754 224 312 3.90E-05 41.0848 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#4517 - CGI_10011288 superfamily 215754 112 222 7.80E-05 39.9292 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#4518 - CGI_10011289 superfamily 247724 2 174 2.10E-104 309.129 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4518 - CGI_10011289 superfamily 243184 273 363 7.28E-30 110.896 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#4518 - CGI_10011289 superfamily 243185 182 268 4.48E-27 102.971 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#4519 - CGI_10011290 superfamily 226801 647 708 0.000464655 39.4107 cl15486 COG4357 superfamily C - Zinc finger domain containing protein (CHY type) [Function unknown] Q#4519 - CGI_10011290 superfamily 245716 142 165 0.000535654 38.3793 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#4519 - CGI_10011290 superfamily 245716 83 103 0.00727967 35.2267 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#4522 - CGI_10011293 superfamily 247856 154 216 1.21E-19 82.9809 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4522 - CGI_10011293 superfamily 247856 311 373 4.17E-19 81.8253 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4522 - CGI_10011293 superfamily 247856 384 441 1.50E-18 80.2845 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4522 - CGI_10011293 superfamily 247856 227 282 4.99E-18 78.7437 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4522 - CGI_10011293 superfamily 247856 64 119 3.33E-15 70.6545 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4522 - CGI_10011293 superfamily 247856 2 53 2.13E-13 65.6469 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4523 - CGI_10011294 superfamily 247856 25 87 4.68E-18 75.2769 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4523 - CGI_10011294 superfamily 247856 98 159 5.11E-09 50.2389 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4524 - CGI_10011295 superfamily 247856 19 81 6.88E-20 78.3585 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4524 - CGI_10011295 superfamily 247856 55 109 7.53E-07 42.9201 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4524 - CGI_10011295 superfamily 247856 92 133 0.00690107 32.1345 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4525 - CGI_10011296 superfamily 247856 140 210 4.30E-12 58.3281 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4526 - CGI_10011297 superfamily 247856 15 77 1.26E-18 79.8993 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4526 - CGI_10011297 superfamily 247856 319 381 8.05E-18 77.9733 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4526 - CGI_10011297 superfamily 247856 393 447 7.84E-16 72.1953 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4526 - CGI_10011297 superfamily 247856 88 150 6.01E-15 69.8841 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4526 - CGI_10011297 superfamily 247856 124 206 5.16E-10 55.6317 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4526 - CGI_10011297 superfamily 247856 218 273 5.90E-07 46.7721 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4527 - CGI_10011298 superfamily 247856 85 139 3.93E-12 57.9429 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4527 - CGI_10011298 superfamily 247856 11 73 8.33E-12 57.1725 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4530 - CGI_10011301 superfamily 247856 89 151 8.73E-22 83.7513 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4530 - CGI_10011301 superfamily 247856 16 78 4.36E-18 74.1213 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4531 - CGI_10011302 superfamily 247856 29 76 1.96E-09 48.6981 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4533 - CGI_10001830 superfamily 243085 17 62 8.05E-26 98.1331 cl02557 DM superfamily - - "DM DNA binding domain; The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerise and bind palindromic DNA." Q#4533 - CGI_10001830 superfamily 112299 185 221 2.13E-09 52.7966 cl04098 DMA superfamily - - DMRTA motif; This region is found to the C-terminus of the pfam00751. DM-domain proteins with this motif are known as DMRTA proteins. The function of this region is unknown. Q#4534 - CGI_10002110 superfamily 242153 133 306 1.16E-06 46.5232 cl00866 NTPase_I-T superfamily - - Protein of unknown function DUF84; The function of this prokaryotic protein family is unknown. Q#4536 - CGI_10001795 superfamily 241547 146 273 2.07E-34 128.94 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#4536 - CGI_10001795 superfamily 241547 363 465 2.79E-17 80.0195 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#4537 - CGI_10003382 superfamily 245213 103 135 1.98E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4537 - CGI_10003382 superfamily 214531 551 590 1.33E-11 60.6932 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#4537 - CGI_10003382 superfamily 214531 591 635 1.58E-11 60.6932 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#4537 - CGI_10003382 superfamily 214531 636 678 6.66E-06 44.1297 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#4537 - CGI_10003382 superfamily 205157 236 266 2.73E-05 42.5247 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#4537 - CGI_10003382 superfamily 205157 441 476 3.12E-05 42.1395 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#4537 - CGI_10003382 superfamily 241668 2 59 5.84E-05 43.835 cl00186 nidG2 superfamily N - "Nidogen, G2 domain; Nidogen is an important component of the basement membrane, an extracellular sheet-like matrix. Nidogen is a multifunctional protein that interacts with many other basement membrane proteins, like collagen, perlecan, lamin, and has a potential role in the assembly and connection of networks. Nidogen consists of 3 globular domains (G1-G3), G3 is the lamin-binding domain, while G2 binds collagen IV and perlecan. Also found in hemicentin, a protein which functions at various cell-cell and cell-matrix junctions and might assist in refining broad regions of cell contact into oriented, line-shaped junctions. Nidogen G2 consists of an N-terminal EGF-like domain (excluded from this alignment model) and an 11-stranded beta-barrel with a central helix, a topology that exhibits high structural similarity to the green flourescent proteins of Cnidaria." Q#4537 - CGI_10003382 superfamily 205157 66 101 0.000141699 40.2135 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#4537 - CGI_10003382 superfamily 205157 403 434 0.00390957 35.9763 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#4537 - CGI_10003382 superfamily 205157 354 391 0.00534743 35.5911 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#4537 - CGI_10003382 superfamily 205157 152 182 0.00780586 35.2059 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#4539 - CGI_10003384 superfamily 241570 573 684 3.24E-20 88.9221 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#4539 - CGI_10003384 superfamily 243045 65 156 1.04E-10 60.7247 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#4539 - CGI_10003384 superfamily 219619 437 491 1.59E-08 53.7507 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#4540 - CGI_10003385 superfamily 241574 479 675 6.63E-73 241.336 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4540 - CGI_10003385 superfamily 241574 699 931 2.68E-36 138.102 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4542 - CGI_10001399 superfamily 204716 74 244 0.00227156 36.8347 cl18257 Git3 superfamily - - "G protein-coupled glucose receptor regulating Gpa2; Git3 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. Git3 contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This is the conserved N-terminus of these proteins, and the C-terminal conserved region is now in family Git3_C." Q#4543 - CGI_10002689 superfamily 247044 39 151 7.54E-59 187.045 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#4543 - CGI_10002689 superfamily 247044 165 243 6.00E-22 88.0632 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#4543 - CGI_10002689 superfamily 247044 269 323 6.14E-06 43.392 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#4544 - CGI_10016336 superfamily 247068 458 554 9.72E-24 97.7693 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4544 - CGI_10016336 superfamily 247068 564 660 6.04E-20 86.5985 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4544 - CGI_10016336 superfamily 247068 242 341 9.51E-19 83.1317 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4544 - CGI_10016336 superfamily 247068 673 763 2.04E-16 76.5833 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4544 - CGI_10016336 superfamily 247068 356 450 2.40E-16 76.1981 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4544 - CGI_10016336 superfamily 247068 134 234 4.20E-16 75.4277 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4544 - CGI_10016336 superfamily 247068 22 124 6.19E-08 51.5454 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4553 - CGI_10016345 superfamily 245814 102 157 7.54E-05 40.468 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4556 - CGI_10016348 superfamily 248097 59 185 3.67E-25 95.4098 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4557 - CGI_10016349 superfamily 248097 281 390 4.88E-26 101.958 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4557 - CGI_10016349 superfamily 248097 397 463 8.17E-12 62.2826 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4558 - CGI_10016350 superfamily 248097 289 399 8.37E-26 100.417 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4558 - CGI_10016350 superfamily 242079 192 240 0.00172876 38.5365 cl00770 PSP1 superfamily NC - PSP1 C-terminal conserved region; This region is present in both eukaryotes and eubacteria. The yeast PSP1 protein is involved in suppressing mutations in the DNA polymerase alpha subunit in yeast. Q#4559 - CGI_10016351 superfamily 241564 81 130 5.10E-14 63.8423 cl00035 BIR superfamily N - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4560 - CGI_10016352 superfamily 248097 636 739 1.23E-21 91.943 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4561 - CGI_10016353 superfamily 248097 156 284 4.63E-23 91.943 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4562 - CGI_10016354 superfamily 241564 38 100 2.22E-14 65.7682 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4562 - CGI_10016354 superfamily 241564 141 205 4.12E-11 56.9087 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4562 - CGI_10016354 superfamily 247792 220 262 3.29E-09 51.6116 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4563 - CGI_10016355 superfamily 241564 285 350 1.30E-19 81.9283 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4563 - CGI_10016355 superfamily 241564 6 73 5.47E-15 69.2167 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4564 - CGI_10016356 superfamily 241564 1 52 2.01E-16 69.2167 cl00035 BIR superfamily N - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4568 - CGI_10016361 superfamily 248097 208 334 8.72E-29 108.507 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4569 - CGI_10016362 superfamily 248097 244 369 2.64E-22 90.4022 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4570 - CGI_10016363 superfamily 241564 41 107 1.47E-23 91.5583 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4570 - CGI_10016363 superfamily 247792 250 289 1.38E-05 41.6624 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4571 - CGI_10016364 superfamily 241564 99 167 9.79E-22 86.1655 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4571 - CGI_10016364 superfamily 241564 15 80 2.50E-21 85.0099 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#4573 - CGI_10016366 superfamily 241693 227 375 8.63E-103 303.621 cl00215 Aconitase_swivel superfamily - - "Aconitase swivel domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The aconitase family contains the following proteins: - Iron-responsive element binding protein (IRE-BP). IRE-BP is a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid." Q#4573 - CGI_10016366 superfamily 241753 2 145 5.02E-94 290.881 cl00285 Aconitase superfamily N - "Aconitase catalytic domain; Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle; Aconitase catalytic domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. Aconitase, in its active form, contains a 4Fe-4S iron-sulfur cluster; three cysteine residues have been shown to be ligands of the 4Fe-4S cluster. This is the Aconitase core domain, including structural domains 1, 2 and 3, which binds the Fe-S cluster. The aconitase family also contains the following proteins: - Iron-responsive element binding protein (IRE-BP), a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid." Q#4574 - CGI_10012136 superfamily 110440 211 237 0.00234461 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#4581 - CGI_10012143 superfamily 248264 2 67 4.41E-06 44.1502 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#4582 - CGI_10012144 superfamily 248318 758 810 9.33E-22 90.5729 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#4584 - CGI_10004798 superfamily 243110 4 181 3.07E-18 79.7809 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#4586 - CGI_10004800 superfamily 248458 12 381 1.78E-08 54.2421 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4587 - CGI_10004801 superfamily 248458 35 413 7.35E-26 107.4 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4588 - CGI_10004802 superfamily 219290 15 156 3.43E-20 82.5489 cl06216 Spot_14 superfamily - - "Thyroid hormone-inducible hepatic protein Spot 14; This family consists of several thyroid hormone-inducible hepatic protein (Spot 14 or S14) sequences. Mainly expressed in tissues that synthesise triglycerides, the mRNA coding for Spot 14 has been shown to be increased in rat liver by insulin, dietary carbohydrates, glucose in hepatocyte culture medium, as well as thyroid hormone. In contrast, dietary fats and polyunsaturated fatty acids, have been shown to decrease the amount of Spot 14 mRNA, while an elevated level of cAMP acts as a dominant negative factor. In addition, liver-specific factors or chromatin organisation of the gene have been shown to contribute to the regulation of its expression. Spot 14 protein is thought to be required for induction of hepatic lipogenesis." Q#4589 - CGI_10004803 superfamily 243050 103 161 1.45E-27 103.566 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#4589 - CGI_10004803 superfamily 243050 284 337 6.21E-27 101.663 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#4589 - CGI_10004803 superfamily 243050 164 215 2.83E-26 100.103 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#4589 - CGI_10004803 superfamily 243050 228 278 1.41E-21 87.4285 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#4589 - CGI_10004803 superfamily 243050 345 398 2.78E-20 83.9369 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#4591 - CGI_10004805 superfamily 241563 62 97 1.90E-05 42.4664 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4595 - CGI_10006585 superfamily 220168 15 118 1.79E-46 156.959 cl07802 DUF1969 superfamily - - "Domain of unknown function (DUF1969); The N-terminal domain of fumarylacetoacetate hydrolase is functionally uncharacterized, and adopts a structure consisting of an SH3-like barrel." Q#4595 - CGI_10006585 superfamily 245608 150 359 1.99E-27 107.784 cl11421 FAA_hydrolase superfamily - - "Fumarylacetoacetate (FAA) hydrolase family; This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyses fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hepatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerises this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. This family also includes various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase, encoded by mhpD in E. coli, is involved in the phenylpropionic acid pathway of E. coli and catalyzes the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase." Q#4596 - CGI_10006586 superfamily 247899 334 472 1.21E-07 52.2663 cl17345 AccA superfamily NC - Acetyl-CoA carboxylase alpha subunit [Lipid metabolism] Q#4596 - CGI_10006586 superfamily 247899 26 180 7.45E-06 46.7236 cl17345 AccA superfamily NC - Acetyl-CoA carboxylase alpha subunit [Lipid metabolism] Q#4597 - CGI_10006587 superfamily 243092 703 947 1.59E-09 58.8856 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4597 - CGI_10006587 superfamily 243092 923 1118 1.58E-08 56.1892 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4598 - CGI_10006588 superfamily 204080 211 395 3.68E-41 145.878 cl18252 BAAT_C superfamily - - BAAT / Acyl-CoA thioester hydrolase C terminal; This catalytic domain is found at the C terminal of acyl-CoA thioester hydrolases and bile acid-CoA:amino acid N-acetyltransferases (BAAT). Q#4598 - CGI_10006588 superfamily 218259 15 150 4.74E-35 126.615 cl04742 Bile_Hydr_Trans superfamily - - "Acyl-CoA thioester hydrolase/BAAT N-terminal region; This family consists of the amino termini of acyl-CoA thioester hydrolase and bile acid-CoA:amino acid N-acetyltransferase (BAAT). This region is not thought to contain the active site of either enzyme. Thioesterase isoforms have been identified in peroxisomes, cytoplasm and mitochondria, where they are thought to have distinct functions in lipid metabolism. For example, in peroxisomes, the hydrolase acts on bile-CoA esters." Q#4599 - CGI_10006589 superfamily 245819 380 556 1.05E-52 180.083 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#4599 - CGI_10006589 superfamily 219812 43 287 6.14E-21 92.3692 cl07121 NIT superfamily - - "Nitrate and nitrite sensing; The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure." Q#4599 - CGI_10006589 superfamily 219526 287 366 4.28E-07 49.9251 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#4600 - CGI_10006590 superfamily 245819 405 557 6.01E-44 155.815 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#4600 - CGI_10006590 superfamily 219812 68 313 8.84E-21 91.984 cl07121 NIT superfamily - - "Nitrate and nitrite sensing; The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure." Q#4600 - CGI_10006590 superfamily 219526 365 390 0.00051014 40.6803 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#4601 - CGI_10006591 superfamily 246925 116 387 1.17E-23 100.508 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#4603 - CGI_10006593 superfamily 247907 10 45 0.000188162 36.2421 cl17353 LamG superfamily NC - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#4604 - CGI_10006594 superfamily 247907 1 93 3.17E-14 64.3617 cl17353 LamG superfamily N - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#4605 - CGI_10011169 superfamily 241563 65 106 0.00146909 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4606 - CGI_10011170 superfamily 110440 111 137 0.00325451 33.5353 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#4607 - CGI_10011171 superfamily 110440 561 588 0.00609569 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#4609 - CGI_10011173 superfamily 245819 832 1008 2.44E-65 218.217 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#4609 - CGI_10011173 superfamily 245225 45 410 6.22E-50 182.511 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#4609 - CGI_10011173 superfamily 245201 531 759 1.51E-30 122.26 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4609 - CGI_10011173 superfamily 219526 777 818 2.88E-05 44.9175 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#4610 - CGI_10011174 superfamily 245819 9 159 2.66E-54 171.993 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#4611 - CGI_10011175 superfamily 241578 25 187 4.07E-31 122.015 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4612 - CGI_10011176 superfamily 247723 15 88 1.94E-51 168.541 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4612 - CGI_10011176 superfamily 247723 99 181 1.31E-43 148.212 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4615 - CGI_10011179 superfamily 247805 28 146 3.22E-11 56.5768 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#4617 - CGI_10011181 superfamily 241599 111 169 3.05E-24 95.0028 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#4617 - CGI_10011181 superfamily 146451 291 309 0.00493514 34.6423 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#4619 - CGI_10006904 superfamily 190261 122 181 2.62E-22 93.7674 cl03504 RFX_DNA_binding superfamily - - RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. Q#4621 - CGI_10006906 superfamily 243072 100 222 9.28E-40 140.982 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#4622 - CGI_10006907 superfamily 216101 167 767 2.48E-178 527.246 cl08288 Carn_acyltransf superfamily - - Choline/Carnitine o-acyltransferase; Choline/Carnitine o-acyltransferase. Q#4630 - CGI_10022409 superfamily 245213 45 78 0.000137082 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4630 - CGI_10022409 superfamily 245847 84 227 7.93E-17 74.1301 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#4634 - CGI_10022414 superfamily 241958 1 392 5.40E-78 249.74 cl00573 SDF superfamily - - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#4636 - CGI_10022416 superfamily 220806 150 208 0.000527998 38.1492 cl12387 MULE superfamily N - MULE transposase domain; This domain was identified by Babu and colleagues. Q#4637 - CGI_10022417 superfamily 243161 102 163 0.000357635 36.6442 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#4640 - CGI_10022420 superfamily 241563 62 97 0.000821051 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4641 - CGI_10022421 superfamily 247896 1 290 8.85E-170 484.51 cl17342 Pyruvate_Kinase superfamily N - "Pyruvate kinase (PK): Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors. Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state. PK exists as several different isozymes, depending on organism and tissue type. In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung. PK forms a homotetramer, with each subunit containing three domains. The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer." Q#4643 - CGI_10022423 superfamily 222150 61 85 7.67E-05 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4643 - CGI_10022423 superfamily 222150 88 113 0.00304351 32.3638 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4644 - CGI_10022424 superfamily 241574 21 136 5.48E-53 175.467 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4644 - CGI_10022424 superfamily 241574 137 193 1.38E-12 62.7615 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4644 - CGI_10022424 superfamily 241574 231 303 1.53E-05 44.1138 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4646 - CGI_10022426 superfamily 246680 149 228 8.56E-05 41.6119 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4647 - CGI_10022427 superfamily 241574 1 182 9.63E-69 221.306 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4647 - CGI_10022427 superfamily 241574 245 432 3.21E-19 85.7153 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4650 - CGI_10017669 superfamily 241578 23 66 3.32E-07 44.9701 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4651 - CGI_10017670 superfamily 248012 413 533 1.61E-14 70.7653 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#4651 - CGI_10017670 superfamily 214507 291 328 0.000279819 38.9504 cl15307 LRRCT superfamily C - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#4652 - CGI_10017672 superfamily 241584 10804 10888 3.92E-05 46.7207 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#4652 - CGI_10017672 superfamily 241584 10939 11008 0.00680466 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#4652 - CGI_10017672 superfamily 216566 469 567 1.93E-05 47.5673 cl18370 Peptidase_M23 superfamily - - "Peptidase family M23; Members of this family are zinc metallopeptidases with a range of specificities. The peptidase family M23 is included in this family, these are Gly-Gly endopeptidases. Peptidase family M23 are also endopeptidases. This family also includes some bacterial lipoproteins such as Escherichia coli murein hydrolase activator NlpD, for which no proteolytic activity has been demonstrated. This family also includes leukocyte cell-derived chemotaxin 2 (LECT2) proteins. LECT2 is a liver-specific protein which is thought to be linked to hepatocyte growth although the exact function of this protein is unknown." Q#4652 - CGI_10017672 superfamily 241584 10328 10405 0.000760293 42.7963 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#4652 - CGI_10017672 superfamily 245213 12767 12801 0.00268244 40.308 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4654 - CGI_10017675 superfamily 241607 37 63 0.00723219 29.9306 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#4655 - CGI_10017676 superfamily 245819 371 546 1.81E-65 215.906 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#4655 - CGI_10017676 superfamily 245201 56 297 2.01E-31 122.345 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4655 - CGI_10017676 superfamily 219526 310 356 0.00125589 39.5247 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#4656 - CGI_10017677 superfamily 247068 654 717 0.00167653 38.0995 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4656 - CGI_10017677 superfamily 247068 411 481 0.00299725 37.3291 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4657 - CGI_10017678 superfamily 247805 33 151 8.25E-12 59.6584 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#4667 - CGI_10017688 superfamily 247068 103 169 0.00566862 35.4031 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4668 - CGI_10017690 superfamily 243035 191 281 2.67E-11 58.7853 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4670 - CGI_10017692 superfamily 219933 1 204 1.07E-50 165.186 cl07290 Med20 superfamily - - TATA-binding related factor (TRF) of subunit 20 of Mediator complex; This family of proteins is related to TATA-binding protein (TBP). TBP is a highly conserved RNA polymerase II general transcription factor that binds to the core promoter and initiates assembly of the preinitiation complex. Human TRF has been shown to associate with an RNA polymerase II-SRB complex. This Med20 subunit of Mediator is found in the non-essential part of the head. Q#4671 - CGI_10017693 superfamily 241733 2 70 6.79E-48 148.514 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#4672 - CGI_10023008 superfamily 191362 29 81 1.29E-20 78.4666 cl05351 zf-nanos superfamily - - "Nanos RNA binding domain; This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localised determinant of posterior pattern. Nanos RNA is localised to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localised source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localised and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development." Q#4673 - CGI_10023009 superfamily 243092 144 385 1.21E-24 101.258 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4673 - CGI_10023009 superfamily 243074 49 92 1.70E-06 44.8049 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#4676 - CGI_10023012 superfamily 248097 148 223 2.90E-15 69.6014 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4677 - CGI_10023013 superfamily 248097 73 151 7.24E-13 61.8974 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4678 - CGI_10023014 superfamily 248097 154 282 6.40E-13 63.4382 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4678 - CGI_10023014 superfamily 248097 51 134 7.35E-09 51.8822 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4679 - CGI_10023015 superfamily 248097 234 360 2.00E-17 76.9202 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4679 - CGI_10023015 superfamily 248097 46 172 1.49E-10 57.6602 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4680 - CGI_10023016 superfamily 246669 17 151 8.00E-20 85.3098 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#4680 - CGI_10023016 superfamily 241578 167 384 1.65E-41 150.215 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4682 - CGI_10023018 superfamily 241644 12 119 6.23E-40 141.571 cl00154 UBCc superfamily C - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#4683 - CGI_10023019 superfamily 188051 142 305 5.25E-30 116.027 cl18155 nop2p superfamily C - "NOL1/NOP2/sun family putative RNA methylase; [Protein synthesis, tRNA and rRNA base modification]." Q#4685 - CGI_10023022 superfamily 241578 347 460 4.36E-11 61.4278 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4685 - CGI_10023022 superfamily 241578 156 215 1.24E-06 47.5606 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4685 - CGI_10023022 superfamily 115363 20 71 1.50E-14 69.3229 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#4685 - CGI_10023022 superfamily 115363 94 140 2.28E-12 63.1597 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#4685 - CGI_10023022 superfamily 115363 571 617 1.15E-10 58.1522 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#4686 - CGI_10023023 superfamily 247068 358 448 6.48E-12 63.1013 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4686 - CGI_10023023 superfamily 247068 233 277 1.20E-05 44.6118 cl15786 CA_like superfamily NC - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4686 - CGI_10023023 superfamily 247068 142 201 1.50E-05 44.2266 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4686 - CGI_10023023 superfamily 115363 12 73 2.33E-13 66.6265 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#4686 - CGI_10023023 superfamily 115363 100 130 0.00110938 38.1218 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#4687 - CGI_10023024 superfamily 219533 673 955 1.45E-136 413.357 cl06658 Coatamer_beta_C superfamily - - Coatamer beta C-terminal region; This family is found at the N-terminus of the coatamer beta subunit proteins (Beta-coat proteins). This C-terminal domain probably adapts the function of the N-terminal pfam01602 domain. Q#4688 - CGI_10023025 superfamily 245208 45 418 0 669.535 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#4689 - CGI_10023026 superfamily 216101 202 814 0 653.592 cl08288 Carn_acyltransf superfamily - - Choline/Carnitine o-acyltransferase; Choline/Carnitine o-acyltransferase. Q#4690 - CGI_10023027 superfamily 245227 30 827 0 1437.25 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#4692 - CGI_10023029 superfamily 243077 63 116 3.35E-15 67.5705 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#4693 - CGI_10023030 superfamily 241677 4 169 3.17E-95 287.7 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#4693 - CGI_10023030 superfamily 247723 237 319 6.42E-58 187.856 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4697 - CGI_10023034 superfamily 243061 31 129 7.82E-40 130.925 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4698 - CGI_10003202 superfamily 247684 5 200 1.70E-28 110.629 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4702 - CGI_10018905 superfamily 242274 2 144 0.000124562 40.4734 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#4704 - CGI_10018907 superfamily 247856 207 236 0.00408814 34.4928 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4706 - CGI_10018909 superfamily 247743 129 258 0.000242836 41.8016 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#4707 - CGI_10018910 superfamily 247743 129 258 0.000151771 42.1868 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#4708 - CGI_10018911 superfamily 241719 1 140 9.49E-72 214.333 cl00242 MoaC superfamily - - "MoaC family. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis." Q#4710 - CGI_10018913 superfamily 245599 182 368 1.18E-68 228.53 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#4710 - CGI_10018913 superfamily 245599 91 181 8.76E-26 106.807 cl11397 NR_LBD superfamily C - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#4713 - CGI_10018916 superfamily 243084 845 952 4.01E-71 235.415 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#4713 - CGI_10018916 superfamily 241760 1478 1518 1.32E-21 91.4707 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#4713 - CGI_10018916 superfamily 242892 1105 1422 8.35E-97 318.593 cl02120 KAT11 superfamily - - "Histone acetylation protein; Histone acetylation is required in many cellular processes including transcription, DNA repair, and chromatin assembly. This family contains the fungal KAT11 protein (previously known as RTT109) which is required for H3K56 acetylation. Loss of KAT11 results in the loss of H3K56 acetylation, both on bulk histone and on chromatin. KAT11 and H3K56 acetylation appear to correlate with actively transcribed genes and associate with the elongating form of Pol II in yeast. This family also incorporates the p300/CBP histone acetyltransferase domain which has different catalytic properties and cofactor regulation to KAT11." Q#4713 - CGI_10018916 superfamily 111102 565 644 5.49E-41 148.364 cl03478 KIX superfamily - - KIX domain; CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun. Q#4713 - CGI_10018916 superfamily 243131 359 437 8.98E-29 112.834 cl02660 zf-TAZ superfamily - - "TAZ zinc finger; The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumour suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC." Q#4713 - CGI_10018916 superfamily 243131 1541 1614 2.29E-22 94.3447 cl02660 zf-TAZ superfamily - - "TAZ zinc finger; The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumour suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC." Q#4713 - CGI_10018916 superfamily 191424 950 990 2.25E-20 87.841 cl05510 DUF902 superfamily - - "Domain of Unknown Function (DUF902); This domain of unknown function is found in several transcriptional co-activators including the CREB-binding protein, which is an acetyltransferase that acetylates histones, giving a specific tag for transcriptional activation. This short domain is found to the C-terminus of bromodomains. The 40 residue domain contains four conserved cysteines suggesting that it may be stabilised by a zinc ion. In CREB this domain is to the N-terminus of another zinc binding PHD domain." Q#4713 - CGI_10018916 superfamily 220093 1793 1843 2.66E-11 63.1626 cl07590 Creb_binding superfamily N - "Creb binding; The Creb binding domain assumes a structure comprising of three alpha-helices which pack in a bundle, exposing a hydrophobic groove between alpha-1 and alpha-3 within which complimentary domains found in the protein 'activator for thyroid hormone and retinoid receptors' (ACTR) can dock. Docking of these domains is required for the recruitment of RNA polymerase II and the basal transcription machinery." Q#4715 - CGI_10018918 superfamily 243176 310 679 0 700.192 cl02777 chaperonin_like superfamily N - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#4715 - CGI_10018918 superfamily 243176 12 308 0 549.964 cl02777 chaperonin_like superfamily C - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#4718 - CGI_10018921 superfamily 241647 12 42 1.56E-09 55.6118 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#4718 - CGI_10018921 superfamily 246669 694 811 1.14E-37 138.903 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#4718 - CGI_10018921 superfamily 241647 58 88 0.000474406 39.5063 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#4719 - CGI_10018922 superfamily 247804 446 488 2.65E-10 58.3558 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#4719 - CGI_10018922 superfamily 247804 339 381 5.04E-05 42.5626 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#4721 - CGI_10018924 superfamily 218376 19 132 5.82E-47 149.706 cl04884 Ocnus superfamily - - "Janus/Ocnus family (Ocnus); This family is comprised of the Ocnus, Janus-A and Janus-B proteins. These proteins have been found to be testes specific in Drosophila melanogaster." Q#4723 - CGI_10018926 superfamily 245814 129 191 1.00E-05 41.3207 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4723 - CGI_10018926 superfamily 245814 48 109 3.09E-05 40.1651 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#4724 - CGI_10009063 superfamily 245213 534 569 0.00116737 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#4724 - CGI_10009063 superfamily 243061 691 773 4.07E-13 66.5966 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#4724 - CGI_10009063 superfamily 246918 578 632 1.19E-09 55.6707 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4724 - CGI_10009063 superfamily 246918 634 687 1.97E-09 54.9003 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4725 - CGI_10009064 superfamily 241642 243 299 3.14E-12 60.587 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#4725 - CGI_10009064 superfamily 241634 76 175 1.79E-11 59.6759 cl00143 SynN superfamily - - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#4729 - CGI_10009068 superfamily 248458 297 436 0.00543847 37.6785 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4730 - CGI_10009069 superfamily 100115 1 147 1.16E-27 106.985 cl18930 StaR_like superfamily N - "StaR_like; a well-conserved protein found in bacteria, plants, and animals. A family member from Streptomyces toyocaensis, StaR is part of a gene cluster involved in the biosynthesis of glycopeptide antibiotics (GPAs), specifically A47934. It has been speculated that StaR could be a flavoprotein hydroxylating a tyrosine sidechain. Some family members have been annotated as proteins containing tetratricopeptide (TPR) repeats, which may at least indicate mostly alpha-helical secondary structure." Q#4731 - CGI_10009070 superfamily 100115 58 174 1.87E-18 80.0207 cl18930 StaR_like superfamily C - "StaR_like; a well-conserved protein found in bacteria, plants, and animals. A family member from Streptomyces toyocaensis, StaR is part of a gene cluster involved in the biosynthesis of glycopeptide antibiotics (GPAs), specifically A47934. It has been speculated that StaR could be a flavoprotein hydroxylating a tyrosine sidechain. Some family members have been annotated as proteins containing tetratricopeptide (TPR) repeats, which may at least indicate mostly alpha-helical secondary structure." Q#4732 - CGI_10009071 superfamily 243072 39 136 7.95E-23 88.2094 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#4733 - CGI_10009072 superfamily 243072 12 160 2.83E-19 80.5054 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#4735 - CGI_10009074 superfamily 215821 245 309 2.62E-06 44.5387 cl18346 FKBP_C superfamily C - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#4736 - CGI_10003734 superfamily 247068 238 334 5.86E-28 110.481 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 953 1049 1.79E-25 103.547 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 744 841 1.49E-23 98.1545 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 853 945 2.82E-23 96.9989 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 1164 1266 1.65E-22 95.0729 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 1059 1156 6.92E-22 93.1469 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 647 735 8.50E-20 86.9837 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 442 539 8.86E-16 75.4277 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 134 229 8.12E-12 63.8717 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 1288 1361 2.25E-09 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 568 637 3.20E-07 50.0046 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 37 120 4.35E-07 49.6194 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4736 - CGI_10003734 superfamily 247068 343 415 0.000714303 39.5872 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4738 - CGI_10003736 superfamily 217078 110 287 6.27E-65 208.555 cl15643 CoA_transf_3 superfamily - - "CoA-transferase family III; CoA-transferases are found in organisms from all lines of descent. Most of these enzymes belong to two well-known enzyme families, but recent work on unusual biochemical pathways of anaerobic bacteria has revealed the existence of a third family of CoA-transferases. The members of this enzyme family differ in sequence and reaction mechanism from CoA-transferases of the other families. Currently known enzymes of the new family are a formyl-CoA: oxalate CoA-transferase, a succinyl-CoA: (R)-benzylsuccinate CoA-transferase, an (E)-cinnamoyl-CoA: (R)-phenyllactate CoA-transferase, and a butyrobetainyl-CoA: (R)-carnitine CoA-transferase. In addition, a large number of proteins of unknown or differently annotated function from Bacteria, Archaea and Eukarya apparently belong to this enzyme family. Properties and reaction mechanisms of the CoA-transferases of family III are described and compared to those of the previously known CoA-transferases." Q#4739 - CGI_10003737 superfamily 247692 192 541 5.13E-67 222.782 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#4739 - CGI_10003737 superfamily 247692 51 215 1.89E-12 68.056 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#4740 - CGI_10003738 superfamily 247692 184 533 4.45E-63 211.996 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#4740 - CGI_10003738 superfamily 247692 17 211 9.19E-16 78.8669 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#4741 - CGI_10000334 superfamily 203591 86 219 5.29E-37 136.347 cl06275 DUF1399 superfamily - - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#4741 - CGI_10000334 superfamily 226728 184 256 0.000244991 42.2241 cl18775 COG4278 superfamily NC - Uncharacterized conserved protein [Function unknown] Q#4741 - CGI_10000334 superfamily 203591 12 89 0.00135406 38.5064 cl06275 DUF1399 superfamily N - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#4743 - CGI_10007438 superfamily 241677 9 176 1.84E-89 273.364 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#4745 - CGI_10007440 superfamily 215754 143 230 3.92E-21 86.1532 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#4745 - CGI_10007440 superfamily 215754 234 329 8.46E-21 85.3828 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#4745 - CGI_10007440 superfamily 215754 7 132 8.61E-18 76.9084 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#4747 - CGI_10007442 superfamily 248458 124 259 1.67E-13 68.1093 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4750 - CGI_10007445 superfamily 243035 248 352 5.41E-10 56.4741 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4750 - CGI_10007445 superfamily 243093 109 177 1.36E-08 51.7622 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#4750 - CGI_10007445 superfamily 243035 30 92 0.00286066 36.0586 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4752 - CGI_10007447 superfamily 241578 205 358 1.41E-35 128.949 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4752 - CGI_10007447 superfamily 241578 34 204 1.62E-30 115.081 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#4753 - CGI_10007448 superfamily 241720 8 102 6.18E-49 156.22 cl00245 MGS-like superfamily C - "MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase, which catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The family also includes the C-terminal domain in carbamoyl phosphate synthetase (CPS) where it catalyzes the last phosphorylation of a coaboxyphosphate intermediate to form the product carbamoyl phosphate and may also play a regulatory role. This family also includes inosine monophosphate cyclohydrolase. The known structures in this family show a common phosphate binding site." Q#4754 - CGI_10009966 superfamily 245206 53 313 1.03E-95 289.563 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#4755 - CGI_10009967 superfamily 245602 205 569 0 537.137 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#4755 - CGI_10009967 superfamily 222390 99 161 2.08E-09 54.8238 cl16409 Gal_mutarotas_2 superfamily - - "Galactose mutarotase-like; This family is found N-terminal to glycosyl-hydrolase domains, and appears to be similar to the galactose mutarotase superfamily." Q#4756 - CGI_10009968 superfamily 247639 35 361 0 532.962 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#4757 - CGI_10009969 superfamily 247856 88 158 7.15E-12 57.1725 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4757 - CGI_10009969 superfamily 247856 53 114 1.24E-10 53.7057 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4758 - CGI_10009970 superfamily 247792 20 72 7.23E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4759 - CGI_10009971 superfamily 247856 65 126 1.30E-11 56.7873 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4759 - CGI_10009971 superfamily 247856 100 170 4.91E-11 55.2465 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4760 - CGI_10009972 superfamily 222150 516 541 2.62E-05 42.3789 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4760 - CGI_10009972 superfamily 222150 461 484 0.0002058 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4760 - CGI_10009972 superfamily 222150 347 370 0.000302495 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4760 - CGI_10009972 superfamily 222150 544 568 0.000320428 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4760 - CGI_10009972 superfamily 222150 601 625 0.00154004 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4760 - CGI_10009972 superfamily 222150 374 398 0.00168598 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4760 - CGI_10009972 superfamily 222150 488 512 0.00254084 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4761 - CGI_10009973 superfamily 245201 53 298 8.35E-69 220.448 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4762 - CGI_10009974 superfamily 243034 33 127 2.84E-25 96.6803 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#4764 - CGI_10009976 superfamily 247856 102 173 1.16E-13 65.2617 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4764 - CGI_10009976 superfamily 247856 69 127 2.28E-13 64.1061 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4764 - CGI_10009976 superfamily 247856 42 88 0.00486752 34.4457 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4765 - CGI_10009977 superfamily 243072 294 419 3.17E-19 85.1278 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#4765 - CGI_10009977 superfamily 247856 65 126 3.92E-17 77.2029 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4765 - CGI_10009977 superfamily 247856 100 174 7.16E-15 70.6545 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4765 - CGI_10009977 superfamily 244906 595 660 4.00E-28 108.767 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#4765 - CGI_10009977 superfamily 244906 476 540 3.74E-25 100.292 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#4765 - CGI_10009977 superfamily 244906 737 801 7.38E-23 93.7439 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#4765 - CGI_10009977 superfamily 244899 36 86 0.0022496 37.0842 cl08302 S-100 superfamily N - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#4766 - CGI_10009978 superfamily 247068 428 521 1.72E-26 107.014 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 976 1078 9.88E-26 104.703 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 1210 1305 2.82E-21 91.6061 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 88 185 5.52E-21 90.8357 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 639 735 1.33E-20 89.6801 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 194 300 1.47E-16 78.1241 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 530 631 2.16E-15 74.6573 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 870 968 4.65E-11 61.9457 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 744 841 8.93E-11 61.1753 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 1091 1195 1.13E-10 60.7901 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 309 420 3.59E-06 47.3082 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4766 - CGI_10009978 superfamily 247068 1338 1451 0.000257356 41.5302 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#4767 - CGI_10009979 superfamily 218267 217 429 2.88E-35 136.41 cl04754 LMBR1 superfamily N - "LMBR1-like membrane protein; Members of this family are integral membrane proteins that are around 500 residues in length. LMBR1 is not involved in preaxial polydactyly, as originally thought. Vertebrate members of this family may play a role in limb development. A member of this family has been shown to be a lipocalin membrane receptor" Q#4768 - CGI_10009980 superfamily 248264 47 107 3.59E-06 44.5354 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#4770 - CGI_10022475 superfamily 247723 12 70 1.35E-29 101.158 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4771 - CGI_10022476 superfamily 247723 35 105 1.03E-25 97.3058 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4771 - CGI_10022476 superfamily 199156 122 138 0.00185652 35.1225 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#4773 - CGI_10022478 superfamily 243142 391 560 4.23E-16 75.3555 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#4774 - CGI_10022479 superfamily 245598 1 246 4.54E-130 384.389 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#4775 - CGI_10022480 superfamily 241629 16 125 1.73E-35 120.909 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#4776 - CGI_10022481 superfamily 241629 41 84 7.87E-08 45.7016 cl00133 SCP superfamily N - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#4777 - CGI_10022482 superfamily 241629 40 127 2.27E-15 68.7008 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#4778 - CGI_10022483 superfamily 216368 121 442 7.99E-77 249.357 cl03130 CAP_N superfamily - - Adenylate cyclase associated (CAP) N terminal; Adenylate cyclase associated (CAP) N terminal. Q#4778 - CGI_10022483 superfamily 197827 538 575 1.77E-09 54.0609 cl02725 CARP superfamily - - Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product; Domain in CAPs (cyclase-associated proteins) and X-linked retinitis pigmentosa 2 gene product. Q#4779 - CGI_10022484 superfamily 241574 378 558 3.93E-17 79.9373 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4781 - CGI_10022486 superfamily 221377 15 55 0.000318394 39.3743 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#4781 - CGI_10022486 superfamily 192535 95 194 0.00533635 36.4198 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#4782 - CGI_10022487 superfamily 245303 43 411 0 516.727 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#4782 - CGI_10022487 superfamily 243119 440 488 2.84E-14 67.844 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#4784 - CGI_10022489 superfamily 241867 75 289 3.84E-20 88.3794 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#4786 - CGI_10022491 superfamily 247856 19 79 4.96E-18 72.1953 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#4788 - CGI_10022493 superfamily 243540 622 848 2.73E-29 116.964 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#4788 - CGI_10022493 superfamily 128778 98 212 6.64E-05 42.2519 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#4788 - CGI_10022493 superfamily 241563 60 95 0.000165398 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4790 - CGI_10022495 superfamily 110440 356 382 9.38E-05 39.6985 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#4792 - CGI_10022497 superfamily 241563 38 73 9.12E-06 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4794 - CGI_10022500 superfamily 241727 130 184 0.00109484 36.1709 cl00252 NifX_NifB superfamily C - "This CD represents a family of iron-molybdenum cluster-binding proteins that includes NifB, NifX, and NifY, all of which are involved in the synthesis of an iron-molybdenum cofactor (FeMo-co) that binds the active site of the dinitrogenase enzyme. This domain is a predicted small-molecule-binding domain (SMBD) with an alpha/beta fold that is present either as a stand-alone domain (e.g. NifX and NifY) or fused to another conserved domain (e.g. NifB) however, its function is still undetermined.The SCOP database suggests that this domain is most similar to structures within the ribonuclease H superfamily. This conserved domain is represented in two of the three major divisions of life (bacteria and archaea)." Q#4795 - CGI_10022501 superfamily 247854 145 201 9.83E-08 47.3963 cl17300 Herpes_UL52 superfamily - - "Herpesviridae UL52/UL70 DNA primase; Herpes simplex virus type 1 DNA replication in host cells is known to be mediated by seven viral-encoded proteins, three of which form a heterotrimeric DNA helicase-primase complex. This complex consists of UL5, UL8, and UL52 subunits. Heterodimers consisting of UL5 and UL52 have been shown to retain both helicase and primase activities. Nevertheless, UL8 is still essential for replication: though it lacks any DNA binding or catalytic activities, it is involved in the transport of UL5-UL52 and it also interacts with other replication proteins. The molecular mechanisms of the UL5-UL52 catalytic activities are not known. While UL5 is associated with DNA helicase activity and UL52 with DNA primase activity, the helicase activity requires the interaction of UL5 and UL52. It is not known if the primase activity can be maintained by UL52 alone. The region encompassed by residues 610-636 of HSV1 UL52 is thought to contain a divalent metal cation binding motif. Indeed, this region contains several aspartate and glutamate residues that might be involved in divalent cation binding. The biological significance of UL52-UL8 interaction is not known. Yeast two-hybrid analysis together with immunoprecipitation experiments have shown that the HSV1 UL52 region between residues 366-914 is essential for this interaction, while the first 349 N-terminal residues are dispensable. This family also includes protein UL70 from cytomegalovirus (CMV, a subgroup of the Herpesviridae) strains, which, by analogy with UL52, is thought to have DNA primase activity. Indeed, CMV strains also possess a DNA helicase-primase complex, the other subunits being protein UL105 (with known similarity to HSV1 UL5) and protein UL102." Q#4796 - CGI_10022502 superfamily 247725 1130 1251 3.30E-50 175.532 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#4796 - CGI_10022502 superfamily 243096 927 1111 4.07E-27 110.85 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#4796 - CGI_10022502 superfamily 248318 1278 1332 7.85E-21 89.0321 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#4796 - CGI_10022502 superfamily 247725 1424 1489 9.69E-19 83.9639 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#4798 - CGI_10022504 superfamily 113585 171 281 1.50E-21 88.1499 cl04775 DUF716 superfamily - - "Family of unknown function (DUF716); This family is equally distributed in both metazoa and plants. Annotation associated with a member from Nicotiana tabacum suggest that it may be involved in response to viral attack in plants. However, no clear function has been assigned to this family." Q#4799 - CGI_10022505 superfamily 243066 1 79 4.37E-08 47.6856 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#4801 - CGI_10022507 superfamily 246680 116 160 8.76E-08 46.1746 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4802 - CGI_10022508 superfamily 221535 136 233 1.12E-15 73.3991 cl13730 Ipi1_N superfamily - - "Rix1 complex component involved in 60S ribosome maturation; This domain family is found in eukaryotes, and is typically between 91 and 105 amino acids in length. This family is the N terminal of Ipi1, a component of the Rix1 complex which works in conjunction with Rea1 to mature the 60S ribosome." Q#4803 - CGI_10022509 superfamily 202529 6 42 2.76E-06 39.4668 cl18228 MtN3_slv superfamily NC - "Sugar efflux transporter for intercellular exchange; This family includes proteins such as drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family. This family also contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisims it meditaes gluose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologues are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter." Q#4805 - CGI_10003526 superfamily 242406 4 105 1.62E-08 51.0529 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#4805 - CGI_10003526 superfamily 248264 150 198 0.00567579 35.2906 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#4811 - CGI_10010493 superfamily 241563 108 141 0.000276856 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4814 - CGI_10010496 superfamily 110440 222 246 0.00439341 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#4815 - CGI_10010497 superfamily 241733 29 97 6.11E-38 124.56 cl00259 Sm_like superfamily N - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#4816 - CGI_10010498 superfamily 194822 166 277 2.91E-40 141.125 cl04283 Rad10 superfamily - - "Binding domain of DNA repair protein Ercc1 (rad10/Swi10); Ercc1 and XPF (xeroderma pigmentosum group F-complementing protein) are two structure-specific endonucleases of a class of seven containing an ERCC4 domain. Together they form an obligate complex that functions primarily in nucleotide excision repair (NER), a versatile pathway able to detect and remove a variety of DNA lesions induced by UV light and environmental carcinogens, and secondarily in DNA interstrand cross-link repair and telomere maintenance. This domain in fact binds simultaneously to both XPF and single-stranded DNA; this ternary complex explains the important role of Ercc1 in targeting its catalytic XPF partner to the NER pre-incision complex." Q#4816 - CGI_10010498 superfamily 109247 358 437 1.87E-09 54.1201 cl02816 Ribosomal_L2 superfamily - - "Ribosomal Proteins L2, RNA binding domain; Ribosomal Proteins L2, RNA binding domain. " Q#4817 - CGI_10010499 superfamily 241563 68 109 3.64E-07 47.474 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4817 - CGI_10010499 superfamily 241563 28 59 0.00195103 36.3032 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#4818 - CGI_10010500 superfamily 202823 35 100 8.04E-12 57.9341 cl08408 Ribosomal_L2_C superfamily NC - "Ribosomal Proteins L2, C-terminal domain; Ribosomal Proteins L2, C-terminal domain. " Q#4821 - CGI_10010503 superfamily 241570 292 398 8.52E-07 48.091 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#4823 - CGI_10010505 superfamily 243035 343 469 6.43E-31 115.41 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#4823 - CGI_10010505 superfamily 243051 189 334 2.18E-16 76.2352 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#4824 - CGI_10010506 superfamily 248097 145 222 8.11E-12 59.9714 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4824 - CGI_10010506 superfamily 222429 8 53 2.24E-09 52.2428 cl18676 Myb_DNA-bind_5 superfamily C - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#4826 - CGI_10010508 superfamily 241613 104 137 4.93E-06 41.0382 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#4826 - CGI_10010508 superfamily 246918 41 92 1.00E-11 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#4827 - CGI_10010509 superfamily 218273 69 165 1.02E-37 126.874 cl04760 ETC_C1_NDUFA4 superfamily - - ETC complex I subunit conserved region; Family of pankaryotic NADH-ubiquinone oxidoreductase subunits (EC:1.6.5.3) (EC:1.6.99.3) from complex I of the electron transport chain initially identified in Neurospora crassa as a 21 kDa protein. Q#4829 - CGI_10010511 superfamily 245595 681 948 2.71E-141 425.749 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#4831 - CGI_10010513 superfamily 247799 70 131 1.37E-09 54.8735 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#4831 - CGI_10010513 superfamily 247799 190 233 1.34E-06 46.0139 cl17245 KH-I superfamily N - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#4831 - CGI_10010513 superfamily 247792 483 529 0.000446643 38.1956 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4834 - CGI_10028588 superfamily 241574 299 450 1.04E-20 88.0492 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4835 - CGI_10028589 superfamily 245309 51 134 0.0067617 34.0474 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#4837 - CGI_10028591 superfamily 243066 23 121 1.47E-13 68.0277 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#4837 - CGI_10028591 superfamily 222412 826 859 0.00326774 36.5797 cl16432 Tnp_zf-ribbon_2 superfamily - - DDE_Tnp_1-like zinc-ribbon; This zinc-ribbon domain is frequently found at the C-terminal of proteins derived from transposable elements. Q#4844 - CGI_10028598 superfamily 222150 590 615 3.93E-05 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4844 - CGI_10028598 superfamily 222150 320 345 0.00258782 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4845 - CGI_10028599 superfamily 247684 13 186 3.29E-19 83.7923 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4846 - CGI_10028600 superfamily 247684 13 186 1.03E-19 84.9479 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4847 - CGI_10028601 superfamily 247684 13 186 1.16E-19 84.9479 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4848 - CGI_10028602 superfamily 247684 13 176 2.90E-18 78.0143 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4849 - CGI_10028603 superfamily 247684 13 186 5.59E-19 83.0219 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4849 - CGI_10028603 superfamily 247684 290 336 0.00916239 36.2527 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4850 - CGI_10028604 superfamily 247684 13 178 6.53E-18 78.3995 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#4855 - CGI_10028609 superfamily 202685 453 498 0.00280353 36.9628 cl04160 DUF1759 superfamily NC - Protein of unknown function (DUF1759); This is a family of proteins of unknown function. Most of the members are gag-polyproteins. Q#4856 - CGI_10028610 superfamily 248097 7 43 1.77E-08 46.4894 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#4857 - CGI_10028611 superfamily 243181 213 339 4.96E-28 110.709 cl02783 TopoII_MutL_Trans superfamily - - "MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of type II DNA topoisomerases (Topo II) and DNA mismatch repair (MutL/MLH1/PMS2) proteins. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. The GyrB dimerizes in response to ATP binding, and is homologous to the N-terminal half of eukaryotic Topo II and the ATPase fragment of MutL. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. Included in this group are proteins similar to human MLH1 and PMS2. MLH1 forms a heterodimer with PMS2 which functions in meiosis and in DNA mismatch repair (MMR). Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families." Q#4857 - CGI_10028611 superfamily 241593 30 151 7.21E-06 45.3302 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#4857 - CGI_10028611 superfamily 241597 611 659 0.000174845 40.68 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#4858 - CGI_10028612 superfamily 248458 75 222 3.63E-06 47.6937 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4858 - CGI_10028612 superfamily 248458 324 499 1.63E-05 45.7677 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4861 - CGI_10028615 superfamily 248458 74 440 2.12E-05 44.9973 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#4862 - CGI_10028616 superfamily 222150 213 240 0.000192583 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4863 - CGI_10028617 superfamily 220666 33 253 1.78E-55 182.56 cl10951 Tmemb_185A superfamily - - "Transmembrane Fragile-X-F protein; This is a family of conserved transmembrane proteins that appear in humans to be expressed from a region upstream of the FragileXF site and to be intimately linked with the Fragile-X syndrome. Absence of TMEM185A does not necessarily lead to developmental delay, but might in combination with other, yet unknown, factors. Otherwise, the lack of the TMEM185A protein is either disposable (redundant) or its function can be complemented by the highly similar chromosome 2 retro-pseudogene product, TMEM185B." Q#4864 - CGI_10028618 superfamily 243072 18 119 4.87E-25 94.7578 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#4865 - CGI_10028619 superfamily 222150 139 166 0.000895353 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4868 - CGI_10028622 superfamily 150854 68 250 1.30E-32 120.212 cl10929 MRP-S22 superfamily C - "Mitochondrial 28S ribosomal protein S22; This is the conserved N-terminus and central portion of the mitochondrial small subunit 28S ribosomal protein S22. Mammalian mitochondria carry out the synthesis of 13 polypeptides that are essential for oxidative phosphorylation and, hence, for the synthesis of the majority of the ATP used by eukaryotic organisms. The number of proteins produced by prokaryotes is smaller, reflected in the lower number of ribosomal proteins present in them." Q#4869 - CGI_10028623 superfamily 220658 244 360 9.82E-06 46.5769 cl12382 Rogdi_lz superfamily NC - Rogdi leucine zipper containing protein; This is a family of conserved proteins which have been suggested as containing leucine-zipper domains. A leucine zipper domain is a region of 30 amino acids with leucines repeating every seven or eight residues; these proteins do have many such leucines. The protein in Drosophila comes from the gene ROGDI. Q#4873 - CGI_10028627 superfamily 241559 405 455 2.24E-13 66.5655 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#4873 - CGI_10028627 superfamily 220715 11 164 2.71E-21 90.4671 cl11025 NT-C2 superfamily - - "N-terminal C2 in EEIG1 and EHBP1 proteins; This version of the C2 domain was initally identified in the vertebrate estrogen early-induced gene 1 (EEIG1), and its Drosophila ortholog required for uptake of dsRNA via the endocytotic machinery to induce RNAi silencing. It is also in C.elegans ortholog Sym-3 (SYnthetic lethal with Mec-3) and the mammalian protein EHBP1 (EH domain Binding Protein-1) that regulates endocytotic recycling and two plant proteins, RPG that regulates Rhizobium-directed polar growth and PMI1 (Plastid Movement Impaired 1) that is essential for intracellular movement of chloroplasts in response to blue light." Q#4874 - CGI_10028628 superfamily 241754 7 316 2.00E-74 245.223 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#4875 - CGI_10028629 superfamily 222150 566 591 0.000431184 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4875 - CGI_10028629 superfamily 246975 553 574 0.000899292 38.0969 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#4877 - CGI_10028631 superfamily 247724 1 151 6.77E-48 155.394 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4878 - CGI_10028632 superfamily 247724 6 165 3.09E-50 161.943 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4879 - CGI_10028633 superfamily 247724 11 158 2.00E-41 139.216 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4880 - CGI_10028634 superfamily 247724 9 165 2.29E-41 138.831 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4882 - CGI_10028636 superfamily 247724 9 165 4.26E-45 148.461 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4883 - CGI_10028637 superfamily 247724 19 108 3.76E-31 110.326 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4884 - CGI_10028638 superfamily 247724 10 170 1.72E-43 144.609 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4886 - CGI_10028640 superfamily 247724 1 147 8.13E-31 111.097 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4887 - CGI_10028641 superfamily 247724 11 173 1.25E-59 186.21 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4890 - CGI_10028644 superfamily 247724 10 166 1.87E-40 137.29 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4891 - CGI_10028645 superfamily 217648 28 393 5.44E-75 243.414 cl15557 Glyco_hydro_65m superfamily - - "Glycosyl hydrolase family 65 central catalytic domain; This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The central domain is the catalytic domain, which binds a phosphate ion that is proximal the the highly conserved Glu. The arrangement of the phosphate and the glutamate is thought to cause nucleophilic attack on the anomeric carbon atom. The catalytic domain also forms the majority of the dimerisation interface." Q#4892 - CGI_10028646 superfamily 241941 7 96 3.48E-16 72.2417 cl00551 Acylphosphatase superfamily - - Acylphosphatase; Acylphosphatase. Q#4893 - CGI_10028647 superfamily 152354 1706 1795 3.35E-38 140.067 cl13371 DUF3437 superfamily - - Domain of unknown function (DUF3437); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 142 to 163 amino acids in length. Q#4894 - CGI_10028648 superfamily 245602 375 713 0 636.923 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#4894 - CGI_10028648 superfamily 222390 245 315 9.63E-20 85.2546 cl16409 Gal_mutarotas_2 superfamily - - "Galactose mutarotase-like; This family is found N-terminal to glycosyl-hydrolase domains, and appears to be similar to the galactose mutarotase superfamily." Q#4895 - CGI_10028649 superfamily 245818 47 184 7.63E-43 149.28 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#4895 - CGI_10028649 superfamily 221674 222 281 1.59E-09 54.3568 cl14990 PolyA_pol_RNAbd superfamily - - Probable RNA and SrmB- binding site of polymerase A; This region encompasses much of the RNA and SrmB binding motifs on polymerase A. Q#4897 - CGI_10028651 superfamily 241580 120 196 1.74E-50 166.19 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#4899 - CGI_10028653 superfamily 243306 319 531 8.30E-116 348.767 cl03114 RNase_PH superfamily - - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#4899 - CGI_10028653 superfamily 243306 10 226 6.59E-95 294.81 cl03114 RNase_PH superfamily - - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#4899 - CGI_10028653 superfamily 247799 542 599 0.000205222 39.8274 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#4903 - CGI_10028657 superfamily 247755 88 296 4.95E-76 244.488 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#4903 - CGI_10028657 superfamily 247789 401 574 1.00E-20 90.7809 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#4904 - CGI_10028658 superfamily 247755 52 277 5.05E-70 228.695 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#4904 - CGI_10028658 superfamily 247789 370 582 2.72E-13 68.4394 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#4905 - CGI_10028659 superfamily 245201 38 284 3.41E-66 209.4 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4906 - CGI_10028660 superfamily 245201 71 317 3.00E-55 182.051 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4907 - CGI_10028661 superfamily 243092 349 561 8.22E-24 103.184 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4907 - CGI_10028661 superfamily 243092 875 1006 1.47E-06 50.026 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4907 - CGI_10028661 superfamily 243092 140 240 0.000521729 41.9368 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4907 - CGI_10028661 superfamily 243092 667 714 0.00347009 39.6256 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4909 - CGI_10028663 superfamily 241599 106 154 1.04E-19 80.3652 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#4909 - CGI_10028663 superfamily 146451 250 266 2.10E-06 43.5019 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#4911 - CGI_10028665 superfamily 222347 234 328 0.00107564 39.1644 cl16365 TraF_2 superfamily NC - "F plasmid transfer operon, TraF, protein; F plasmid transfer operon, TraF, protein. " Q#4913 - CGI_10028667 superfamily 147194 201 243 0.00347237 35.3499 cl04832 RAMP superfamily N - Receptor activity modifying family; The calcitonin-receptor-like receptor can function as either a calcitonin-gene-related peptide or an adrenomedullin receptor. The receptors function is modified by receptor-activity-modifying protein or RAMP. RAMPs are single-transmembrane-domain proteins. Q#4915 - CGI_10028669 superfamily 245202 58 149 6.99E-51 161.26 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#4915 - CGI_10028669 superfamily 206550 2 40 6.97E-08 46.2344 cl16842 ECR1_N superfamily - - Exosome complex exonuclease RRP4 N-terminal region; ECR1_N is an N-terminal region of the exosome complex exonuclease RRP proteins. It is a G-rich domain which structurally is a rudimentary single hybrid fold with a permuted topology. Q#4917 - CGI_10028671 superfamily 247723 130 178 1.91E-22 87.7615 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#4919 - CGI_10028673 superfamily 219225 7 61 0.00803958 31.2795 cl06114 FAIM1 superfamily N - "Fas apoptotic inhibitory molecule (FAIM1); This family consists of several fas apoptotic inhibitory molecule (FAIM1) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM1 is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology." Q#4920 - CGI_10028674 superfamily 219225 50 131 0.00166903 35.5167 cl06114 FAIM1 superfamily N - "Fas apoptotic inhibitory molecule (FAIM1); This family consists of several fas apoptotic inhibitory molecule (FAIM1) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM1 is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology." Q#4921 - CGI_10028675 superfamily 243092 181 411 1.08E-12 67.36 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4926 - CGI_10028680 superfamily 242886 58 161 2.82E-45 148.868 cl02107 Evr1_Alr superfamily N - Erv1 / Alr family; Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian orthologue of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane an d it thought to operate downstream of the mitochondrial ABC transporter. Q#4928 - CGI_10028682 superfamily 148425 14 180 7.96E-09 52.4506 cl06049 NPDC1 superfamily N - Neural proliferation differentiation control-1 protein (NPDC1); This family consists of several neural proliferation differentiation control-1 (NPDC1) proteins. NPDC1 plays a role in the control of neural cell proliferation and differentiation. It has been suggested that NPDC1 may be involved in the development of several secretion glands. This family also contains the C-terminal region of the C. elegans protein CAB-1 which is known to interact with AEX-3. Q#4929 - CGI_10028683 superfamily 241559 24 125 4.83E-19 84.2847 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#4929 - CGI_10028683 superfamily 221433 805 936 3.32E-35 132.037 cl13553 DUF3585 superfamily - - Protein of unknown function (DUF3585); This domain is found in eukaryotes. This domain is typically between 135 and 149 amino acids in length and is found associated with pfam00307. Q#4929 - CGI_10028683 superfamily 243050 177 210 7.59E-09 53.5682 cl02475 LIM superfamily C - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#4932 - CGI_10028686 superfamily 111745 554 602 1.24E-23 95.8325 cl17930 zf-MIZ superfamily - - MIZ/SP-RING zinc finger; This domain has SUMO (small ubiquitin-like modifier) ligase activity and is involved in DNA repair and chromosome organisation. Q#4933 - CGI_10028688 superfamily 201526 126 192 1.04E-28 104.928 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#4933 - CGI_10028688 superfamily 222370 28 89 1.08E-14 66.3925 cl16386 Longin superfamily C - "Regulated-SNARE-like domain; Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain." Q#4934 - CGI_10028689 superfamily 245201 34 302 4.05E-58 202.728 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#4934 - CGI_10028689 superfamily 220736 563 701 6.66E-23 96.995 cl11068 PTEN_C2 superfamily - - "C2 domain of PTEN tumour-suppressor protein; This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (pfam00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane." Q#4934 - CGI_10028689 superfamily 241574 466 549 0.00702567 36.5679 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4935 - CGI_10028690 superfamily 243555 233 402 0.000901499 40.8374 cl03871 Chitin_bind_3 superfamily N - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#4937 - CGI_10028692 superfamily 216807 3 189 1.94E-48 159.739 cl18379 DUF106 superfamily - - Integral membrane protein DUF106; This archaebacterial protein family has no known function. Members are predicted to be integral membrane proteins. Q#4938 - CGI_10028693 superfamily 245010 11 142 5.59E-21 83.4331 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#4939 - CGI_10028694 superfamily 242611 89 334 7.40E-127 379.541 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#4939 - CGI_10028694 superfamily 245606 392 547 7.28E-65 213.072 cl11410 TPP_enzyme_PYR superfamily - - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#4939 - CGI_10028694 superfamily 217227 563 685 3.32E-25 101.904 cl08363 Transketolase_C superfamily - - "Transketolase, C-terminal domain; The C-terminal domain of transketolase has been proposed as a regulatory molecule binding site." Q#4941 - CGI_10028696 superfamily 243015 5 135 5.75E-50 160.775 cl02381 Tim17 superfamily - - "Tim17/Tim22/Tim23/Pmp24 family; The pre-protein translocase of the mitochondrial outer membrane (Tom) allows the import of pre-proteins from the cytoplasm. Tom forms a complex with a number of proteins, including Tim17. Tim17 and Tim23 are thought to form the translocation channel of the inner membrane. This family includes Tim17, Tim22 and Tim23. This family also includes Pmp24 a peroxisomal protein. The involvement of this domain in the targeting of PMP24 remains to be proved. PMP24 was known as Pmp27 in." Q#4942 - CGI_10028697 superfamily 241896 244 477 8.53E-100 306.908 cl00483 UDG_like superfamily - - "Uracil-DNA glycosylases (UDG) and related enzymes; Uracil-DNA glycosylases (UDG) catalyzes the removal of uracil from DNA, which initiates the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. At least five UDG families have been characterized so far; these families share similar overall folds and common active site motifs. They demonstrate different substrate specificities, but often the function of one enzyme can be complemented by the other. Family 1 enzymes are active against uracil in both ssDNA and dsDNA, and recognize uracil explicitly in an extrahelical conformation via a combination of protein and bound-water interactions. Family 2 enzymes are mismatch specific and explicitly recognize the widowed guanine on the complementary strand, rather than the extrahelical scissile pyrimidine. This allows a broader specificity so that some Family 2 enzymes can excise uracil as well as 3, N(4)-ethenocytosine from mismatches with guanine. A Family 3 UDG from human was first characterized to remove Uracil from ssDNA, hence the name hSMUG (single-strand-selective monofunctional uracil-DNA glycosylase). However, subsequent research has shown that hSMUG1 and its rat ortholog can remove uracil and its oxidized pyrimidine derivatives from both, ssDNA and dsDNA. Enzymes in Families 4 and 5 are both thermostable. Family 4 enzymes specifically recognize uracil in a manner similar to human UDG (Family 1), rather than guanine in the complementary strand DNA, as does E. coli MUG (Family 2). These results suggest that the mechanism by which Family 4 UDGs remove uracils from DNA is similar to that of Family 1 enzyme. Although Family 5 enzymes are close relatives of Family 4, they show different substrate specificities." Q#4943 - CGI_10028698 superfamily 247739 46 238 1.66E-25 100.038 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#4944 - CGI_10028699 superfamily 201804 42 223 3.85E-56 179.689 cl03220 MAGE superfamily - - "MAGE family; The MAGE (melanoma antigen-encoding gene) family are expressed in a wide variety of tumours but not in normal cells, with the exception of the male germ cells, placenta, and, possibly, cells of the developing embryo. The cellular function of this family is unknown. This family also contains the yeast protein, Nse3. The Nse3 protein is part of the Smc5-6 complex. Nse3 has been demonstrated to be important for meiosis." Q#4946 - CGI_10028701 superfamily 248232 1 440 0 684.411 cl17678 MRS6 superfamily - - "RAB proteins geranylgeranyltransferase component A (RAB escort protein) [Posttranslational modification, protein turnover, chaperones]" Q#4947 - CGI_10028702 superfamily 243109 99 187 3.46E-51 168.246 cl02614 SPRY superfamily N - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#4948 - CGI_10028703 superfamily 248012 3 57 2.84E-06 42.1797 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#4949 - CGI_10028704 superfamily 241555 8 114 1.93E-12 61.0331 cl00020 GAT_1 superfamily C - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#4950 - CGI_10028705 superfamily 243091 122 251 1.78E-29 109.346 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#4950 - CGI_10028705 superfamily 243114 17 114 4.25E-12 60.8869 cl02622 Pre-SET superfamily - - Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains. Q#4951 - CGI_10028706 superfamily 243072 27 138 1.53E-27 100.536 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#4952 - CGI_10028707 superfamily 243072 30 150 4.69E-37 125.959 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#4953 - CGI_10028708 superfamily 248012 4 145 1.41E-16 71.5856 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#4954 - CGI_10004706 superfamily 241714 8 198 1.33E-59 187.475 cl00237 Peptidase_C15 superfamily - - "Pyroglutamyl peptidase (PGP) type I, also known as pyrrolidone carboxyl peptidase (pcp) type I: Enzymes responsible for cleaving pyroglutamate (pGlu) from the N-terminal end of specialized proteins. The N-terminal pGlu protects these proteins from proteolysis by other proteases until the pGlu is removed by a PGP. PGPs are cysteine proteases with a Cys-His-Glu/Asp catalytic triad. Type I PGPs are found in a wide variety of prokaryotes and eukaryotes. It is not clear whether the functional form is a monomer, a homodimer, or a homotetramer." Q#4955 - CGI_10004707 superfamily 243106 10 90 2.38E-27 107.162 cl02608 BAH superfamily N - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#4955 - CGI_10004707 superfamily 226572 159 203 0.000588065 39.4644 cl18761 COG4087 superfamily N - Soluble P-type ATPase [General function prediction only] Q#4962 - CGI_10004714 superfamily 115139 21 147 1.26E-51 163.165 cl05790 Mob_synth_C superfamily - - Molybdenum Cofactor Synthesis C; This region contains two iron-sulphur (3Fe-4S) binding sites. Mutations in this region of human MOCS1 cause MOCOD (Molybdenum Co-Factor Deficiency) type A. Q#4964 - CGI_10011863 superfamily 245818 75 182 5.49E-22 89.1535 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#4964 - CGI_10011863 superfamily 217750 269 328 4.61E-11 57.9742 cl04280 PAP_assoc superfamily - - Cid1 family poly A polymerase; This domain is found in poly(A) polymerases and has been shown to have polynucleotide adenylyltransferase activity. Proteins in this family have been located to both the nucleus and the cytoplasm. Q#4965 - CGI_10011864 superfamily 192286 104 207 3.04E-68 208.433 cl18178 UPF1_Zn_bind superfamily C - RNA helicase (UPF2 interacting domain); UPF1 is an essential RNA helicase that detects mRNAs containing premature stop codons and triggers their degradation. This domain contains 3 zinc binding motifs and forms interactions with another protein (UPF2) that is also involved nonsense-mediated mRNA decay (NMD). Q#4966 - CGI_10011865 superfamily 246680 628 675 6.35E-08 50.797 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#4967 - CGI_10011866 superfamily 217783 59 278 8.21E-97 288.594 cl04314 TRAP_alpha superfamily N - "Translocon-associated protein (TRAP), alpha subunit; The alpha-subunit of the TRAP complex (TRAP alpha) is a single-spanning membrane protein of the endoplasmic reticulum (ER) which is found in proximity of nascent polypeptide chains translocating across the membrane." Q#4968 - CGI_10011867 superfamily 207687 352 466 3.69E-54 180.241 cl02647 AXH superfamily - - Ataxin-1 and HBP1 module (AXH); AXH is a protein-protein and RNA binding motif found in Ataxin-1 (ATX1). ATX1 is responsible for the autosomal-dominant neurodegenerative disorder Spinocerebellar ataxia type-1 (SCA1) in humans. The AXH module has also been identified in the apparently unrelated transcription factor HBP1 which is thought to be involved in the architectural regulation of chromatin and in specific gene expression. Q#4969 - CGI_10011868 superfamily 243109 11 185 2.56E-58 195.414 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#4969 - CGI_10011868 superfamily 220676 456 555 2.63E-28 109.994 cl18566 DUF2392 superfamily - - Protein of unknown function (DUF2392); This is a family of proteins conserved from plants to humans. The function is not known. It carries a characteristic GRG sequence motif. Q#4969 - CGI_10011868 superfamily 241758 294 515 0.000458925 40.3084 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#4970 - CGI_10011869 superfamily 243092 11 305 3.50E-20 87.3904 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4971 - CGI_10011870 superfamily 247724 150 352 1.07E-43 151.788 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#4972 - CGI_10011871 superfamily 222150 270 293 0.000184171 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#4973 - CGI_10011872 superfamily 146263 4 112 8.84E-24 94.6786 cl04138 SK_channel superfamily - - Calcium-activated SK potassium channel; Calcium-activated SK potassium channel. Q#4974 - CGI_10011873 superfamily 146263 73 165 3.33E-22 91.597 cl04138 SK_channel superfamily - - Calcium-activated SK potassium channel; Calcium-activated SK potassium channel. Q#4974 - CGI_10011873 superfamily 219619 259 323 3.04E-11 59.5287 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#4975 - CGI_10011874 superfamily 146263 84 193 4.41E-19 83.1227 cl04138 SK_channel superfamily - - Calcium-activated SK potassium channel; Calcium-activated SK potassium channel. Q#4975 - CGI_10011874 superfamily 219619 304 359 9.60E-12 61.0695 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#4975 - CGI_10011874 superfamily 198825 378 449 3.45E-05 42.0157 cl03763 CaMBD superfamily - - "Calmodulin binding domain; Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other." Q#4976 - CGI_10011875 superfamily 247058 123 313 3.19E-57 186.997 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#4976 - CGI_10011875 superfamily 241699 24 100 2.52E-33 120.123 cl00221 ACBP superfamily - - Acyl CoA binding protein (ACBP) binds thiol esters of long fatty acids and coenzyme A in a one-to-one binding mode with high specificity and affinity. Acyl-CoAs are important intermediates in fatty lipid synthesis and fatty acid degradation and play a role in regulation of intermediary metabolism and gene regulation. The suggested role of ACBP is to act as a intracellular acyl-CoA transporter and pool former. ACBPs are present in a large group of eukaryotic species and several tissue-specific isoforms have been detected. Q#4977 - CGI_10011876 superfamily 247058 433 620 3.66E-52 179.293 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#4978 - CGI_10011877 superfamily 243092 306 577 6.06E-39 144.4 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#4978 - CGI_10011877 superfamily 247792 25 58 2.65E-05 42.0476 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4978 - CGI_10011877 superfamily 190233 115 172 0.00362349 35.893 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#4979 - CGI_10011878 superfamily 247792 46 90 9.49E-15 67.4708 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#4980 - CGI_10011879 superfamily 245847 55 113 4.11E-07 44.4156 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#4982 - CGI_10002435 superfamily 241832 8 133 7.13E-60 184.641 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#4987 - CGI_10025633 superfamily 241868 38 224 1.01E-78 237.077 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#4988 - CGI_10025634 superfamily 241574 645 876 3.14E-113 348.421 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#4988 - CGI_10025634 superfamily 241622 480 567 2.03E-22 93.015 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#4990 - CGI_10025636 superfamily 247725 211 333 2.92E-75 233.71 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#4990 - CGI_10025636 superfamily 215882 131 236 3.77E-35 127.012 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#4990 - CGI_10025636 superfamily 220215 33 125 2.13E-24 96.1402 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#4990 - CGI_10025636 superfamily 192138 332 377 1.05E-11 59.9399 cl07378 FA superfamily - - "FERM adjacent (FA); This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase substrates." Q#4992 - CGI_10025638 superfamily 241593 31 118 2.74E-07 45.3302 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#4993 - CGI_10025639 superfamily 128937 84 144 5.36E-14 63.0504 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#4993 - CGI_10025639 superfamily 128937 9 74 2.27E-11 55.7316 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#4995 - CGI_10025641 superfamily 242406 27 144 8.44E-15 67.2313 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#4997 - CGI_10025643 superfamily 241613 2 30 6.42E-09 52.209 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#4997 - CGI_10025643 superfamily 247744 269 312 0.00449978 36.6872 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#4999 - CGI_10025645 superfamily 218295 2 570 0 757.872 cl15562 TH1 superfamily - - TH1 protein; TH1 is a highly conserved but uncharacterized metazoan protein. No homologue has been identified in Caenorhabditis elegans. TH1 binds specifically to A-Raf kinase. Q#5000 - CGI_10025646 superfamily 243967 293 542 3.44E-65 212.96 cl05005 TAF4 superfamily - - "TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex; The TATA Binding Protein (TBP) Associated Factor 4 (TAF4) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryote. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for the expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFS and many other transcription factors. TFIID has a histone octamer-like substructure. TAF4 domain interacts with TAF12 and makes a novel histone-like heterodimer that binds DNA and has a core promoter function of a subset of genes." Q#5000 - CGI_10025646 superfamily 198757 133 224 6.65E-33 120.892 cl02658 TAFH superfamily - - "NHR1 homology to TAF; This corresponds to the region NHR1 that is conserved between the product of the nervy gene in Drosophila and the human mtg8b protein, which is hypothesised to be a transcription factor." Q#5001 - CGI_10025647 superfamily 241733 4 77 4.58E-40 141.186 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#5001 - CGI_10025647 superfamily 220282 334 439 1.64E-19 84.843 cl09757 FDF superfamily - - "FDF domain; The FDF domain, so called because of the conserved FDF at its N termini, is an entirely alpha-helical domain with multiple exposed hydrophilic loops. It is found at the C terminus of Scd6p-like SM domains. It is also found with other divergent Sm domains and in proteins such as Dcp3p and FLJ21128, where it is found N terminal to the YjeF-N domain, a novel Rossmann fold domain." Q#5002 - CGI_10025648 superfamily 247856 102 161 3.34E-05 40.9941 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5003 - CGI_10025649 superfamily 243068 5 230 8.55E-19 82.5932 cl02523 Zona_pellucida superfamily - - Zona pellucida-like domain; Zona pellucida-like domain. Q#5005 - CGI_10025651 superfamily 220692 58 157 0.00107911 39.1097 cl18570 7TM_GPCR_Srw superfamily C - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#5008 - CGI_10025654 superfamily 189870 13 103 1.04E-42 142.269 cl09526 SBDS superfamily - - "Shwachman-Bodian-Diamond syndrome (SBDS) protein; This family is highly conserved in species ranging from archaea to vertebrates and plants. The family contains several Shwachman-Bodian-Diamond syndrome (SBDS) proteins from both mouse and humans. Shwachman-Diamond syndrome is an autosomal recessive disorder with clinical features that include pancreatic exocrine insufficiency, haematological dysfunction and skeletal abnormalities. It is characterized by bone marrow failure and leukemia predisposition. Members of this family play a role in RNA metabolism. In yeast these proteins have been shown to be critical for the release and recycling of the nucleolar shuttling factor Tif6 from pre-60S ribosomes, a key step in 60S maturation and translational activation of ribosomes. This data links defective late 60S subunit maturation to an inherited bone marrow failure syndrome associated with leukemia predisposition." Q#5009 - CGI_10025655 superfamily 247856 452 530 2.35E-11 59.8689 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5009 - CGI_10025655 superfamily 246925 120 415 4.28E-30 119.382 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#5013 - CGI_10025659 superfamily 241550 299 428 8.11E-13 65.0048 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#5013 - CGI_10025659 superfamily 241872 5 104 7.01E-06 45.0557 cl00453 CDP-OH_P_transf superfamily C - CDP-alcohol phosphatidyltransferase; All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. Q#5014 - CGI_10025660 superfamily 247907 4281 4440 1.42E-29 119.06 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#5014 - CGI_10025660 superfamily 247907 3740 3893 4.17E-25 105.963 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#5014 - CGI_10025660 superfamily 247907 3997 4175 2.28E-24 103.652 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#5014 - CGI_10025660 superfamily 238012 2502 2551 1.70E-10 60.4458 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#5014 - CGI_10025660 superfamily 238012 2081 2130 1.98E-10 60.0606 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#5014 - CGI_10025660 superfamily 245814 3009 3069 8.31E-10 59.0399 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 241613 445 480 1.19E-09 57.6018 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 1052 1086 4.46E-09 56.061 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 1012 1047 8.36E-09 55.2906 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 574 605 1.42E-08 54.5202 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 238012 1681 1730 2.45E-08 54.2826 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#5014 - CGI_10025660 superfamily 241613 367 401 3.76E-08 53.3646 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 938 973 5.96E-08 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 1195 1227 7.51E-08 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 743 777 1.29E-07 51.8238 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 406 441 1.50E-07 51.4386 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 238012 1796 1841 1.94E-07 51.5862 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#5014 - CGI_10025660 superfamily 245814 2635 2700 2.45E-07 51.7211 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 241613 1280 1315 3.83E-07 50.283 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 783 814 6.04E-07 49.8978 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 245213 4236 4272 6.83E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5014 - CGI_10025660 superfamily 238012 2195 2244 9.38E-07 49.6602 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#5014 - CGI_10025660 superfamily 241613 975 1010 1.67E-06 48.357 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 821 857 4.38E-06 47.2014 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 616 647 6.03E-06 46.8162 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 902 936 1.12E-05 46.0458 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 527 560 1.58E-05 45.6606 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 241613 487 520 4.97E-05 44.1198 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5014 - CGI_10025660 superfamily 245213 3913 3949 0.00578951 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5014 - CGI_10025660 superfamily 245213 4203 4233 0.00821617 37.6162 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5014 - CGI_10025660 superfamily 243080 1503 1630 3.45E-29 116.977 cl02548 Laminin_B superfamily - - Laminin B (Domain IV); Laminin B (Domain IV). Q#5014 - CGI_10025660 superfamily 245814 1333 1410 1.97E-27 110.272 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 243080 1911 2035 2.22E-24 103.11 cl02548 Laminin_B superfamily - - Laminin B (Domain IV); Laminin B (Domain IV). Q#5014 - CGI_10025660 superfamily 243080 2314 2455 6.04E-24 101.954 cl02548 Laminin_B superfamily - - Laminin B (Domain IV); Laminin B (Domain IV). Q#5014 - CGI_10025660 superfamily 245814 2710 2794 2.46E-19 87.2184 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 3434 3510 2.71E-16 78.316 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 3610 3686 4.26E-15 74.8492 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 3099 3166 8.65E-14 70.5133 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 3524 3599 1.47E-13 70.2268 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 3179 3262 1.79E-12 67.1452 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 3345 3419 2.15E-10 60.982 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 243060 82 179 1.88E-09 58.5444 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#5014 - CGI_10025660 superfamily 245814 2809 2890 7.38E-09 56.3597 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 3284 3335 6.70E-08 53.1795 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 1101 1179 8.59E-06 47.1576 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 2927 2964 3.15E-05 45.6168 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 245814 655 737 5.85E-05 44.4612 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5014 - CGI_10025660 superfamily 238012 1731 1788 0.00625636 38.1042 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#5014 - CGI_10025660 superfamily 241613 861 884 0.00887947 37.6129 cl00104 LDLa superfamily C - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5015 - CGI_10025661 superfamily 219251 205 290 2.42E-23 92.8857 cl18503 MRP-L47 superfamily - - "Mitochondrial 39-S ribosomal protein L47 (MRP-L47); This family represents the N-terminal region (approximately 8 residues) of the eukaryotic mitochondrial 39-S ribosomal protein L47 (MRP-L47). Mitochondrial ribosomal proteins (MRPs) are the counterparts of the cytoplasmic ribosomal proteins, in that they fulfil similar functions in protein biosynthesis. However, they are distinct in number, features and primary structure." Q#5017 - CGI_10025663 superfamily 243069 37 232 5.63E-108 312.608 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#5018 - CGI_10025664 superfamily 241634 1 139 8.70E-46 152.053 cl00143 SynN superfamily - - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#5018 - CGI_10025664 superfamily 241642 153 212 1.24E-12 60.587 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#5020 - CGI_10025666 superfamily 241913 190 263 0.00228093 35.6597 cl00509 hot_dog superfamily C - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#5021 - CGI_10025667 superfamily 241884 5 213 3.00E-145 407.905 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#5022 - CGI_10025668 superfamily 241574 51 285 1.01E-91 279.856 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#5023 - CGI_10025669 superfamily 241599 188 240 1.20E-12 61.1052 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#5023 - CGI_10025669 superfamily 243050 113 167 4.38E-26 97.8791 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#5023 - CGI_10025669 superfamily 243050 52 105 8.40E-18 75.1053 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#5025 - CGI_10025671 superfamily 209898 36 56 0.00893719 33.0827 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#5026 - CGI_10025672 superfamily 219609 1 140 7.30E-52 165.566 cl06752 Orai-1 superfamily - - "Mediator of CRAC channel activity; ORAI-1 is a protein homologue of Drosophila Orai and human Orai1, Orai2 and Orai3. ORAI-1 GFP reporters are co- expressed with STIM-1 (ER CA(2+) sensors) in the gonad and intestine. The protein has four predicted transmembrane domains with a highly conserved region between TM2 ad TM3. This conserved domain is thought to function in channel regulation. ORAI1- related proteins are required for the production of the calcium channel, CRAC, along with STIM1-related proteins." Q#5027 - CGI_10025673 superfamily 202085 713 741 2.34E-10 58.1406 cl03401 zf-CXXC superfamily N - "CXXC zinc finger domain; This domain contains eight conserved cysteine residues that bind to two zinc ions. The CXXC domain is found in a variety of chromatin-associated proteins. This domain binds to nonmethyl-CpG dinucleotides. The domain is characterized by two CGXCXXC repeats. The RecQ helicase has a single repeat that also binds to zinc, but this has not been included in this family. The DNA binding interface has been identified by NMR." Q#5027 - CGI_10025673 superfamily 202224 298 398 1.17E-08 54.6092 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#5027 - CGI_10025673 superfamily 199166 1278 1394 1.99E-05 46.1664 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#5027 - CGI_10025673 superfamily 243074 1148 1191 2.62E-05 43.6493 cl02535 F-box-like superfamily N - F-box-like; This is an F-box-like family. Q#5028 - CGI_10025674 superfamily 241593 264 403 2.30E-12 63.0494 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#5028 - CGI_10025674 superfamily 220754 64 217 1.65E-51 172.022 cl11087 BCDHK_Adom3 superfamily - - "Mitochondrial branched-chain alpha-ketoacid dehydrogenase kinase; Catabolism and synthesis of leucine, isoleucine and valine are finely balanced, allowing the body to make the most of dietary input but removing excesses to prevent toxic build-up of their corresponding keto-acids. This is the butyryl-CoA dehydrogenase, subunit A domain 3, a largely alpha-helical bundle of the enzyme BCDHK. This enzyme is the regulator of the dehydrogenase complex that breaks branched-chain amino-acids down, by phosphorylating and thereby inactivating it when synthesis is required. The domain is associated with family HATPase_c pfam02518 which is towards the C-terminal." Q#5030 - CGI_10025676 superfamily 248458 46 138 0.000109947 42.6861 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5031 - CGI_10025677 superfamily 203238 57 218 1.04E-85 253.375 cl12307 GMP_PDE_delta superfamily - - "GMP-PDE, delta subunit; GMP-PDE delta subunit was originally identified as a fourth subunit of rod-specific cGMP phosphodiesterase (PDE)(EC:3.1.4.35). The precise function of PDE delta subunit in the rod specific GMP-PDE complex is unclear. In addition, PDE delta subunit is not confined to photoreceptor cells but is widely distributed in different tissues. PDE delta subunit is thought to be a specific soluble transport factor for certain prenylated proteins and Arl2-GTP a regulator of PDE-mediated transport." Q#5032 - CGI_10025678 superfamily 243072 6 116 9.77E-09 53.9266 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5032 - CGI_10025678 superfamily 243072 434 499 4.22E-05 42.3707 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5033 - CGI_10025679 superfamily 248312 30 199 0.0020896 36.9477 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#5034 - CGI_10025680 superfamily 241750 25 394 2.83E-94 289.999 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#5035 - CGI_10025681 superfamily 247723 540 616 5.59E-32 123.151 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5035 - CGI_10025681 superfamily 247723 367 440 6.13E-31 120.106 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5035 - CGI_10025681 superfamily 247723 466 539 3.73E-29 114.806 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5035 - CGI_10025681 superfamily 247723 113 171 4.07E-09 56.6883 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5036 - CGI_10025682 superfamily 219549 1212 1336 1.13E-20 90.0474 cl06671 SPOC superfamily - - SPOC domain; The SPOC (Spen paralogue and orthologue C-terminal) domain is involved in developmental signalling. Q#5037 - CGI_10025683 superfamily 241777 181 459 1.54E-27 110.083 cl00316 Cation_efflux superfamily - - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#5038 - CGI_10025684 superfamily 248313 96 159 0.00679985 34.8994 cl17759 EamA superfamily N - EamA-like transporter family; This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. The family used to be known as DUF6. Q#5039 - CGI_10025685 superfamily 242059 1 422 9.46E-28 112.84 cl00738 MBOAT superfamily - - "MBOAT, membrane-bound O-acyltransferase family; The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue." Q#5040 - CGI_10025686 superfamily 201526 32 98 1.09E-28 102.231 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#5041 - CGI_10025687 superfamily 220393 3 289 1.11E-99 310.076 cl10751 Tmem26 superfamily - - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#5045 - CGI_10025691 superfamily 243035 8 127 7.44E-20 82.2825 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5046 - CGI_10025692 superfamily 245201 338 562 9.81E-55 187.744 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5047 - CGI_10025693 superfamily 190614 40 122 1.76E-33 114.58 cl15647 YEATS superfamily - - "YEATS family; We have named this family the YEATS family, after `YNK7', `ENL', `AF-9', and `TFIIF small subunit'. This family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity" Q#5048 - CGI_10025694 superfamily 244870 175 389 1.22E-51 179.408 cl08238 PA superfamily - - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#5048 - CGI_10025694 superfamily 246748 397 669 2.80E-47 169.696 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#5048 - CGI_10025694 superfamily 202944 667 785 5.46E-13 66.9104 cl07854 TFR_dimer superfamily - - Transferrin receptor-like dimerisation domain; This domain is involved in dimerisation of the transferrin receptor as shown in its crystal structure. Q#5049 - CGI_10025695 superfamily 243859 62 147 1.69E-10 55.799 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#5050 - CGI_10012376 superfamily 207642 206 295 5.50E-28 104.881 cl02558 GED superfamily - - Dynamin GTPase effector domain; Dynamin GTPase effector domain. Q#5050 - CGI_10012376 superfamily 247725 36 75 1.48E-22 90.4455 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5050 - CGI_10012376 superfamily 247725 151 185 8.40E-20 82.7415 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5051 - CGI_10012377 superfamily 248469 110 208 8.11E-09 51.2167 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#5053 - CGI_10012379 superfamily 241578 409 560 6.03E-38 139.734 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5053 - CGI_10012379 superfamily 243051 134 272 2.71E-35 132.116 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#5053 - CGI_10012379 superfamily 243061 299 401 7.89E-34 126.303 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5053 - CGI_10012379 superfamily 243061 24 126 2.94E-30 115.902 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5054 - CGI_10012380 superfamily 247755 580 815 4.54E-136 407.001 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#5054 - CGI_10012380 superfamily 216049 244 534 1.05E-11 64.9998 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#5055 - CGI_10012381 superfamily 241640 70 303 1.87E-69 218.3 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#5057 - CGI_10012383 superfamily 247792 18 55 7.45E-08 47.8256 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5057 - CGI_10012383 superfamily 149875 129 271 2.24E-80 242.787 cl07515 USP8_interact superfamily N - USP8 interacting; This domain interacts with the UBP deubiquitinating enzyme USP8. Q#5058 - CGI_10012384 superfamily 246675 87 127 0.00802021 36.8788 cl14615 PI-PLCc_GDPD_SF superfamily NC - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#5059 - CGI_10012385 superfamily 243029 40 108 9.75E-17 71.9981 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#5059 - CGI_10012385 superfamily 215647 141 215 2.46E-09 54.9221 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#5060 - CGI_10012386 superfamily 219619 329 385 2.64E-10 56.4471 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#5060 - CGI_10012386 superfamily 243066 6 56 1.91E-09 54.0961 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#5061 - CGI_10012387 superfamily 217740 22 222 8.63E-71 218 cl18427 Scramblase superfamily - - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#5062 - CGI_10012388 superfamily 219863 493 892 1.20E-159 485.277 cl07203 DUF1744 superfamily - - Domain of unknown function (DUF1744); This domain is found on the epsilon catalytic subunit of DNA polymerase. It is found C terminal to pfam03104 and pfam00136. Q#5062 - CGI_10012388 superfamily 245235 23 116 1.34E-50 190.195 cl10023 POLBc superfamily N - "DNA polymerase type-B family catalytic domain. DNA-directed DNA polymerases elongate DNA by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA. DNA-directed DNA polymerases are multifunctional with both synthetic (polymerase) and degradative modes (exonucleases) and play roles in the processes of DNA replication, repair, and recombination. DNA-dependent DNA polymerases can be classified in six main groups based upon their phylogenetic relationships with E. coli polymerase I (class A), E. coli polymerase II (class B), E. coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB, and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family B DNA polymerases include E. coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon, and zeta), and eukaryotic viral and plasmid-borne enzymes. DNA polymerase is made up of distinct domains and sub-domains. The polymerase domain of DNA polymerase type B (Pol domain) is responsible for the template-directed polymerization of dNTPs onto the growing primer strand of duplex DNA that is usually magnesium dependent. In general, the architecture of the Pol domain has been likened to a right hand with fingers, thumb, and palm sub-domains with a deep groove to accommodate the nucleic acid substrate. There are a few conserved motifs in the Pol domain of family B DNA polymerases. The conserved aspartic acid residues in the DTDS motifs of the palm sub-domain is crucial for binding to divalent metal ion and is suggested to be important for polymerase catalysis." Q#5063 - CGI_10012389 superfamily 245235 461 992 0 1051.12 cl10023 POLBc superfamily - - "DNA polymerase type-B family catalytic domain. DNA-directed DNA polymerases elongate DNA by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA. DNA-directed DNA polymerases are multifunctional with both synthetic (polymerase) and degradative modes (exonucleases) and play roles in the processes of DNA replication, repair, and recombination. DNA-dependent DNA polymerases can be classified in six main groups based upon their phylogenetic relationships with E. coli polymerase I (class A), E. coli polymerase II (class B), E. coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB, and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family B DNA polymerases include E. coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon, and zeta), and eukaryotic viral and plasmid-borne enzymes. DNA polymerase is made up of distinct domains and sub-domains. The polymerase domain of DNA polymerase type B (Pol domain) is responsible for the template-directed polymerization of dNTPs onto the growing primer strand of duplex DNA that is usually magnesium dependent. In general, the architecture of the Pol domain has been likened to a right hand with fingers, thumb, and palm sub-domains with a deep groove to accommodate the nucleic acid substrate. There are a few conserved motifs in the Pol domain of family B DNA polymerases. The conserved aspartic acid residues in the DTDS motifs of the palm sub-domain is crucial for binding to divalent metal ion and is suggested to be important for polymerase catalysis." Q#5063 - CGI_10012389 superfamily 245226 198 401 4.62E-146 435.921 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#5064 - CGI_10012390 superfamily 220417 2 142 1.76E-12 60.8706 cl10785 MRP-L28 superfamily - - Mitochondrial ribosomal protein L28; Members of this family are components of the mitochondrial large ribosomal subunit. Mature mitochondrial ribosomes consist of a small (37S) and a large (54S) subunit. The 37S subunit contains at least 33 different proteins and 1 molecule of RNA (15S). The 54S subunit contains at least 45 different proteins and 1 molecule of RNA (21S). Q#5065 - CGI_10012391 superfamily 149675 4 164 1.70E-115 327.09 cl07351 UFC1 superfamily - - "Ubiquitin-fold modifier-conjugating enzyme 1; Ubiquitin-like (UBL) post-translational modifiers are covalently linked to most, if not all, target protein(s) through an enzymatic cascade analogous to ubiquitylation, consisting of E1 (activating), E2 (conjugating), and E3 (ligating) enzymes. Ubiquitin-fold modifier 1 (Ufm1) a ubiquitin-like protein is activated by a novel E1-like enzyme, Uba5, by forming a high-energy thioester bond. Activated Ufm1 is then transferred to its cognate E2-like enzyme, Ufc1, in a similar thioester linkage. This family represents the E2-like enzyme." Q#5067 - CGI_10012393 superfamily 246669 6 117 1.71E-32 119.666 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#5067 - CGI_10012393 superfamily 245205 255 334 1.43E-08 51.4697 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#5069 - CGI_10012395 superfamily 245814 54 120 3.27E-11 59.4251 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5069 - CGI_10012395 superfamily 245814 164 243 3.67E-27 104.53 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5069 - CGI_10012395 superfamily 245814 266 359 1.31E-14 69.753 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5072 - CGI_10012398 superfamily 247741 25 91 5.06E-12 59.8691 cl17187 Aldolase_Class_I superfamily N - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#5073 - CGI_10005379 superfamily 246902 304 490 2.52E-79 257.505 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#5073 - CGI_10005379 superfamily 246902 69 241 3.35E-79 256.321 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#5073 - CGI_10005379 superfamily 199166 755 882 2.11E-05 45.396 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#5075 - CGI_10005381 superfamily 193251 2416 2686 3.94E-155 485.21 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#5075 - CGI_10005381 superfamily 193253 3043 3386 8.56E-155 487.622 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#5075 - CGI_10005381 superfamily 193256 2763 3030 1.14E-152 477.903 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#5075 - CGI_10005381 superfamily 193257 3404 3631 2.24E-135 426.709 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#5075 - CGI_10005381 superfamily 247743 2123 2244 3.97E-08 54.994 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#5077 - CGI_10005383 superfamily 241622 137 216 2.80E-11 58.7322 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#5077 - CGI_10005383 superfamily 221480 22 63 0.007449 35.9323 cl13647 MSA-2c superfamily N - Merozoite surface antigen 2c; This family of proteins is found in eukaryotes. Proteins in this family are typically between 263 and 318 amino acids in length. There is a conserved SFT sequence motif. MSA-2 is a plasma membrane glycoprotein which can be found in Babesia bovis species. Q#5078 - CGI_10005384 superfamily 222370 27 97 1.12E-16 71.4001 cl16386 Longin superfamily - - "Regulated-SNARE-like domain; Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain." Q#5078 - CGI_10005384 superfamily 201526 117 172 8.47E-16 69.1041 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#5079 - CGI_10005385 superfamily 241739 223 481 1.46E-108 324.499 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#5080 - CGI_10005386 superfamily 218209 85 299 1.58E-48 162.534 cl12298 DUF607 superfamily - - "Protein of unknown function, DUF607; This family represents a conserved region found in several uncharacterized eukaryotic proteins." Q#5081 - CGI_10005387 superfamily 242908 6 148 1.76E-64 197.396 cl02155 ER_lumen_recept superfamily - - ER lumen protein retaining receptor; ER lumen protein retaining receptor. Q#5086 - CGI_10007651 superfamily 220325 399 579 7.06E-32 123.282 cl09853 Med18 superfamily - - "Med18 protein; Med18 is one subunit of Mediator, a head-module multiprotein complex, that stimulates basal RNA polymerase II (Pol II) transcription. Med18 consists of an eight-stranded beta-barrel with a central pore and three flanking helices. It complexes with Med8 and Med20 proteins by forming a heterodimer of two-fold symmetry with Med20 and binding the C-terminal alpha-helix region of Med8 across the top of its barrel. This complex creates a multipartite TBP-binding site that can be modulated by transcriptional activators." Q#5086 - CGI_10007651 superfamily 241583 19 61 4.80E-06 46.2593 cl00064 ZnMc superfamily NC - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#5087 - CGI_10007652 superfamily 247057 346 397 1.60E-05 42.1967 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#5087 - CGI_10007652 superfamily 221744 46 91 0.00261634 37.8007 cl18614 CABIT superfamily C - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#5088 - CGI_10007653 superfamily 220226 106 363 1.46E-97 292.98 cl09658 XendoU superfamily - - Endoribonuclease XendoU; This is a family of endoribonucleases involved in RNA biosynthesis which has been named XendoU in Xenopus laevis. XendoU is a U-specific metal dependent enzyme that produces products with a 2'-3' cyclic phosphate termini. Q#5088 - CGI_10007653 superfamily 207618 18 58 9.66E-11 56.6746 cl02508 Somatomedin_B superfamily - - Somatomedin B domain; Somatomedin B domain. Q#5088 - CGI_10007653 superfamily 207618 58 96 1.83E-10 55.9042 cl02508 Somatomedin_B superfamily - - Somatomedin B domain; Somatomedin B domain. Q#5090 - CGI_10007655 superfamily 243098 165 209 1.60E-06 46.0519 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#5090 - CGI_10007655 superfamily 247999 677 717 7.12E-08 50.1816 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#5090 - CGI_10007655 superfamily 248259 82 146 2.62E-07 49.1702 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#5093 - CGI_10007658 superfamily 245213 64 98 9.26E-08 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5093 - CGI_10007658 superfamily 245213 148 191 2.74E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5093 - CGI_10007658 superfamily 245213 193 228 8.81E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5093 - CGI_10007658 superfamily 245213 105 141 0.000217144 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5093 - CGI_10007658 superfamily 245213 20 57 0.00391578 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5093 - CGI_10007658 superfamily 205157 238 275 0.00524395 35.5911 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#5094 - CGI_10007659 superfamily 245213 10 52 1.25E-08 51.4834 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5098 - CGI_10007082 superfamily 241563 67 108 2.39E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5098 - CGI_10007082 superfamily 241563 27 58 0.00116743 37.0736 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5099 - CGI_10007083 superfamily 242536 147 228 2.73E-29 106.027 cl01497 ATase superfamily - - "The DNA repair protein O6-alkylguanine-DNA alkyltransferase (ATase; also known as AGT, AGAT and MGMT) reverses O6-alkylation DNA damage by transferring O6-alkyl adducts to an active site cysteine irreversibly, without inducing DNA strand breaks. ATases are specific for repair of guanines with O6-alkyl adducts, however human ATase is not limited to O6-methylguanine, repairing many other adducts at the O6-position of guanine as well. ATase is widely distributed among species. Most ATases have N- and C-terminal domains. The C-terminal domain contains the conserved active-site cysteine motif (PCHR), the O6-alkylguanine binding channel, and the helix-turn-helix (HTH) DNA-binding motif. The active site is located near the recognition helix of the HTH motif. While the C-terminal domain of ATase contains residues that are necessary for DNA binding and alkyl transfer, the function of the N-terminal domain is still unknown. Removal of the N-terminal domain abolishes the activity of the C-terminal domain, suggesting an important structural role for the N-terminal domain in orienting the C-terminal domain for proper catalysis. Some ATase C-terminal domain homologs are either single-domain proteins that lack an N-terminal domain, or have a tryptophan substituted in place of the acceptor cysteine (i.e. the motif PCHR is replaced by PWHR). ATase null mutant mice are viable, fertile, and have a normal lifespan." Q#5103 - CGI_10007087 superfamily 243362 209 250 8.34E-05 41.2567 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#5103 - CGI_10007087 superfamily 110440 335 361 0.000255661 38.5429 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#5109 - CGI_10003693 superfamily 247744 250 369 4.38E-08 51.5004 cl17190 NK superfamily N - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#5109 - CGI_10003693 superfamily 241782 70 141 0.00501046 36.2063 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#5111 - CGI_10003696 superfamily 219843 5 199 2.13E-81 249.078 cl18528 ATP-grasp_2 superfamily - - ATP-grasp domain; ATP-grasp domain. Q#5111 - CGI_10003696 superfamily 215988 258 309 1.07E-11 61.1184 cl18355 Ligase_CoA superfamily C - "CoA-ligase; This family includes the CoA ligases Succinyl-CoA synthetase alpha and beta chains, malate CoA ligase and ATP-citrate lyase. Some members of the family utilise ATP others use GTP." Q#5112 - CGI_10003697 superfamily 247724 23 302 2.99E-130 388.912 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5112 - CGI_10003697 superfamily 216255 226 516 8.98E-121 364.943 cl03076 Dynamin_M superfamily - - "Dynamin central region; This region lies between the GTPase domain, see pfam00350, and the pleckstrin homology (PH) domain, see pfam00169." Q#5112 - CGI_10003697 superfamily 207642 593 682 1.79E-25 101.414 cl02558 GED superfamily - - Dynamin GTPase effector domain; Dynamin GTPase effector domain. Q#5113 - CGI_10000428 superfamily 247098 7 92 7.38E-20 78.82 cl15841 COG0229 superfamily N - "Conserved domain frequently associated with peptide methionine sulfoxide reductase [Posttranslational modification, protein turnover, chaperones]" Q#5115 - CGI_10002622 superfamily 222150 89 114 7.54E-05 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5115 - CGI_10002622 superfamily 222150 118 141 0.000168036 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5118 - CGI_10012067 superfamily 192566 29 151 1.05E-29 115.094 cl18180 COG5 superfamily - - "Golgi transport complex subunit 5; The COG complex, the peripheral membrane oligomeric protein complex involved in intra-Golgi protein trafficking, consists of eight subunits arranged in two lobes bridged by Cog1. Cog5 is in the smaller, B lobe, bound in with Cog6-8, and is itself bound to Cog1 as well as, strongly, to Cog7." Q#5119 - CGI_10012068 superfamily 247740 22 251 6.50E-96 285.928 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#5120 - CGI_10012069 superfamily 245201 167 416 4.74E-61 210.941 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5120 - CGI_10012069 superfamily 221460 443 480 1.88E-12 64.7475 cl12053 OSR1_C superfamily - - "Oxidative-stress-responsive kinase 1 C terminal; This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00069. There is a single completely conserved residue F that may be functionally important. OSR1 is involved in the signalling cascade which activates Na/K/2Cl cotransporter during osmotic stress. This domain is the C terminal domain of OSR1 which recognises a motif (Arg-Phe-Xaa-Val) on the OSR1-activating protein WNK1." Q#5123 - CGI_10012072 superfamily 247743 488 666 0.000135167 41.7479 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#5123 - CGI_10012072 superfamily 221668 83 184 1.56E-05 44.626 cl13991 MCM2_N superfamily N - "Mini-chromosome maintenance protein 2; This domain family is found in eukaryotes, and is typically between 138 and 153 amino acids in length. The family is found in association with pfam00493. Mini-chromosome maintenance (MCM) proteins are essential for DNA replication. These proteins use ATPase activity to perform this function." Q#5125 - CGI_10012074 superfamily 246918 30 89 4.49E-16 69.5379 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5126 - CGI_10012075 superfamily 246918 192 252 2.01E-09 54.9003 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5126 - CGI_10012075 superfamily 246918 255 317 1.37E-07 49.5075 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5126 - CGI_10012075 superfamily 246918 142 187 0.000110543 41.0331 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5127 - CGI_10012076 superfamily 245201 811 1048 4.31E-20 90.3737 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5127 - CGI_10012076 superfamily 247724 149 197 6.75E-14 71.0385 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5127 - CGI_10012076 superfamily 247724 248 398 5.06E-06 46.771 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5128 - CGI_10012077 superfamily 245201 827 986 2.75E-13 69.5729 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5128 - CGI_10012077 superfamily 247724 162 226 1.35E-09 57.5566 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5129 - CGI_10012078 superfamily 245201 721 994 2.87E-46 166.258 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5129 - CGI_10012078 superfamily 247724 232 371 1.19E-08 54.475 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5129 - CGI_10012078 superfamily 247724 129 167 1.19E-05 45.6154 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5130 - CGI_10012079 superfamily 245201 81 133 0.000111852 39.099 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5131 - CGI_10012080 superfamily 245201 13 96 2.62E-10 54.4648 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5132 - CGI_10012081 superfamily 247724 124 162 3.66E-07 49.8526 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5132 - CGI_10012081 superfamily 247724 227 342 8.70E-05 42.919 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5132 - CGI_10012081 superfamily 245814 534 631 0.00207353 37.4849 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5133 - CGI_10012082 superfamily 203031 31 90 1.68E-07 43.856 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#5135 - CGI_10012084 superfamily 241659 96 174 3.23E-27 99.9018 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#5136 - CGI_10012085 superfamily 245602 1439 1610 1.89E-96 318.344 cl11402 GH31 superfamily N - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#5136 - CGI_10012085 superfamily 245602 2079 2571 4.95E-169 529.506 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#5136 - CGI_10012085 superfamily 245602 289 783 4.19E-142 453.237 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#5136 - CGI_10012085 superfamily 245602 1196 1371 3.15E-62 219.348 cl11402 GH31 superfamily C - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#5136 - CGI_10012085 superfamily 241612 917 961 7.59E-06 46.183 cl00103 Trefoil superfamily - - "P or trefoil or TFF domain; Trefoil factor family domain peptides are mucin-associated molecules, largely found in epithelia of gastrointestinal tissues. Function is not known but it was originally identified from mucosal tissues, where it may have a regulatory or structural role and has also been implicated as a growth fractor in other tissues.The domain is found in 1 to 6 copies where it occurs." Q#5136 - CGI_10012085 superfamily 241612 1815 1862 0.00128259 39.291 cl00103 Trefoil superfamily - - "P or trefoil or TFF domain; Trefoil factor family domain peptides are mucin-associated molecules, largely found in epithelia of gastrointestinal tissues. Function is not known but it was originally identified from mucosal tissues, where it may have a regulatory or structural role and has also been implicated as a growth fractor in other tissues.The domain is found in 1 to 6 copies where it occurs." Q#5140 - CGI_10012090 superfamily 241563 75 115 8.21E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5141 - CGI_10018432 superfamily 245201 7 212 2.49E-82 250.878 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5142 - CGI_10018433 superfamily 245033 162 306 7.31E-26 101.246 cl09208 Tim44 superfamily - - Tim44-like domain; Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region. This family includes the C-terminal region of Tim44 that has been shown to form a stable proteolytic fragment in yeast. This region is also found in a set of smaller bacterial proteins. The molecular function of the bacterial members of this family is unknown but transport seems likely. The crystal structure of the C terminal of Tim44 has revealed a large hydrophobic pocket which might play an important role in interacting with the acyl chains of lipid molecules in the mitochondrial membrane. Q#5143 - CGI_10018434 superfamily 241782 35 370 1.51E-58 206.038 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#5143 - CGI_10018434 superfamily 247856 540 601 0.000515278 39.4533 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5143 - CGI_10018434 superfamily 215754 779 870 1.57E-26 105.798 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#5143 - CGI_10018434 superfamily 215754 965 1054 2.00E-26 105.798 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#5143 - CGI_10018434 superfamily 215754 871 962 1.08E-20 89.2348 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#5143 - CGI_10018434 superfamily 244899 465 523 0.000324407 40.1658 cl08302 S-100 superfamily - - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#5144 - CGI_10018435 superfamily 220729 3 151 8.94E-43 148.181 cl11056 Hat1_N superfamily - - "Histone acetyl transferase HAT1 N-terminus; This domain is the N-terminal half of the structure of histone acetyl transferase HAT1. It is often found in association with the C-terminal part of the GNAT Acetyltransf_1 (pfam00583) domain. It seems to be motifs C and D of the structure. Histone acetyltransferases (HATs) catalyze the transfer of an acetyl group from acetyl-CoA to the lysine E-amino groups on the N-terminal tails of histones. HATs are involved in transcription since histones tend to be hyper-acetylated in actively transcribed regions of chromatin, whereas in transcriptionally silent regions histones are hypo-acetylated." Q#5145 - CGI_10018436 superfamily 246749 4 88 2.62E-32 122.221 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#5145 - CGI_10018436 superfamily 243098 461 508 0.000100767 41.4295 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#5145 - CGI_10018436 superfamily 246749 319 383 1.14E-07 50.5067 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#5146 - CGI_10018437 superfamily 243066 7 112 1.71E-15 69.5685 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#5148 - CGI_10018439 superfamily 245864 1 93 2.08E-23 93.1118 cl12078 p450 superfamily NC - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#5149 - CGI_10018440 superfamily 245864 190 413 9.41E-53 184.404 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#5149 - CGI_10018440 superfamily 245864 39 121 1.57E-15 76.9334 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#5150 - CGI_10018441 superfamily 245864 1 419 7.61E-72 236.021 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#5151 - CGI_10018442 superfamily 217473 64 334 3.37E-25 105.91 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#5152 - CGI_10018443 superfamily 241600 91 300 3.03E-99 293.763 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#5156 - CGI_10018447 superfamily 243072 323 501 3.80E-09 55.0822 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5156 - CGI_10018447 superfamily 243072 455 551 7.39E-09 54.3118 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5156 - CGI_10018447 superfamily 243072 28 119 3.24E-08 52.3858 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5157 - CGI_10018448 superfamily 145533 60 154 5.14E-40 140.962 cl03592 Ski_Sno superfamily - - "SKI/SNO/DAC family; This family contains a presumed domain that is about 100 amino acids long. All members of this family contain a conserved CLPQ motif. The c-ski proto-oncogene has been shown to influence proliferation, morphological transformation and myogenic differentiation. Sno, a Ski proto-oncogene homologue, is expressed in two isoforms and plays a role in the response to proliferation stimuli. Dachshund also contains this domain. It is involved in various aspects of development." Q#5157 - CGI_10018448 superfamily 198898 165 257 2.69E-35 127.484 cl07406 c-SKI_SMAD_bind superfamily - - c-SKI Smad4 binding domain; c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4 Q#5158 - CGI_10018449 superfamily 241600 301 515 1.50E-97 296.459 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#5159 - CGI_10018450 superfamily 247637 211 387 2.49E-91 280.683 cl16912 MDR superfamily N - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#5160 - CGI_10018451 superfamily 216686 68 257 1.47E-54 177.9 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#5163 - CGI_10018454 superfamily 243062 305 406 1.02E-56 183.631 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#5163 - CGI_10018454 superfamily 216062 15 253 1.40E-49 169.156 cl02928 TGFb_propeptide superfamily - - TGF-beta propeptide; This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. Q#5164 - CGI_10018455 superfamily 241645 183 261 5.86E-37 127.08 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#5164 - CGI_10018455 superfamily 243133 108 147 2.64E-11 57.5708 cl02662 SEP superfamily N - "SEP domain; The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain." Q#5165 - CGI_10018456 superfamily 247858 395 475 0.000140341 40.4617 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#5166 - CGI_10018457 superfamily 245819 267 425 4.38E-20 89.1754 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#5166 - CGI_10018457 superfamily 245819 734 811 5.99E-08 52.1963 cl11967 Nucleotidyl_cyc_III superfamily N - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#5167 - CGI_10018458 superfamily 241584 576 667 1.21E-07 50.5727 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5167 - CGI_10018458 superfamily 241584 393 437 0.00012122 41.3279 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5167 - CGI_10018458 superfamily 245814 160 231 9.80E-16 73.6326 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5167 - CGI_10018458 superfamily 245814 257 333 1.14E-15 73.6824 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5167 - CGI_10018458 superfamily 245814 64 139 9.63E-13 65.1134 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5167 - CGI_10018458 superfamily 245814 2 34 2.82E-06 45.8232 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5169 - CGI_10018460 superfamily 243051 596 749 2.63E-37 139.435 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#5169 - CGI_10018460 superfamily 241609 1037 1115 4.94E-28 110.16 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#5169 - CGI_10018460 superfamily 241609 845 912 5.23E-27 107.464 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#5169 - CGI_10018460 superfamily 241609 763 835 5.66E-26 104.382 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#5169 - CGI_10018460 superfamily 241609 917 993 9.88E-23 95.1375 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#5169 - CGI_10018460 superfamily 241609 1126 1196 1.85E-21 91.2855 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#5169 - CGI_10018460 superfamily 241571 429 548 1.46E-10 60.5038 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5169 - CGI_10018460 superfamily 245213 988 1029 1.80E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5169 - CGI_10018460 superfamily 241583 130 277 4.58E-38 142.325 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#5169 - CGI_10018460 superfamily 241583 280 381 2.03E-22 96.8714 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#5172 - CGI_10018463 superfamily 247905 573 728 1.50E-13 69.1888 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#5172 - CGI_10018463 superfamily 247805 361 501 2.29E-07 50.4136 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#5176 - CGI_10022378 superfamily 216363 262 356 3.47E-12 62.1026 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#5178 - CGI_10022380 superfamily 241758 4 99 0.00540215 34.2678 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5180 - CGI_10022382 superfamily 241758 3 149 2.00E-22 87.4254 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5181 - CGI_10022383 superfamily 241758 3 130 6.62E-16 69.321 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5182 - CGI_10022384 superfamily 241758 3 140 5.55E-26 96.285 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5183 - CGI_10022385 superfamily 183292 2 195 2.69E-05 44.4272 cl18135 PRK11728 superfamily C - hydroxyglutarate oxidase; Provisional Q#5184 - CGI_10022386 superfamily 183782 19 55 3.69E-05 44.1187 cl18137 PRK12834 superfamily C - putative FAD-binding dehydrogenase; Reviewed Q#5185 - CGI_10022387 superfamily 241758 2 88 6.14E-17 70.8618 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5186 - CGI_10022388 superfamily 241758 3 140 2.18E-25 94.7442 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5192 - CGI_10022394 superfamily 241563 491 529 0.0023078 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5193 - CGI_10022395 superfamily 192604 340 422 1.74E-21 88.125 cl11135 PACT_coil_coil superfamily - - "Pericentrin-AKAP-450 domain of centrosomal targeting protein; This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly." Q#5194 - CGI_10022396 superfamily 241852 6 433 0 867.066 cl00416 CS_ACL-C_CCL superfamily - - "Citrate synthase (CS), citryl-CoA lyase (CCL), the C-terminal portion of the single-subunit type ATP-citrate lyase (ACL) and the C-terminal portion of the large subunit of the two-subunit type ACL. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) from citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. Some CS proteins function as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. CCL cleaves citryl-CoA (CiCoA) to AcCoA and OAA. ACLs catalyze an ATP- and a CoA- dependant cleavage of citrate to form AcCoA and OAA; they do this in a multistep reaction, the final step of which is likely to involve the cleavage of CiCoA to generate AcCoA and OAA. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate CiCoA, and c) the hydrolysis of CiCoA to produce citrate and CoA. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. In fungi, yeast, plants, and animals ACL is cytosolic and generates AcCoA for lipogenesis. In several groups of autotrophic prokaryotes and archaea, ACL carries out the citrate-cleavage reaction of the reductive tricarboxylic acid (rTCA) cycle. In the family Aquificaceae this latter reaction in the rTCA cycle is carried out via a two enzyme system the second enzyme of which is CCL." Q#5195 - CGI_10022397 superfamily 241852 1 47 5.71E-26 96.6664 cl00416 CS_ACL-C_CCL superfamily N - "Citrate synthase (CS), citryl-CoA lyase (CCL), the C-terminal portion of the single-subunit type ATP-citrate lyase (ACL) and the C-terminal portion of the large subunit of the two-subunit type ACL. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) from citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. Some CS proteins function as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. CCL cleaves citryl-CoA (CiCoA) to AcCoA and OAA. ACLs catalyze an ATP- and a CoA- dependant cleavage of citrate to form AcCoA and OAA; they do this in a multistep reaction, the final step of which is likely to involve the cleavage of CiCoA to generate AcCoA and OAA. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate CiCoA, and c) the hydrolysis of CiCoA to produce citrate and CoA. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. In fungi, yeast, plants, and animals ACL is cytosolic and generates AcCoA for lipogenesis. In several groups of autotrophic prokaryotes and archaea, ACL carries out the citrate-cleavage reaction of the reductive tricarboxylic acid (rTCA) cycle. In the family Aquificaceae this latter reaction in the rTCA cycle is carried out via a two enzyme system the second enzyme of which is CCL." Q#5196 - CGI_10022398 superfamily 203881 107 138 3.25E-05 38.6725 cl07024 Spindle_Spc25 superfamily N - "Chromosome segregation protein Spc25; This is a family of chromosome segregation proteins. It contains Spc25, which is a conserved eukaryotic kinetochore protein involved in cell division. In fungi the Spc25 protein is a subunit of the Nuf2-Ndc80 complex, and in vertebrates it forms part of the Ndc80 complex." Q#5197 - CGI_10022399 superfamily 247724 10 170 8.28E-108 309.241 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5199 - CGI_10022401 superfamily 241583 124 277 5.17E-35 133.51 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#5199 - CGI_10022401 superfamily 245321 297 374 9.43E-32 120.038 cl10507 Disintegrin superfamily - - Disintegrin; Disintegrin. Q#5199 - CGI_10022401 superfamily 246968 375 472 2.65E-28 112.453 cl15456 ADAM_CR superfamily C - ADAM cysteine-rich; ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity. Q#5199 - CGI_10022401 superfamily 216572 74 102 7.69E-06 45.3435 cl03265 Pep_M12B_propep superfamily NC - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#5201 - CGI_10022405 superfamily 248097 65 170 2.67E-18 76.9202 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#5202 - CGI_10022406 superfamily 243035 15 124 2.92E-23 89.9865 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5202 - CGI_10022406 superfamily 243035 141 179 1.74E-05 41.0662 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5203 - CGI_10022407 superfamily 248097 1 86 5.00E-10 51.497 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#5204 - CGI_10026995 superfamily 214531 604 644 5.66E-09 53.7597 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5204 - CGI_10026995 superfamily 214531 122 165 1.81E-08 52.2189 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5204 - CGI_10026995 superfamily 215683 580 619 3.16E-06 45.6239 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#5204 - CGI_10026995 superfamily 214531 167 208 5.98E-06 44.9001 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5204 - CGI_10026995 superfamily 215683 93 139 0.00500391 35.9939 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#5205 - CGI_10026996 superfamily 241874 11 542 0 778.779 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#5206 - CGI_10026997 superfamily 241570 412 528 2.48E-20 87.3813 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#5207 - CGI_10026998 superfamily 243072 100 224 1.23E-35 133.663 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5207 - CGI_10026998 superfamily 243072 198 353 5.70E-35 131.737 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5207 - CGI_10026998 superfamily 243072 34 158 5.82E-33 125.959 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5207 - CGI_10026998 superfamily 243072 294 414 4.60E-31 120.566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5209 - CGI_10027000 superfamily 241578 185 342 3.26E-45 160.92 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5209 - CGI_10027000 superfamily 245213 671 706 3.79E-09 54.1798 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5209 - CGI_10027000 superfamily 245213 632 668 3.79E-09 54.1798 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5209 - CGI_10027000 superfamily 245213 594 629 2.27E-08 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5209 - CGI_10027000 superfamily 245213 556 592 4.00E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5209 - CGI_10027000 superfamily 245213 482 516 1.75E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5209 - CGI_10027000 superfamily 243119 822 869 3.01E-08 51.6757 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#5209 - CGI_10027000 superfamily 245213 518 540 0.0051631 36.0708 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5209 - CGI_10027000 superfamily 243119 770 817 0.0097878 35.1122 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#5211 - CGI_10027002 superfamily 242891 26 161 3.30E-57 177.812 cl02117 ORMDL superfamily - - "ORMDL family; Evidence form suggests that ORMDLs are involved in protein folding in the ER. Orm proteins have been identified as negative regulators of sphingolipid synthesis that form a conserved complex with serine palmitoyltransferase, the first and rate-limiting enzyme in sphingolipid production. This novel and conserved protein complex, has been termed the SPOTS complex (serine palmitoyltransferase, Orm1/2, Tsc3, and Sac1)." Q#5212 - CGI_10027003 superfamily 241884 1 157 3.34E-92 268.727 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#5213 - CGI_10027004 superfamily 241567 162 428 6.43E-44 155.066 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#5214 - CGI_10027005 superfamily 247941 117 243 1.12E-11 60.8124 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#5215 - CGI_10027006 superfamily 241570 688 799 2.31E-24 100.478 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#5215 - CGI_10027006 superfamily 219619 562 611 6.97E-11 60.2991 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#5215 - CGI_10027006 superfamily 197509 34 75 0.00580622 36.3921 cl09965 PAC superfamily - - Motif C-terminal to PAS motifs (likely to contribute to PAS structural domain); PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold. Q#5218 - CGI_10027009 superfamily 248264 154 318 2.16E-49 163.947 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#5218 - CGI_10027009 superfamily 222263 63 165 4.70E-10 55.4017 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#5220 - CGI_10027011 superfamily 217956 242 394 6.92E-34 124.445 cl04443 PDCD2_C superfamily - - "Programmed cell death protein 2, C-terminal putative domain; Programmed cell death protein 2, C-terminal putative domain. " Q#5222 - CGI_10027013 superfamily 245225 474 688 8.62E-08 54.0517 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#5222 - CGI_10027013 superfamily 247986 186 288 3.79E-06 47.753 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#5222 - CGI_10027013 superfamily 247986 718 797 9.99E-05 43.5158 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#5222 - CGI_10027013 superfamily 197504 885 995 0.000910305 39.1949 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#5223 - CGI_10027014 superfamily 241802 33 334 6.06E-97 292.469 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#5224 - CGI_10027015 superfamily 246598 232 446 4.85E-102 308.134 cl13996 MPN superfamily C - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#5224 - CGI_10027015 superfamily 241645 5 76 0.00735452 34.4234 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#5224 - CGI_10027015 superfamily 147282 125 264 6.17E-68 215.648 cl04889 zf-NPL4 superfamily - - "NPL4 family, putative zinc binding region; The HRD4 gene was identical to NPL4, a gene previously implicated in nuclear transport. Using a diverse set of substrates and direct ubiquitination assays, analysis revealed that HRD4/NPL4 is required for a poorly characterized step in ER-associated degradation after ubiquitination of target proteins but before their recognition by the 26S proteasome. This region of the protein contains possibly two zinc binding motifs (Bateman A pers. obs.). Npl4p physically associates with Cdc48p via Ufd1p to form a Cdc48p-Ufd1p-Npl4p complex. The Cdc48-Ufd1-Npl4 complex functions in the recognition of several polyubiquitin-tagged proteins and facilitates their presentation to the 26S proteasome for processive degradation or even more specific processing." Q#5227 - CGI_10027018 superfamily 243061 17 118 1.74E-38 127.458 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5228 - CGI_10027019 superfamily 248097 15 53 3.02E-08 45.3338 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#5231 - CGI_10027022 superfamily 241763 90 336 1.39E-139 398.567 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#5231 - CGI_10027022 superfamily 203856 29 73 2.01E-12 61.0676 cl06937 Propeptide_C1 superfamily - - Peptidase family C1 propeptide; This motif is found at the N terminal of some members of the Peptidase_C1 family (pfam00112) and is involved in activation of this peptidase. Q#5233 - CGI_10027024 superfamily 247802 242 437 7.18E-97 295.958 cl17248 RIO superfamily - - "RIO kinase family, catalytic domain. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). RIO kinases are atypical protein serine kinases present in archaea, bacteria and eukaryotes. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. RIO kinases contain a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. Most organisms contain at least two RIO kinases, RIO1 and RIO2. A third protein, RIO3, is present in multicellular eukaryotes. In yeast, RIO1 and RIO2 are essential for survival. They function as non-ribosomal factors necessary for late 18S rRNA processing. RIO1 is also required for proper cell cycle progression and chromosome maintenance. The biological substrates for RIO kinases are still unknown." Q#5237 - CGI_10027028 superfamily 247905 462 617 8.85E-13 66.8776 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#5237 - CGI_10027028 superfamily 247805 250 406 8.40E-11 61.1992 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#5237 - CGI_10027028 superfamily 214946 728 932 4.00E-32 128.245 cl15345 Sec63 superfamily C - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#5239 - CGI_10027030 superfamily 247949 16 379 7.49E-173 498.773 cl17395 TAF6 superfamily - - "TATA Binding Protein (TBP) Associated Factor 6 (TAF6) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex; The TATA Binding Protein (TBP) Associated Factor 6 (TAF6) is one of several TAFs that bind TBP and are involved in forming Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTFs) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A new, unified nomenclature has been suggested for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF6 is a shared subunit of histone acetyltransferase complex SAGA and TFIID complexes. TAF6 domain interacts with TAF9 and makes a novel histone-like heterodimer that is structurally related to histones H4 and H3. TAF6 may also interact with the downstream core promoter element (DPE)." Q#5240 - CGI_10027031 superfamily 245599 258 434 5.02E-36 131.576 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#5240 - CGI_10027031 superfamily 207662 110 203 6.59E-35 125.636 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#5242 - CGI_10027033 superfamily 247824 49 309 2.33E-10 59.5455 cl17270 APH_ChoK_like superfamily - - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#5243 - CGI_10027034 superfamily 247792 60 110 3.36E-06 45.1292 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5243 - CGI_10027034 superfamily 241563 206 237 0.000204361 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5243 - CGI_10027034 superfamily 221533 261 337 0.00196906 36.906 cl13726 TMF_DNA_bd superfamily - - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#5244 - CGI_10027035 superfamily 241599 195 247 2.81E-17 74.202 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#5245 - CGI_10027036 superfamily 243058 1135 1237 3.17E-09 56.5539 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5245 - CGI_10027036 superfamily 245201 3 254 7.97E-80 264.36 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5246 - CGI_10027037 superfamily 247724 33 270 1.17E-06 46.6808 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5248 - CGI_10027039 superfamily 248247 25 450 1.00E-124 380.797 cl17693 Integrin_beta superfamily - - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#5248 - CGI_10027039 superfamily 219669 616 679 0.00593744 35.8316 cl06832 Integrin_B_tail superfamily C - Integrin beta tail domain; This is the beta tail domain of the Integrin protein. Integrins are receptors which are involved in cell-cell and cell-extracellular matrix interactions. Q#5249 - CGI_10027040 superfamily 241750 2 182 3.73E-33 119.986 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#5250 - CGI_10004940 superfamily 247692 81 708 0 944.351 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#5252 - CGI_10004942 superfamily 246723 55 145 4.85E-32 123.953 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#5254 - CGI_10004944 superfamily 245601 91 262 6.59E-30 111.26 cl11399 HP superfamily - - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#5256 - CGI_10004946 superfamily 241613 41 75 2.58E-09 50.283 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5256 - CGI_10004946 superfamily 241613 1 35 1.39E-07 45.6606 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5257 - CGI_10004947 superfamily 214565 34 95 0.00400452 32.5345 cl18312 VWC_out superfamily - - von Willebrand factor (vWF) type C domain; von Willebrand factor (vWF) type C domain. Q#5258 - CGI_10004948 superfamily 241636 5 174 1.25E-61 198.966 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#5260 - CGI_10017553 superfamily 241574 165 302 1.63E-66 207.846 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#5260 - CGI_10017553 superfamily 241626 5 121 1.98E-28 107.368 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#5261 - CGI_10017554 superfamily 243072 36 129 2.05E-24 92.0614 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5262 - CGI_10017555 superfamily 241622 606 689 7.07E-18 81.459 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#5262 - CGI_10017555 superfamily 241622 748 835 6.17E-12 64.125 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#5262 - CGI_10017555 superfamily 241622 392 466 0.000286091 41.0131 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#5262 - CGI_10017555 superfamily 152488 125 282 2.39E-25 104.998 cl13485 DUF3534 superfamily - - Domain of unknown function (DUF3534); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 150 amino acids in length. This domain is found associated with pfam00595. This domain has a conserved GILD sequence motif. Q#5264 - CGI_10017557 superfamily 247724 18 176 6.11E-126 354.021 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5266 - CGI_10017559 superfamily 215647 209 477 1.04E-63 211.313 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#5266 - CGI_10017559 superfamily 243029 49 112 1.18E-21 89.3321 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#5266 - CGI_10017559 superfamily 215647 131 191 2.83E-16 77.6488 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#5268 - CGI_10017563 superfamily 242095 343 603 3.41E-67 220.455 cl00793 DUF92 superfamily - - Integral membrane protein DUF92; Members of this family have several predicted transmembrane helices. The function of these prokaryotic proteins is unknown. Q#5270 - CGI_10017565 superfamily 246921 237 293 6.14E-09 53.1481 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#5270 - CGI_10017565 superfamily 246921 169 230 7.09E-06 44.2885 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#5272 - CGI_10017567 superfamily 217007 48 352 4.30E-65 210.919 cl11995 Syja_N superfamily C - SacI homology domain; This Pfam family represents a protein domain which shows homology to the yeast protein SacI. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin. Q#5274 - CGI_10017570 superfamily 241547 15 260 2.41E-80 256.056 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#5274 - CGI_10017570 superfamily 241547 284 499 4.20E-79 252.589 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#5274 - CGI_10017570 superfamily 241547 576 592 0.00445609 37.6476 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#5277 - CGI_10001261 superfamily 245864 54 288 3.37E-51 176.315 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#5278 - CGI_10018493 superfamily 241564 152 220 2.73E-29 108.122 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#5278 - CGI_10018493 superfamily 241564 21 88 1.57E-20 83.8543 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#5278 - CGI_10018493 superfamily 247792 303 341 0.000153156 38.966 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5279 - CGI_10018494 superfamily 241564 142 210 8.20E-25 95.0251 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#5279 - CGI_10018494 superfamily 241564 21 88 2.88E-20 82.3135 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#5279 - CGI_10018494 superfamily 247792 243 281 0.00799904 33.5732 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5280 - CGI_10018495 superfamily 220656 39 153 1.12E-39 131.991 cl10939 Erf4 superfamily - - Golgin subfamily A member 7/ERF4 family; This family of proteins includes Golgin subfamily A member 7 proteins as well as Ras modification protein ERF4. Q#5284 - CGI_10018499 superfamily 247787 112 396 0 586.553 cl17233 RecA-like_NTPases superfamily - - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#5284 - CGI_10018499 superfamily 215848 412 503 2.74E-19 83.4833 cl08258 ATP-synt_ab_C superfamily - - "ATP synthase alpha/beta chain, C terminal domain; ATP synthase alpha/beta chain, C terminal domain. " Q#5284 - CGI_10018499 superfamily 217261 44 110 3.65E-09 53.6832 cl18399 ATP-synt_ab_N superfamily - - "ATP synthase alpha/beta family, beta-barrel domain; This family includes the ATP synthase alpha and beta subunits the ATP synthase associated with flagella." Q#5286 - CGI_10018501 superfamily 219738 120 221 6.46E-22 91.3322 cl06980 Anillin superfamily - - "Cell division protein anillin; Anillin is a protein involved in septin organisation during cell division. It is an actin binding protein that is localised to the cleavage furrow, and it maintains the localisation of active myosin, which ensures the spatial control of concerted contraction during cytokinesis." Q#5286 - CGI_10018501 superfamily 247725 309 414 3.49E-12 63.1533 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5286 - CGI_10018501 superfamily 241602 50 103 0.000232239 39.0318 cl00087 HR1 superfamily - - "Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases; The HR1 domain, also called the ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. It is found in Rho effector proteins including PKC-related kinases such as vertebrate PRK1 (or PKN) and yeast PKC1 protein kinases C, as well as in rhophilins and Rho-associated kinase (ROCK). Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 domains may occur in repeat arrangements (PKN contains three HR1 domains), separated by a short linker region." Q#5290 - CGI_10018505 superfamily 243074 959 996 2.90E-09 54.4349 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#5292 - CGI_10018507 superfamily 216056 8 146 1.67E-43 155.931 cl08279 Peptidase_M16 superfamily - - Insulinase (Peptidase family M16); Insulinase (Peptidase family M16). Q#5292 - CGI_10018507 superfamily 218490 171 350 3.09E-22 96.0063 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#5292 - CGI_10018507 superfamily 218490 705 887 3.14E-15 75.2055 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#5293 - CGI_10018508 superfamily 241555 682 902 2.78E-118 361.169 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#5293 - CGI_10018508 superfamily 241547 390 622 1.07E-44 161.682 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#5293 - CGI_10018508 superfamily 241547 62 187 2.07E-23 100.05 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#5293 - CGI_10018508 superfamily 241547 232 272 1.09E-06 49.1923 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#5293 - CGI_10018508 superfamily 241547 248 336 0.00973974 37.2624 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#5294 - CGI_10018509 superfamily 245208 65 434 0 575.371 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#5297 - CGI_10018512 superfamily 214507 93 146 0.00057949 36.6392 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#5299 - CGI_10018514 superfamily 243035 20 136 5.85E-18 75.3489 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5300 - CGI_10018515 superfamily 243035 2 105 1.08E-22 86.1345 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5301 - CGI_10018516 superfamily 243035 451 569 2.07E-23 96.5349 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5301 - CGI_10018516 superfamily 243035 603 715 3.08E-18 81.8973 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5301 - CGI_10018516 superfamily 243035 333 433 2.17E-16 76.5045 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5301 - CGI_10018516 superfamily 243035 117 209 6.29E-10 57.2445 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5301 - CGI_10018516 superfamily 243035 15 141 6.43E-19 83.8933 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5302 - CGI_10018517 superfamily 225324 686 845 0.00763769 38.5004 cl11972 COG2604 superfamily N - Uncharacterized protein conserved in bacteria [Function unknown] Q#5304 - CGI_10018519 superfamily 241766 23 265 5.60E-94 303.22 cl00303 PNP_UDP_1 superfamily - - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#5305 - CGI_10018520 superfamily 241592 62 88 0.00446582 33.0703 cl00074 H2A superfamily N - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#5306 - CGI_10018521 superfamily 203029 72 206 9.51E-81 247.938 cl18233 GRASP55_65 superfamily - - "GRASP55/65 PDZ-like domain; GRASP55 (Golgi re-assembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide- sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system. This region appears to be related to the PDZ domain." Q#5306 - CGI_10018521 superfamily 203029 15 101 1.68E-22 92.3176 cl18233 GRASP55_65 superfamily N - "GRASP55/65 PDZ-like domain; GRASP55 (Golgi re-assembly stacking protein of 55 kDa) and GRASP65 (a 65 kDa) protein are highly homologous. GRASP55 is a component of the Golgi stacking machinery. GRASP65, an N-ethylmaleimide- sensitive membrane protein required for the stacking of Golgi cisternae in a cell-free system. This region appears to be related to the PDZ domain." Q#5307 - CGI_10018522 superfamily 215859 513 717 2.31E-56 192.046 cl18347 Peptidase_S9 superfamily - - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#5309 - CGI_10018524 superfamily 244859 26 153 6.52E-06 44.4597 cl08171 HtrL_YibB superfamily N - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#5312 - CGI_10007812 superfamily 243061 135 238 1.04E-35 130.54 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5312 - CGI_10007812 superfamily 214531 356 399 1.45E-08 51.8337 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5312 - CGI_10007812 superfamily 214531 537 578 6.29E-08 49.9077 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5312 - CGI_10007812 superfamily 214531 579 621 7.31E-06 44.1297 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5312 - CGI_10007812 superfamily 214531 450 490 0.00048674 38.7369 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5312 - CGI_10007812 superfamily 214531 497 534 0.00074875 37.9665 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5312 - CGI_10007812 superfamily 214531 274 311 0.00417239 35.6553 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5312 - CGI_10007812 superfamily 214531 328 354 0.0062425 35.2701 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#5314 - CGI_10007814 superfamily 243134 28 152 1.64E-29 109.661 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#5314 - CGI_10007814 superfamily 243134 171 294 9.21E-24 93.8679 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#5316 - CGI_10007816 superfamily 243134 36 159 4.40E-33 119.291 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#5316 - CGI_10007816 superfamily 243134 191 293 1.73E-25 98.8755 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#5317 - CGI_10007817 superfamily 243134 31 154 1.87E-33 120.062 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#5317 - CGI_10007817 superfamily 243134 169 287 4.10E-29 108.506 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#5318 - CGI_10007818 superfamily 247684 45 250 5.95E-36 135.868 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#5318 - CGI_10007818 superfamily 247724 12 45 4.91E-06 44.7542 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5319 - CGI_10007819 superfamily 245323 310 540 3.56E-80 267.572 cl10511 Beach superfamily - - "BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins." Q#5319 - CGI_10007819 superfamily 243092 1533 1782 1.42E-33 133.229 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5320 - CGI_10007820 superfamily 241622 157 225 2.71E-10 58.347 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#5321 - CGI_10007821 superfamily 247792 206 249 0.000167588 38.1956 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5321 - CGI_10007821 superfamily 202367 2 164 9.20E-32 116.485 cl18226 3HCDH_N superfamily - - "3-hydroxyacyl-CoA dehydrogenase, NAD binding domain; This family also includes lambda crystallin." Q#5322 - CGI_10007822 superfamily 220679 135 274 8.22E-09 53.0997 cl18567 Methyltransf_16 superfamily - - Putative methyltransferase; Putative methyltransferase. Q#5323 - CGI_10007823 superfamily 243555 27 215 0.000981595 38.9114 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#5325 - CGI_10003572 superfamily 241578 302 479 5.04E-46 163.155 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5325 - CGI_10003572 superfamily 207701 57 174 2.32E-31 120.092 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#5325 - CGI_10003572 superfamily 148333 679 860 8.15E-11 61.143 cl05947 ITI_HC_C superfamily - - "Inter-alpha-trypsin inhibitor heavy chain C-terminus; This family represents the C-terminal region of inter-alpha-trypsin inhibitor heavy chains. Inter-alpha-trypsin inhibitors are glycoproteins with a high inhibitory activity against trypsin, built up from different combinations of four polypeptides: bikunin and the three heavy chains that belong to this family (HC1, HC2, HC3). The heavy chains do not have any protease inhibitory properties but have the capacity to interact in vitro and in vivo with hyaluronic acid, which promotes the stability of the extra-cellular matrix. All family members contain the pfam00092 domain." Q#5326 - CGI_10003573 superfamily 241868 153 241 1.91E-18 78.5627 cl00447 Nudix_Hydrolase superfamily C - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#5327 - CGI_10003574 superfamily 152704 309 543 6.51E-116 346.384 cl13675 zf-CpG_bind_C superfamily - - "CpG binding protein zinc finger C terminal domain; This domain family is found in eukaryotes, and is approximately 240 amino acids in length. This domain is the zinc finger domain of a CpG binding DNA methyltransferase protein. It contains a CxxC motif which forms the zinc finger and binds to DNA." Q#5327 - CGI_10003574 superfamily 202085 123 162 1.65E-13 65.8446 cl03401 zf-CXXC superfamily - - "CXXC zinc finger domain; This domain contains eight conserved cysteine residues that bind to two zinc ions. The CXXC domain is found in a variety of chromatin-associated proteins. This domain binds to nonmethyl-CpG dinucleotides. The domain is characterized by two CGXCXXC repeats. The RecQ helicase has a single repeat that also binds to zinc, but this has not been included in this family. The DNA binding interface has been identified by NMR." Q#5327 - CGI_10003574 superfamily 247999 30 74 1.42E-07 48.7474 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#5328 - CGI_10003575 superfamily 248012 62 191 1.84E-10 54.9721 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#5329 - CGI_10003576 superfamily 248012 103 216 5.22E-11 57.2833 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#5330 - CGI_10003577 superfamily 248012 13 142 1.23E-10 54.9721 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#5337 - CGI_10023702 superfamily 241832 25 147 3.03E-31 113.404 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#5341 - CGI_10023706 superfamily 245864 1 402 2.34E-93 313.061 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#5341 - CGI_10023706 superfamily 244881 1404 1731 9.67E-53 189.711 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#5341 - CGI_10023706 superfamily 203720 1851 1943 5.74E-19 84.9073 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#5341 - CGI_10023706 superfamily 216731 539 631 1.80E-17 80.7659 cl12258 A2M_N superfamily - - MG2 domain; This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin. Q#5341 - CGI_10023706 superfamily 215788 1160 1253 6.98E-15 72.9823 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#5342 - CGI_10023707 superfamily 246680 1034 1082 0.00017526 41.167 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#5342 - CGI_10023707 superfamily 219405 175 342 0.00443099 39.1154 cl06450 PCEMA1 superfamily N - Acidic phosphoprotein precursor PCEMA1; This family consists of several acidic phosphoprotein precursor PCEMA1 sequences which appear to be found exclusively in Plasmodium chabaudi. PCEMA1 is an antigen that is associated with the membrane of the infected erythrocyte throughout the entire intraerythrocytic cycle. The exact function of this family is unclear. Q#5342 - CGI_10023707 superfamily 246680 609 681 0.00788726 35.8645 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#5343 - CGI_10023708 superfamily 149414 1 54 1.33E-18 81.1638 cl07091 TRP_2 superfamily - - Transient receptor ion channel II; This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023). Q#5344 - CGI_10023709 superfamily 216897 8 86 1.13E-28 103.53 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#5344 - CGI_10023709 superfamily 216897 105 183 1.02E-27 100.834 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#5345 - CGI_10023710 superfamily 216897 30 101 7.79E-21 81.9589 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#5346 - CGI_10023711 superfamily 222150 243 267 0.00192984 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5346 - CGI_10023711 superfamily 246975 231 251 0.00827773 33.4745 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#5347 - CGI_10023712 superfamily 206040 36 220 4.13E-115 329.177 cl16444 NUDIX_2 superfamily - - "Nucleotide hydrolase; Nudix hydrolases are found in all classes of organism and hydrolyse a wide range of organic pyrophosphates, including nucleoside di- and triphosphates, di-nucleoside and diphospho-inositol polyphosphates, nucleotide sugars and RNA caps, with varying degrees of substrate specificity." Q#5348 - CGI_10023713 superfamily 242793 364 523 2.01E-70 225.78 cl01947 MT-A70 superfamily - - "MT-A70; MT-A70 is the S-adenosylmethionine-binding subunit of human mRNA:m6A methyl-transferase (MTase), an enzyme that sequence-specifically methylates adenines in pre-mRNAs." Q#5349 - CGI_10023714 superfamily 243077 3 56 3.56E-16 69.8817 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#5351 - CGI_10023716 superfamily 241578 223 387 1.41E-58 203.228 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5351 - CGI_10023716 superfamily 241578 25 181 9.69E-34 131.26 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5351 - CGI_10023716 superfamily 247097 471 507 0.000260657 42.053 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#5352 - CGI_10023717 superfamily 241578 992 1152 4.29E-55 192.121 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5352 - CGI_10023717 superfamily 241578 592 752 5.61E-55 191.736 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5352 - CGI_10023717 superfamily 241578 326 490 9.03E-48 170.872 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5352 - CGI_10023717 superfamily 241578 792 958 2.80E-50 178.249 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5352 - CGI_10023717 superfamily 240521 62 281 1.52E-33 137.824 cl18940 Syo1_like superfamily N - "Fungal symportin 1 (syo1) and similar proteins; This family of eukaryotic proteins includes Saccharomyces cerevisiae Ydl063c and Chaetomium thermophilum Syo1, which mediate the co-import of two ribosomal proteins, Rpl5 and Rpl11 (which both interact with 5S rRNA) into the nucleus. Import precedes their association with rRNA and subsequent ribosome assembly in the nucleolus. The primary structure of syo1 is a mixture of Armadillo- (ARM, N-terminal part of syo1) and HEAT-repeats (C-terminal part of syo1)." Q#5352 - CGI_10023717 superfamily 246918 511 562 3.24E-16 76.0863 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5352 - CGI_10023717 superfamily 247097 1466 1501 0.000200647 41.6678 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#5352 - CGI_10023717 superfamily 247097 2118 2152 0.00619077 36.9737 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#5352 - CGI_10023717 superfamily 248289 1739 1797 0.00811917 36.7252 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#5352 - CGI_10023717 superfamily 247097 2006 2040 0.00819115 36.6602 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#5353 - CGI_10023718 superfamily 215647 216 458 3.49E-39 143.133 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#5353 - CGI_10023718 superfamily 243029 147 203 1.57E-11 60.4421 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#5354 - CGI_10023719 superfamily 145416 106 254 2.78E-79 238.322 cl03502 PA28_beta superfamily - - Proteasome activator pa28 beta subunit; PA28 activator complex (also known as 11s regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha and beta subunits. This family represents the beta subunit. The activator complex binds to the 20S proteasome ana simulates peptidase activity in and ATP-independent manner. Q#5354 - CGI_10023719 superfamily 145415 17 71 1.31E-11 58.0596 cl03501 PA28_alpha superfamily - - Proteasome activator pa28 alpha subunit; PA28 activator complex (also known as 11s regulator of 20S proteasome) is a ring shaped hexameric structure of alternating alpha and beta subunits. This family represents the alpha subunit. The activator complex binds to the 20S proteasome ana simulates peptidase activity in and ATP-independent manner. Q#5356 - CGI_10023721 superfamily 216731 137 232 1.10E-23 98.0999 cl12258 A2M_N superfamily - - MG2 domain; This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin. Q#5356 - CGI_10023721 superfamily 248289 953 1002 0.0049445 36.7252 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#5359 - CGI_10023724 superfamily 241597 98 161 2.20E-15 67.6529 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#5359 - CGI_10023724 superfamily 241597 8 75 9.40E-15 65.7269 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#5360 - CGI_10023725 superfamily 206037 123 189 1.95E-23 91.1512 cl16442 zf-SAP30 superfamily - - "SAP30 zinc-finger; SAP30 is a subunit of the histone deacetylase complex, and this domain is a zinc-finger. Solution of the structure shows a novel fold comprising two beta-strands and two alpha-helices with the zinc organising centre showing remote resemblance to the treble clef motif. In silico analysis of the structure revealed a highly conserved surface dominated by basic residues. NMR-based analysis of potential ligands for the SAP30 zn-finger motif indicated a strong preference for nucleic acid substrates. The zinc-finger of SAP3 probably functions as a double-stranded DNA-binding motif, thereby expanding the known functions of both SAP30 and the mammalian Sin3 co-repressor complex." Q#5360 - CGI_10023725 superfamily 206038 212 262 5.90E-23 89.2364 cl16443 SAP30_Sin3_bdg superfamily - - Sin3 binding region of histone deacetylase complex subunit SAP30; This C-terminal domain of the SAP30 proteins appears to be the binding region for Sin3. Q#5361 - CGI_10023726 superfamily 246598 7 271 3.81E-143 405.821 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#5362 - CGI_10023727 superfamily 242881 8 184 6.78E-117 337.623 cl02099 CK_II_beta superfamily - - Casein kinase II regulatory subunit; Casein kinase II regulatory subunit. Q#5363 - CGI_10023728 superfamily 206630 433 534 3.94E-31 119.762 cl16900 NFRKB_winged superfamily - - NFRKB Winged Helix-like; This domain covers regions 370-495 of human nuclear factor related to kappaB binding (NFRKB) protein. Q#5364 - CGI_10023729 superfamily 241874 46 624 0 583.372 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#5376 - CGI_10001571 superfamily 243175 88 210 1.34E-68 209.725 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#5376 - CGI_10001571 superfamily 241832 4 78 3.23E-32 114.273 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#5378 - CGI_10024023 superfamily 243072 823 948 1.76E-36 135.589 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5378 - CGI_10024023 superfamily 243072 988 1113 7.59E-36 133.663 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5378 - CGI_10024023 superfamily 243072 745 882 6.55E-31 119.411 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5378 - CGI_10024023 superfamily 243072 960 988 0.000247807 40.23 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5380 - CGI_10024025 superfamily 247999 178 222 8.96E-10 53.755 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#5382 - CGI_10024027 superfamily 243066 98 189 7.38E-29 108.794 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#5382 - CGI_10024027 superfamily 219619 389 459 3.72E-11 59.1435 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#5385 - CGI_10024030 superfamily 246680 2 60 6.75E-06 43.7296 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#5388 - CGI_10024033 superfamily 246616 73 332 2.07E-28 112.4 cl14105 MetH superfamily - - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#5389 - CGI_10024034 superfamily 248097 5 127 3.75E-23 88.4762 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#5391 - CGI_10024036 superfamily 247057 323 371 0.00611668 34.4657 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#5392 - CGI_10024037 superfamily 203324 57 174 5.06E-57 181.295 cl18240 UEV superfamily - - "UEV domain; This family includes the eukaryotic tumour susceptibility gene 101 protein (TSG101). Altered transcripts of this gene have been detected in sporadic breast cancers and many other human malignancies. However, the involvement of this gene in neoplastic transformation and tumorigenesis is still elusive. TSG101 is required for normal cell function of embryonic and adult tissues but that this gene is not a tumour suppressor for sporadic forms of breast cancer. This family is related to the ubiquitin conjugating enzymes." Q#5392 - CGI_10024037 superfamily 247057 240 282 0.00197856 35.3489 cl15755 SAM_superfamily superfamily N - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#5393 - CGI_10024038 superfamily 152683 72 170 5.07E-07 46.8973 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#5394 - CGI_10024039 superfamily 215647 786 935 3.13E-11 63.3964 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#5394 - CGI_10024039 superfamily 243029 196 241 9.65E-08 50.4269 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#5394 - CGI_10024039 superfamily 245814 402 490 0.000361267 40.1813 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5394 - CGI_10024039 superfamily 245814 108 189 0.00154451 38.2553 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5396 - CGI_10024041 superfamily 152683 111 212 2.11E-11 61.1497 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#5397 - CGI_10024042 superfamily 152683 2 95 2.69E-08 46.8973 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#5398 - CGI_10024043 superfamily 248458 329 508 3.41E-11 63.4869 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5398 - CGI_10024043 superfamily 248458 158 221 0.0080562 37.2933 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5399 - CGI_10024044 superfamily 203446 112 177 1.47E-14 70.0611 cl05764 Sec2p superfamily - - "GDP/GTP exchange factor Sec2p; In Saccharomyces cerevisiae, Sec2p is a GDP/GTP exchange factor for Sec4p, which is required for vesicular transport at the post-Golgi stage of yeast secretion." Q#5399 - CGI_10024044 superfamily 243092 396 593 7.52E-06 46.9444 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5400 - CGI_10024045 superfamily 183292 40 72 0.00762391 35.5676 cl18135 PRK11728 superfamily C - hydroxyglutarate oxidase; Provisional Q#5401 - CGI_10024046 superfamily 218493 218 361 9.12E-35 125.933 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#5405 - CGI_10024051 superfamily 204914 79 228 1.15E-43 156.285 cl13809 Condensin2nSMC superfamily - - "Condensin II non structural maintenance of chromosomes subunit; This domain family is found in eukaryotes, and is approximately 150 amino acids in length. This family is part of a non-SMC subunit of condensin II which is involved in maintenance of the structural integrity of chromosomes. Condensin II is made up of SMC (structural maintenance of chromosomes) and non-SMC subunits. The non-SMC subunits bind to the catalytic ends of the SMC subunit dimer. The condensin holocomplex is able to introduce superhelical tension into DNA in an ATP hydrolysis- dependent manner, resulting in the formation of positive supercoils in the presence of topoisomerase I and of positive knots in the presence of topoisomerase II." Q#5409 - CGI_10024055 superfamily 241596 46 86 1.95E-05 41.0455 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#5410 - CGI_10024056 superfamily 243205 97 388 2.63E-174 493.37 cl02823 phosphagen_kinases superfamily - - "Phosphagen (guanidino) kinases; Phosphagen (guanidino) kinases are enzymes that transphosphorylate a high energy phosphoguanidino compound, like phosphocreatine (PCr) in the case of creatine kinase (CK) or phosphoarginine in the case of arginine kinase, which is used as an energy-storage and -transport metabolite, to ADP, thereby creating ATP. The substrate binding site is located in the cleft between the N and C-terminal domains, but most of the catalytic residues are found in the larger C-terminal domain. In higher eukaryotes, CK exists in tissue-specific (muscle, brain), as well as compartment-specific (mitochondrial and cytosolic) isoforms. They are either coupled to glycolysis (cytosolic form) or oxidative phosphorylation (mitochondrial form). Besides CK and AK, the most studied members of this family are also other phosphagen kinases with different substrate specificities, like glycocyamine kinase (GK), lombricine kinase (LK), taurocyamine kinase (TK) and hypotaurocyamine kinase (HTK). The majority of bacterial phosphagen kinases appear to lack the N-terminal domain and have not been functionally characterized." Q#5412 - CGI_10024058 superfamily 110440 310 336 7.02E-05 40.4689 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#5414 - CGI_10001888 superfamily 217311 1 351 3.26E-112 342.009 cl18402 DUF229 superfamily N - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#5417 - CGI_10001510 superfamily 243061 41 141 5.22E-36 122.451 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5417 - CGI_10001510 superfamily 243061 3 37 2.41E-10 53.885 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5418 - CGI_10001511 superfamily 243061 150 191 3.42E-18 76.2266 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5420 - CGI_10002284 superfamily 247805 31 129 1.84E-11 57.3472 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#5422 - CGI_10018012 superfamily 247856 18 74 8.76E-16 65.6469 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5423 - CGI_10018013 superfamily 110440 188 215 0.00334021 33.9205 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#5424 - CGI_10018014 superfamily 242523 80 230 4.36E-10 58.0589 cl01472 DUF2236 superfamily C - "Uncharacterized protein conserved in bacteria (DUF2236); This domain, found in various hypothetical bacterial proteins, has no known function. This family contains a highly conserved arginine and histidine that may be active site residues for an as yet unknown catalytic activity." Q#5427 - CGI_10018017 superfamily 247724 5 152 1.78E-79 236.457 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5429 - CGI_10018019 superfamily 217206 216 377 5.68E-72 229.879 cl03682 SKIP_SNW superfamily - - SKIP/SNW domain; This domain is found in chromatin proteins. Q#5430 - CGI_10018020 superfamily 248067 138 254 8.91E-43 152.362 cl17513 ABC1 superfamily - - "ABC1 family; This family includes ABC1 from yeast and AarF from E. coli. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex and E. coli AarF is required for ubiquinone production. It has been suggested that members of the ABC1 family are novel chaperonins. These proteins are unrelated to the ABC transporter proteins." Q#5430 - CGI_10018020 superfamily 248067 528 644 3.34E-42 150.821 cl17513 ABC1 superfamily - - "ABC1 family; This family includes ABC1 from yeast and AarF from E. coli. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex and E. coli AarF is required for ubiquinone production. It has been suggested that members of the ABC1 family are novel chaperonins. These proteins are unrelated to the ABC transporter proteins." Q#5431 - CGI_10018021 superfamily 222439 1 192 1.77E-61 195.921 cl16461 Glyco_transf_49 superfamily N - "Glycosyl-transferase for dystroglycan; This glycosyl-transferase brings about the glycosylation of the alpha-dystroglycan subunit. Dystroglycan is an integral member of the skeletal muscular dystrophin glycoprotein complex, which links dystrophin to proteins in the extracellular matrix." Q#5432 - CGI_10018022 superfamily 243034 832 929 9.01E-14 68.9459 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5432 - CGI_10018022 superfamily 243034 762 861 3.11E-13 67.4052 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5432 - CGI_10018022 superfamily 241568 169 233 1.99E-11 61.3248 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#5432 - CGI_10018022 superfamily 243034 702 787 2.13E-08 53.1528 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5432 - CGI_10018022 superfamily 243034 904 978 8.15E-05 41.982 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5432 - CGI_10018022 superfamily 241568 42 95 0.000197334 40.524 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#5432 - CGI_10018022 superfamily 241568 243 300 0.00138864 38.2128 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#5433 - CGI_10018023 superfamily 243179 124 249 4.95E-27 104.438 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#5435 - CGI_10018025 superfamily 222090 105 309 7.19E-13 66.1422 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#5436 - CGI_10018026 superfamily 241600 77 290 1.20E-78 240.22 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#5437 - CGI_10018027 superfamily 241600 66 279 3.23E-93 277.199 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#5438 - CGI_10018028 superfamily 243119 59 99 0.000221554 35.8826 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#5439 - CGI_10018029 superfamily 246908 18 104 1.51E-31 115.317 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#5439 - CGI_10018029 superfamily 245201 133 381 9.48E-153 434.198 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5440 - CGI_10018030 superfamily 241584 61 151 2.87E-14 65.9807 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5440 - CGI_10018030 superfamily 245814 176 236 0.00241415 35.1575 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5440 - CGI_10018030 superfamily 216647 7 53 2.86E-07 46.7761 cl03309 DB superfamily N - "DB module; This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657." Q#5443 - CGI_10018033 superfamily 154924 105 211 3.42E-38 129.817 cl02467 C4 superfamily - - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#5443 - CGI_10018033 superfamily 154924 32 91 1.28E-21 86.1964 cl02467 C4 superfamily N - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#5447 - CGI_10018037 superfamily 154924 124 204 3.48E-28 104.686 cl02467 C4 superfamily C - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#5449 - CGI_10018039 superfamily 154924 1375 1477 8.68E-49 170.94 cl02467 C4 superfamily - - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#5449 - CGI_10018039 superfamily 154924 1491 1597 7.59E-38 139.833 cl02467 C4 superfamily - - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#5452 - CGI_10018042 superfamily 154924 1405 1511 2.21E-56 192.897 cl02467 C4 superfamily - - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#5452 - CGI_10018042 superfamily 154924 1512 1627 1.26E-42 153.7 cl02467 C4 superfamily - - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#5453 - CGI_10018043 superfamily 248458 348 524 6.40E-09 56.5533 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5453 - CGI_10018043 superfamily 248458 137 238 4.06E-07 50.7753 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5455 - CGI_10018045 superfamily 248289 30 86 0.000113784 40.0999 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#5456 - CGI_10018046 superfamily 245201 268 456 7.63E-52 177.429 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5456 - CGI_10018046 superfamily 246680 9 126 8.44E-23 94.0021 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#5457 - CGI_10018047 superfamily 248305 196 573 7.62E-93 296.957 cl17751 Glyco_transf_22 superfamily - - Alg9-like mannosyltransferase family; Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. Q#5457 - CGI_10018047 superfamily 248305 42 189 1.44E-43 161.752 cl17751 Glyco_transf_22 superfamily C - Alg9-like mannosyltransferase family; Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. Q#5458 - CGI_10018048 superfamily 243161 4 93 6.84E-15 65.5342 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#5459 - CGI_10018049 superfamily 248236 72 242 1.22E-16 78.8804 cl17682 PRK07189 superfamily C - malonate decarboxylase subunit beta; Reviewed Q#5459 - CGI_10018049 superfamily 247899 344 513 8.55E-11 62.1315 cl17345 AccA superfamily N - Acetyl-CoA carboxylase alpha subunit [Lipid metabolism] Q#5460 - CGI_10018050 superfamily 245596 210 427 7.12E-36 133.761 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#5463 - CGI_10018053 superfamily 242281 79 350 1.33E-45 159.462 cl01067 Dyp_perox superfamily N - Dyp-type peroxidase family; This family of dye-decolourising peroxidases lack a typical heme-binding region. Q#5464 - CGI_10003895 superfamily 243035 792 918 1.43E-14 71.4969 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5464 - CGI_10003895 superfamily 216276 681 772 0.000150261 40.9943 cl15639 Activin_recp superfamily - - "Activin types I and II receptor domain; This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box." Q#5465 - CGI_10003896 superfamily 217598 437 547 2.63E-50 173.424 cl04130 KCNQ_channel superfamily N - KCNQ voltage-gated potassium channel; This family matches to the C-terminal tail of KCNQ type potassium channels. Q#5465 - CGI_10003896 superfamily 219619 218 294 3.50E-12 62.6103 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#5466 - CGI_10003897 superfamily 245201 333 582 9.75E-59 197.844 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5466 - CGI_10003897 superfamily 245201 1 290 0 569.052 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5467 - CGI_10014894 superfamily 243309 1 134 1.39E-28 109.426 cl03119 FpgNei_N superfamily - - "N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases; DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. These enzymes initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycolsylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. The FpgNei DNA glycosylases represent one of the two structural superfamilies of DNA glycosylases that recognize oxidized bases (the other is the HTH-GPD superfamily exemplified by Escherichia coli Nth). Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. One exception is mouse Nei-like glycosylase 3 (Neil3) which forms a Schiff base intermediate via its N-terminal valine. In addition to this FpgNei_N domain, FpgNei proteins have a helix-two-turn-helix (H2TH) domain and a zinc (or zincless)-finger motif which also contribute residues to the active site. FpgNei DNA glycosylases have a broad substrate specificity. They are bifunctional, in addition to the glycosylase (recognition) activity, they have a lyase (cleaving) activity on the phosphodiester backbone of the DNA at the AP site. This superfamily includes eukaryotic, bacterial, and viral proteins." Q#5467 - CGI_10014894 superfamily 219199 408 451 1.31E-14 68.1756 cl06070 zf-GRF superfamily - - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#5467 - CGI_10014894 superfamily 115485 157 200 6.61E-09 53.0966 cl06065 H2TH superfamily NC - Formamidopyrimidine-DNA glycosylase H2TH domain; Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. This family is the central domain containing the DNA-binding helix-two turn-helix domain. Q#5467 - CGI_10014894 superfamily 219199 360 406 6.62E-07 46.2192 cl06070 zf-GRF superfamily - - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#5473 - CGI_10014900 superfamily 242406 230 370 3.95E-22 91.8841 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#5480 - CGI_10009463 superfamily 241571 283 396 3.14E-22 92.4754 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5480 - CGI_10009463 superfamily 241571 151 258 5.67E-08 50.8739 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5483 - CGI_10009466 superfamily 243109 53 101 1.27E-28 102.252 cl02614 SPRY superfamily C - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#5485 - CGI_10009468 superfamily 215827 68 241 2.47E-38 138.37 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#5487 - CGI_10009470 superfamily 245304 160 414 7.44E-179 520.859 cl10459 Peptidases_S8_S53 superfamily - - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#5489 - CGI_10009472 superfamily 241680 13 229 4.36E-49 164.007 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#5490 - CGI_10009473 superfamily 242046 437 599 2.80E-62 204.796 cl00718 TOPRIM superfamily - - "Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function." Q#5490 - CGI_10009473 superfamily 190973 323 390 1.23E-17 78.3311 cl04502 TP6A_N superfamily - - "Type IIB DNA topoisomerase; Type II DNA topoisomerases are ubiquitous enzymes that catalyze the ATP-dependent transport of one DNA duplex through a second DNA segment via a transient double-strand break. Type II DNA topoisomerases are now subdivided into two sub-families, type IIA and IIB DNA topoisomerases. TP6A_N is present in type IIB topoisomerase and is thought to be involved in DNA binding owing to its sequence similarity to E. coli catabolite activator protein (CAP)." Q#5491 - CGI_10009474 superfamily 241749 15 156 1.43E-29 106.701 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#5492 - CGI_10002616 superfamily 245213 34 68 5.52E-06 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5492 - CGI_10002616 superfamily 245213 71 105 5.52E-06 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5492 - CGI_10002616 superfamily 245213 182 216 3.79E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5492 - CGI_10002616 superfamily 245213 108 142 8.31E-05 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5492 - CGI_10002616 superfamily 245213 293 327 0.000136017 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5492 - CGI_10002616 superfamily 245213 145 180 0.000856256 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5492 - CGI_10002616 superfamily 245213 219 254 0.0021895 35.6902 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5492 - CGI_10002616 superfamily 245213 256 291 0.00223631 35.6902 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5494 - CGI_10002749 superfamily 247792 215 260 4.58E-11 56.6852 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5494 - CGI_10002749 superfamily 222714 15 43 3.11E-06 43.3152 cl16832 zf-RING_3 superfamily - - zinc-finger; zinc-finger. Q#5495 - CGI_10002750 superfamily 220796 243 401 8.15E-08 52.3582 cl15661 DUF2454 superfamily N - Protein of unknown function (DUF2454); A Schizosaccharomyces pombe member of this family is known to interact with Tel2. Tel2 is a component of the TOR complexes. Q#5496 - CGI_10002751 superfamily 241802 32 339 8.01E-106 315.195 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#5498 - CGI_10001934 superfamily 218797 10 236 3.34E-38 136.322 cl05454 Ebp2 superfamily - - Eukaryotic rRNA processing protein EBP2; This family consists of several Eukaryotic rRNA processing protein EBP2 sequences. Ebp2p is required for the maturation of 25S rRNA and 60S subunit assembly. Ebp2p may be one of the target proteins of Rrs1p for executing the signal to regulate ribosome biogenesis. This family also plays a role in chromosome segregation. Q#5500 - CGI_10001936 superfamily 245206 41 346 0 554.999 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#5504 - CGI_10002653 superfamily 241832 4 76 1.05E-18 77.2268 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#5504 - CGI_10002653 superfamily 243175 89 164 1.54E-13 63.6223 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#5505 - CGI_10002654 superfamily 245009 133 261 0.000137742 39.5733 cl09109 NTF2_like superfamily - - "Nuclear transport factor 2 (NTF2-like) superfamily. This family includes members of the NTF2 family, Delta-5-3-ketosteroid isomerases, Scytalone Dehydratases, and the beta subunit of Ring hydroxylating dioxygenases. This family is a classic example of divergent evolution wherein the proteins have many common structural details but diverge greatly in their function. For example, nuclear transport factor 2 (NTF2) mediates the nuclear import of RanGDP and binds to both RanGDP and FxFG repeat-containing nucleoporins while Ketosteroid isomerases catalyze the isomerization of delta-5-3-ketosteroid to delta-4-3-ketosteroid, by intramolecular transfer of the C4-beta proton to the C6-beta position. While the function of the beta sub-unit of the Ring hydroxylating dioxygenases is not known, Scytalone Dehydratases catalyzes two reactions in the biosynthetic pathway that produces fungal melanin. Members of the NTF2-like superfamily are widely distributed among bacteria, archaea and eukaryotes." Q#5506 - CGI_10002655 superfamily 243088 19 115 2.13E-10 53.9007 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#5507 - CGI_10002656 superfamily 248012 430 571 8.59E-26 103.557 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#5507 - CGI_10002656 superfamily 214507 312 371 0.000231547 39.3356 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#5508 - CGI_10003344 superfamily 245847 133 174 0.00473519 34.7856 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#5509 - CGI_10003345 superfamily 246751 31 330 5.61E-120 351.546 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#5511 - CGI_10003347 superfamily 246918 93 145 4.89E-10 57.5967 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 260 310 5.06E-09 54.9003 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 1138 1188 6.37E-09 54.5151 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 918 968 7.98E-09 54.1299 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 1193 1243 1.05E-08 53.7447 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 480 530 1.32E-08 53.7447 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 536 585 1.79E-08 53.3595 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 1249 1298 2.24E-08 52.9743 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 150 200 2.52E-08 52.5891 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 315 365 3.04E-08 52.5891 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 590 640 6.39E-08 51.4335 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 1084 1133 7.36E-08 51.4335 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 812 860 8.44E-08 51.0483 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 1028 1078 1.15E-07 50.6631 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 370 420 1.52E-07 50.6631 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 755 805 2.44E-07 49.8927 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 645 695 4.19E-07 49.1223 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 425 474 7.45E-07 48.3519 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 1358 1409 9.56E-07 47.9667 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 205 255 9.79E-07 47.9667 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 701 750 1.20E-05 44.8851 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 867 912 1.47E-05 44.4999 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 1303 1353 0.00278421 37.9515 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5511 - CGI_10003347 superfamily 246918 973 1000 0.00533752 36.7959 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5512 - CGI_10003348 superfamily 193253 303 546 1.87E-20 94.3333 cl15084 MT superfamily N - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#5512 - CGI_10003348 superfamily 193256 7 180 5.85E-12 66.8948 cl18189 AAA_8 superfamily N - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#5512 - CGI_10003348 superfamily 193257 856 1010 8.05E-05 44.9763 cl15086 AAA_9 superfamily N - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#5514 - CGI_10007120 superfamily 247724 9 215 3.43E-22 90.7471 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5515 - CGI_10007121 superfamily 243066 27 125 1.11E-21 89.9841 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#5520 - CGI_10004025 superfamily 243061 506 606 4.32E-32 121.68 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5520 - CGI_10004025 superfamily 243061 403 504 2.72E-31 119.369 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5520 - CGI_10004025 superfamily 243061 147 247 1.30E-30 117.443 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5520 - CGI_10004025 superfamily 243061 302 400 1.36E-29 114.361 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5520 - CGI_10004025 superfamily 243061 715 815 5.41E-25 101.265 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5520 - CGI_10004025 superfamily 243061 45 142 2.65E-23 96.257 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5520 - CGI_10004025 superfamily 243061 609 712 2.11E-17 79.3082 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5520 - CGI_10004025 superfamily 243061 818 911 8.38E-17 77.7674 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#5521 - CGI_10004026 superfamily 242565 68 167 5.99E-05 39.4846 cl01535 Repair_PSII superfamily - - "Repair protein; In plants, this domain plays a role in the photosystem II (PSII) repair cycle. It may be involved in the regulation of synthesis/degradation of the D1 protein of the PSII core and in the assembly of PSII monomers into dimers in the grana stacks. Its function in other organisms is unknown." Q#5522 - CGI_10004027 superfamily 243130 254 286 0.00651757 33.5927 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#5525 - CGI_10015726 superfamily 245814 149 207 6.44E-09 52.1063 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5525 - CGI_10015726 superfamily 245814 232 303 6.72E-10 54.7578 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5525 - CGI_10015726 superfamily 245814 43 127 7.82E-08 49.3079 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5526 - CGI_10015727 superfamily 243054 78 274 2.34E-13 70.5523 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5526 - CGI_10015727 superfamily 243054 182 390 1.00E-07 53.2184 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5526 - CGI_10015727 superfamily 243054 404 603 0.00208863 40.1216 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5526 - CGI_10015727 superfamily 243054 1364 1506 0.0025735 39.7364 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5527 - CGI_10015728 superfamily 243054 107 250 4.92E-06 46.2848 cl02488 SPEC superfamily N - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 5401 5610 4.02E-16 80.9527 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 241559 154 257 6.17E-16 78.1215 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#5528 - CGI_10015729 superfamily 243054 1494 1703 2.28E-13 72.4783 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 241559 282 386 2.13E-12 67.7211 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#5528 - CGI_10015729 superfamily 243054 2773 2981 1.57E-11 66.7004 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 4079 4261 1.77E-11 66.7004 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 4757 4970 1.33E-10 63.6188 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 5186 5399 2.28E-10 62.8484 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 2883 3092 1.35E-09 60.5372 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 3414 3625 2.25E-09 59.7668 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 2456 2669 2.54E-09 59.7668 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 4972 5184 3.33E-09 59.3816 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 5827 6034 4.08E-09 58.9964 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 2671 2865 9.53E-09 57.8408 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 1280 1492 1.21E-08 57.4556 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 4188 4401 4.28E-08 55.9148 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 4542 4755 1.95E-07 53.9888 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 3061 3198 2.50E-07 53.6036 cl02488 SPEC superfamily N - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 5613 5822 4.96E-07 52.8332 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 3532 3736 7.41E-07 52.0628 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 1953 2132 2.66E-06 50.522 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 3864 4075 7.18E-06 49.3664 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 3322 3403 0.00365939 39.6094 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5528 - CGI_10015729 superfamily 243054 1816 2025 0.00896203 40.1216 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#5530 - CGI_10015731 superfamily 243179 2 48 4.69E-11 53.9772 cl02781 tetraspanin_LEL superfamily NC - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#5533 - CGI_10015734 superfamily 207662 169 244 5.57E-23 94.9488 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#5533 - CGI_10015734 superfamily 245599 816 986 1.56E-14 73.0259 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#5533 - CGI_10015734 superfamily 207662 85 162 3.06E-13 66.8292 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#5534 - CGI_10015735 superfamily 241563 7 27 0.000740692 37.4588 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5536 - CGI_10003949 superfamily 243056 38 255 3.98E-45 158.291 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#5537 - CGI_10003950 superfamily 241983 42 349 3.19E-38 141.341 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#5538 - CGI_10003951 superfamily 241563 26 55 0.00151068 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5539 - CGI_10003109 superfamily 245847 27 142 5.30E-11 56.7419 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#5541 - CGI_10003501 superfamily 248264 198 250 1.08E-15 71.1141 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#5541 - CGI_10003501 superfamily 243161 4 68 2.69E-08 49.3558 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#5542 - CGI_10003706 superfamily 243092 123 206 0.00902331 36.1588 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5543 - CGI_10003707 superfamily 241609 131 219 1.50E-26 99.3746 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#5543 - CGI_10003707 superfamily 241609 51 133 9.78E-22 86.2779 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#5546 - CGI_10003903 superfamily 220695 31 147 0.00211045 37.9435 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#5547 - CGI_10003904 superfamily 204985 3 89 6.88E-11 61.4187 cl14987 Chorein_N superfamily C - "N-terminal region of Chorein, a TM vesicle-mediated sorter; Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport." Q#5550 - CGI_10003907 superfamily 247941 164 302 6.09E-09 53.1085 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#5551 - CGI_10003908 superfamily 177822 102 297 1.61E-10 59.9337 cl18088 PLN02164 superfamily N - sulfotransferase Q#5552 - CGI_10003909 superfamily 177822 7 215 5.12E-05 42.2145 cl18088 PLN02164 superfamily - - sulfotransferase Q#5553 - CGI_10003910 superfamily 221997 10 85 5.30E-08 46.2762 cl18631 Complex1_LYR_2 superfamily - - "Complex1_LYR-like; This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria." Q#5554 - CGI_10004560 superfamily 242059 655 866 3.33E-15 76.3478 cl00738 MBOAT superfamily N - "MBOAT, membrane-bound O-acyltransferase family; The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue." Q#5554 - CGI_10004560 superfamily 245206 8 108 2.51E-05 45.7229 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#5555 - CGI_10004561 superfamily 243156 17 70 2.71E-15 65.484 cl02717 RNA_POL_M_15KD superfamily - - RNA polymerases M/15 Kd subunit; RNA polymerases M/15 Kd subunit. Q#5555 - CGI_10004561 superfamily 207668 86 127 4.64E-10 51.1759 cl02609 TFIIS_C superfamily - - Transcription factor S-II (TFIIS); Transcription factor S-II (TFIIS). Q#5558 - CGI_10004564 superfamily 222003 194 262 7.71E-07 45.7606 cl17871 Hydrolase_like superfamily - - HAD-hyrolase-like; HAD-hyrolase-like. Q#5559 - CGI_10004565 superfamily 147298 1769 1964 1.70E-103 333.333 cl04904 Pecanex_C superfamily - - Pecanex protein (C-terminus); This family consists of C terminal region of the pecanex protein homologues. The pecanex protein is a maternal-effect neurogenic gene found in Drosophila. Q#5563 - CGI_10004569 superfamily 217473 143 319 1.33E-28 115.925 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#5564 - CGI_10002454 superfamily 245230 1 359 0 740.645 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#5565 - CGI_10002455 superfamily 245230 52 408 0 784.558 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#5566 - CGI_10002456 superfamily 245230 142 275 1.02E-110 330.022 cl10017 Tubulin_FtsZ superfamily N - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#5566 - CGI_10002456 superfamily 245230 2 141 4.76E-105 315.77 cl10017 Tubulin_FtsZ superfamily C - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#5567 - CGI_10002457 superfamily 247755 405 644 5.54E-157 454.69 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#5567 - CGI_10002457 superfamily 216049 87 358 3.50E-49 173.626 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#5571 - CGI_10013554 superfamily 247905 85 202 2.81E-17 77.278 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#5572 - CGI_10013555 superfamily 247805 34 143 5.46E-07 44.6356 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#5575 - CGI_10013558 superfamily 245213 147 182 5.26E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 185 221 1.13E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 109 145 1.28E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 603 639 2.21E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 375 410 3.39E-06 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 337 372 3.96E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 451 486 3.99E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 413 448 4.88E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 299 335 5.14E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 223 258 5.75E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 489 525 9.30E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 565 601 1.21E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 261 296 2.33E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 35 69 3.05E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 72 106 9.98E-05 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 245213 527 562 0.000100968 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5575 - CGI_10013558 superfamily 221370 770 930 1.59E-06 48.9069 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#5575 - CGI_10013558 superfamily 215647 956 1015 0.00038564 41.8253 cl18338 7tm_2 superfamily NC - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#5576 - CGI_10013559 superfamily 243029 35 87 1.01E-05 43.1081 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#5576 - CGI_10013559 superfamily 221370 107 267 8.50E-05 42.3585 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#5577 - CGI_10013560 superfamily 243124 93 188 1.59E-25 99.8088 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#5578 - CGI_10013561 superfamily 247903 215 416 1.20E-43 152.451 cl17349 Peptidase_M54 superfamily - - "Peptidase family M54, also called archaemetzincins or archaelysins; Peptidase M54 (archaemetzincin or archaelysin) is a zinc-dependent aminopeptidase that contains the consensus zinc-binding sequence HEXXHXXGXXH/D and a conserved Met residue at the active site, and is thus classified as a metzincin. Archaemetzincins, first identified in archaea, are also found in bacteria and eukaryotes, including two human members, archaemetzincin-1 and -2 (AMZ1 and AMZ2). AMZ1 is mainly found in the liver and heart while AMZ2 is primarily expressed in testis and heart; both have been reported to degrade synthetic substrates and peptides. The Peptidase M54 family contains an extended metzincin concensus sequence of HEXXHXXGX3CX4CXMX17CXXC such that a second zinc ion is bound to four cysteines, thus resembling a zinc finger. Phylogenetic analysis of this family reveals a complex evolutionary process involving a series of lateral gene transfer, gene loss and genetic duplication events." Q#5579 - CGI_10013562 superfamily 202894 126 193 2.94E-20 81.113 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#5582 - CGI_10004013 superfamily 245010 10 115 3.05E-18 74.9618 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#5583 - CGI_10004014 superfamily 241599 122 179 8.68E-19 81.906 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#5584 - CGI_10004015 superfamily 241874 3 513 0 751.815 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#5585 - CGI_10004016 superfamily 241874 1 178 7.76E-81 252.212 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#5586 - CGI_10004017 superfamily 241874 5 184 7.32E-103 309.606 cl00456 SLC5-6-like_sbd superfamily NC - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#5587 - CGI_10006103 superfamily 241739 95 309 2.56E-79 246.688 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#5587 - CGI_10006103 superfamily 244928 329 420 2.75E-28 106.741 cl08386 FDX-ACB superfamily - - Ferredoxin-fold anticodon binding domain; This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold. Q#5588 - CGI_10006104 superfamily 243072 412 529 1.47E-30 117.87 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5588 - CGI_10006104 superfamily 243072 470 596 2.60E-22 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5588 - CGI_10006104 superfamily 243072 580 683 3.53E-11 61.6306 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5588 - CGI_10006104 superfamily 241752 784 895 8.25E-28 110.102 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#5588 - CGI_10006104 superfamily 241760 71 114 5.46E-10 56.6979 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#5588 - CGI_10006104 superfamily 115363 142 214 7.54E-10 56.6114 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#5588 - CGI_10006104 superfamily 115363 6 62 8.32E-05 41.5886 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#5591 - CGI_10006107 superfamily 241578 31 192 1.90E-46 166.698 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5591 - CGI_10006107 superfamily 241578 232 389 4.36E-46 165.542 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5593 - CGI_10006109 superfamily 243039 489 591 2.13E-09 55.463 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#5593 - CGI_10006109 superfamily 247792 92 139 1.06E-06 46.2848 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5593 - CGI_10006109 superfamily 241563 241 270 0.00668268 35.1476 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5595 - CGI_10016233 superfamily 245814 302 375 3.50E-05 41.7059 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5596 - CGI_10016234 superfamily 242104 208 277 0.00685346 37.054 cl00803 Cas7_I superfamily NC - CRISPR/Cas system-associated RAMP superfamily protein Cas7; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as MJ0381 family Q#5597 - CGI_10016235 superfamily 220692 49 314 1.34E-22 94.5785 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#5599 - CGI_10016237 superfamily 147218 22 245 6.50E-69 214.621 cl04854 SIP1 superfamily - - Survival motor neuron (SMN) interacting protein 1 (SIP1); Survival motor neuron (SMN) interacting protein 1 (SIP1) interacts with SMN protein and plays a crucial role in the biogenesis of spliceosomes. There is evidence that the protein is linked to spinal muscular atrophy (SMA) and amyotrophic lateral sclerosis(ALS) in humans. Q#5600 - CGI_10016238 superfamily 215859 373 575 7.40E-30 116.547 cl18347 Peptidase_S9 superfamily - - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#5604 - CGI_10016242 superfamily 241600 1091 1309 3.18E-65 221.345 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#5605 - CGI_10016243 superfamily 215733 113 339 4.97E-62 209.729 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#5605 - CGI_10016243 superfamily 216063 725 897 1.03E-54 187.826 cl02929 Cation_ATPase_C superfamily - - "Cation transporting ATPase, C-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. This family represents 5 transmembrane helices." Q#5605 - CGI_10016243 superfamily 222006 408 492 2.56E-19 84.5814 cl16182 Hydrolase_like2 superfamily - - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#5605 - CGI_10016243 superfamily 243244 23 92 7.27E-14 68.3782 cl02930 Cation_ATPase_N superfamily - - "Cation transporter/ATPase, N-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport." Q#5605 - CGI_10016243 superfamily 226572 615 687 0.00015277 41.7756 cl18761 COG4087 superfamily N - Soluble P-type ATPase [General function prediction only] Q#5606 - CGI_10016244 superfamily 247740 10 355 1.31E-124 364.917 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#5607 - CGI_10016245 superfamily 187403 109 490 2.19E-173 501.487 cl14649 BRO1_Alix_like superfamily - - "Protein-interacting Bro1-like domain of mammalian Alix and related domains; This superfamily includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1 and Rim20 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, HD-PTP, and Brox) and Snf7 (in the case of yeast Bro1, and Rim20). The single domain protein human Brox, and the isolated Bro1-like domains of Alix, HD-PTP and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix, HD-PTP, Bro1, and Rim20 also have a V-shaped (V) domain, which in the case of Alix, has been shown to be a dimerization domain and to contain a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in this superfamily. Alix, HD-PTP and Bro1 also have a proline-rich region (PRR); the Alix PRR binds multiple partners. Rhophilin-1, and -2, in addition to this Bro1-like domain, have an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This protein has a C-terminal, catalytically inactive tyrosine phosphatase domain." Q#5607 - CGI_10016245 superfamily 241602 21 103 1.96E-36 131.88 cl00087 HR1 superfamily - - "Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases; The HR1 domain, also called the ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. It is found in Rho effector proteins including PKC-related kinases such as vertebrate PRK1 (or PKN) and yeast PKC1 protein kinases C, as well as in rhophilins and Rho-associated kinase (ROCK). Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 domains may occur in repeat arrangements (PKN contains three HR1 domains), separated by a short linker region." Q#5607 - CGI_10016245 superfamily 241622 508 585 7.47E-21 88.0074 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#5608 - CGI_10016246 superfamily 247038 7 85 3.90E-09 55.1533 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#5608 - CGI_10016246 superfamily 246936 103 194 0.002982 37.0848 cl15354 CBS_pair superfamily C - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#5614 - CGI_10016252 superfamily 222150 1495 1519 1.67E-06 47.0013 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5614 - CGI_10016252 superfamily 222150 51 76 0.000231244 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5614 - CGI_10016252 superfamily 222150 679 703 0.000402703 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5614 - CGI_10016252 superfamily 222150 1160 1185 0.00133379 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5616 - CGI_10016254 superfamily 216970 1 96 6.11E-42 137.182 cl03529 CBF_beta superfamily C - "Core binding factor beta subunit; Core binding factor (CBF) is a heterodimeric transcription factor essential for genetic regulation of hematopoiesis and osteogenesis. The beta subunit enhances DNA-binding ability of the alpha subunit in vitro, and has been show to have a structure related to the OB fold." Q#5618 - CGI_10012241 superfamily 247856 94 144 0.000707837 37.9125 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5620 - CGI_10012243 superfamily 216212 465 966 0 655.131 cl03037 HCO3_cotransp superfamily - - HCO3- transporter family; This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. Q#5621 - CGI_10012244 superfamily 216212 478 979 0 672.465 cl03037 HCO3_cotransp superfamily - - HCO3- transporter family; This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. Q#5625 - CGI_10012248 superfamily 245213 575 611 6.13E-10 56.1058 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5625 - CGI_10012248 superfamily 245213 491 533 3.43E-09 53.7946 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5625 - CGI_10012248 superfamily 245213 452 489 9.23E-09 52.639 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5625 - CGI_10012248 superfamily 245213 535 572 1.01E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5625 - CGI_10012248 superfamily 110440 363 390 0.000233122 39.6985 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#5627 - CGI_10012250 superfamily 241600 27 50 0.00123287 35.6791 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#5629 - CGI_10020305 superfamily 194545 49 137 5.10E-64 192.33 cl03131 Dynein_light superfamily - - Dynein light chain type 1; Dynein light chain type 1. Q#5629 - CGI_10020305 superfamily 247724 1 58 2.34E-05 40.6078 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5630 - CGI_10020306 superfamily 247727 114 192 2.03E-09 53.2027 cl17173 AdoMet_MTases superfamily N - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#5631 - CGI_10020307 superfamily 147539 22 172 8.42E-69 208.791 cl05128 TRAP-delta superfamily - - "Translocon-associated protein, delta subunit precursor (TRAP-delta); This family consists of several eukaryotic translocon-associated protein, delta subunit precursors (TRAP-delta or SSR-delta). The exact function of this protein is unknown." Q#5632 - CGI_10020308 superfamily 215648 1242 1431 1.15E-19 90.3475 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#5632 - CGI_10020308 superfamily 245225 719 1082 4.41E-13 71.8903 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#5632 - CGI_10020308 superfamily 245225 395 668 6.89E-09 58.4083 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#5633 - CGI_10020309 superfamily 222150 281 306 7.27E-05 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5634 - CGI_10020310 superfamily 243082 1176 1507 8.09E-122 389.087 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#5634 - CGI_10020310 superfamily 241645 2062 2165 4.61E-34 129.431 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#5635 - CGI_10020311 superfamily 245839 357 458 8.55E-16 73.767 cl12020 Anticodon_Ia_like superfamily C - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#5636 - CGI_10020312 superfamily 245201 3 289 0 541.108 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5637 - CGI_10020313 superfamily 243146 835 880 5.47E-07 48.0414 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#5637 - CGI_10020313 superfamily 198867 645 720 0.000142807 41.5581 cl06652 BACK superfamily C - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#5637 - CGI_10020313 superfamily 243146 886 929 0.00584378 36.2722 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#5638 - CGI_10020314 superfamily 248097 86 209 2.75E-19 80.387 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#5639 - CGI_10020315 superfamily 248012 1049 1157 5.50E-29 113.442 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#5639 - CGI_10020315 superfamily 141623 3 120 5.18E-25 102.774 cl02685 Neuralized superfamily - - Neuralized; This family contains a conserved region approximately 60 residues long within eukaryotic neuralized and neuralized-like proteins. Neuralized belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the Drosophila nervous system. Some family members contain multiple copies of this region. Q#5639 - CGI_10020315 superfamily 247724 405 524 1.15E-18 85.4655 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5639 - CGI_10020315 superfamily 246680 964 1036 1.16E-12 65.462 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#5639 - CGI_10020315 superfamily 247724 186 225 7.11E-06 46.1752 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5640 - CGI_10020316 superfamily 247875 70 131 5.39E-08 48.5786 cl17321 2OG-FeII_Oxy_2 superfamily N - 2OG-Fe(II) oxygenase superfamily; 2OG-Fe(II) oxygenase superfamily. Q#5642 - CGI_10020318 superfamily 243092 364 644 3.48E-35 137.081 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5642 - CGI_10020318 superfamily 243092 22 165 0.000291873 42.7072 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5643 - CGI_10020319 superfamily 248024 9 114 5.81E-08 50.7446 cl17470 SBF superfamily NC - "Sodium Bile acid symporter family; This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds." Q#5645 - CGI_10020321 superfamily 247856 100 171 2.63E-12 58.7133 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5645 - CGI_10020321 superfamily 247856 65 126 7.55E-11 54.4761 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5645 - CGI_10020321 superfamily 244899 37 90 0.000255515 37.0842 cl08302 S-100 superfamily N - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#5646 - CGI_10020322 superfamily 220414 3 242 7.58E-45 153.055 cl10780 DUF2348 superfamily - - Uncharacterized conserved protein (DUF2348); Members of this family of putative uncharacterized proteins have no known function. Q#5647 - CGI_10020323 superfamily 217737 3 145 6.32E-24 96.5217 cl04266 Nuf2 superfamily - - "Nuf2 family; Members of this family are components of the mitotic spindle. It has been shown that Nuf2 from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. An arabidopsis protein has been included in this family that has previously not been identified as a member of this family, Arabidopsis thaliana T7P1.14. The match is not strong, but in common with other members of this family contains coiled-coil to the C terminus of this region." Q#5648 - CGI_10020324 superfamily 246713 65 235 3.04E-46 158.559 cl14786 ENDO3c superfamily - - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#5648 - CGI_10020324 superfamily 241868 325 430 1.09E-16 75.8363 cl00447 Nudix_Hydrolase superfamily N - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#5648 - CGI_10020324 superfamily 210074 239 255 0.00371552 35.0969 cl15304 EndIII_4Fe-2S superfamily - - "Iron-sulfur binding domain of endonuclease III; Escherichia coli endonuclease III (EC 4.2.99.18) is a DNA repair enzyme that acts both as a DNA N-glycosylase, removing oxidized pyrimidines from DNA, and as an apurinic/apyrimidinic (AP) endonuclease, introducing a single-strand nick at the site from which the damaged base was removed. Endonuclease III is an iron-sulfur protein that binds a single 4Fe-4S cluster. The 4Fe-4S cluster does not seem to be important for catalytic activity, but is probably involved in the proper positioning of the enzyme along the DNA strand. The 4Fe-4S cluster is bound by four cysteines which are all located in a 17 amino acid region at the C-terminal end of endonuclease III. A similar region is also present in the central section of mutY and in the C-terminus of ORF-10 and of the Micro-coccus UV endonuclease." Q#5649 - CGI_10020325 superfamily 241574 1391 1620 3.25E-116 368.452 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#5649 - CGI_10020325 superfamily 241574 1672 1900 1.07E-109 349.962 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#5649 - CGI_10020325 superfamily 241584 408 499 1.91E-16 77.5367 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 241584 311 403 2.24E-15 74.4551 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 241584 806 896 2.75E-13 68.2919 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 241584 606 705 6.17E-13 67.5215 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 241584 710 799 2.22E-11 62.8991 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 241584 518 601 5.49E-11 61.7435 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 241584 234 303 8.83E-10 58.2767 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 241584 1000 1100 1.96E-07 50.9579 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 241584 901 980 3.56E-06 47.1059 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#5649 - CGI_10020325 superfamily 245814 167 230 2.98E-20 88.0216 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5649 - CGI_10020325 superfamily 245814 72 146 1.07E-15 75.0804 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5651 - CGI_10020327 superfamily 248458 68 249 9.87E-27 110.481 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5651 - CGI_10020327 superfamily 248458 303 517 2.98E-07 51.1605 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5652 - CGI_10020328 superfamily 241750 18 342 4.36E-104 310.35 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#5653 - CGI_10020329 superfamily 243130 463 503 1.06E-05 43.6079 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#5654 - CGI_10020330 superfamily 245936 138 365 1.76E-46 158.482 cl12283 IPK superfamily - - Inositol polyphosphate kinase; ArgRIII has has been demonstrated to be an inositol polyphosphate kinase. Q#5655 - CGI_10020331 superfamily 241782 695 1048 1.16E-104 338.923 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#5655 - CGI_10020331 superfamily 243092 408 683 3.24E-57 202.18 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5655 - CGI_10020331 superfamily 243072 1482 1577 1.17E-26 108.24 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5655 - CGI_10020331 superfamily 247792 149 183 5.44E-05 42.818 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5655 - CGI_10020331 superfamily 190233 240 296 6.02E-07 48.9898 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#5655 - CGI_10020331 superfamily 213458 299 340 0.00762037 36.4829 cl17044 DD_cGKI superfamily - - "Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I; Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is also expressed at lower concentrations in other tissues. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing their targeting to different subcellular compartments and intracellular substrates." Q#5656 - CGI_10020332 superfamily 241596 112 155 2.13E-08 48.7495 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#5657 - CGI_10020333 superfamily 242686 6 379 8.08E-113 339.653 cl01750 PhoPQ_related superfamily - - PhoPQ-activated pathogenicity-related protein; Members of this family of bacterial proteins are involved in the virulence of some pathogenic proteobacteria. Q#5658 - CGI_10020334 superfamily 243034 987 1058 4.40E-05 43.5228 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5658 - CGI_10020334 superfamily 243092 64 355 8.29E-09 57.3448 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5659 - CGI_10020335 superfamily 246598 22 287 5.93E-153 432.447 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#5660 - CGI_10020336 superfamily 222269 149 348 5.88E-08 51.9406 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#5663 - CGI_10020339 superfamily 246908 35 136 2.75E-44 144.72 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#5663 - CGI_10020339 superfamily 247683 139 193 2.46E-18 75.6524 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#5663 - CGI_10020339 superfamily 247683 6 32 5.42E-12 57.7517 cl17036 SH3 superfamily N - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#5664 - CGI_10020340 superfamily 215754 217 308 2.27E-24 95.0128 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#5664 - CGI_10020340 superfamily 215754 115 213 5.22E-21 85.3828 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#5664 - CGI_10020340 superfamily 215754 16 111 5.70E-18 76.9084 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#5665 - CGI_10020341 superfamily 247723 237 305 3.01E-30 111.205 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5666 - CGI_10020342 superfamily 243072 183 294 1.16E-21 91.6762 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5666 - CGI_10020342 superfamily 243072 574 627 0.00104338 38.1335 cl02529 ANK superfamily NC - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5667 - CGI_10020343 superfamily 247856 86 148 8.20E-21 81.0549 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5667 - CGI_10020343 superfamily 247856 13 75 2.80E-19 77.2029 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5668 - CGI_10020344 superfamily 247856 155 217 3.87E-17 72.5805 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5668 - CGI_10020344 superfamily 247856 87 139 9.34E-05 38.6829 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5669 - CGI_10020345 superfamily 241583 85 280 1.07E-129 391.419 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#5669 - CGI_10020345 superfamily 241571 392 503 9.71E-38 137.929 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5669 - CGI_10020345 superfamily 241571 547 658 8.76E-34 126.758 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5669 - CGI_10020345 superfamily 241571 707 815 1.31E-31 120.595 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5669 - CGI_10020345 superfamily 241571 817 932 3.18E-27 108.269 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5669 - CGI_10020345 superfamily 241571 282 390 1.68E-20 88.6234 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5669 - CGI_10020345 superfamily 241578 483 541 3.94E-08 53.5427 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5669 - CGI_10020345 superfamily 241578 652 695 3.26E-07 50.8464 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5670 - CGI_10020346 superfamily 210118 193 211 0.000109381 38.4607 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#5673 - CGI_10009197 superfamily 219425 383 501 9.21E-14 67.5667 cl06494 Hydrolase_2 superfamily - - "Cell Wall Hydrolase; These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance B. subtilis sleB is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyses the cortex. A similar role is carried out by the partially redundant B. subtilis cwlJ. It is not clear whether these enzymes are amidases or peptidases." Q#5677 - CGI_10009201 superfamily 219425 334 452 6.29E-14 67.9519 cl06494 Hydrolase_2 superfamily - - "Cell Wall Hydrolase; These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance B. subtilis sleB is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyses the cortex. A similar role is carried out by the partially redundant B. subtilis cwlJ. It is not clear whether these enzymes are amidases or peptidases." Q#5682 - CGI_10009207 superfamily 241563 59 97 2.73E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5684 - CGI_10005294 superfamily 243133 106 177 3.07E-15 69.8972 cl02662 SEP superfamily - - "SEP domain; The SEP domain is named after Saccharomyces cerevisiae Shp1, Drosophila melanogaster eyes closed gene (eyc), and vertebrate p47. In p47, the SEP domain has been shown to bind to and inhibit the cysteine protease cathepsin L. Most SEP domains are succeeded closely by a UBX domain." Q#5684 - CGI_10005294 superfamily 241645 291 366 5.35E-09 52.2735 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#5685 - CGI_10005295 superfamily 217380 809 1081 5.13E-81 269.965 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#5686 - CGI_10005296 superfamily 248139 591 1052 0 759.027 cl17585 RNA_pol_B_RPB2 superfamily N - "RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Each RNA polymerase complex contains two related members of this family, in each case they are the two largest subunits.The clamp is a mobile structure that grips DNA during elongation." Q#5686 - CGI_10005296 superfamily 248139 38 468 3.93E-110 364.968 cl17585 RNA_pol_B_RPB2 superfamily C - "RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Each RNA polymerase complex contains two related members of this family, in each case they are the two largest subunits.The clamp is a mobile structure that grips DNA during elongation." Q#5686 - CGI_10005296 superfamily 191029 562 602 1.39E-12 64.1405 cl04595 RNA_pol_Rpb2_5 superfamily - - "RNA polymerase Rpb2, domain 5; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 5, is also known as the external 2 domain." Q#5686 - CGI_10005296 superfamily 113341 504 535 2.12E-07 49.599 cl04594 RNA_pol_Rpb2_4 superfamily C - "RNA polymerase Rpb2, domain 4; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Domain 4, is also known as the external 2 domain." Q#5687 - CGI_10005297 superfamily 245201 17 112 6.96E-13 62.0324 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5688 - CGI_10020485 superfamily 241566 1255 1303 5.41E-10 57.8872 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#5688 - CGI_10020485 superfamily 243036 1483 1736 4.96E-29 119.65 cl02434 CNH superfamily - - "CNH domain; Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations." Q#5688 - CGI_10020485 superfamily 247725 1335 1448 3.87E-07 50.5433 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5689 - CGI_10020486 superfamily 245819 44 190 8.52E-64 197.031 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#5689 - CGI_10020486 superfamily 219526 1 31 0.000153018 39.9099 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#5691 - CGI_10020488 superfamily 202184 16 106 2.30E-19 77.3194 cl03509 UCR_14kD superfamily - - "Ubiquinol-cytochrome C reductase complex 14kD subunit; The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 14kD (or VI) subunit of the complex which is not directly involved in electron transfer, but has a role in assembly of the complex." Q#5693 - CGI_10020490 superfamily 241574 235 371 1.72E-61 196.676 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#5693 - CGI_10020490 superfamily 241626 78 206 1.64E-29 111.22 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#5694 - CGI_10020491 superfamily 243146 127 174 4.64E-09 49.4791 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#5694 - CGI_10020491 superfamily 243146 79 126 9.09E-08 46.0123 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#5694 - CGI_10020491 superfamily 243146 160 194 0.000110023 37.5915 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#5695 - CGI_10020492 superfamily 243146 77 116 0.00119248 34.5595 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#5696 - CGI_10020493 superfamily 243072 40 151 1.07E-06 46.993 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5697 - CGI_10020494 superfamily 199899 206 251 1.75E-30 110.692 cl18199 TFIIA_alpha_beta_like superfamily N - "Precursor of TFIIA alpha and beta subunits and similar proteins; Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of TATA-binding protein (TBP) for DNA in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta) and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single gene (TFIIA_alpha_beta), its protein product is post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. TFIIA_alpha_beta alone is sufficient for transcription in early embryogenesis, but the cleaved forms, TFIIA alpha and TFIIA beta, represent the vast majority of TFIIA in most differentiated cells. The exact functional differences between cleaved and uncleaved forms are not yet clear. This model also contains paralogs of the canonical TFIIA_alpha_beta, such as the human ALF, which may be involved in gametogenesis and early embryogenesis (and is also subject to proteolytic cleavage)." Q#5697 - CGI_10020494 superfamily 199899 7 61 1.45E-24 94.5135 cl18199 TFIIA_alpha_beta_like superfamily C - "Precursor of TFIIA alpha and beta subunits and similar proteins; Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of TATA-binding protein (TBP) for DNA in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta) and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single gene (TFIIA_alpha_beta), its protein product is post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. TFIIA_alpha_beta alone is sufficient for transcription in early embryogenesis, but the cleaved forms, TFIIA alpha and TFIIA beta, represent the vast majority of TFIIA in most differentiated cells. The exact functional differences between cleaved and uncleaved forms are not yet clear. This model also contains paralogs of the canonical TFIIA_alpha_beta, such as the human ALF, which may be involved in gametogenesis and early embryogenesis (and is also subject to proteolytic cleavage)." Q#5699 - CGI_10020496 superfamily 246925 97 179 0.00525342 35.409 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#5700 - CGI_10020497 superfamily 241775 78 163 4.19E-11 56.756 cl00314 Ribosomal_S10 superfamily - - Ribosomal protein S10p/S20e; This family includes small ribosomal subunit S10 from prokaryotes and S20 from eukaryotes. Q#5703 - CGI_10020500 superfamily 212559 1450 1494 6.77E-10 57.2391 cl18297 SANT_MTA3_like superfamily - - "Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis." Q#5703 - CGI_10020500 superfamily 216509 1353 1408 4.38E-06 46.0778 cl03218 ELM2 superfamily - - "ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in a member from Arabidopsis thaliana. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain." Q#5703 - CGI_10020500 superfamily 197676 259 281 0.0098124 35.9045 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#5704 - CGI_10020501 superfamily 205451 83 176 1.27E-06 44.8767 cl16203 DUF4062 superfamily - - "Domain of unknown function (DUF4062); This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. There is a conserved SST sequence motif." Q#5705 - CGI_10020502 superfamily 208568 171 222 7.80E-27 99.8888 cl06890 NOSIC superfamily - - NOSIC (NUC001) domain; This is the central domain in Nop56/SIK1-like proteins. Q#5705 - CGI_10020502 superfamily 219731 5 71 2.90E-21 84.9259 cl06964 NOP5NT superfamily - - NOP5NT (NUC127) domain; This N terminal domain is found in RNA-binding proteins of the NOP5 family. Q#5706 - CGI_10020503 superfamily 247792 65 108 6.57E-09 50.1368 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5707 - CGI_10020504 superfamily 248458 108 460 7.15E-19 86.5989 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5708 - CGI_10020505 superfamily 248458 113 496 7.52E-27 110.481 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5709 - CGI_10020506 superfamily 241752 941 1148 5.67E-130 398.504 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#5709 - CGI_10020506 superfamily 243072 515 640 6.58E-37 136.745 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5709 - CGI_10020506 superfamily 243072 668 782 3.73E-34 129.041 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5709 - CGI_10020506 superfamily 243072 47 161 1.51E-32 124.418 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5709 - CGI_10020506 superfamily 243072 115 294 3.44E-29 114.788 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5709 - CGI_10020506 superfamily 247057 869 929 1.42E-27 108.184 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#5709 - CGI_10020506 superfamily 243072 357 474 7.51E-25 102.077 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5711 - CGI_10020508 superfamily 243035 8 76 0.00047513 37.3556 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5712 - CGI_10020509 superfamily 220692 76 187 0.00205725 36.7986 cl18570 7TM_GPCR_Srw superfamily N - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#5714 - CGI_10020511 superfamily 241992 823 1057 1.01E-149 456.69 cl00628 Piwi-like superfamily N - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#5714 - CGI_10020511 superfamily 241765 338 458 7.62E-40 144.767 cl00301 PAZ superfamily - - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#5714 - CGI_10020511 superfamily 241992 544 765 7.23E-76 257.156 cl00628 Piwi-like superfamily C - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#5714 - CGI_10020511 superfamily 219976 286 338 8.54E-21 87.9985 cl07356 DUF1785 superfamily - - Domain of unknown function (DUF1785); This region is found in argonaute proteins and often co-occurs with pfam02179 and pfam02171. Q#5717 - CGI_10020514 superfamily 192997 298 423 7.57E-20 87.6371 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#5718 - CGI_10020515 superfamily 243072 41 150 1.48E-36 131.352 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5723 - CGI_10020520 superfamily 241777 91 410 2.76E-75 237.888 cl00316 Cation_efflux superfamily - - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#5723 - CGI_10020520 superfamily 241884 50 120 0.005871 36.5396 cl00467 Ntn_hydrolase superfamily C - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#5724 - CGI_10020521 superfamily 241777 182 329 2.24E-42 149.293 cl00316 Cation_efflux superfamily N - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#5724 - CGI_10020521 superfamily 241777 1 114 1.32E-30 117.321 cl00316 Cation_efflux superfamily C - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#5727 - CGI_10020524 superfamily 245213 1423 1457 7.86E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#5727 - CGI_10020524 superfamily 242173 438 593 5.91E-18 83.0702 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#5727 - CGI_10020524 superfamily 242173 292 429 1.28E-12 67.2771 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#5727 - CGI_10020524 superfamily 242173 611 760 2.79E-10 59.9583 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#5729 - CGI_10020526 superfamily 245814 2 87 2.96E-10 56.3597 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5730 - CGI_10020527 superfamily 247727 48 126 7.86E-10 55.5138 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#5732 - CGI_10001776 superfamily 243072 17 131 2.67E-10 55.8526 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5732 - CGI_10001776 superfamily 243073 211 250 0.00272897 34.3935 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#5733 - CGI_10001777 superfamily 243072 36 166 1.56E-22 88.5946 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5736 - CGI_10008500 superfamily 243074 4 48 7.46E-11 57.1313 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#5736 - CGI_10008500 superfamily 192915 227 274 2.18E-07 48.5185 cl13451 DUF3506 superfamily C - Domain of unknown function (DUF3506); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 131 to 148 amino acids in length. This domain has a conserved KLTGD sequence motif. Q#5737 - CGI_10008501 superfamily 247724 163 327 9.08E-75 230.389 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5737 - CGI_10008501 superfamily 110047 9 161 2.25E-14 68.8896 cl03072 GTP1_OBG superfamily - - "GTP1/OBG; The N-terminal domain of B. subtilis GTPase obgE has the OBG fold, which is formed by three glycine-rich regions inserted into a small 8-stranded beta-sandwich these regions form six left-handed collagen-like helices packed and H-bonded together." Q#5738 - CGI_10008502 superfamily 220795 94 272 3.55E-100 293.487 cl11156 Kua-UEV1_localn superfamily - - "Kua-ubiquitin conjugating enzyme hybrid localisation domain; This domain is part of the transcript of the fusion of two genes, the UEV1, an enzymatically inactive variant of the E2 ubiquitin-conjugating enzymes that regulate non-canonical elongation of ubiquitin chains, and Kua, an otherwise unknown gene. UEV1A is a nuclear protein, whereas both Kua and Kua-UEV localise to cytoplasmic structures, indicating that the addition of a Kua domain to UEV confers new biological properties. UEV1-Kua carries the B domain with its characteristic double histidine motif, and it is probably this domain which determines the cytoplasmic localisation. It is postulated that this hybrid transcript could preferentially direct the variant polyubiquitination of substrates closely associated with the cytoplasmic face of the endoplasmic reticulum, possibly, although not necessarily, in conjunction with membrane-bound ubiquitin-conjugating enzymes." Q#5739 - CGI_10008503 superfamily 222150 1117 1142 3.43E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 1064 1089 5.31E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 1735 1759 0.000210966 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 1246 1270 0.000222736 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 712 737 0.000257426 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 1762 1787 0.000470304 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 768 793 0.000866332 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 298 322 0.000890099 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 1303 1328 0.00093678 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 595 620 0.0013929 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5741 - CGI_10008505 superfamily 222150 325 350 0.00301204 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5742 - CGI_10008506 superfamily 243072 228 358 1.72E-24 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5742 - CGI_10008506 superfamily 243072 305 425 8.17E-18 80.1202 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5742 - CGI_10008506 superfamily 243072 134 255 7.22E-12 62.7862 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5742 - CGI_10008506 superfamily 243073 463 501 4.89E-06 43.9981 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#5743 - CGI_10008507 superfamily 241657 79 556 0 535.799 cl00170 eu-GS superfamily - - Eukaryotic Glutathione Synthetase (eu-GS); catalyses the production of glutathione from gamma-glutamylcysteine and glycine in an ATP-dependent manner. Belongs to the ATP-grasp superfamily. Q#5744 - CGI_10008508 superfamily 241900 11 158 4.91E-05 40.7883 cl00490 EEP superfamily N - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#5745 - CGI_10008509 superfamily 211463 792 917 7.93E-50 172.417 cl13463 FAT-like_CAS_C superfamily - - "C-terminal FAT-like Four helix bundle domain, also called DUF3513, of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module; CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes including migration, chemotaxis, apoptosis, differentiation, and progenitor cell function. They mediate the signaling of integrins at focal adhesions where they localize, and thus, regulate cell invasion and survival. Over-expression of these proteins is implicated in poor prognosis, increased metastasis, and resistance to chemotherapeutics in many cancers such as breast, lung, melanoma, and glioblastoma. CAS proteins have also been linked to the pathogenesis of inflammatory disorders, Alzheimer's, Parkinson's, and developmental defects. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Vertebrates contain four CAS proteins: BCAR1 (or p130Cas), NEDD9 (or HEF1), EFS (or SIN), and CASS4 (or HEPL). The FAT-like C-terminal domain of CAS proteins binds to the C-terminal domain of NSPs (novel SH2-containing proteins) to form multidomain signaling modules that mediate cell migration and invasion." Q#5745 - CGI_10008509 superfamily 247683 9 64 6.35E-32 119.758 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#5745 - CGI_10008509 superfamily 211451 433 579 3.84E-43 155.061 cl07433 Serine_rich_CAS superfamily - - "Serine rich Four helix bundle domain of CAS (Crk-Associated Substrate) scaffolding proteins; a protein interaction module; CAS proteins function as molecular scaffolds to regulate protein complexes that are involved in many cellular processes including migration, chemotaxis, apoptosis, differentiation, and progenitor cell function. They mediate the signaling of integrins at focal adhesions where they localize, and thus, regulate cell invasion and survival. Over-expression of these proteins is implicated in poor prognosis, increased metastasis, and resistance to chemotherapeutics in many cancers such as breast, lung, melanoma, and glioblastoma. CAS proteins have also been linked to the pathogenesis of inflammatory disorders, Alzheimer's, Parkinson's, and developmental defects. They share a common domain structure containing protein interaction modules that enable their scaffolding function, including an N-terminal SH3 domain, an unstructured substrate domain that contains many YxxP motifs, a serine-rich four-helix bundle, and a FAT-like C-terminal domain. Vertebrates contain four CAS proteins: BCAR1 (or p130Cas), NEDD9 (or HEF1), EFS (or SIN), and CASS4 (or HEPL). CAS proteins associate with the 14-3-3 family; this interaction is regulated by integrin-mediated cell adhesion. The serine rich four helix bundle domain of BCAR1 has been shown to bind 14-3-3 in a phosphorylation-dependent manner. This domain is structurally similar to other helical bundles found in cell adhesion components such as alpha-catenin, vinculin, and FAK, and may bind other proteins in addition to the 14-3-3 family." Q#5747 - CGI_10008511 superfamily 247723 168 240 5.05E-19 79.2251 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5748 - CGI_10008512 superfamily 243058 635 737 5.06E-08 51.9315 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5748 - CGI_10008512 superfamily 243058 202 318 6.11E-08 51.5463 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5748 - CGI_10008512 superfamily 243058 291 407 1.70E-05 44.2276 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5748 - CGI_10008512 superfamily 243072 16 111 8.36E-05 41.9855 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5748 - CGI_10008512 superfamily 243058 124 237 0.000236863 40.3756 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5748 - CGI_10008512 superfamily 243058 381 491 0.000359328 39.9904 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5749 - CGI_10008513 superfamily 227462 39 154 3.23E-44 145.524 cl18814 COG5133 superfamily N - Uncharacterized conserved protein [Function unknown] Q#5750 - CGI_10008514 superfamily 241572 258 347 1.67E-17 77.664 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#5750 - CGI_10008514 superfamily 241572 372 444 5.77E-11 59.1744 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#5752 - CGI_10004958 superfamily 219075 237 413 7.11E-43 151.468 cl05845 RGM_C superfamily - - "Repulsive guidance molecule (RGM) C-terminus; This family consists of several mammalian and one bird sequence from Gallus gallus (Chicken). This family represents the C-terminal region of several sequences but in others it represents the full protein. All of the mammalian proteins are hypothetical and have no known function but RGMA from the chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum." Q#5752 - CGI_10004958 superfamily 219076 41 234 2.39E-35 130.709 cl05846 RGM_N superfamily - - "Repulsive guidance molecule (RGM) N-terminus; This family consists of the N-terminal region of several mammalian and one bird sequence from Gallus gallus (Chicken). All of the mammalian proteins are hypothetical and have no known function but RGMA from the chicken is annotated as being a repulsive guidance molecule (RGM). RGM is a GPI-linked axon guidance molecule of the retinotectal system. RGM is repulsive for a subset of axons, those from the temporal half of the retina. Temporal retinal axons invade the anterior optic tectum in a superficial layer, and encounter RGM expressed in a gradient with increasing concentration along the anterior-posterior axis. Temporal axons are able to receive posterior-dependent information by sensing gradients or concentrations of guidance cues. Thus, RGM is likely to provide positional information for temporal axons invading the optic tectum in the stratum opticum." Q#5754 - CGI_10004960 superfamily 247724 22 162 2.08E-24 97.2955 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5757 - CGI_10019151 superfamily 241572 82 166 0.00109135 37.1867 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#5759 - CGI_10019153 superfamily 245201 162 246 4.68E-09 54.3912 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5761 - CGI_10019155 superfamily 242122 1988 2131 4.02E-07 50.7085 cl00824 HEPN superfamily - - HEPN domain; HEPN domain. Q#5765 - CGI_10019159 superfamily 192987 21 98 5.55E-20 78.7683 cl13724 TMF_TATA_bd superfamily N - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#5767 - CGI_10019161 superfamily 243072 43 168 1.12E-33 121.337 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5767 - CGI_10019161 superfamily 243072 109 241 1.45E-31 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5768 - CGI_10019162 superfamily 247057 445 515 9.61E-34 125.87 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#5768 - CGI_10019162 superfamily 247057 379 444 3.23E-30 115.434 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#5768 - CGI_10019162 superfamily 247683 22 74 1.13E-06 47.6876 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#5768 - CGI_10019162 superfamily 242211 205 293 0.00197241 39.2137 cl00944 KdpC superfamily C - "K+-transporting ATPase, c chain; This family consists of K+-transporting ATPase, c chain, KdpC. KdpC forms strong interactions with the KdpA subunit, serving to assemble and stabilise the Kdp complex. It has been suggested that KdpC could be one of the connecting links between the energy providing subunit KdpB and the K+-transporting subunit KdpA. The K+ transport system actively transports K+ ions via ATP hydrolysis." Q#5769 - CGI_10019163 superfamily 246726 28 99 0.000553341 36.734 cl14821 DUF3792 superfamily C - Protein of unknown function (DUF3792); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 130 amino acids in length. These proteins are integral membrane proteins. Q#5772 - CGI_10019166 superfamily 247725 28 133 1.25E-46 153.053 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5772 - CGI_10019166 superfamily 220708 99 196 1.43E-09 53.5611 cl11017 WWbp superfamily - - "WW-domain ligand protein; The WWbp domain is characterized by several short PY and PT-like motifs of the PPPPY form. These appear to bind directly to the WW domains of WWP1 and WWP2 and other such diverse proteins as dystrophin and YAP (Yes-associated protein). This is the WW-domain binding protein WWbp via PY and PY_like motifs. The presence of a phosphotyrosine residue in the pWBP-1 peptide abolishes WW domain binding which suggests a potential regulatory role for tyrosine phosphorylation in modulating WW domain-ligand interactions. Given the likelihood that WWP1 and WWP2 function as E3 ubiquitin-protein ligases, it is possible that initial substrate-specific recognition occurs via WW domain-substrate protein interaction followed by ubiquitin transfer and subsequent proteolysis. This domain lies just downstream of the GRAM (pfam02893) in many members." Q#5773 - CGI_10019167 superfamily 241550 59 343 1.53E-101 307.225 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#5774 - CGI_10019168 superfamily 218454 250 366 1.93E-26 102.352 cl04950 RNA_pol_Rpc4 superfamily - - "RNA polymerase III RPC4; Specific subunit for Pol III, the tRNA specific polymerase." Q#5775 - CGI_10019169 superfamily 245201 15 206 1.53E-35 131.59 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5776 - CGI_10019170 superfamily 241983 429 751 9.07E-39 145.578 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#5776 - CGI_10019170 superfamily 241983 10 334 5.96E-34 132.096 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#5777 - CGI_10019171 superfamily 241983 10 127 2.43E-12 61.9902 cl00614 ADP_ribosyl_GH superfamily C - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#5779 - CGI_10019173 superfamily 150870 122 199 2.27E-27 107.455 cl10946 Stork_head superfamily - - Winged helix Storkhead-box1 domain; This is the conserved N-terminal winged helix domain of Storkhead-box1 protein which is likely to be a DNA binding domain. In humans the full-length protein controls polyploidization of extravillus trophoblast and is implicated in pre-eclampsia. Q#5780 - CGI_10019174 superfamily 247723 12 82 3.57E-29 105.78 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5781 - CGI_10019175 superfamily 216881 1 203 3.40E-83 248.016 cl03454 Rho_GDI superfamily - - RHO protein GDP dissociation inhibitor; RHO protein GDP dissociation inhibitor. Q#5783 - CGI_10019177 superfamily 203750 989 1123 1.69E-45 161.709 cl18248 Sad1_UNC superfamily - - "Sad1 / UNC-like C-terminal; The C. elegans UNC-84 protein is a nuclear envelope protein that is involved in nuclear anchoring and migration during development. The S. pombe Sad1 protein localises at the spindle pole body. UNC-84 and and Sad1 share a common C-terminal region, that is often termed the SUN (Sad1 and UNC) domain. In mammals, the SUN domain is present in two proteins, Sun1 and Sun2. The SUN domain of Sun2 has been demonstrated to be in the periplasm." Q#5785 - CGI_10001654 superfamily 219610 25 89 1.14E-19 81.3108 cl06753 DUF1632 superfamily N - CEO family (DUF1632); These sequences are found in hypothetical eukaryotic proteins of unknown function. The region concerned is approximately 280 residues long. This family has been termed the CEO family for C. elegans ORF. Q#5787 - CGI_10018928 superfamily 141815 13 109 4.64E-36 127.865 cl04275 Mtc superfamily C - Tricarboxylate carrier; Tricarboxylate carrier. Q#5788 - CGI_10018929 superfamily 247724 72 241 2.15E-98 287.756 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5789 - CGI_10018930 superfamily 245230 8 436 0 949.038 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#5790 - CGI_10018931 superfamily 247725 51 182 4.27E-70 229.063 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5790 - CGI_10018931 superfamily 243056 464 672 2.02E-64 216.404 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#5790 - CGI_10018931 superfamily 221593 213 343 1.77E-30 118.631 cl13857 DUF3694 superfamily - - "Kinesin protein; This domain family is found in eukaryotes, and is typically between 131 and 151 amino acids in length. The family is found in association with pfam00225, pfam00498. There is a single completely conserved residue W that may be functionally important." Q#5792 - CGI_10018933 superfamily 247723 191 268 4.93E-35 125.975 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5792 - CGI_10018933 superfamily 247723 427 501 2.52E-31 115.496 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5792 - CGI_10018933 superfamily 247723 277 355 1.76E-21 88.5111 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5793 - CGI_10018934 superfamily 247723 6 83 8.34E-38 132.138 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5793 - CGI_10018934 superfamily 247723 250 313 6.98E-29 107.792 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5793 - CGI_10018934 superfamily 247723 94 170 2.70E-18 78.4959 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5794 - CGI_10018935 superfamily 241889 24 185 9.03E-58 182.859 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#5795 - CGI_10018936 superfamily 241846 7 377 1.14E-179 507.719 cl00409 tgt superfamily - - queuine tRNA-ribosyltransferase; Provisional Q#5796 - CGI_10018937 superfamily 243072 118 241 7.14E-18 78.5794 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5796 - CGI_10018937 superfamily 243072 52 170 2.27E-07 48.1486 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5796 - CGI_10018937 superfamily 243073 321 361 9.12E-06 42.4573 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#5797 - CGI_10018938 superfamily 247724 5 106 1.13E-29 108.767 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5797 - CGI_10018938 superfamily 247724 102 163 3.47E-16 72.1731 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5798 - CGI_10018939 superfamily 247724 46 164 4.16E-31 113.004 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5799 - CGI_10018940 superfamily 243306 6 257 6.93E-140 397.352 cl03114 RNase_PH superfamily - - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#5800 - CGI_10018941 superfamily 216554 144 305 3.41E-36 129.907 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#5801 - CGI_10018942 superfamily 215648 316 560 4.63E-29 116.156 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#5802 - CGI_10018943 superfamily 243072 503 619 3.02E-34 128.27 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5802 - CGI_10018943 superfamily 243072 236 353 7.24E-31 118.64 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5803 - CGI_10018944 superfamily 244951 79 340 9.27E-78 241.927 cl08428 Mannosyl_trans superfamily - - Mannosyltransferase (PIG-M); PIG-M has a DXD motif. The DXD motif is found in many glycosyltransferases that utilise nucleotide sugars. It is thought that the motif is involved in the binding of a manganese ion that is required for association of the enzymes with nucleotide sugar substrates. Q#5804 - CGI_10018945 superfamily 243130 77 114 7.58E-05 38.9855 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#5806 - CGI_10018947 superfamily 247792 304 326 0.0010556 36.602 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5807 - CGI_10018948 superfamily 245206 179 473 2.80E-109 328.419 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#5807 - CGI_10018948 superfamily 243072 57 173 2.07E-27 105.929 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5808 - CGI_10018949 superfamily 220226 1 256 1.16E-94 281.424 cl09658 XendoU superfamily - - Endoribonuclease XendoU; This is a family of endoribonucleases involved in RNA biosynthesis which has been named XendoU in Xenopus laevis. XendoU is a U-specific metal dependent enzyme that produces products with a 2'-3' cyclic phosphate termini. Q#5809 - CGI_10018950 superfamily 243072 453 572 6.67E-32 121.722 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5809 - CGI_10018950 superfamily 243072 514 676 2.46E-27 108.625 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5809 - CGI_10018950 superfamily 243072 620 748 5.43E-24 98.995 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5809 - CGI_10018950 superfamily 115363 4 66 9.94E-19 82.0345 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#5809 - CGI_10018950 superfamily 241760 77 119 9.81E-09 52.8459 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#5809 - CGI_10018950 superfamily 247792 794 835 1.92E-07 48.9152 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5809 - CGI_10018950 superfamily 115363 182 209 0.00795652 35.4254 cl05972 MIB_HERC2 superfamily N - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#5810 - CGI_10018951 superfamily 243521 749 857 4.64E-29 112.722 cl03759 Alpha_adaptinC2 superfamily - - "Adaptin C-terminal domain; Alpha adaptin is a heterotetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This ig-fold domain is found in alpha, beta and gamma adaptins." Q#5812 - CGI_10018953 superfamily 247792 447 488 5.70E-09 52.448 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5812 - CGI_10018953 superfamily 222330 1 416 5.08E-115 352.841 cl16356 TRC8_N superfamily - - TRC8 N-terminal domain; This region is found at the N-terminus of the TRC8 protein. TRC8 is an E3 ubiquitin-protein ligase also known as RNF139. This region contains 12 transmembrane domains. This region has been suggested to contain a sterol sensing domain. It has been found that TRC8 protein levels are sterol responsive and that it binds and stimulates ubiquitylation of the endoplasmic reticulum anchor protein INSIG. Q#5814 - CGI_10018955 superfamily 241613 103 134 5.78E-05 37.5714 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5814 - CGI_10018955 superfamily 241571 57 91 1.20E-05 40.8587 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5814 - CGI_10018955 superfamily 241613 9 39 0.00150811 34.1046 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5815 - CGI_10018956 superfamily 243161 31 56 0.000839092 33.907 cl02739 THAP superfamily N - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#5817 - CGI_10018958 superfamily 241571 52 147 1.13E-16 73.9858 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#5817 - CGI_10018958 superfamily 241613 153 187 7.97E-11 56.4462 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#5818 - CGI_10018959 superfamily 243107 422 468 1.87E-10 56.7468 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#5819 - CGI_10018960 superfamily 241646 27 75 0.000199009 35.5042 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#5821 - CGI_10012401 superfamily 199939 2 50 1.66E-24 89.1619 cl03508 TFIIA_gamma_N superfamily - - "Gamma subunit of transcription initiation factor IIA, N-terminal helical domain; Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of the TATA-binding protein (TBP) for DNA, in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta), and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single TFIIA_alpha_beta gene and post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. The TFIIA gamma subunit is highly conserved between humans, Drosophila and yeast and it is required for TFIIA function. The N-terminal domain of the gamma subunit forms a 4-helix bundle together with the alpha subunit." Q#5821 - CGI_10012401 superfamily 199942 56 101 2.71E-22 83.0659 cl08356 TFIIA_gamma_C superfamily - - "Gamma subunit of transcription initiation factor IIA, C-terminal domain; Transcription factor II A (TFIIA) is one of the general transcription factors for RNA polymerase II. TFIIA increases the affinity of the TATA-binding protein (TBP) for DNA, in order to assemble the initiation complex. TFIIA also functions as an activator during development and differentiation, and is involved in transcription from TATA-less promoters. TFIIA is composed of more than one subunit in various organisms. Mammalian TFIIA large subunits (TFIIA alpha and beta), and the smaller subunit (TFIIA gamma) form a heterotrimer. TFIIA alpha and beta are encoded by a single TFIIA_alpha_beta gene and post-translationally processed and cleaved. TOA1 and TOA2 are the two subunits of Yeast TFIIA which correspond to Mammalian TFIIA_alpha_beta and TFIIA gamma, respectively. TOA1 and TOA2 form a heterodimeric protein complex. The TFIIA gamma subunit is highly conserved between humans, Drosophila and yeast and it is required for TFIIA function. The C-terminal domain of the gamma (TFIIA_gamma_C) subunit forms a beta-barrel structure together with TFIIA beta." Q#5822 - CGI_10012402 superfamily 241615 39 81 3.15E-07 45.5541 cl00107 LysM superfamily - - "Lysine Motif is a small domain involved in binding peptidoglycan; LysM, a small globular domain with approximately 40 amino acids, is a widespread protein module involved in binding peptidoglycan in bacteria and chitin in eukaryotes. The domain was originally identified in enzymes that degrade bacterial cell walls, but proteins involved in many other biological functions also contain this domain. It has been reported that the LysM domain functions as a signal for specific plant-bacteria recognition in bacterial pathogenesis. Many of these enzymes are modular and are composed of catalytic units linked to one or several repeats of LysM domains. LysM domains are found in bacteria and eukaryotes." Q#5823 - CGI_10012403 superfamily 243182 1 158 4.56E-56 182.77 cl02784 Chelatase_Class_II superfamily - - "Class II Chelatase: a family of ATP-independent monomeric or homodimeric enzymes that catalyze the insertion of metal into protoporphyrin rings. This family includes protoporphyrin IX ferrochelatase (HemH), sirohydrochlorin ferrochelatase (SirB) and the cobaltochelatases, CbiK and CbiX. HemH and SirB are involved in heme and siroheme biosynthesis, respectively, while the cobaltochelatases are associated with cobalamin biosynthesis. Excluded from this family are the ATP-dependent heterotrimeric chelatases (class I) and the multifunctional homodimeric enzymes with dehydrogenase and chelatase activities (class III)." Q#5823 - CGI_10012403 superfamily 243182 163 299 4.37E-47 158.462 cl02784 Chelatase_Class_II superfamily - - "Class II Chelatase: a family of ATP-independent monomeric or homodimeric enzymes that catalyze the insertion of metal into protoporphyrin rings. This family includes protoporphyrin IX ferrochelatase (HemH), sirohydrochlorin ferrochelatase (SirB) and the cobaltochelatases, CbiK and CbiX. HemH and SirB are involved in heme and siroheme biosynthesis, respectively, while the cobaltochelatases are associated with cobalamin biosynthesis. Excluded from this family are the ATP-dependent heterotrimeric chelatases (class I) and the multifunctional homodimeric enzymes with dehydrogenase and chelatase activities (class III)." Q#5825 - CGI_10012405 superfamily 247724 145 273 6.79E-23 97.9118 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5825 - CGI_10012405 superfamily 241972 1110 1196 4.08E-08 52.9667 cl00600 Ribosomal_L7Ae superfamily - - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#5826 - CGI_10012406 superfamily 247724 6 172 5.20E-121 343.096 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5827 - CGI_10012407 superfamily 247724 89 241 2.91E-75 229.746 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5827 - CGI_10012407 superfamily 247724 2 89 3.06E-42 144.521 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5828 - CGI_10012408 superfamily 241705 16 118 1.63E-56 173.132 cl00228 HIT_like superfamily - - "HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups." Q#5829 - CGI_10012409 superfamily 241563 61 99 7.08E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5832 - CGI_10009690 superfamily 247041 50 278 5.75E-39 139.009 cl15692 CE4_SF superfamily - - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#5833 - CGI_10009691 superfamily 243035 75 188 2.63E-13 63.0225 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5834 - CGI_10009692 superfamily 243035 75 188 1.69E-13 63.4077 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5835 - CGI_10009693 superfamily 243034 108 194 8.28E-11 59.316 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5835 - CGI_10009693 superfamily 243034 434 515 3.64E-09 54.3084 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5835 - CGI_10009693 superfamily 243034 337 480 3.92E-06 45.4488 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5835 - CGI_10009693 superfamily 243034 484 551 0.000699942 38.5152 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5835 - CGI_10009693 superfamily 243034 296 368 0.00902121 35.0484 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5836 - CGI_10009694 superfamily 248458 21 169 4.68E-06 46.9233 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5837 - CGI_10009695 superfamily 248458 21 166 1.17E-06 48.4641 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#5838 - CGI_10009696 superfamily 245201 377 627 2.25E-70 231.64 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5838 - CGI_10009696 superfamily 241645 190 266 1.14E-23 96.1044 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#5838 - CGI_10009696 superfamily 241645 67 148 2.07E-23 95.334 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#5841 - CGI_10009699 superfamily 247684 43 264 1.52E-61 209.056 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#5841 - CGI_10009699 superfamily 247684 246 368 3.44E-22 97.3479 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#5842 - CGI_10009700 superfamily 159569 269 307 3.90E-05 40.3941 cl11598 Thymosin superfamily - - Thymosin beta-4 family; Thymosin beta-4 family. Q#5842 - CGI_10009700 superfamily 159569 231 271 0.00281321 35.0013 cl11598 Thymosin superfamily - - Thymosin beta-4 family; Thymosin beta-4 family. Q#5842 - CGI_10009700 superfamily 159569 201 233 0.00306137 35.0013 cl11598 Thymosin superfamily - - Thymosin beta-4 family; Thymosin beta-4 family. Q#5843 - CGI_10009701 superfamily 217293 20 216 4.82E-25 103.481 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#5843 - CGI_10009701 superfamily 217293 290 470 3.99E-24 100.785 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#5847 - CGI_10009705 superfamily 243034 212 311 1.42E-22 93.5987 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5847 - CGI_10009705 superfamily 243034 280 379 1.84E-22 93.5987 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5847 - CGI_10009705 superfamily 243034 76 175 6.55E-21 88.9763 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5847 - CGI_10009705 superfamily 243034 352 447 5.10E-20 86.6651 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5847 - CGI_10009705 superfamily 243034 177 210 0.000208 39.7084 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5849 - CGI_10009707 superfamily 248022 95 364 5.41E-51 176.699 cl17468 Aa_trans superfamily C - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#5850 - CGI_10009708 superfamily 100115 77 440 2.51E-106 323.467 cl18930 StaR_like superfamily - - "StaR_like; a well-conserved protein found in bacteria, plants, and animals. A family member from Streptomyces toyocaensis, StaR is part of a gene cluster involved in the biosynthesis of glycopeptide antibiotics (GPAs), specifically A47934. It has been speculated that StaR could be a flavoprotein hydroxylating a tyrosine sidechain. Some family members have been annotated as proteins containing tetratricopeptide (TPR) repeats, which may at least indicate mostly alpha-helical secondary structure." Q#5851 - CGI_10009709 superfamily 241832 30 117 1.36E-42 142.492 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#5851 - CGI_10009709 superfamily 243175 131 255 1.02E-29 109.718 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#5853 - CGI_10009712 superfamily 243124 61 219 4.20E-39 138.714 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#5853 - CGI_10009712 superfamily 247038 232 305 0.000510048 38.2112 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#5867 - CGI_10025093 superfamily 241572 64 148 8.92E-09 52.2409 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#5870 - CGI_10025096 superfamily 145792 5 25 0.00741293 29.9215 cl10597 Antistasin superfamily - - Antistasin family; Members of this family are inhibitors of trypsin family proteases. This domain is highly disulphide bonded. The domain is also found in some large extracellular proteins in multiple copies. Q#5871 - CGI_10025097 superfamily 247724 81 126 1.14E-25 102.578 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5873 - CGI_10025101 superfamily 217410 25 78 1.41E-08 50.0452 cl18409 DDE_1 superfamily NC - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localised to the centromere." Q#5877 - CGI_10025105 superfamily 207684 128 157 0.0049646 33.2288 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#5878 - CGI_10025106 superfamily 241832 13 155 3.95E-31 111.838 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#5879 - CGI_10025107 superfamily 241578 796 1038 2.09E-109 346.568 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5879 - CGI_10025107 superfamily 219707 1039 1123 2.91E-33 125.321 cl06871 Sec23_BS superfamily - - Sec23/Sec24 beta-sandwich domain; Sec23/Sec24 beta-sandwich domain. Q#5879 - CGI_10025107 superfamily 218277 1135 1236 1.29E-22 95.2902 cl04773 Sec23_helical superfamily - - "Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices." Q#5879 - CGI_10025107 superfamily 203092 723 760 2.17E-14 69.8967 cl04769 zf-Sec23_Sec24 superfamily - - "Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain." Q#5879 - CGI_10025107 superfamily 247044 1257 1326 0.000179589 41.5164 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#5881 - CGI_10025109 superfamily 241754 83 852 0 1279.45 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#5881 - CGI_10025109 superfamily 111612 35 77 4.41E-14 68.6618 cl03686 Myosin_N superfamily - - Myosin N-terminal SH3-like domain; This domain has an SH3-like fold. It is found at the N-terminus of many but not all myosins. The function of this domain is unknown. Q#5881 - CGI_10025109 superfamily 151039 1016 1153 0.000893146 39.3963 cl11115 Cenp-F_leu_zip superfamily - - "Leucine-rich repeats of kinetochore protein Cenp-F/LEK1; Cenp-F, a centromeric kinetochore, microtubule-binding protein consisting of two 1,600-amino acid-long coils, is essential for the full functioning of the mitotic checkpoint pathway. There are several leucine-rich repeats along the sequence of LEK1 that are considered to be zippers, though they do not appear to be binding DNA directly in this instance." Q#5881 - CGI_10025109 superfamily 210118 855 877 0.0016863 37.6903 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#5884 - CGI_10025112 superfamily 241578 1 77 1.62E-09 50.753 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#5886 - CGI_10025114 superfamily 243092 249 319 2.39E-15 77.3752 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5886 - CGI_10025114 superfamily 243092 501 690 2.35E-09 58.8856 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5886 - CGI_10025114 superfamily 243092 286 387 3.11E-05 46.174 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5887 - CGI_10025115 superfamily 219125 393 584 8.65E-82 257.256 cl05941 C5-epim_C superfamily - - D-glucuronyl C5-epimerase C-terminus; This family represents the C-terminus of D-glucuronyl C5-epimerase (EC:5.1.3.-). Glucuronyl C5-epimerases catalyze the conversion of D-glucuronic acid (GlcUA) to L-iduronic acid (IdceA) units during the biosynthesis of glycosaminoglycans. Q#5889 - CGI_10025117 superfamily 247725 445 595 3.20E-63 210.595 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5889 - CGI_10025117 superfamily 243096 244 433 9.82E-35 131.651 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#5890 - CGI_10025118 superfamily 243555 74 259 3.17E-21 88.6022 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#5891 - CGI_10025119 superfamily 243555 16 201 1.81E-22 91.6838 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#5892 - CGI_10025120 superfamily 248097 7 124 4.31E-13 61.127 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#5894 - CGI_10025122 superfamily 110440 331 352 0.00410926 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#5896 - CGI_10025124 superfamily 247743 478 623 2.57E-06 47.9111 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#5896 - CGI_10025124 superfamily 243092 1381 1607 8.60E-13 69.6712 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5896 - CGI_10025124 superfamily 243092 1001 1309 4.95E-09 58.1152 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5898 - CGI_10025126 superfamily 220080 265 395 6.11E-61 199.312 cl07526 DUF1900 superfamily - - "Domain of unknown function (DUF1900); This domain is predominantly found in the structural protein coronin, and is duplicated in some sequences. It has no known function." Q#5898 - CGI_10025126 superfamily 149883 10 73 6.07E-35 125.814 cl07525 DUF1899 superfamily - - Domain of unknown function (DUF1899); This set of domains is found in various eukaryotic proteins. Function is unknown. Q#5898 - CGI_10025126 superfamily 243092 81 224 1.08E-17 82.3828 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5900 - CGI_10025128 superfamily 241758 194 518 6.44E-108 335.645 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5900 - CGI_10025128 superfamily 241555 20 163 5.83E-66 218.559 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#5900 - CGI_10025128 superfamily 241758 541 613 0.00418141 38.656 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5901 - CGI_10025129 superfamily 243092 259 545 1.55E-21 97.0204 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5901 - CGI_10025129 superfamily 243092 707 806 0.000178197 44.248 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5902 - CGI_10025130 superfamily 218284 20 187 1.63E-57 180.913 cl04786 SOUL superfamily - - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#5904 - CGI_10025132 superfamily 247724 50 255 1.98E-70 217.574 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5907 - CGI_10025135 superfamily 241758 13 159 2.19E-26 97.8258 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5911 - CGI_10027701 superfamily 242889 329 427 1.22E-21 89.5845 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#5911 - CGI_10027701 superfamily 149439 434 464 2.50E-09 53.8682 cl07120 Rpn3_C superfamily C - Proteasome regulatory subunit C-terminal; This eukaryotic domain is found at the C-terminus of 26S proteasome regulatory subunits such as the non-ATPase Rpn3 subunit which is essential for proteasomal function. It occurs together with the PCI/PINT domain (pfam01399). Q#5911 - CGI_10027701 superfamily 243034 255 288 0.00679977 34.4123 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#5915 - CGI_10027705 superfamily 207627 1288 1382 5.21E-13 67.6599 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#5915 - CGI_10027705 superfamily 207627 925 1023 3.05E-12 65.3535 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#5915 - CGI_10027705 superfamily 207627 1055 1143 2.46E-11 63.0375 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#5915 - CGI_10027705 superfamily 207627 805 898 3.67E-10 59.5755 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#5915 - CGI_10027705 superfamily 207627 1169 1260 3.85E-08 53.4123 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#5916 - CGI_10027706 superfamily 247792 11 55 0.00819407 35.114 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#5916 - CGI_10027706 superfamily 241563 155 187 0.000994962 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#5917 - CGI_10027707 superfamily 247723 91 168 2.98E-25 99.7417 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5917 - CGI_10027707 superfamily 247723 462 537 2.88E-13 65.7618 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#5917 - CGI_10027707 superfamily 207717 25 77 2.71E-07 48.3836 cl02755 LAM superfamily N - "LA motif RNA-binding domain; This domain is found at the N-terminus of La RNA-binding proteins as well as in other related proteins. Typically, the domain co-occurs with an RNA-recognition motif (RRM), and together these domains function to bind primary transcripts of RNA polymerase III in the La autoantigen (Lupus La protein, LARP3, or Sjoegren syndrome type B antigen, SS-B). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes." Q#5918 - CGI_10027708 superfamily 243072 60 184 4.18E-29 112.092 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5918 - CGI_10027708 superfamily 243072 126 255 2.26E-21 90.1354 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#5918 - CGI_10027708 superfamily 216554 368 530 1.64E-24 100.247 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#5919 - CGI_10027709 superfamily 243053 863 1068 9.75E-66 224.054 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#5919 - CGI_10027709 superfamily 241622 538 619 1.56E-19 85.6962 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#5919 - CGI_10027709 superfamily 243067 430 535 2.27E-15 74.7563 cl02520 REM superfamily - - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#5919 - CGI_10027709 superfamily 241570 291 393 1.12E-12 66.5806 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#5919 - CGI_10027709 superfamily 241645 756 840 5.04E-37 136.147 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#5921 - CGI_10027711 superfamily 245819 1067 1243 1.05E-61 210.128 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#5921 - CGI_10027711 superfamily 245201 751 925 7.66E-30 119.649 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5921 - CGI_10027711 superfamily 245225 166 542 1.57E-75 257.644 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#5921 - CGI_10027711 superfamily 219526 1012 1053 1.12E-05 46.4583 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#5923 - CGI_10027713 superfamily 245225 27 117 3.64E-06 43.7707 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#5924 - CGI_10027714 superfamily 201526 18 61 1.20E-08 47.1477 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#5925 - CGI_10027715 superfamily 201526 231 291 1.81E-13 65.2521 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#5925 - CGI_10027715 superfamily 201526 7 51 1.46E-11 59.8593 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#5926 - CGI_10027716 superfamily 201526 424 492 9.74E-14 67.1781 cl09522 Synaptobrevin superfamily - - Synaptobrevin; Synaptobrevin. Q#5928 - CGI_10027719 superfamily 192535 59 277 3.36E-07 49.9018 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#5930 - CGI_10027721 superfamily 247683 52 106 1.69E-21 87.638 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#5930 - CGI_10027721 superfamily 245201 230 436 9.71E-145 417.107 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#5930 - CGI_10027721 superfamily 246908 114 215 4.08E-47 159.285 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#5931 - CGI_10027722 superfamily 241695 174 356 6.21E-68 215.033 cl00217 pyrophosphatase superfamily - - Inorganic pyrophosphatase. These enzymes hydrolyze inorganic pyrophosphate (PPi) to two molecules of orthophosphates (Pi). The reaction requires bivalent cations. The enzymes in general exist as homooligomers. Q#5931 - CGI_10027722 superfamily 241695 2 151 1.68E-55 186.499 cl00217 pyrophosphatase superfamily C - Inorganic pyrophosphatase. These enzymes hydrolyze inorganic pyrophosphate (PPi) to two molecules of orthophosphates (Pi). The reaction requires bivalent cations. The enzymes in general exist as homooligomers. Q#5932 - CGI_10027723 superfamily 241597 179 250 3.27E-36 128.185 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#5932 - CGI_10027723 superfamily 219805 24 75 4.54E-05 42.9668 cl07093 CTNNB1_binding superfamily N - "N-terminal CTNNB1 binding; This region tends to appear at the N-terminus of proteins also containing DNA-binding HMG (high mobility group) boxes (pfam00505) and appears to bind the armadillo repeat of CTNNB1 (beta-catenin), forming a stable complex. Signaling by Wnt through TCF/LCF is involved in developmental patterning, induction of neural tissues, cell fate decisions and stem cell differentiation. Isoforms of HMG T-cell factors lacking the N-terminal CTNNB1-binding domain cannot fulfill their role as transcriptional activators in T-cell differentiation." Q#5935 - CGI_10027726 superfamily 219805 1 47 6.28E-11 54.5228 cl07093 CTNNB1_binding superfamily C - "N-terminal CTNNB1 binding; This region tends to appear at the N-terminus of proteins also containing DNA-binding HMG (high mobility group) boxes (pfam00505) and appears to bind the armadillo repeat of CTNNB1 (beta-catenin), forming a stable complex. Signaling by Wnt through TCF/LCF is involved in developmental patterning, induction of neural tissues, cell fate decisions and stem cell differentiation. Isoforms of HMG T-cell factors lacking the N-terminal CTNNB1-binding domain cannot fulfill their role as transcriptional activators in T-cell differentiation." Q#5936 - CGI_10027727 superfamily 243092 136 247 1.24E-13 67.36 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5937 - CGI_10027728 superfamily 243058 32 150 2.44E-09 55.3983 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5937 - CGI_10027728 superfamily 243058 435 557 4.03E-08 51.9315 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5937 - CGI_10027728 superfamily 243058 369 469 8.30E-05 41.5312 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5937 - CGI_10027728 superfamily 243058 544 633 0.000195873 40.3756 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#5938 - CGI_10027729 superfamily 243859 5 97 6.72E-17 70.4366 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#5941 - CGI_10027732 superfamily 243092 68 401 6.90E-49 169.438 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5942 - CGI_10027733 superfamily 247683 67 129 7.87E-33 120.075 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#5942 - CGI_10027733 superfamily 247744 238 409 5.09E-22 93.5115 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#5943 - CGI_10027734 superfamily 246710 4 105 1.19E-18 82.9355 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#5943 - CGI_10027734 superfamily 217685 292 450 2.49E-50 171.748 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#5943 - CGI_10027734 superfamily 216290 146 277 1.45E-40 143.583 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#5946 - CGI_10027737 superfamily 217414 439 840 3.40E-93 301.556 cl03927 Otopetrin superfamily - - "Protein of unknown function, DUF270; Protein of unknown function, DUF270. " Q#5946 - CGI_10027737 superfamily 247944 186 244 0.00137181 40.819 cl17390 Herpes_UL37_2 superfamily C - Betaherpesvirus immediate-early glycoprotein UL37; This family consists of several Betaherpesvirus immediate-early glycoprotein UL37 sequences. The human cytomegalovirus (HCMV) UL37 immediate-early regulatory protein is a type I integral membrane N-glycoprotein which traffics through the ER and the Golgi network. Q#5948 - CGI_10027739 superfamily 243035 22 134 1.91E-13 64.1781 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5951 - CGI_10027742 superfamily 241547 33 260 9.03E-99 291.109 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#5952 - CGI_10027743 superfamily 247743 1032 1124 0.00475791 37.66 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#5953 - CGI_10027744 superfamily 203031 72 132 0.00109785 35.3816 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#5954 - CGI_10027745 superfamily 193256 465 736 3.84E-71 242.161 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#5954 - CGI_10027745 superfamily 193257 1249 1479 5.10E-60 208.686 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#5954 - CGI_10027745 superfamily 193251 113 376 9.09E-42 157.405 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#5954 - CGI_10027745 superfamily 193253 885 1226 4.68E-39 151.343 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#5954 - CGI_10027745 superfamily 193256 797 871 7.94E-07 51.872 cl18189 AAA_8 superfamily N - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#5954 - CGI_10027745 superfamily 193253 750 796 0.00610745 40.0201 cl15084 MT superfamily C - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#5955 - CGI_10027746 superfamily 248020 28 441 1.02E-43 162.531 cl17466 Sulfatase superfamily C - Sulfatase; Sulfatase. Q#5956 - CGI_10027747 superfamily 241832 60 172 1.16E-63 195.485 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#5957 - CGI_10027748 superfamily 247781 5 192 1.36E-91 278.866 cl17227 Arginosuc_synth superfamily C - Arginosuccinate synthase; This family contains a PP-loop motif. Q#5957 - CGI_10027748 superfamily 241758 189 259 9.74E-29 112.242 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#5959 - CGI_10027750 superfamily 222370 131 215 6.32E-31 112.231 cl16386 Longin superfamily - - "Regulated-SNARE-like domain; Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain." Q#5959 - CGI_10027750 superfamily 201526 230 285 4.51E-13 63.3261 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#5960 - CGI_10027751 superfamily 246721 1081 1403 7.90E-103 331.145 cl14807 ACE1-Sec16-like superfamily - - "Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site." Q#5965 - CGI_10027756 superfamily 241770 147 274 1.17E-23 93.9996 cl00309 PRTases_typeI superfamily - - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#5965 - CGI_10027756 superfamily 222383 4 120 2.44E-69 213.812 cl16402 Pribosyltran_N superfamily - - "N-terminal domain of ribose phosphate pyrophosphokinase; This family is frequently found N-terminal to the Pribosyltran, pfam00156." Q#5966 - CGI_10027757 superfamily 241644 8 149 2.74E-54 171.616 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#5966 - CGI_10027757 superfamily 241643 163 199 1.41E-10 53.6171 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#5967 - CGI_10027758 superfamily 241622 661 745 2.84E-13 68.3622 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#5967 - CGI_10027758 superfamily 241559 37 115 8.01E-10 58.4763 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#5967 - CGI_10027758 superfamily 243050 1961 2019 1.22E-09 56.9391 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#5968 - CGI_10027759 superfamily 243092 347 621 5.10E-86 274.212 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#5968 - CGI_10027759 superfamily 243074 252 296 1.68E-12 63.2945 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#5970 - CGI_10027761 superfamily 218440 882 1158 1.85E-18 89.9809 cl14936 AF-4 superfamily N - "AF-4 proto-oncoprotein; This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila." Q#5971 - CGI_10027762 superfamily 243083 5 90 4.83E-29 110.237 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#5971 - CGI_10027762 superfamily 236582 258 313 5.17E-09 56.2955 cl18895 PRK09599 superfamily C - 6-phosphogluconate dehydrogenase-like protein; Reviewed Q#5972 - CGI_10027763 superfamily 245814 210 243 0.00124729 37.3745 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#5973 - CGI_10027764 superfamily 246597 59 354 0 518.479 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#5973 - CGI_10027764 superfamily 247856 474 539 3.70E-11 59.4837 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#5974 - CGI_10027765 superfamily 243045 266 361 1.06E-16 75.7475 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#5974 - CGI_10027765 superfamily 241596 30 80 1.57E-08 51.4459 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#5974 - CGI_10027765 superfamily 243045 120 170 3.49E-06 44.9315 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#5976 - CGI_10027767 superfamily 243992 8 64 1.81E-06 41.82 cl05087 Complex1_LYR_1 superfamily - - "Complex1_LYR-like; This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria." Q#5977 - CGI_10027768 superfamily 220695 72 194 0.00181333 38.7139 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#5978 - CGI_10027769 superfamily 222526 332 551 4.48E-51 179.853 cl16591 UBN_AB superfamily - - "Ubinuclein conserved middle domain; Ubinuclein 1 and 2 (UBN1, UBN2) are members of a histone chaperone complex involved in the formation of a certain type of facultative heterochromatin, called senescence-associated heterochromatin foci (SAHF). The domain described here is conserved in many eukaryotes such as human, rat, drosophila, and zebra-fish and has been targeted for protein structure determination by the Joint Center for Structural Genomics." Q#5978 - CGI_10027769 superfamily 219992 117 161 2.40E-09 55.0184 cl16007 HRD superfamily - - "Hpc2-related domain; HPC2 (Histone promoter control 2) is required for cell-cycle regulation of histone transcription. It regulates transcription of the histone genes during the S-phase of the cell cycle by repressing transcription at other cell cycle stages. HPC2 mutants display synthetic interactions with FACT complex which allows RNA Pol II to elongate through nucleosomes. This short domain is referred to as the HRD or Hpc2-related domain and is found in both human, yeast and Sch. pombe sequences. Hpc2 is one of the proteins of one of the multi-subunit complexes that mediate replication-independent nucleosome assembly, along with histone chaperone proteins. the Hip4 sequence from SCH. pombe is an integral component of this complex that is required for transcriptional silencing at multiple loci." Q#5979 - CGI_10027770 superfamily 242350 3 160 6.98E-35 122.819 cl01182 SprT superfamily - - "SprT homologues; Predicted to have roles in transcription elongation. Contains a conserved HExxH motif, indicating a metalloprotease function." Q#5979 - CGI_10027770 superfamily 241597 175 214 0.00274472 34.5987 cl00082 HMG-box superfamily C - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#5980 - CGI_10027771 superfamily 247725 347 483 1.24E-89 279.941 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5980 - CGI_10027771 superfamily 247725 529 661 2.76E-47 163.926 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#5980 - CGI_10027771 superfamily 241647 236 264 0.00101803 37.9332 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#5981 - CGI_10027772 superfamily 188051 318 553 0.00908705 37.0611 cl18155 nop2p superfamily - - "NOL1/NOP2/sun family putative RNA methylase; [Protein synthesis, tRNA and rRNA base modification]." Q#5984 - CGI_10027775 superfamily 241867 1 241 8.80E-109 316.739 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#5985 - CGI_10027776 superfamily 147799 1 74 1.50E-08 46.4219 cl05422 Apc13p superfamily - - Apc13p protein; The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators Members of this family are components of the anaphase-promoting complex homologous to Apc13p. Q#5986 - CGI_10027777 superfamily 241739 45 173 8.87E-52 170.054 cl00268 class_II_aaRS-like_core superfamily N - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#5987 - CGI_10027778 superfamily 243035 1 112 1.01E-12 59.9409 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#5988 - CGI_10027779 superfamily 241607 47 79 0.00220342 33.7826 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#5989 - CGI_10027780 superfamily 222150 7 31 3.02E-05 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#5990 - CGI_10027781 superfamily 215648 20 280 8.20E-19 85.3399 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#5993 - CGI_10027785 superfamily 246918 51 95 1.13E-07 44.4999 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5993 - CGI_10027785 superfamily 246918 3 48 4.70E-07 42.9591 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#5994 - CGI_10027786 superfamily 247724 10 44 2.48E-14 63.1167 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#5995 - CGI_10004272 superfamily 247941 222 387 3.35E-08 51.7445 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#5995 - CGI_10004272 superfamily 247941 27 124 0.00083597 39.0329 cl17387 Methyltransf_21 superfamily N - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#5996 - CGI_10004273 superfamily 241575 1451 1512 2.37E-11 61.9047 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#5996 - CGI_10004273 superfamily 241575 1373 1405 0.000781322 39.5631 cl00054 DSRM superfamily N - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#5998 - CGI_10004275 superfamily 241874 36 143 8.98E-17 77.5213 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#5999 - CGI_10004276 superfamily 241874 1 343 1.76E-62 210.489 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#6000 - CGI_10004277 superfamily 243035 228 342 4.21E-15 71.8821 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6000 - CGI_10004277 superfamily 243035 447 504 8.95E-06 44.1478 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6001 - CGI_10009317 superfamily 245819 194 368 1.12E-70 222.84 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#6001 - CGI_10009317 superfamily 219526 96 181 1.14E-18 83.0522 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#6001 - CGI_10009317 superfamily 203730 16 98 2.06E-16 76.154 cl18246 HNOB superfamily N - "Heme NO binding; The HNOB (Heme NO Binding) domain, is a predominantly alpha-helical domain and binds heme via a covalent linkage to histidine. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#6002 - CGI_10009318 superfamily 215827 187 365 2.10E-30 120.266 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#6003 - CGI_10009319 superfamily 215827 136 314 4.97E-41 149.926 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#6004 - CGI_10009320 superfamily 220170 18 172 1.84E-20 85.8764 cl08542 XLF superfamily - - "XLF (XRCC4-like factor); XLF (also called Cernunnos) interacts with the XRCC4-DNA ligase IV complex to promote DNA non-homologous end-joining. It directly interacts with the XRCC4-Ligase IV complex and siRNA-mediated downregulation of XLF in human cell lines leads to radio-sensitivity and impaired DNA non-homologous end-joining. This family contains Nej1 (non-homologous end-joining factor), and Lif1." Q#6005 - CGI_10009321 superfamily 243072 37 99 1.22E-08 51.6154 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6007 - CGI_10009323 superfamily 243077 10 47 8.35E-13 63.3333 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#6009 - CGI_10009325 superfamily 202894 106 171 1.80E-26 98.447 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#6010 - CGI_10009326 superfamily 248472 6 65 4.23E-28 101.244 cl17918 Ribosomal_P1_P2_L12p superfamily C - "Ribosomal protein P1, P2, and L12p. Ribosomal proteins P1 and P2 are the eukaryotic proteins that are functionally equivalent to bacterial L7/L12. L12p is the archaeal homolog. Unlike other ribosomal proteins, the archaeal L12p and eukaryotic P1 and P2 do not share sequence similarity with their bacterial counterparts. They are part of the ribosomal stalk (called the L7/L12 stalk in bacteria), along with 28S rRNA and the proteins L11 and P0 in eukaryotes (23S rRNA, L11, and L10e in archaea). In bacterial ribosomes, L7/L12 homodimers bind the extended C-terminal helix of L10 to anchor the L7/L12 molecules to the ribosome. Eukaryotic P1/P2 heterodimers and archaeal L12p homodimers are believed to bind the L10 equivalent proteins, eukaryotic P0 and archaeal L10e, in a similar fashion. P1 and P2 (L12p, L7/L12) are the only proteins in the ribosome to occur as multimers, always appearing as sets of dimers. Recent data indicate that most archaeal species contain six copies of L12p (three homodimers), while eukaryotes have two copies each of P1 and P2 (two heterodimers). Bacteria may have four or six copies (two or three homodimers), depending on the species. As in bacteria, the stalk is crucial for binding of initiation, elongation, and release factors in eukaryotes and archaea." Q#6011 - CGI_10009327 superfamily 241832 147 247 8.10E-52 170.158 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#6011 - CGI_10009327 superfamily 241832 285 382 4.06E-46 155.52 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#6011 - CGI_10009327 superfamily 241832 25 128 1.56E-39 137.801 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#6012 - CGI_10009328 superfamily 243092 20 213 2.23E-07 52.7224 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#6012 - CGI_10009328 superfamily 220045 619 702 0.00102807 38.8524 cl07455 DUF1829 superfamily - - Domain of unknown function DUF1829; This short domain is usually associated with pfam08861. Q#6016 - CGI_10011604 superfamily 245864 16 486 4.54E-94 295.727 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#6017 - CGI_10011606 superfamily 191430 14 189 2.88E-07 49.0114 cl05523 Gly_acyl_tr_N superfamily - - Aralkyl acyl-CoA:amino acid N-acyltransferase; This family consists of several mammalian specific aralkyl acyl-CoA:amino acid N-acyltransferase (glycine N-acyltransferase) proteins EC:2.3.1.13. Q#6017 - CGI_10011606 superfamily 247736 197 265 1.87E-05 41.9314 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#6019 - CGI_10011608 superfamily 238191 34 133 0.000967545 38.082 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#6020 - CGI_10011609 superfamily 238191 30 133 3.21E-10 56.1864 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#6021 - CGI_10011610 superfamily 238191 35 141 1.76E-06 45.0156 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#6022 - CGI_10011611 superfamily 238191 1 90 2.04E-06 43.86 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#6023 - CGI_10011612 superfamily 238191 4 540 2.26E-87 288.462 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#6024 - CGI_10007145 superfamily 215754 13 107 3.90E-24 94.2424 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#6024 - CGI_10007145 superfamily 215754 112 194 2.98E-18 77.6788 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#6024 - CGI_10007145 superfamily 215754 221 295 1.85E-14 67.6636 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#6025 - CGI_10007146 superfamily 201844 204 234 8.15E-12 60.7722 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#6027 - CGI_10007148 superfamily 245864 35 528 4.48E-78 254.896 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#6028 - CGI_10007149 superfamily 245864 51 507 4.03E-102 316.913 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#6029 - CGI_10007150 superfamily 245864 36 486 3.27E-91 288.023 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#6030 - CGI_10007151 superfamily 241563 18 51 0.00205118 33.0795 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6031 - CGI_10007152 superfamily 243072 43 164 4.80E-26 100.921 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6033 - CGI_10007154 superfamily 238191 38 157 9.48E-10 55.416 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#6034 - CGI_10007155 superfamily 238191 30 145 2.57E-05 41.1636 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#6035 - CGI_10007156 superfamily 238191 31 566 1.63E-94 307.722 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#6036 - CGI_10006169 superfamily 241563 70 109 5.02E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6039 - CGI_10006172 superfamily 110440 744 770 0.000913662 38.1577 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6041 - CGI_10006174 superfamily 245213 531 566 3.84E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6041 - CGI_10006174 superfamily 245213 876 907 2.87E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6041 - CGI_10006174 superfamily 245213 916 959 6.89E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6041 - CGI_10006174 superfamily 245213 834 875 3.82E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6041 - CGI_10006174 superfamily 245213 788 824 5.83E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6041 - CGI_10006174 superfamily 245213 615 647 0.000393674 39.5422 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6041 - CGI_10006174 superfamily 245213 725 770 0.00141553 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6041 - CGI_10006174 superfamily 242788 45 203 4.36E-78 253.962 cl01936 Rad52_Rad22 superfamily - - "Rad52/22 family double-strand break repair protein; The DNA single-strand annealing proteins (SSAPs), such as RecT, Red-beta, ERF and Rad52, function in RecA-dependent and RecA-independent DNA recombination pathways. This family includes proteins related to Rad52. These proteins contain two helix-hairpin-helix motifs." Q#6041 - CGI_10006174 superfamily 221695 597 618 0.000706471 38.9754 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#6041 - CGI_10006174 superfamily 245213 960 996 0.0020741 37.6116 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6042 - CGI_10006175 superfamily 220647 26 189 8.20E-46 153.252 cl18565 L_HGMIC_fpl superfamily - - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#6044 - CGI_10006177 superfamily 243119 26 62 0.00871017 33.1761 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#6045 - CGI_10006178 superfamily 248469 90 197 5.20E-09 51.9871 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#6046 - CGI_10006179 superfamily 241677 17 163 7.52E-83 243.596 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#6051 - CGI_10007502 superfamily 241563 97 129 5.11E-05 41.1188 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6051 - CGI_10007502 superfamily 245010 136 230 0.00069483 38.3679 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#6052 - CGI_10007503 superfamily 247727 103 207 0.000108446 39.3355 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#6053 - CGI_10007504 superfamily 243092 24 112 0.00449546 38.0848 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#6054 - CGI_10007505 superfamily 110440 396 423 0.00205297 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6056 - CGI_10007507 superfamily 243092 27 186 1.02E-05 47.3296 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#6057 - CGI_10007508 superfamily 241889 144 297 1.94E-61 205.146 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#6057 - CGI_10007508 superfamily 222150 599 624 6.48E-06 43.9197 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6057 - CGI_10007508 superfamily 222150 628 649 0.000214433 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6058 - CGI_10007509 superfamily 215647 97 166 3.22E-05 42.9809 cl18338 7tm_2 superfamily NC - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#6059 - CGI_10007510 superfamily 241874 33 535 0 740.148 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#6060 - CGI_10007511 superfamily 215647 74 138 0.00013675 39.1289 cl18338 7tm_2 superfamily NC - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#6062 - CGI_10007513 superfamily 152897 378 685 6.08E-94 297.493 cl13848 DUF3689 superfamily - - Protein of unknown function (DUF3689); This family of proteins is found in eukaryotes. Proteins in this family are typically between 399 and 797 amino acids in length. Q#6063 - CGI_10007514 superfamily 215686 193 260 0.0015905 36.6265 cl18340 Lipocalin superfamily N - "Lipocalin / cytosolic fatty-acid binding protein family; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel." Q#6065 - CGI_10001203 superfamily 241686 431 493 1.02E-16 77.2609 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#6065 - CGI_10001203 superfamily 241686 231 294 3.65E-16 75.7201 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#6065 - CGI_10001203 superfamily 241686 62 124 1.80E-15 73.4089 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#6065 - CGI_10001203 superfamily 241686 140 200 5.78E-15 72.2533 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#6065 - CGI_10001203 superfamily 241686 311 371 6.40E-13 66.0901 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#6065 - CGI_10001203 superfamily 241686 506 569 7.87E-13 65.7049 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#6065 - CGI_10001203 superfamily 248469 1138 1264 0.00255027 38.5051 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#6065 - CGI_10001203 superfamily 215733 730 976 2.24E-60 208.188 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#6066 - CGI_10000493 superfamily 214531 7 49 6.06E-07 42.5889 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#6067 - CGI_10002974 superfamily 245847 41 187 5.56E-18 76.8265 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6068 - CGI_10012519 superfamily 152683 352 447 2.69E-07 48.4381 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#6068 - CGI_10012519 superfamily 219525 18 64 4.53E-05 41.2506 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#6069 - CGI_10012520 superfamily 246925 394 594 0.000116945 44.2686 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#6069 - CGI_10012520 superfamily 210118 801 820 0.00178798 37.6903 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#6070 - CGI_10012521 superfamily 241563 40 76 0.000208835 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6070 - CGI_10012521 superfamily 110440 462 488 0.000333046 38.5429 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6071 - CGI_10012522 superfamily 241623 1825 2144 1.60E-179 550.343 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#6071 - CGI_10012522 superfamily 241742 1597 1772 2.51E-75 250.351 cl00271 PI3Ka superfamily - - "Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture." Q#6074 - CGI_10012525 superfamily 241754 899 1046 3.30E-60 210.17 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#6074 - CGI_10012525 superfamily 241754 196 375 9.34E-44 163.175 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#6078 - CGI_10012529 superfamily 245101 111 196 0.00625482 36.5221 cl09608 Cas7_I-E superfamily C - CRISPR/Cas system-associated RAMP superfamily protein Cas7; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas7 is a RAMP superfamily protein; Subunit of the Cascade complex; also known as Cse4/CasC family Q#6080 - CGI_10012531 superfamily 219595 563 811 1.34E-66 223.341 cl06723 GLE1 superfamily - - GLE1-like protein; The members of this family are sequences that are similar to the human protein GLE1. This protein is localised at the nuclear pore complexes and functions in poly(A)+ RNA export to the cytoplasm. Q#6081 - CGI_10012532 superfamily 248098 127 159 0.00398577 36.0589 cl17544 U-box superfamily N - U-box domain; This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. Q#6083 - CGI_10012534 superfamily 220691 66 229 4.95E-06 46.457 cl18569 7TM_GPCR_Srv superfamily C - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#6084 - CGI_10013092 superfamily 243072 159 295 1.69E-12 63.5566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6084 - CGI_10013092 superfamily 243072 13 87 0.0040714 35.8223 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6085 - CGI_10013093 superfamily 241574 30 165 1.94E-46 151.992 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6088 - CGI_10013096 superfamily 217915 255 624 5.68E-79 265.909 cl14957 Spc97_Spc98 superfamily N - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#6091 - CGI_10013099 superfamily 247675 131 467 0 531.529 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#6093 - CGI_10013101 superfamily 247725 141 241 7.94E-07 46.4334 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#6094 - CGI_10013102 superfamily 243091 146 256 3.58E-14 69.6707 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#6094 - CGI_10013102 superfamily 222150 442 467 3.08E-06 44.6901 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6094 - CGI_10013102 superfamily 222150 358 383 3.49E-06 44.6901 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6094 - CGI_10013102 superfamily 222150 498 523 3.54E-06 44.6901 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6094 - CGI_10013102 superfamily 222150 386 411 5.42E-06 43.9197 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6094 - CGI_10013102 superfamily 222150 470 495 5.53E-06 43.9197 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6094 - CGI_10013102 superfamily 222150 414 438 8.11E-06 43.5345 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6094 - CGI_10013102 superfamily 222150 526 551 8.52E-06 43.5345 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6094 - CGI_10013102 superfamily 222150 554 578 3.74E-05 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6095 - CGI_10013103 superfamily 243116 2292 2669 5.28E-157 492.897 cl02626 DNA_pol_A superfamily - - "Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication; DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains." Q#6095 - CGI_10013103 superfamily 247905 528 612 8.63E-10 59.1737 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#6095 - CGI_10013103 superfamily 247805 189 343 3.60E-08 54.2656 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6096 - CGI_10013104 superfamily 247741 6 269 1.75E-78 242.6 cl17187 Aldolase_Class_I superfamily - - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#6097 - CGI_10013105 superfamily 247792 74 111 0.00156089 33.8925 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6098 - CGI_10013106 superfamily 247792 37 74 0.00378974 33.5073 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6100 - CGI_10013108 superfamily 244913 179 626 0 584.764 cl08327 Glyco_hydro_47 superfamily - - "Glycosyl hydrolase family 47; Members of this family are alpha-mannosidases that catalyze the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2)." Q#6101 - CGI_10013109 superfamily 218547 14 116 1.34E-34 117.429 cl05054 DUF727 superfamily - - Protein of unknown function (DUF727); This family consists of several uncharacterized eukaryotic proteins of unknown function. Q#6103 - CGI_10013111 superfamily 243072 48 160 5.80E-30 116.329 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6103 - CGI_10013111 superfamily 243072 201 276 4.37E-21 90.9058 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6104 - CGI_10013112 superfamily 241599 333 390 7.65E-11 57.6385 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#6107 - CGI_10005596 superfamily 242122 1413 1541 5.93E-16 77.0427 cl00824 HEPN superfamily - - HEPN domain; HEPN domain. Q#6108 - CGI_10005597 superfamily 241913 195 266 3.06E-06 44.1341 cl00509 hot_dog superfamily C - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#6109 - CGI_10005598 superfamily 241886 13 287 3.46E-86 262.88 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#6110 - CGI_10005599 superfamily 243267 26 131 1.58E-33 124.648 cl03000 Innexin superfamily C - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#6110 - CGI_10005599 superfamily 243267 108 209 1.24E-08 53.3864 cl03000 Innexin superfamily N - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#6112 - CGI_10021884 superfamily 247725 400 504 6.26E-41 144.007 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#6113 - CGI_10021885 superfamily 241782 194 545 3.33E-173 498.624 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#6113 - CGI_10021885 superfamily 149929 10 101 8.68E-06 44.4225 cl07589 Preseq_ALAS superfamily - - "5-aminolevulinate synthase presequence; The N terminal presequence domain found in 5-aminolevulinate synthase exists as an amphipathic helix, with a positively charged surface provided by lysine residues and no stable helix at the N-terminus. The domain is essential for the import process by which ALAS is transported into the mitochondria: translocase of the outer membrane (Tom) and translocase of the inner membrane protein complexes appear responsible for recognition and import through the mitochondrial membrane. The protein Tom20 is anchored to the mitochondrial outer membrane, and its interaction with presequences is thought to be the recognition step which allows subsequent import." Q#6114 - CGI_10021886 superfamily 241563 68 109 1.71E-06 45.548 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6114 - CGI_10021886 superfamily 241563 28 59 0.00119127 37.0736 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6114 - CGI_10021886 superfamily 206779 136 203 0.00898634 36.446 cl16910 MCP_signal superfamily N - "Methyl-accepting chemotaxis protein (MCP), signaling domain; Methyl-accepting chemotaxis proteins (MCPs or chemotaxis receptors) are an integral part of the transmembrane protein complex that controls bacterial chemotaxis, together with the histidine kinase CheA, the receptor-coupling protein CheW, receptor-modification enzymes, and localized phosphatases. MCPs contain a four helix trans membrane region, an N-terminal periplasmic ligand binding domain, and a C-terminal HAMP domain followed by a cytoplasmic signaling domain. This C-terminal signaling domain dimerizes into a four-helix bundle and interacts with CheA through the adaptor protein CheW." Q#6115 - CGI_10021887 superfamily 241563 227 268 9.57E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6115 - CGI_10021887 superfamily 242274 28 110 3.44E-05 43.7935 cl01053 SGNH_hydrolase superfamily NC - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#6115 - CGI_10021887 superfamily 241563 187 218 0.00205471 37.0736 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6117 - CGI_10021889 superfamily 247724 169 344 5.33E-17 79.8074 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#6119 - CGI_10021891 superfamily 115363 194 255 5.62E-14 65.4709 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#6119 - CGI_10021891 superfamily 241578 1 121 0.00186043 37.1602 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#6121 - CGI_10021893 superfamily 115363 406 467 2.62E-10 57.3818 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#6121 - CGI_10021893 superfamily 115363 605 666 6.14E-10 56.2262 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#6121 - CGI_10021893 superfamily 115363 680 715 0.005646 35.8106 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#6123 - CGI_10021895 superfamily 247905 54 104 5.13E-15 67.2628 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#6123 - CGI_10021895 superfamily 247805 1 76 6.77E-25 95.6293 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6124 - CGI_10021896 superfamily 247756 140 237 0.00101948 38.1099 cl17202 HAD superfamily N - haloacid dehalogenase-like hydrolase; haloacid dehalogenase-like hydrolase. Q#6125 - CGI_10021897 superfamily 217473 99 322 3.10E-25 105.91 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#6131 - CGI_10021903 superfamily 247769 187 358 1.99E-13 67.7497 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#6131 - CGI_10021903 superfamily 203961 42 102 8.50E-13 63.8032 cl07209 PDEase_I_N superfamily - - 3'5'-cyclic nucleotide phosphodiesterase N-terminal; This domain is found to the N-terminus of the calcium/calmodulin-dependent 3'5'-cyclic nucleotide phosphodiesterase domain (pfam00233). Q#6133 - CGI_10021905 superfamily 247856 50 98 0.000747338 33.6753 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#6135 - CGI_10021907 superfamily 245874 120 174 0.000178698 38.5614 cl12111 TNFR superfamily N - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#6136 - CGI_10021908 superfamily 247057 25 92 4.61E-15 71.5118 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#6136 - CGI_10021908 superfamily 247057 114 181 4.61E-15 71.5118 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#6137 - CGI_10021909 superfamily 243175 198 347 8.86E-65 204.763 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#6137 - CGI_10021909 superfamily 241832 69 146 1.11E-33 120.591 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#6138 - CGI_10021910 superfamily 216554 49 211 5.05E-25 97.9353 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#6139 - CGI_10021911 superfamily 216554 80 233 2.43E-20 85.2237 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#6140 - CGI_10021912 superfamily 217473 98 289 1.31E-29 118.622 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#6141 - CGI_10021913 superfamily 243072 831 956 1.24E-24 101.306 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6141 - CGI_10021913 superfamily 243072 900 1023 3.40E-19 85.513 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6143 - CGI_10021915 superfamily 199006 43 136 8.45E-29 104.265 cl10918 Cg6151-P superfamily - - "Uncharacterized conserved protein CG6151-P; This is a family of small, less than 200 residue long, proteins which are named as CG6151-P proteins that are conserved from fungi to humans. The function is unknown. The fungal members have a characteristic ICP sequence motif. Some members are annotated as putative clathrin-coated vesicle protein but this could not be defined." Q#6144 - CGI_10021916 superfamily 217473 711 934 4.83E-27 113.229 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#6144 - CGI_10021916 superfamily 217473 96 319 5.25E-26 110.147 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#6148 - CGI_10021920 superfamily 243106 332 457 2.95E-47 165.713 cl02608 BAH superfamily - - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#6148 - CGI_10021920 superfamily 243106 550 684 3.97E-56 190.786 cl02608 BAH superfamily - - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#6148 - CGI_10021920 superfamily 221394 1 131 1.54E-40 147.487 cl13480 DNMT1-RFD superfamily - - Cytosine specific DNA methyltransferase replication foci domain; This domain is part of a cytosine specific DNA methyltransferase enzyme. It functions non-catalytically to target the protein towards replication foci. This allows the DNMT1 protein to methylate the correct residues. This domain targets DMAP1 and HDAC2 to the replication foci during the S phase of mitosis. They are thought to have some importance in conversion of critical histone lysine moieties. Q#6148 - CGI_10021920 superfamily 238192 724 953 2.30E-34 133.9 cl18939 Cyt_C5_DNA_methylase superfamily C - "Cytosine-C5 specific DNA methylases; Methyl transfer reactions play an important role in many aspects of biology. Cytosine-specific DNA methylases are found both in prokaryotes and eukaryotes. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the mammalian genome. These effects include transcriptional repression via inhibition of transcription factor binding or the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability." Q#6148 - CGI_10021920 superfamily 202085 237 281 1.44E-18 81.2526 cl03401 zf-CXXC superfamily - - "CXXC zinc finger domain; This domain contains eight conserved cysteine residues that bind to two zinc ions. The CXXC domain is found in a variety of chromatin-associated proteins. This domain binds to nonmethyl-CpG dinucleotides. The domain is characterized by two CGXCXXC repeats. The RecQ helicase has a single repeat that also binds to zinc, but this has not been included in this family. The DNA binding interface has been identified by NMR." Q#6150 - CGI_10021922 superfamily 247805 3 41 0.000504685 34.2352 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6153 - CGI_10021925 superfamily 216152 1 198 3.10E-51 171.343 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#6154 - CGI_10001660 superfamily 191362 149 202 6.75E-31 109.668 cl05351 zf-nanos superfamily - - "Nanos RNA binding domain; This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localised determinant of posterior pattern. Nanos RNA is localised to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localised source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localised and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development." Q#6155 - CGI_10001661 superfamily 241563 68 109 6.13E-05 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6158 - CGI_10007253 superfamily 199166 657 806 5.46E-17 80.8344 cl15308 AMN1 superfamily C - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#6158 - CGI_10007253 superfamily 199166 458 645 9.18E-16 76.9824 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#6158 - CGI_10007253 superfamily 199166 337 530 5.92E-14 71.2044 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#6158 - CGI_10007253 superfamily 243074 292 332 6.52E-11 59.0573 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#6159 - CGI_10007254 superfamily 222150 495 520 1.72E-06 45.8457 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6159 - CGI_10007254 superfamily 222150 439 464 1.08E-05 43.5345 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6159 - CGI_10007254 superfamily 222150 467 491 0.00451484 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6159 - CGI_10007254 superfamily 222150 581 606 0.00770489 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6163 - CGI_10007258 superfamily 248458 109 404 3.84E-05 44.2269 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#6164 - CGI_10007259 superfamily 241758 8 136 5.27E-24 97.0554 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#6166 - CGI_10007261 superfamily 245847 41 159 1.54E-11 57.9517 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6169 - CGI_10009877 superfamily 242043 35 73 1.06E-09 50.8036 cl00713 Auto_anti-p27 superfamily - - Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27); This family consists of several Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27) sequences. It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis. Q#6170 - CGI_10009878 superfamily 248054 24 218 2.33E-05 44.2155 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#6171 - CGI_10009879 superfamily 248054 6 220 1.88E-14 71.5647 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#6172 - CGI_10009880 superfamily 245864 42 305 2.78E-44 159.751 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#6172 - CGI_10009880 superfamily 245864 313 361 0.000275927 41.1098 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#6174 - CGI_10009882 superfamily 218208 632 750 3.17E-36 133.654 cl18447 CwfJ_C_1 superfamily - - Protein similar to CwfJ C-terminus 1; This region is found in the N terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. Q#6174 - CGI_10009882 superfamily 218207 759 853 2.37E-29 113.524 cl04666 CwfJ_C_2 superfamily - - Protein similar to CwfJ C-terminus 2; This region is found in the N terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. Q#6176 - CGI_10009884 superfamily 241814 47 226 1.77E-27 105.499 cl00360 COG0212 superfamily - - 5-formyltetrahydrofolate cyclo-ligase [Coenzyme metabolism] Q#6178 - CGI_10009886 superfamily 246681 6 206 2.04E-101 295.169 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#6179 - CGI_10009887 superfamily 241974 597 728 1.07E-19 85.7562 cl00604 STAS superfamily - - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#6179 - CGI_10009887 superfamily 216188 269 537 1.39E-43 158.535 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#6179 - CGI_10009887 superfamily 205965 126 209 1.21E-25 101.72 cl18285 Sulfate_tra_GLY superfamily - - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#6180 - CGI_10009888 superfamily 247792 410 434 7.27E-05 40.5068 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6180 - CGI_10009888 superfamily 245814 241 286 0.000546821 37.8539 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6180 - CGI_10009888 superfamily 245814 325 391 2.73E-09 53.5647 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6180 - CGI_10009888 superfamily 245814 147 208 0.00334556 35.5589 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6181 - CGI_10009889 superfamily 245814 218 285 2.49E-10 55.4907 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6181 - CGI_10009889 superfamily 245814 112 182 7.60E-09 51.7373 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6181 - CGI_10009889 superfamily 245814 53 84 0.00115648 36.2308 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6182 - CGI_10009890 superfamily 245814 82 128 0.00101357 36.6983 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6182 - CGI_10009890 superfamily 245814 262 329 1.24E-11 59.3427 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6182 - CGI_10009890 superfamily 245814 156 227 6.85E-07 46.3445 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6183 - CGI_10009891 superfamily 243034 394 492 3.51E-11 61.6272 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6183 - CGI_10009891 superfamily 243034 786 891 1.10E-08 54.3084 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6183 - CGI_10009891 superfamily 243034 20 84 1.32E-05 44.6784 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6183 - CGI_10009891 superfamily 243072 259 366 0.00133302 38.9039 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6184 - CGI_10009892 superfamily 243072 296 410 7.66E-06 46.993 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6184 - CGI_10009892 superfamily 243072 5 81 0.00847421 37.3631 cl02529 ANK superfamily NC - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6186 - CGI_10000669 superfamily 244895 38 514 4.51E-110 337.982 cl08294 Peptidase_M17 superfamily - - "Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants." Q#6187 - CGI_10002040 superfamily 247727 84 194 3.55E-07 47.0395 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#6188 - CGI_10002041 superfamily 188051 276 550 4.09E-112 340.213 cl18155 nop2p superfamily - - "NOL1/NOP2/sun family putative RNA methylase; [Protein synthesis, tRNA and rRNA base modification]." Q#6190 - CGI_10002043 superfamily 242406 25 67 0.000738169 35.432 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#6192 - CGI_10005659 superfamily 242406 1 45 0.00457227 33.7189 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#6194 - CGI_10005661 superfamily 243072 70 179 1.27E-29 108.24 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6196 - CGI_10005663 superfamily 241585 216 250 1.51E-06 44.8172 cl00066 FU superfamily C - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#6196 - CGI_10005663 superfamily 216254 29 144 1.44E-26 101.941 cl08303 Recep_L_domain superfamily - - Receptor L domain; The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. Q#6197 - CGI_10000878 superfamily 241802 53 168 6.89E-24 94.8612 cl00342 Trp-synth-beta_II superfamily C - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#6198 - CGI_10000879 superfamily 221442 19 85 1.61E-23 91.4565 cl18607 Hydrolase_4 superfamily - - "Putative lysophospholipase; This domain is found in bacteria and eukaryotes and is approximately 110 amino acids in length. It is found in association with pfam00561. Many members are annotated as being lysophospholipases, and others as alpha-beta hydrolase fold-containing proteins." Q#6198 - CGI_10000879 superfamily 225375 106 153 0.000262771 40.0821 cl18715 COG2819 superfamily NC - Predicted hydrolase of the alpha/beta superfamily [General function prediction only] Q#6204 - CGI_10001100 superfamily 241563 62 97 0.00034947 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6204 - CGI_10001100 superfamily 241563 8 53 0.008482 34.6203 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6207 - CGI_10006678 superfamily 219619 352 426 1.15E-10 57.6027 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#6207 - CGI_10006678 superfamily 243066 10 90 1.02E-07 49.4737 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#6208 - CGI_10006679 superfamily 219619 281 356 3.35E-11 58.7583 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#6208 - CGI_10006679 superfamily 243066 2 43 0.00508183 34.8361 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#6209 - CGI_10006680 superfamily 214531 821 863 4.34E-09 54.1449 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#6209 - CGI_10006680 superfamily 214531 164 207 2.77E-05 42.9741 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#6209 - CGI_10006680 superfamily 241578 320 353 0.000212199 42.7572 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#6209 - CGI_10006680 superfamily 214531 210 249 0.000569119 39.1221 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#6209 - CGI_10006680 superfamily 214531 131 159 0.00469864 36.4257 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#6210 - CGI_10006681 superfamily 241578 1 130 6.60E-22 89.6582 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#6210 - CGI_10006681 superfamily 111397 152 232 1.93E-08 50.0323 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#6212 - CGI_10006683 superfamily 111397 17 78 2.80E-06 41.1727 cl03620 HYR superfamily N - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#6213 - CGI_10006684 superfamily 241786 9 264 2.26E-85 262.08 cl00325 Ribosomal_L4 superfamily - - Ribosomal protein L4/L1 family; This family includes Ribosomal L4/L1 from eukaryotes and archaebacteria and L4 from eubacteria. L4 from yeast has been shown to bind rRNA. Q#6213 - CGI_10006684 superfamily 222716 274 353 1.02E-27 104.224 cl16836 Ribos_L4_asso_C superfamily - - 60S ribosomal protein L4 C-terminal domain; This family is found at the very C-terminal of 60 ribosomal L4 proteins. Q#6214 - CGI_10006685 superfamily 217915 4 562 1.29E-97 311.748 cl14957 Spc97_Spc98 superfamily - - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#6215 - CGI_10006686 superfamily 245814 27 95 4.32E-06 41.7059 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6217 - CGI_10010123 superfamily 247677 901 1033 4.18E-40 146.661 cl17013 W2 superfamily - - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#6217 - CGI_10010123 superfamily 247677 1154 1286 1.42E-38 142.038 cl17013 W2 superfamily - - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#6217 - CGI_10010123 superfamily 243128 65 274 1.95E-22 97.4002 cl02652 MIF4G superfamily N - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#6217 - CGI_10010123 superfamily 243129 711 812 1.56E-17 80.7605 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#6217 - CGI_10010123 superfamily 243129 483 594 7.53E-15 73.0565 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#6218 - CGI_10010124 superfamily 247068 570 661 1.48E-07 50.0046 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6218 - CGI_10010124 superfamily 247068 483 560 3.65E-07 48.849 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6219 - CGI_10010125 superfamily 247743 153 299 6.43E-13 65.6303 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#6220 - CGI_10010126 superfamily 216554 69 248 4.73E-26 104.869 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#6221 - CGI_10010127 superfamily 241913 511 621 8.27E-30 114.588 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#6221 - CGI_10010127 superfamily 241913 321 440 1.36E-14 71.0609 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#6222 - CGI_10010128 superfamily 241644 9 145 5.45E-55 173.157 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#6223 - CGI_10010129 superfamily 241644 9 160 8.77E-63 192.417 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#6225 - CGI_10010131 superfamily 247095 50 496 8.60E-155 449.799 cl15837 alkPPc superfamily - - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#6226 - CGI_10010132 superfamily 247095 85 531 3.71E-171 493.712 cl15837 alkPPc superfamily - - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#6227 - CGI_10010133 superfamily 241750 8 447 0 579.245 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#6233 - CGI_10010139 superfamily 221049 881 1139 2.95E-73 247.235 cl12807 Tho2 superfamily - - "Transcription factor/nuclear export subunit protein 2; THO and TREX form a eukaryotic complex which functions in messenger ribonucleoprotein metabolism and plays a role in preventing the transcription-associated genetic instability. Tho2, along with four other subunits forms THO" Q#6233 - CGI_10010139 superfamily 152168 593 668 3.07E-33 125.066 cl13220 Thoc2 superfamily - - "Transcription- and export-related complex subunit; The THO/TREX complex is the transcription- and export-related complex associated with spliceosomes that preferentially deal with spliced mRNAs as opposed to unspliced mRNAs. Thoc2 plays a role in RNA polymerase II (RNA pol II)-dependent transcription and is required for the stability of DNA repeats. In humans, the TRE complex is comprised of the exon-junction-associated proteins Aly/REF and UAP56 together with the THO proteins THOC1 (hHpr1/p84), Thoc2 (hRlr1), THOC3 (hTex1), THOC5 (fSAP79), THOC6 (fSAP35), and THOC7 (fSAP24). Although much evidence indicates that the function of the TREX complex as an adaptor between the mRNA and components of the export machinery is conserved among eukaryotes, in Drosophila the majority of mRNAs can be exported from the nucleus independently of the THO complex." Q#6234 - CGI_10010140 superfamily 243175 339 400 5.59E-25 98.1586 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#6234 - CGI_10010140 superfamily 242920 42 91 1.11E-06 46.2679 cl02174 TAF13 superfamily N - "The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex; The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAFs orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and are found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF13 interacts with TAF11 and makes a histone-like heterodimer similar to H3/H4-like proteins. The dimer may be structurally and functionally similar to the spt3 protein within the SAGA histone acetyltransferase complex." Q#6235 - CGI_10010141 superfamily 217474 71 332 9.53E-32 122.461 cl03979 PAE superfamily - - Pectinacetylesterase; Pectinacetylesterase. Q#6236 - CGI_10001227 superfamily 222225 8 124 1.11E-05 44.0504 cl18652 DoxX_2 superfamily - - DoxX-like family; This family of uncharacterized proteins are related to DoxX pfam07681. Q#6237 - CGI_10001228 superfamily 243948 8 328 2.49E-120 353.133 cl04955 LanC_like superfamily - - "LanC-like proteins. LanC is the cyclase enzyme of the lanthionine synthetase. Lanthionine is a lantibiotic, a unique class of peptide antibiotics. They are ribosomally synthesized as a precursor peptide and then post-translationally modified to contain thioether cross-links called lanthionines (Lans) or methyllanthionines (MeLans), in addition to 2,3-didehydroalanine (Dha) and (Z)-2,3-didehydrobutyrine (Dhb). These unusual amino acids are introduced by the dehydration of serine and threonine residues, followed by thioether formation via addition of cysteine thiols, catalysed by LanB and LanC or LanM. LanC, the cyclase component, is a zinc metalloprotein, whose bound metal has been proposed to activate the thiol substrate for nucleophilic addition. A related domain is also present in LanM and other pro- and eukaryotic proteins of unknown function." Q#6239 - CGI_10004240 superfamily 247746 251 365 0.00149674 38.0082 cl17192 ATP-synt_B superfamily - - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#6240 - CGI_10004241 superfamily 247683 255 328 0.0086401 34.4785 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#6245 - CGI_10001192 superfamily 247724 40 87 0.00157367 36.8165 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#6248 - CGI_10001391 superfamily 248458 50 449 1.04E-17 83.5173 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#6249 - CGI_10014087 superfamily 243555 10 202 2.05E-12 62.0234 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#6252 - CGI_10014090 superfamily 241766 20 294 5.79E-124 371.06 cl00303 PNP_UDP_1 superfamily - - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#6254 - CGI_10014092 superfamily 241782 32 400 1.82E-51 177.533 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#6255 - CGI_10014093 superfamily 241596 55 110 1.17E-09 51.4459 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#6256 - CGI_10014094 superfamily 248458 39 204 1.48E-11 64.6425 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#6256 - CGI_10014094 superfamily 248458 352 534 1.53E-07 52.3161 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#6257 - CGI_10014095 superfamily 176599 20 153 1.89E-29 106.4 cl03381 pVHL superfamily - - "von Hippel-Landau (pVHL) tumor suppressor protein; von Hippel-Landau (pVHL) protein, the gene product of VHL, is a critical regulator of the ubiquitous oxygen-sensing pathway. It is conserved throughout evolution, as its homologs are found in organisms ranging from mammals to the Drosophila melanogaster, Anopheles gambiae insects and the Caenorhabditis elegans nematode. pVHL acts as the substrate recognition component of an E3 ubiquitin ligase complex. Several proteins have been identified as pVHL-binding proteins that are subject to ubiquitin-mediated proteolysis; the best characterized putative substrates are the alpha subunits of the hypoxia-inducible factor (HIF1alpha, HIF2alpha, and HIF3alpha). In addition to HIF degradation, pVHL has been implicated to be involved in HIF independent cellular processes. Germline VHL mutations cause renal cell carcinomas, hemangioblastomas and pheochromocytomas in humans. pVHL can bind to and direct the proper deposition of fibronectin and collagen IV within the extracellular matrix. It works to stabilize microtubules and foster the maintenance of primary cilium. It also has been reported to promote the stabilization and activation of p53 in a HIF-independent manner and, in neuronal cells, promote apoptosis by down-regulation of Jun-B." Q#6260 - CGI_10014098 superfamily 245213 182 215 1.75E-05 41.6934 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6260 - CGI_10014098 superfamily 245847 225 354 1.40E-16 75.0813 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6261 - CGI_10014099 superfamily 245213 177 210 2.00E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6261 - CGI_10014099 superfamily 247068 106 162 0.000112515 39.9894 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6261 - CGI_10014099 superfamily 245847 228 355 1.06E-08 52.3545 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6262 - CGI_10014100 superfamily 245847 177 317 3.62E-28 107.053 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6262 - CGI_10014100 superfamily 218564 122 160 0.000803867 36.81 cl18463 He_PIG superfamily - - Putative Ig domain; This alignment represents the conserved core region of ~90 residue repeat found in several haemagglutinins and other cell surface proteins. Sequence similarities to (pfam02494) and (pfam00801) suggest an Ig-like fold (personal obs:C. Yeats). So this family may be similar in function to the (pfam02639) and (pfam02638) domains. This domain is also found in the WisP family of proteins of Tropheryma whipplei. Q#6263 - CGI_10014101 superfamily 245213 328 362 2.76E-08 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6263 - CGI_10014101 superfamily 219635 21 186 1.14E-45 159.734 cl06790 Peptidase_C78 superfamily - - Peptidase family C78; This family formerly known as DUF1671 has been shown to be a cysteine peptidase called (Ufm1)-specific protease. Q#6263 - CGI_10014101 superfamily 245847 370 515 4.96E-11 60.4437 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6264 - CGI_10014102 superfamily 245847 209 352 1.06E-24 97.4228 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6264 - CGI_10014102 superfamily 245213 169 203 2.21E-08 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6264 - CGI_10014102 superfamily 247068 93 169 1.61E-05 42.3006 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6266 - CGI_10014104 superfamily 241563 69 98 0.00228662 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6266 - CGI_10014104 superfamily 204194 111 208 0.00907434 34.9353 cl07806 Cortex-I_coil superfamily - - "Cortexillin I, coiled coil; Members of this family are predominantly found in the actin-bundling protein Cortexillin I from Dictyostelium discoideum. They adopt a structure consisting of an 18-heptad-repeat alpha-helical coiled-coil, and are a prerequisite for the assembly of Cortexillin I." Q#6267 - CGI_10014105 superfamily 246669 1115 1267 2.08E-67 224.186 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#6267 - CGI_10014105 superfamily 243096 768 951 4.17E-47 168.245 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#6267 - CGI_10014105 superfamily 247683 582 639 2.46E-31 118.98 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#6267 - CGI_10014105 superfamily 247683 689 740 6.42E-26 103.264 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#6267 - CGI_10014105 superfamily 247683 376 425 1.21E-25 102.493 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#6267 - CGI_10014105 superfamily 247683 283 334 1.92E-25 101.674 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#6267 - CGI_10014105 superfamily 247725 978 1104 2.08E-44 158.775 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#6267 - CGI_10014105 superfamily 247683 176 231 9.27E-22 91.2654 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#6270 - CGI_10001538 superfamily 245201 131 305 6.12E-37 134.184 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6275 - CGI_10018561 superfamily 243992 9 72 7.10E-09 47.1834 cl05087 Complex1_LYR_1 superfamily - - "Complex1_LYR-like; This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria." Q#6276 - CGI_10018562 superfamily 215648 1 199 2.74E-15 71.858 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#6277 - CGI_10018563 superfamily 245225 28 344 2.65E-27 109.659 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#6278 - CGI_10018564 superfamily 247723 46 119 1.66E-39 136.55 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#6279 - CGI_10018565 superfamily 247749 11 466 0 546.205 cl17195 LDH_MDH_like superfamily - - "NAD-dependent, lactate dehydrogenase-like, 2-hydroxycarboxylate dehydrogenase family; Members of this family include ubiquitous enzymes like L-lactate dehydrogenases (LDH), L-2-hydroxyisocaproate dehydrogenases, and some malate dehydrogenases (MDH). LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH/MDH-like proteins are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others." Q#6281 - CGI_10018567 superfamily 241555 4 169 5.09E-65 199.317 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#6282 - CGI_10018568 superfamily 241866 1 407 0 853.128 cl00445 Iso_dh superfamily - - Isocitrate/isopropylmalate dehydrogenase; Isocitrate/isopropylmalate dehydrogenase. Q#6283 - CGI_10018569 superfamily 221051 1 220 2.93E-73 238.217 cl12809 Med25_VWA superfamily - - "Mediator complex subunit 25 von Willebrand factor type A; The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain which is this one, an SD2 domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This VWA or von Willebrand factor type A domain when bound to RAR and the histone acetyltransferase CBP is responsible for recruiting Med1 to the rest of the Mediator complex." Q#6283 - CGI_10018569 superfamily 192726 412 561 5.63E-53 179.956 cl12779 Med25 superfamily - - "Mediator complex subunit 25 PTOV activation and synapsin 2; Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-active part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain, an SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This family is the combined PTOV and SD2 domains. the PTOV domain being the domain through which Med25 co-operates with the histone acetyltransferase CBP, but the function of the SD2 domain is unclear." Q#6284 - CGI_10018570 superfamily 247065 4 112 8.63E-20 79.6962 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#6285 - CGI_10018571 superfamily 241581 438 535 8.26E-08 51.233 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#6285 - CGI_10018571 superfamily 241754 1 338 1.13E-161 483.352 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#6287 - CGI_10018573 superfamily 243072 248 357 5.68E-19 81.661 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6287 - CGI_10018573 superfamily 243072 185 299 7.49E-12 61.6306 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6288 - CGI_10018574 superfamily 241870 7 164 2.42E-57 189.617 cl00451 MoCF_BD superfamily - - "MoCF_BD: molybdenum cofactor (MoCF) binding domain (BD). This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. The domain is presumed to bind molybdopterin." Q#6288 - CGI_10018574 superfamily 241758 300 450 2.30E-36 132.529 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#6289 - CGI_10018575 superfamily 245815 5 160 1.07E-80 248.29 cl11961 ALDH-SF superfamily C - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#6290 - CGI_10018576 superfamily 245815 1 251 2.63E-141 407.763 cl11961 ALDH-SF superfamily N - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#6291 - CGI_10018577 superfamily 128937 4 69 2.93E-11 55.3464 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#6291 - CGI_10018577 superfamily 128937 79 139 8.95E-11 54.1908 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#6292 - CGI_10018579 superfamily 241563 61 97 0.00251107 37.0736 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6294 - CGI_10018581 superfamily 242457 48 133 1.07E-05 43.4667 cl01368 GyrI-like superfamily C - "GyrI-like small molecule binding domain; This family contains the small molecule binding domain of a number of different bacterial transcription activators. This family also contains DNA gyrase inhibitors. The GyrI superfamily contains a diad of the SHS2 module, adapted for small-molecule binding. The GyrI superfamily includes a family of secreted forms that is found only in animals and the bacterial pathogen Leptospira." Q#6295 - CGI_10018582 superfamily 243050 336 393 2.83E-16 73.2226 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#6296 - CGI_10018583 superfamily 245814 281 345 1.32E-11 59.8103 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6296 - CGI_10018583 superfamily 245814 186 257 2.71E-08 50.1803 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6297 - CGI_10018584 superfamily 242043 27 92 0.00602096 31.9134 cl00713 Auto_anti-p27 superfamily N - Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27); This family consists of several Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27) sequences. It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis. Q#6303 - CGI_10001818 superfamily 241766 23 309 6.55E-127 366.778 cl00303 PNP_UDP_1 superfamily - - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#6305 - CGI_10002008 superfamily 245660 2 146 0.0024752 37.5866 cl11493 PQQ_DH_like superfamily C - "PQQ-dependent dehydrogenases and related proteins; This family is composed of dehydrogenases with pyrroloquinoline quinone (PQQ) as a cofactor, such as ethanol, methanol, and membrane-bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller, and the family also includes distantly related proteins which are not enzymatically active and do not bind PQQ." Q#6308 - CGI_10005698 superfamily 243091 79 175 2.15E-08 53.1071 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#6308 - CGI_10005698 superfamily 222150 772 796 1.43E-05 43.5345 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6308 - CGI_10005698 superfamily 246975 760 780 0.0017452 37.3265 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#6308 - CGI_10005698 superfamily 222150 800 824 0.00793209 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6309 - CGI_10005699 superfamily 241590 497 550 8.07E-19 81.9696 cl00072 GYF superfamily - - GYF domain: contains conserved Gly-Tyr-Phe residues; Proline-binding domain in CD2-binding and other proteins. Involved in signaling lymphocyte activity. Also present in other unrelated proteins (mainly unknown) derived from diverse eukaryotic species. Q#6311 - CGI_10005701 superfamily 243555 36 219 6.09E-05 43.1486 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#6312 - CGI_10005702 superfamily 247095 41 476 2.54E-173 496.793 cl15837 alkPPc superfamily - - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#6313 - CGI_10005703 superfamily 243250 40 349 1.48E-76 245.251 cl02959 Glyco_hydro_9 superfamily N - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#6314 - CGI_10001863 superfamily 245201 641 711 0.00204894 39.2441 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6314 - CGI_10001863 superfamily 219677 21 44 0.00864486 35.106 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#6315 - CGI_10001434 superfamily 243035 23 147 4.11E-12 61.8669 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6316 - CGI_10002660 superfamily 241600 3 117 5.86E-33 115.801 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6317 - CGI_10002661 superfamily 247941 434 489 0.000478447 39.6265 cl17387 Methyltransf_21 superfamily C - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#6318 - CGI_10002662 superfamily 247792 6 51 0.000150813 38.1956 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6319 - CGI_10002663 superfamily 219619 90 146 7.93E-15 68.0031 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#6319 - CGI_10002663 superfamily 219619 185 230 6.14E-09 51.8247 cl18518 Ion_trans_2 superfamily C - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#6321 - CGI_10002665 superfamily 241600 1 117 1.28E-32 115.03 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6322 - CGI_10002285 superfamily 193256 46 114 9.96E-07 46.4792 cl18189 AAA_8 superfamily N - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#6324 - CGI_10002860 superfamily 247068 225 319 8.15E-13 65.0273 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6324 - CGI_10002860 superfamily 247068 427 507 2.81E-05 42.6858 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6325 - CGI_10003103 superfamily 241568 86 143 6.06E-07 45.9168 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#6325 - CGI_10003103 superfamily 241568 214 273 0.00801408 34.0393 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#6325 - CGI_10003103 superfamily 205157 23 50 0.00951484 33.6651 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#6327 - CGI_10002502 superfamily 243073 358 389 0.000158318 39.117 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#6332 - CGI_10003149 superfamily 243072 58 115 4.52E-13 64.327 cl02529 ANK superfamily NC - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6333 - CGI_10003150 superfamily 217293 28 221 4.00E-50 172.047 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6333 - CGI_10003150 superfamily 202474 229 483 5.26E-21 91.176 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#6337 - CGI_10003137 superfamily 241563 109 147 1.27E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6337 - CGI_10003137 superfamily 241563 53 97 0.00107232 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6340 - CGI_10005469 superfamily 241832 60 184 1.57E-43 144.292 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#6341 - CGI_10005470 superfamily 241832 116 215 1.11E-53 171.93 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#6342 - CGI_10005471 superfamily 242167 137 210 5.79E-41 135.707 cl00883 RNA_pol_Rpb5_C superfamily - - "RNA polymerase Rpb5, C-terminal domain; The assembly domain of Rpb5. The archaeal equivalent to this domain is subunit H. Subunit H lacks the N-terminal domain." Q#6342 - CGI_10005471 superfamily 217772 1 95 1.62E-39 132.769 cl04305 RNA_pol_Rpb5_N superfamily - - "RNA polymerase Rpb5, N-terminal domain; Rpb5 has a bipartite structure which includes a eukaryote-specific N-terminal domain and a C-terminal domain resembling the archaeal RNAP subunit H. The N-terminal domain is involved in DNA binding and is part of the jaw module in the RNA pol II structure. This module is important for positioning the downstream DNA." Q#6345 - CGI_10000536 superfamily 220647 26 103 4.06E-18 76.2123 cl18565 L_HGMIC_fpl superfamily C - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#6355 - CGI_10014372 superfamily 247907 1421 1598 1.36E-25 106.348 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6355 - CGI_10014372 superfamily 247068 76 180 2.09E-24 101.236 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247068 953 1058 8.27E-23 96.6137 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247068 749 845 2.69E-21 92.3765 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247068 189 278 1.52E-19 86.9837 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247068 416 520 5.96E-19 85.4429 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247068 529 622 2.38E-18 83.5169 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247068 638 737 1.66E-17 81.2057 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247907 1686 1801 3.84E-13 69.7544 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6355 - CGI_10014372 superfamily 247068 853 946 4.89E-11 62.3309 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247068 294 399 0.000105062 43.071 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 247068 5 64 0.000302346 41.5302 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6355 - CGI_10014372 superfamily 245213 1899 1937 0.000303649 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6355 - CGI_10014372 superfamily 245213 1634 1663 0.00376467 37.6162 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6355 - CGI_10014372 superfamily 245213 1383 1415 0.00949536 36.6858 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6355 - CGI_10014372 superfamily 216265 1971 2044 3.00E-15 75.802 cl03079 Cadherin_C superfamily N - Cadherin cytoplasmic region; Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn. Q#6355 - CGI_10014372 superfamily 243179 2245 2318 0.000707178 40.5939 cl02781 tetraspanin_LEL superfamily C - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#6356 - CGI_10014373 superfamily 155088 161 215 7.45E-07 47.2054 cl02758 AMOP superfamily N - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#6357 - CGI_10014374 superfamily 243124 105 191 4.31E-18 75.9167 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#6358 - CGI_10014375 superfamily 241984 89 325 1.64E-11 61.5035 cl00615 Membrane-FADS-like superfamily - - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#6359 - CGI_10014376 superfamily 247792 324 376 1.33E-15 70.8279 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6359 - CGI_10014376 superfamily 214806 213 299 2.06E-13 65.7797 cl15966 CRA superfamily - - "CT11-RanBPM; protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi)" Q#6359 - CGI_10014376 superfamily 128914 154 206 2.30E-08 50.2622 cl15352 CTLH superfamily - - C-terminal to LisH motif; Alpha-helical motif of unknown function. Q#6360 - CGI_10014377 superfamily 243092 145 321 2.66E-15 73.5232 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#6361 - CGI_10014378 superfamily 247725 139 238 6.36E-19 82.626 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#6362 - CGI_10014379 superfamily 247725 514 611 7.30E-18 79.9296 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#6362 - CGI_10014379 superfamily 243092 270 404 7.07E-05 43.8628 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#6364 - CGI_10014381 superfamily 247792 335 377 2.15E-08 50.1368 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6365 - CGI_10014382 superfamily 243175 94 218 5.25E-40 135.419 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#6365 - CGI_10014382 superfamily 241832 3 70 1.70E-31 111.951 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#6366 - CGI_10014383 superfamily 241629 17 95 4.32E-21 83.2712 cl00133 SCP superfamily N - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#6367 - CGI_10014384 superfamily 241610 197 249 1.75E-20 82.2978 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#6367 - CGI_10014384 superfamily 241610 134 181 1.15E-16 71.5122 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#6369 - CGI_10014386 superfamily 241600 10 84 9.66E-33 122.734 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6370 - CGI_10014387 superfamily 220635 30 98 0.00215433 35.5878 cl12380 DUF2151 superfamily C - "Cell cycle and development regulator; This is a set of proteins conserved from worms to humans. The proteins are a PAN GU kinase substrate, Mat89Bb, essential for S-M cycles of early Drosophila embryogenesis, Xenopus embryonic cell cycles and morphogenesis, and cell division in cultured mammalian cells." Q#6372 - CGI_10014389 superfamily 243161 3 60 9.30E-05 38.1442 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#6373 - CGI_10012103 superfamily 241563 123 163 8.57E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6373 - CGI_10012103 superfamily 110440 387 414 0.00165894 37.7725 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6373 - CGI_10012103 superfamily 241563 70 115 0.00202948 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6376 - CGI_10012106 superfamily 217490 71 194 0.00268778 36.6241 cl09303 ETX_MTX2 superfamily C - Clostridium epsilon toxin ETX/Bacillus mosquitocidal toxin MTX2; This family appears to be distantly related to pfam01117. Q#6377 - CGI_10012107 superfamily 247057 116 183 1.34E-12 66.2053 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#6378 - CGI_10012108 superfamily 246908 5 108 1.87E-43 144.924 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#6378 - CGI_10012108 superfamily 247683 107 161 1.08E-23 91.2698 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#6378 - CGI_10012108 superfamily 247683 195 251 1.14E-23 91.0093 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#6380 - CGI_10012110 superfamily 215647 223 465 2.77E-28 111.932 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#6380 - CGI_10012110 superfamily 243029 146 209 3.54E-17 76.2353 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#6381 - CGI_10012111 superfamily 243066 21 109 8.33E-10 57.2421 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#6382 - CGI_10012112 superfamily 243066 249 351 2.24E-19 85.7469 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#6382 - CGI_10012112 superfamily 222150 735 759 1.55E-05 43.9197 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6383 - CGI_10012113 superfamily 241573 415 713 1.29E-106 338.152 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#6383 - CGI_10012113 superfamily 245201 3 222 1.08E-97 311.74 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6383 - CGI_10012113 superfamily 241653 739 867 4.45E-24 100.518 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#6384 - CGI_10012114 superfamily 243179 115 160 5.01E-05 39.3855 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#6385 - CGI_10012115 superfamily 220672 17 244 3.70E-36 130.059 cl10957 Frag1 superfamily - - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#6386 - CGI_10012116 superfamily 243092 11 292 7.10E-31 118.977 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#6387 - CGI_10012117 superfamily 247684 13 434 6.25E-121 366.217 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#6388 - CGI_10012118 superfamily 241622 231 311 2.50E-08 52.5691 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#6389 - CGI_10012119 superfamily 243263 15 380 8.08E-66 225.365 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#6389 - CGI_10012119 superfamily 243263 618 665 0.00105891 40.469 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#6390 - CGI_10012120 superfamily 243263 8 149 3.42E-22 98.6341 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#6390 - CGI_10012120 superfamily 243263 600 657 7.70E-17 82.0706 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#6391 - CGI_10003420 superfamily 109874 31 116 8.97E-07 43.437 cl02980 Stathmin superfamily N - Stathmin family; The Stathmin family of proteins play an important role in the regulation of the microtubule cytoskeleton. They regulate microtubule dynamics by promoting depolymerization of microtubules and/or preventing polymerisation of tubulin heterodimers. Q#6393 - CGI_10003422 superfamily 193253 99 171 0.00705669 35.3977 cl15084 MT superfamily NC - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#6394 - CGI_10003423 superfamily 247916 63 172 1.42E-06 47.7806 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#6396 - CGI_10003425 superfamily 202894 112 179 6.07E-24 96.1358 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#6397 - CGI_10003426 superfamily 247744 7 196 1.54E-77 234.053 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#6399 - CGI_10018793 superfamily 246918 65 110 2.08E-09 50.6631 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#6399 - CGI_10018793 superfamily 241613 127 161 0.00033636 36.4158 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#6401 - CGI_10018795 superfamily 241613 313 347 3.84E-07 47.2014 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#6401 - CGI_10018795 superfamily 241613 85 119 6.23E-07 46.8162 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#6401 - CGI_10018795 superfamily 241613 352 386 2.69E-05 41.8086 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#6401 - CGI_10018795 superfamily 241613 7 41 2.96E-05 41.8086 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#6401 - CGI_10018795 superfamily 241613 46 80 9.19E-05 40.2678 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#6401 - CGI_10018795 superfamily 241613 273 302 0.00326248 36.0306 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#6404 - CGI_10018798 superfamily 241584 134 227 2.55E-14 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#6407 - CGI_10018801 superfamily 242683 12 161 1.10E-10 56.1097 cl01747 SMI1_KNR4 superfamily - - "SMI1 / KNR4 family (SUKH-1); Proteins in this family are involved in the regulation of 1,3-beta-glucan synthase activity and cell-wall formation. Genome contextual information showed that SMI1 are primary immunity proteins in bacterial toxin systems." Q#6408 - CGI_10018802 superfamily 248204 31 96 3.89E-16 68.0263 cl17650 DUF836 superfamily - - Glutaredoxin-like domain (DUF836); These proteins are related to the pfam00462 family. Q#6411 - CGI_10018805 superfamily 241563 158 197 4.19E-07 46.5115 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6411 - CGI_10018805 superfamily 217020 204 303 0.00195391 36.4186 cl03574 Seryl_tRNA_N superfamily - - Seryl-tRNA synthetase N-terminal domain; This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase. Q#6412 - CGI_10018806 superfamily 247792 84 147 8.47E-06 42.818 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6412 - CGI_10018806 superfamily 241563 182 232 4.68E-05 40.7336 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6418 - CGI_10018812 superfamily 241763 48 126 0.000330992 39.1759 cl00298 Peptidase_C1 superfamily C - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#6419 - CGI_10018813 superfamily 243035 97 158 9.61E-19 77.3363 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6420 - CGI_10010757 superfamily 242443 64 317 2.48E-73 230.692 cl01342 Peptidase_A22B superfamily - - "Signal peptide peptidase; The members of this family are membrane proteins. In some proteins this region is found associated with pfam02225. This family corresponds with Merops subfamily A22B, the type example of which is signal peptide peptidase. There is a sequence-similarity relationship with pfam01080." Q#6421 - CGI_10010758 superfamily 241570 163 272 6.05E-32 117.427 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#6421 - CGI_10010758 superfamily 241570 281 396 2.73E-30 112.805 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#6422 - CGI_10010759 superfamily 247724 3253 3286 0.000311827 42.7123 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#6422 - CGI_10010759 superfamily 247724 3420 3507 0.00482324 39.8374 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#6423 - CGI_10010760 superfamily 243091 534 581 1.30E-08 53.2648 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#6424 - CGI_10010761 superfamily 243034 34 121 3.28E-12 63.9384 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6424 - CGI_10010761 superfamily 243034 210 300 0.000135863 40.8264 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6424 - CGI_10010761 superfamily 243091 623 662 6.89E-07 48.2572 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#6425 - CGI_10010762 superfamily 244723 355 479 1.55E-34 130.508 cl07443 Cdt1_m superfamily C - "The middle winged helix fold of replication licensing factor Cdt1 binds geminin to inhibit binding of the MCM complex to origins of replication and DNA; Cdt1 is a replication licensing factor in eukaryotes that recruits the Minichromosome Maintenance Complex (MCM2-7) to the origin recognition complex (ORC). The Cdt1 protein is divided into three regions based on sequence comparison and biochemical analyses: the N-terminal region (Cdt1_n) binds DNA in a sequence-, strand-, and conformation-independent manner; the middle winged helix fold (Cdt1_m) binds geminin to inhibit both binding of the MCM complex to origins of replication and DNA; and the C-terminal region (Cdt1_c) is essential for Cdt1 activity and directly interacts with the MCM2-7 helicase. Precise duplication of chromosomal DNA is required for genomic stability during replication. Assembly of replication factors to start DNA replication in eukaryotes must occur only once per cell cycle. To form a pre-replicative complex on replication origins in the G phase, ORC first binds origin DNA and triggers the binding of Cdc6 and Cdt1. These two factors recruit a putative replicative helicase and the MCM2-7. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication in S-phase. Cdt1 is present during G1 and early S phase of the cell cycle and degraded during the late S, G2, and M phases. The winged helix fold structure of Cdt1_m is similar to the structures of Cdt1_c and other archaeal homologues of the eukaryotic replication initiator, without apparent sequence similarity." Q#6425 - CGI_10010762 superfamily 176572 465 575 6.65E-30 115.457 cl14631 Cdt1_c superfamily - - "The C-terminal fold of replication licensing factor Cdt1 is essential for Cdt1 activity and directly interacts with MCM2-7 helicase; Cdt1 is a replication licensing factor in eukaryotes that recruits the Minichromosome Maintenance Complex (MCM2-7) to the Origin Recognition Complex (ORC). The Cdt1 protein is divided into three regions based on sequence comparison and biochemical analyses: the N-terminal region (Cdt1_n) binds DNA in a sequence-, strand-, and conformation-independent manner; the middle winged helix fold (Cdt1_m) binds geminin to inhibit both binding of the MCM complex to origins of replication and DNA; and the C-terminal region (Cdt1_c) is essential for Cdt1 activity and directly interacts with the MCM2-7 helicase. Precise duplication of chromosomal DNA is required for genomic stability during replication. Assembly of replication factors to start DNA replication in eukaryotes must occur only once per cell cycle. To form a pre-replicative complex on replication origins in the G phase, ORC first binds origin DNA and triggers the binding of Cdc6 and Cdt1. These two factors recruit a putative replicative helicase and the MCM2-7. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication in S-phase. Cdt1 is present during G1 and early S phase of the cell cycle and is degraded during the late S, G2, and M phases. The winged helix fold structure of Cdt1_m is similar to the structures of Cdt1_c and archaeal homologues of the eukaryotic replication initiator, without apparent sequence similarity." Q#6428 - CGI_10010765 superfamily 246680 163 225 1.59E-10 55.0342 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#6428 - CGI_10010765 superfamily 205042 83 102 0.00382093 34.2352 cl15053 RHIM superfamily N - "RIP homotypic interaction motif; RIP proteins are receptor-interacting serine/threonine-protein kinases or cell death proteins. This interacting domain is involved in virus recognition. The RHIM domain is necessary for the recruitment of RIP and RIP3 by the IFN-inducible protein DNA-dependent activator of IRFs (DAI), also known as DLM-1 or Z-DNA binding protein (ZBP1). Both the RIP kinases contribute to DAI-induced NF-kappaB activation. RIP3 undergoes auto phosphorylation on binding to DAI." Q#6429 - CGI_10010766 superfamily 220692 37 335 2.21E-23 97.6601 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#6430 - CGI_10010767 superfamily 220692 35 333 9.95E-25 101.512 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#6431 - CGI_10010768 superfamily 243035 19 81 0.000313628 36.0586 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6433 - CGI_10010770 superfamily 248264 58 80 0.00318163 32.5942 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#6434 - CGI_10010771 superfamily 243072 10 67 1.38E-05 41.6003 cl02529 ANK superfamily NC - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6434 - CGI_10010771 superfamily 222209 76 104 0.000989802 36.2249 cl18648 UvrD_C_2 superfamily N - Family description; This domain is found at the C-terminus of a wide variety of helicase enzymes. This domain has a AAA-like structural fold. Q#6435 - CGI_10001665 superfamily 245225 1519 1969 2.77E-75 261.409 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#6435 - CGI_10001665 superfamily 245225 2016 2462 1.07E-68 242.149 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#6435 - CGI_10001665 superfamily 245225 55 506 2.01E-54 200.162 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#6435 - CGI_10001665 superfamily 245225 1041 1416 8.73E-35 140.841 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#6435 - CGI_10001665 superfamily 245225 560 1002 3.51E-33 135.833 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#6435 - CGI_10001665 superfamily 215648 2554 2685 1.16E-19 91.1179 cl02802 7tm_3 superfamily C - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#6436 - CGI_10001666 superfamily 246597 15 153 2.22E-100 293.362 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#6436 - CGI_10001666 superfamily 246597 1 23 1.63E-10 56.8498 cl13995 MPP_superfamily superfamily NC - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#6437 - CGI_10003243 superfamily 248213 60 107 0.000388136 36.7841 cl17659 DivIC superfamily C - Septum formation initiator; DivIC from B. subtilis is necessary for both vegetative and sporulation septum formation. These proteins are mainly composed of an amino terminal coiled-coil. Q#6444 - CGI_10002890 superfamily 238012 215 263 3.27E-05 40.8006 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#6444 - CGI_10002890 superfamily 243198 29 213 1.49E-61 198.738 cl02806 Laminin_N superfamily - - Laminin N-terminal (Domain VI); Laminin N-terminal (Domain VI). Q#6444 - CGI_10002890 superfamily 238012 274 319 0.00312478 35.0226 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#6446 - CGI_10002892 superfamily 248458 59 437 1.41E-22 97.3844 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#6447 - CGI_10002893 superfamily 243034 35 134 9.89E-17 78.5759 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6447 - CGI_10002893 superfamily 243034 421 533 7.95E-07 48.9156 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6447 - CGI_10002893 superfamily 243034 246 371 1.13E-05 45.4488 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6447 - CGI_10002893 superfamily 243034 168 274 0.0001609 41.982 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6447 - CGI_10002893 superfamily 248422 912 1161 6.56E-25 107.022 cl17868 CHAT superfamily - - CHAT domain; These proteins appear to be related to peptidases in peptidase clan CD that includes the caspases. This domain has been termed the CHAT domain for Caspase HetF Associated with Tprs. This family has been identified as a sister group to the separins. Q#6447 - CGI_10002893 superfamily 248422 723 788 2.69E-09 58.8717 cl17868 CHAT superfamily C - CHAT domain; These proteins appear to be related to peptidases in peptidase clan CD that includes the caspases. This domain has been termed the CHAT domain for Caspase HetF Associated with Tprs. This family has been identified as a sister group to the separins. Q#6452 - CGI_10006042 superfamily 220609 11 74 2.87E-08 46.126 cl10860 EnY2 superfamily C - Transcription factor e(y)2; EnY2 is a small transcription factor which is combined in a complex with the TAFII40 protein. The protein is conserved from paramecium to humans. Q#6455 - CGI_10006045 superfamily 241644 33 169 6.69E-61 187.795 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#6456 - CGI_10006046 superfamily 216363 183 325 2.82E-33 126.046 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#6456 - CGI_10006046 superfamily 243141 35 125 1.28E-15 75.0863 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#6456 - CGI_10006046 superfamily 215647 1008 1185 1.58E-06 49.5293 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#6458 - CGI_10009626 superfamily 241575 134 200 1.42E-08 49.5783 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#6458 - CGI_10009626 superfamily 241575 46 111 3.09E-07 45.7263 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#6459 - CGI_10009627 superfamily 243132 57 353 1.56E-99 304.301 cl02661 A_deamin superfamily C - "Adenosine-deaminase (editase) domain; Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defence against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc." Q#6460 - CGI_10009628 superfamily 217293 1 168 2.07E-67 215.575 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6460 - CGI_10009628 superfamily 202474 175 402 2.70E-46 160.512 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#6461 - CGI_10009629 superfamily 217293 23 62 3.80E-10 51.8647 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6462 - CGI_10009630 superfamily 217293 54 271 3.84E-71 228.671 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6462 - CGI_10009630 superfamily 202474 278 510 3.56E-26 106.199 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#6464 - CGI_10009632 superfamily 218200 32 260 1.47E-69 223.783 cl04660 Glyco_transf_54 superfamily - - "N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein." Q#6467 - CGI_10009635 superfamily 241563 61 97 0.00279577 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6468 - CGI_10003812 superfamily 243034 191 310 6.71E-05 41.2116 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6468 - CGI_10003812 superfamily 243034 321 399 0.00017962 39.6708 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6469 - CGI_10003813 superfamily 248099 145 220 2.17E-10 56.9491 cl17545 Bromo_TP superfamily - - Bromodomain associated; This domain is predicted to bind DNA and is often found associated with pfam00439 and in transcription factors. It has a histone-like fold. Q#6472 - CGI_10003816 superfamily 193257 3243 3478 1.03E-38 148.21 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#6472 - CGI_10003816 superfamily 193253 2896 3228 1.39E-32 133.238 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#6472 - CGI_10003816 superfamily 193256 2615 2834 7.11E-25 108.111 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#6472 - CGI_10003816 superfamily 247743 1981 2106 1.31E-19 89.2768 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#6472 - CGI_10003816 superfamily 193251 2258 2497 5.79E-12 68.0388 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#6473 - CGI_10003817 superfamily 217613 83 197 4.84E-46 150.018 cl04154 Cullin_binding superfamily - - "Cullin binding; This domain binds to cullins and to Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. Neddylation is the process by which the C-terminal glycine of the ubiquitin-like protein Nedd8 is covalently linked to lysine residues in a protein through an isopeptide bond. The structure of this domain is composed entirely of alpha helices." Q#6474 - CGI_10004931 superfamily 247809 122 328 9.12E-93 282.268 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#6474 - CGI_10004931 superfamily 244920 343 450 7.23E-47 158.731 cl08365 Biotin_carb_C superfamily - - "Biotin carboxylase C-terminal domain; Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyzes the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain." Q#6474 - CGI_10004931 superfamily 201133 27 116 2.81E-36 129.91 cl02837 CPSase_L_chain superfamily - - "Carbamoyl-phosphate synthase L chain, N-terminal domain; Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117." Q#6475 - CGI_10004932 superfamily 243082 1666 1685 1.31E-07 54.1855 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#6476 - CGI_10004933 superfamily 243035 814 930 3.56E-30 116.565 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6477 - CGI_10004934 superfamily 221177 158 409 1.82E-79 252.352 cl13203 Slu7 superfamily - - "Pre-mRNA splicing Prp18-interacting factor; The spliceosome, an assembly of snRNAs (U1, U2, U4/U6, and U5) and proteins, catalyzes the excision of introns from pre-mRNAs in two successive trans-esterification reactions. Step 2 depends upon integral spliceosome constituents such as U5 snRNA and Prp8 and non-spliceosomal proteins Prp16, Slu7, Prp18, and Prp22. ATP hydrolysis by the DEAH-box enzyme Prp16 promotes a conformational change in the spliceosome that leads to protection of the 3'ss from targeted RNase H cleavage. This change, which probably reflects binding of the 3'ss PyAG in the catalytic centre of the spliceosome, requires the ordered recruitment of Slu7, Prp18, and Prp22 to the spliceosome. There is a close functional relationship between Prp8, Prp18, and Slu7, and Prp18 interacts with Slu7, so that together they recruit Prp22 to the spliceosome. Most members of the family carry a zinc-finger of the CCHC-type upstream of this domain." Q#6483 - CGI_10015660 superfamily 245847 3 137 0.000534623 36.3264 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6485 - CGI_10015662 superfamily 241563 61 97 0.0043574 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6486 - CGI_10015663 superfamily 241568 14 68 6.70E-05 35.9016 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#6487 - CGI_10015664 superfamily 247068 167 239 0.000398701 39.9894 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6487 - CGI_10015664 superfamily 245201 721 996 9.64E-120 372.759 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6488 - CGI_10015665 superfamily 215754 119 206 1.27E-21 87.3088 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#6488 - CGI_10015665 superfamily 215754 221 313 3.50E-19 80.3752 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#6488 - CGI_10015665 superfamily 215754 11 99 2.44E-18 78.064 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#6489 - CGI_10015666 superfamily 247068 739 839 8.16E-21 92.7617 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3406 3500 3.37E-19 88.1393 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1392 1484 2.52E-16 79.6649 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4709 4805 3.24E-16 79.2797 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1933 2026 1.62E-15 77.3537 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4060 4154 2.57E-15 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3513 3612 4.52E-15 76.1981 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7170 7263 8.77E-15 75.4277 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1499 1590 9.49E-15 75.0425 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1279 1379 1.42E-14 74.6573 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 849 946 2.42E-14 73.8869 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6454 6544 2.95E-14 73.8869 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 158 249 3.72E-14 73.5017 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1068 1159 4.92E-14 73.1165 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3624 3719 7.03E-14 72.7313 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8900 9001 8.01E-14 72.3461 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 9754 9855 8.01E-14 72.3461 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2688 2782 1.52E-13 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8434 8526 2.05E-13 71.1905 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 9288 9380 2.05E-13 71.1905 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5628 5722 2.65E-13 70.8053 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8325 8413 3.93E-13 70.4201 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5024 5114 4.18E-13 70.4201 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5421 5511 7.03E-13 69.6497 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4491 4583 1.63E-12 68.4941 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7275 7365 2.03E-12 68.4941 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2475 2568 2.12E-12 68.1089 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2149 2242 2.18E-12 68.1089 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2255 2356 2.43E-12 68.1089 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3844 3937 4.51E-12 67.3385 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5523 5615 5.11E-12 67.3385 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6950 7053 6.44E-12 66.9533 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2041 2137 6.68E-12 66.9533 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1603 1700 7.40E-12 66.5681 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 9394 9493 7.69E-12 66.5681 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8540 8639 1.65E-11 65.7977 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6847 6942 1.93E-11 65.4125 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7798 7889 2.17E-11 65.4125 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1715 1812 3.67E-11 64.6421 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8131 8209 4.41E-11 64.2569 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4822 4912 6.35E-11 63.8717 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3300 3394 6.75E-11 63.8717 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4167 4265 7.31E-11 63.8717 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4277 4370 8.94E-11 63.4865 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7587 7680 1.07E-10 63.1013 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2368 2449 1.12E-10 63.1013 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2908 2994 1.42E-10 62.7161 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3952 4045 1.51E-10 62.7161 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6246 6339 2.41E-10 62.3309 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2601 2675 2.52E-10 61.9457 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6145 6234 2.74E-10 61.9457 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4384 4479 4.92E-10 61.1753 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3117 3211 5.86E-10 61.1753 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 958 1055 1.79E-09 59.6345 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7377 7473 1.25E-08 56.9382 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 402 507 1.60E-08 56.9382 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3732 3832 1.66E-08 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6662 6757 1.71E-08 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7065 7157 2.67E-08 56.1678 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8220 8311 3.19E-08 55.7826 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6037 6132 4.11E-08 55.3974 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 2797 2894 4.39E-08 55.3974 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5932 6025 4.74E-08 55.3974 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6353 6441 6.39E-08 55.0122 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1172 1265 8.83E-08 54.627 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 1828 1921 1.20E-07 54.2418 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6557 6649 2.22E-07 53.4714 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5316 5408 3.54E-07 52.701 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7483 7574 4.07E-07 52.701 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7692 7785 4.70E-07 52.3158 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8009 8101 5.00E-07 52.3158 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 7901 7983 5.16E-07 52.3158 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 632 729 7.74E-07 51.5454 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5740 5830 8.09E-07 51.5454 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 9202 9267 1.07E-06 51.1602 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4596 4700 2.48E-06 50.0046 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 6767 6836 3.53E-06 49.6194 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8768 8881 6.59E-06 48.849 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 9622 9735 6.94E-06 48.849 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 4924 5016 1.58E-05 47.6934 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 3013 3109 0.000112537 44.997 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 8652 8752 0.000122884 44.997 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 9506 9606 0.000122884 44.997 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5838 5895 0.00050618 43.071 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6489 - CGI_10015666 superfamily 247068 5243 5304 0.00347088 40.3746 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6490 - CGI_10015667 superfamily 247723 10 81 1.30E-14 71.1821 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#6490 - CGI_10015667 superfamily 243098 190 237 8.37E-13 65.3119 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#6490 - CGI_10015667 superfamily 245201 898 1017 1.09E-08 55.7057 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6490 - CGI_10015667 superfamily 245039 412 515 0.0022921 40.2584 cl09232 YqaJ superfamily N - "YqaJ-like viral recombinase domain; This protein family is found in many different bacterial species but is of viral origin. The protein forms an oligomer and functions as a processive alkaline exonuclease that digests linear double-stranded DNA in a Mg(2+)-dependent reaction, It has a preference for 5'-phosphorylated DNA ends. It thus forms part of the two-component SynExo viral recombinase functional unit." Q#6491 - CGI_10015668 superfamily 241782 222 585 1.97E-156 457.793 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#6495 - CGI_10015672 superfamily 110440 314 339 0.000870792 36.6169 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6496 - CGI_10015673 superfamily 241563 68 109 5.14E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6496 - CGI_10015673 superfamily 241563 28 59 0.00616994 35.1476 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6497 - CGI_10015674 superfamily 241563 68 109 2.04E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6497 - CGI_10015674 superfamily 241563 28 59 0.00183013 36.6884 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6498 - CGI_10015675 superfamily 241563 68 109 2.41E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6498 - CGI_10015675 superfamily 241563 21 59 0.000685314 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6499 - CGI_10006466 superfamily 222150 292 317 0.000868479 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6500 - CGI_10006467 superfamily 201217 440 486 7.89E-07 46.3648 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6500 - CGI_10006467 superfamily 201217 277 326 2.07E-05 42.1276 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6500 - CGI_10006467 superfamily 201217 179 238 8.47E-05 40.2016 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6500 - CGI_10006467 superfamily 205718 263 289 0.0006446 37.4698 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6500 - CGI_10006467 superfamily 205718 422 451 0.000738965 37.4698 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6500 - CGI_10006467 superfamily 201217 330 377 0.00160472 36.7348 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6502 - CGI_10006469 superfamily 247904 1 74 8.57E-19 76.4403 cl17350 HD_3 superfamily N - HD domain; HD domains are metal dependent phosphohydrolases. Q#6503 - CGI_10006470 superfamily 247639 66 323 2.59E-45 156.853 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#6504 - CGI_10006471 superfamily 150167 13 451 3.71E-112 340.449 cl09652 DUF2003 superfamily - - Eukaryotic protein of unknown function (DUF2003); This is a family of proteins of unknown function which adopt an alpha helical and beta sheet structure. Q#6505 - CGI_10006472 superfamily 241874 237 678 2.15E-65 225.005 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#6506 - CGI_10012031 superfamily 247805 32 142 2.55E-15 69.6736 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6507 - CGI_10012033 superfamily 248264 207 365 2.14E-44 152.391 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#6507 - CGI_10012033 superfamily 222263 123 212 0.00671664 34.6009 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#6511 - CGI_10012037 superfamily 241607 64 89 4.47E-06 40.7162 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#6511 - CGI_10012037 superfamily 241607 24 58 0.000304672 35.7086 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#6512 - CGI_10012038 superfamily 222324 129 239 3.50E-22 87.8326 cl16352 zf-3CxxC superfamily - - Zinc-binding domain; This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue. Q#6515 - CGI_10012041 superfamily 247792 16 66 2.26E-08 50.9072 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6515 - CGI_10012041 superfamily 241563 101 130 0.00491239 35.1476 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6517 - CGI_10012043 superfamily 246669 1 110 4.15E-59 191.509 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#6518 - CGI_10012044 superfamily 246669 1256 1375 4.24E-80 260.021 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#6518 - CGI_10012044 superfamily 246669 357 522 2.85E-77 252.104 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#6518 - CGI_10012044 superfamily 241566 235 284 2.85E-16 75.6063 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#6518 - CGI_10012044 superfamily 220800 1064 1221 1.38E-39 144.755 cl11172 Membr_traf_MHD superfamily - - "Munc13 (mammalian uncoordinated) homology domain; Munc13 proteins constitute a family of three highly homologous molecules (Munc13-1, Munc13-2 and Munc13-3) with homology to Caenorhabditis elegans unc-13p. Munc13 proteins contain a phorbol ester-binding C1 domain and two C2 domains, which are Ca2+/phospholipid binding domains. Sequence analyses have uncovered two regions called Munc13 homology domains 1 (MHD1) and 2 (MHD2) that are arranged between two flanking C2 domains. MHD1 and MHD2 domains are present in a wide variety of proteins from Arabidopsis thaliana, C. elegans, Drosophila melanogaster, mouse, rat and human, some of which may function in a Munc13-like manner to regulate membrane trafficking. The MHD1 and MHD2 domains are predicted to be alpha-helical." Q#6518 - CGI_10012044 superfamily 218976 714 820 7.64E-33 124.868 cl05671 DUF1041 superfamily - - "Domain of Unknown Function (DUF1041); This family consists of several eukaryotic domains of unknown function. Members of this family are often found in tandem repeats and co-occur with pfam00168, pfam00130 and pfam00169 domains." Q#6521 - CGI_10005658 superfamily 248264 1 138 7.36E-10 55.7062 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#6522 - CGI_10004116 superfamily 246918 42 93 4.68E-05 36.6146 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#6531 - CGI_10005582 superfamily 110440 148 172 0.00187157 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6532 - CGI_10005583 superfamily 241563 63 96 0.00655467 34.9556 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6535 - CGI_10005586 superfamily 246925 416 661 2.66E-28 115.916 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#6537 - CGI_10005588 superfamily 241568 3 43 0.000195538 35.9016 cl00043 CCP superfamily C - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#6537 - CGI_10005588 superfamily 245226 42 114 4.86E-05 39.2055 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#6539 - CGI_10010062 superfamily 241574 1 58 6.87E-18 79.5521 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6539 - CGI_10010062 superfamily 241574 195 299 9.07E-12 62.2181 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6541 - CGI_10010064 superfamily 245234 205 256 2.40E-05 41.1238 cl10022 ABM superfamily C - Antibiotic biosynthesis monooxygenase; This domain is found in monooxygenases involved in the biosynthesis of several antibiotics by Streptomyces species. It's occurrence as a repeat in Streptomyces coelicolor SCO1909 is suggestive that the other proteins function as multimers. There is also a conserved histidine which is likely to be an active site residue. Q#6541 - CGI_10010064 superfamily 247856 10 31 0.0054368 33.7908 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#6543 - CGI_10010066 superfamily 241741 1 146 5.80E-76 238.619 cl00270 PEPCK_HprK superfamily N - "Phosphoenolpyruvate carboxykinase (PEPCK), a critical gluconeogenic enzyme, catalyzes the first committed step in the diversion of tricarboxylic acid cycle intermediates toward gluconeogenesis. It catalyzes the reversible decarboxylation and phosphorylation of oxaloacetate to yield phosphoenolpyruvate and carbon dioxide, using a nucleotide molecule (ATP or GTP) for the phosphoryl transfer, and has a strict requirement for divalent metal ions for activity. PEPCK's separate into two phylogenetic groups based on their nucleotide substrate specificity (the ATP-, and GTP-dependent groups).HprK/P, the bifunctional histidine-containing protein kinase/phosphatase, controls the phosphorylation state of the phosphocarrier protein HPr and regulates the utilization of carbon sources by gram-positive bacteria. It catalyzes both the ATP-dependent phosphorylation of HPr and its dephosphorylation by phosphorolysis. PEPCK and the C-terminal catalytic domain of HprK/P are structurally similar with conserved active site residues suggesting that these two phosphotransferases have related functions." Q#6544 - CGI_10010068 superfamily 245208 10 241 8.54E-68 230.679 cl09933 ACAD superfamily C - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#6544 - CGI_10010068 superfamily 245208 233 480 9.85E-66 224.901 cl09933 ACAD superfamily N - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#6547 - CGI_10010071 superfamily 243306 203 411 1.71E-99 298.325 cl03114 RNase_PH superfamily - - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#6548 - CGI_10010072 superfamily 245206 40 282 2.26E-128 368.879 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#6550 - CGI_10023077 superfamily 217293 33 234 1.07E-41 146.239 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6551 - CGI_10023078 superfamily 220393 235 504 1.10E-68 230.725 cl10751 Tmem26 superfamily - - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#6551 - CGI_10023078 superfamily 217293 511 640 7.79E-25 104.252 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6551 - CGI_10023078 superfamily 217293 640 767 1.08E-24 103.867 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6551 - CGI_10023078 superfamily 202474 791 862 4.98E-09 56.5081 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#6552 - CGI_10023079 superfamily 245608 2 200 2.09E-64 200.232 cl11421 FAA_hydrolase superfamily - - "Fumarylacetoacetate (FAA) hydrolase family; This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyses fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hepatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerises this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. This family also includes various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase, encoded by mhpD in E. coli, is involved in the phenylpropionic acid pathway of E. coli and catalyzes the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase." Q#6556 - CGI_10023083 superfamily 244859 62 289 6.45E-16 75.3005 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#6558 - CGI_10023085 superfamily 245226 249 439 2.40E-112 343.04 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#6558 - CGI_10023085 superfamily 219714 31 122 5.15E-20 86.5253 cl06895 PMC2NT superfamily - - "PMC2NT (NUC016) domain; This domain is found at the N-terminus of 3'-5' exonucleases with HRDC domains, and also in putative exosome components." Q#6558 - CGI_10023085 superfamily 207658 468 548 1.85E-15 73.1002 cl02578 HRDC superfamily - - HRDC domain; The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain cause human disease. It is interesting to note that the RecQ helicase in Deinococcus radiodurans has three tandem HRDC domains. Q#6561 - CGI_10023088 superfamily 220691 151 349 0.00316287 37.5974 cl18569 7TM_GPCR_Srv superfamily N - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#6564 - CGI_10023091 superfamily 245201 948 1141 6.07E-19 87.2921 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6564 - CGI_10023091 superfamily 244600 43 163 8.23E-48 168.171 cl07066 Mad3_BUB1_I superfamily - - Mad3/BUB1 homology region 1; Proteins containing this domain are checkpoint proteins involved in cell division. This region has been shown to be essential for the binding of the binding of BUB1 and MAD3 to CDC20p. Q#6568 - CGI_10023095 superfamily 243060 182 224 0.000131711 40.8252 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#6571 - CGI_10023098 superfamily 203472 19 83 1.26E-11 55.3584 cl05835 B12D superfamily - - "NADH-ubiquinone reductase complex 1 MLRQ subunit; The MLRQ subunit of mitochondrial NADH-ubiquinone reductase complex I is nuclear and is found in plants, insects, fungi and higher metazoans. It appears to act within the membrane and, in mammals, is highly expressed in muscle and neural tissue, indicative of a role in ATP generation." Q#6572 - CGI_10023099 superfamily 241972 1 86 2.15E-56 171.799 cl00600 Ribosomal_L7Ae superfamily - - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#6574 - CGI_10023101 superfamily 243146 44 90 1.34E-10 53.049 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#6574 - CGI_10023101 superfamily 243146 4 55 1.22E-08 47.5531 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#6575 - CGI_10023102 superfamily 241622 141 206 4.41E-17 72.9846 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#6576 - CGI_10023103 superfamily 243096 823 1008 1.61E-37 141.281 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#6576 - CGI_10023103 superfamily 241566 564 613 1.53E-12 64.8207 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#6576 - CGI_10023103 superfamily 243090 353 475 2.43E-33 127.119 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#6576 - CGI_10023103 superfamily 247725 1024 1160 2.25E-31 122.393 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#6577 - CGI_10023104 superfamily 222150 295 319 4.76E-07 47.0013 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6577 - CGI_10023104 superfamily 222150 322 346 2.89E-05 41.9937 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6577 - CGI_10023104 superfamily 246975 282 302 0.000565549 38.0969 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#6577 - CGI_10023104 superfamily 222150 350 373 0.00141104 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6577 - CGI_10023104 superfamily 243091 30 135 0.001413 37.6991 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#6577 - CGI_10023104 superfamily 222150 378 402 0.00572542 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6578 - CGI_10023105 superfamily 243077 4 56 1.14E-20 86.4453 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#6579 - CGI_10023106 superfamily 248012 478 598 2.79E-23 96.5736 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#6580 - CGI_10023107 superfamily 248012 457 595 0.00093716 38.4584 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#6581 - CGI_10023108 superfamily 247907 1 100 2.58E-13 69.3692 cl17353 LamG superfamily N - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6581 - CGI_10023108 superfamily 241611 1000 1151 2.21E-11 63.5616 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#6581 - CGI_10023108 superfamily 241611 180 316 6.61E-09 55.8576 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#6581 - CGI_10023108 superfamily 241611 790 929 1.18E-07 52.0056 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#6581 - CGI_10023108 superfamily 241611 367 515 5.24E-06 46.998 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#6582 - CGI_10023109 superfamily 247907 491 639 2.94E-21 93.6368 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6582 - CGI_10023109 superfamily 247907 1110 1265 1.18E-20 92.096 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6582 - CGI_10023109 superfamily 247907 280 423 3.09E-20 90.9404 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6582 - CGI_10023109 superfamily 247907 1345 1482 5.19E-19 87.0884 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6582 - CGI_10023109 superfamily 247907 929 1074 1.26E-18 85.9328 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6582 - CGI_10023109 superfamily 247907 46 207 2.61E-16 79.3844 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6582 - CGI_10023109 superfamily 247907 714 859 2.57E-09 57.8133 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#6582 - CGI_10023109 superfamily 243092 2100 2405 2.70E-30 123.984 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#6582 - CGI_10023109 superfamily 245213 1291 1314 0.00952456 36.6436 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6584 - CGI_10023111 superfamily 243091 673 789 4.38E-43 152.874 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#6586 - CGI_10023113 superfamily 243034 21 120 4.20E-10 57.7752 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6586 - CGI_10023113 superfamily 243034 326 430 3.90E-07 48.5304 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6586 - CGI_10023113 superfamily 243034 404 478 0.00951496 35.0484 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6587 - CGI_10023114 superfamily 203913 4 137 3.18E-11 60.6937 cl07084 P4Ha_N superfamily - - "Prolyl 4-Hydroxylase alpha-subunit, N-terminal region; The members of this family are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme (EC:1.14.11.2) is important in the post-translational modification of collagen, as it catalyzes the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase. The function of the N-terminal region featured in this family does not seem to be known." Q#6588 - CGI_10023115 superfamily 241578 8 172 1.95E-22 91.199 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#6589 - CGI_10023116 superfamily 241578 23 188 1.49E-31 115.467 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#6592 - CGI_10023119 superfamily 243146 319 365 6.84E-10 54.5898 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#6592 - CGI_10023119 superfamily 243146 281 330 1.30E-06 45.2419 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#6594 - CGI_10023121 superfamily 198867 285 384 1.95E-19 84.7004 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#6594 - CGI_10023121 superfamily 243066 169 276 4.81E-16 74.9613 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#6594 - CGI_10023121 superfamily 243146 616 662 9.17E-10 55.3602 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#6594 - CGI_10023121 superfamily 243146 578 627 3.95E-06 44.8567 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#6599 - CGI_10023126 superfamily 242209 278 352 6.18E-46 153.152 cl00942 PCD_DCoH superfamily - - "PCD_DCoH: The bifunctional protein pterin-4alpha-carbinolamine dehydratase (PCD), also known as DCoH (dimerization cofactor of hepatocyte nuclear factor-1), is both a transcription activator and a metabolic enzyme. DCoH stimulates gene expression by associating with specific DNA binding proteins such as HNF-1alpha (hepatocyte nuclear factor-1) and Xenopus enhancer of rudimentary homologue (XERH). DCoH also catalyzes the dehydration of 4alpha- hydroxy- tetrahydrobiopterin (4alpha-OH-BH4) to quinoiddihydrobiopterin, a percursor of the phenylalanine hydroxylase cofactor BH4 (tetrahydrobiopterin). The DCoH homodimer has a saddle-shaped structure similar to that of TBP (TATA binding protein). Two DCoH proteins have been identifed in humans: DCoH1 and DCoH2. Mutations in human DCoH1 cause hyperphenylalaninemia. Loss of enzymic activity of DCoH in humans is associated with the depigmentation disorder vitiligo. DCoH1 has been reported to be overexpessed in colon cancer carcinomas and in malignant melanomas." Q#6599 - CGI_10023126 superfamily 243035 49 171 4.03E-27 103.854 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6601 - CGI_10010566 superfamily 243058 272 307 0.00122199 37.294 cl02500 ARM superfamily NC - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#6602 - CGI_10010567 superfamily 243058 106 227 1.10E-32 122.038 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#6602 - CGI_10010567 superfamily 243058 340 451 1.11E-30 116.645 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#6602 - CGI_10010567 superfamily 243058 191 325 1.07E-18 82.7475 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#6602 - CGI_10010567 superfamily 243058 470 580 4.00E-16 75.4287 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#6602 - CGI_10010567 superfamily 201951 5 92 5.17E-14 68.5614 cl03339 IBB superfamily - - "Importin beta binding domain; This family consists of the importin alpha (karyopherin alpha), importin beta (karyopherin beta) binding domain. The domain mediates formation of the importin alpha beta complex; required for classical NLS import of proteins into the nucleus, through the nuclear pore complex and across the nuclear envelope. Also in the alignment is the NLS of importin alpha which overlaps with the IBB domain." Q#6603 - CGI_10010568 superfamily 247856 144 209 1.35E-07 48.6981 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#6603 - CGI_10010568 superfamily 150162 48 123 8.43E-22 89.1248 cl09646 FOP_dimer superfamily - - FOP N terminal dimerisation domain; Fibroblast growth factor receptor 1 (FGFR1) oncogene partner (FOP) is a centrosomal protein that is involved in anchoring microtubules to subcellular structures. This domain includes a Lis-homology motif. It forms an alpha helical bundle and is involved in dimerisation. Q#6604 - CGI_10010569 superfamily 243128 315 513 8.73E-13 67.3855 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#6604 - CGI_10010569 superfamily 243129 609 694 3.93E-12 64.197 cl02653 MA3 superfamily C - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#6607 - CGI_10010572 superfamily 220187 12 92 1.42E-29 102.326 cl07829 NuA4 superfamily - - "Histone acetyltransferase subunit NuA4; The NuA4 histone acetyltransferase (HAT) multisubunit complex is responsible for acetylation of histone H4 and H2A N-terminal tails in yeast. NuA4 complexes are highly conserved in eukaryotes and play primary roles in transcription, cellular response to DNA damage, and cell cycle control." Q#6611 - CGI_10003687 superfamily 243066 52 105 3.12E-12 58.3333 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#6612 - CGI_10003688 superfamily 218440 104 251 8.07E-05 44.1421 cl14936 AF-4 superfamily NC - "AF-4 proto-oncoprotein; This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila." Q#6614 - CGI_10003690 superfamily 242274 43 140 0.00253483 35.5649 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#6617 - CGI_10003511 superfamily 243091 75 170 4.13E-08 52.1092 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#6618 - CGI_10003512 superfamily 220722 74 196 1.07E-20 89.0157 cl11040 EST1 superfamily - - Telomerase activating protein Est1; Est1 is a protein which recruits or activates telomerase at the site of polymerisation. Q#6620 - CGI_10012045 superfamily 248097 78 195 7.98E-22 86.9354 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6621 - CGI_10012046 superfamily 248097 46 114 5.32E-16 71.1422 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6622 - CGI_10012047 superfamily 248097 15 78 0.00913704 31.0814 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6623 - CGI_10012048 superfamily 248458 52 240 2.72E-13 70.0353 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#6623 - CGI_10012048 superfamily 248458 306 515 0.000498387 41.1453 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#6625 - CGI_10012050 superfamily 247792 33 76 5.16E-10 56.6852 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6626 - CGI_10012051 superfamily 213389 192 375 6.46E-40 142.043 cl17092 STING_C superfamily - - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#6626 - CGI_10012051 superfamily 248012 50 159 2.26E-15 71.4548 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#6628 - CGI_10012053 superfamily 242311 217 260 0.00207611 35.9364 cl01115 BMFP superfamily N - "Membrane fusogenic activity; BMFP consists of two structural domains, a coiled-coil C-terminal domain via which the protein self-associates as a trimer, and an N-terminal domain disordered at neutral pH but adopting an amphipathic alpha-helical structure in the presence of phospholipid vesicles, high ionic strength, acidic pH or SDS. BMFP interacts with phospholipid vesicles though the predicted amphipathic alpha-helix induced in the N-terminal half of the protein and promotes aggregation and fusion of vesicles in vitro." Q#6629 - CGI_10012054 superfamily 222150 139 162 0.00213089 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6629 - CGI_10012054 superfamily 222150 54 76 0.00278755 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6629 - CGI_10012054 superfamily 222150 224 248 0.00754503 33.9045 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6630 - CGI_10012055 superfamily 244819 6 60 2.98E-07 44.0432 cl07874 zf-AD superfamily C - "Zinc-finger associated domain (zf-AD); The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA." Q#6634 - CGI_10012059 superfamily 217473 65 314 1.08E-26 106.68 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#6636 - CGI_10012061 superfamily 247805 86 226 9.10E-29 112.816 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6636 - CGI_10012061 superfamily 243778 469 559 2.24E-39 140.822 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#6636 - CGI_10012061 superfamily 219532 593 696 2.97E-34 127.045 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#6636 - CGI_10012061 superfamily 247905 313 372 2.25E-06 46.0509 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#6637 - CGI_10012062 superfamily 247805 100 288 2.69E-47 164.195 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6637 - CGI_10012062 superfamily 247805 8 105 8.06E-34 126.445 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6637 - CGI_10012062 superfamily 247905 299 366 0.00554633 36.0617 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#6639 - CGI_10012064 superfamily 192535 75 240 1.71E-06 48.361 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#6640 - CGI_10012065 superfamily 192535 53 185 0.000285137 42.1978 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#6641 - CGI_10012066 superfamily 216739 450 483 0.00204611 36.6418 cl03383 PC_rep superfamily - - Proteasome/cyclosome repeat; Proteasome/cyclosome repeat. Q#6641 - CGI_10012066 superfamily 216739 503 536 0.00232773 36.6418 cl03383 PC_rep superfamily - - Proteasome/cyclosome repeat; Proteasome/cyclosome repeat. Q#6642 - CGI_10001682 superfamily 247739 6 167 1.21E-72 219.397 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#6643 - CGI_10001685 superfamily 219318 110 280 5.29E-62 204.341 cl06271 PhaC_N superfamily - - "Poly-beta-hydroxybutyrate polymerase (PhaC) N-terminus; This family represents the N-terminal region of the bacterial poly-beta-hydroxybutyrate polymerase (PhaC). Polyhydroxyalkanoic acids (PHAs) are carbon and energy reserve polymers produced in some bacteria when carbon sources are plentiful and another nutrient, such as nitrogen, phosphate, oxygen, or sulfur, becomes limiting. PHAs composed of monomeric units ranging from 3 to 14 carbons exist in nature. When the carbon source is exhausted, PHA is utilised by the bacterium. PhaC links D-(-)-3-hydroxybutyrl-CoA to an existing PHA molecule by the formation of an ester bond. This family appears to be a partial segment of an alpha/beta hydrolase domain." Q#6644 - CGI_10001688 superfamily 245815 7 472 0 754.724 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#6645 - CGI_10001690 superfamily 208802 1 229 1.14E-113 331.771 cl07974 DRE_TIM_metallolyase superfamily - - "DRE-TIM metallolyase superfamily; The DRE-TIM metallolyase superfamily includes 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM"." Q#6645 - CGI_10001690 superfamily 149094 232 296 8.89E-28 103.41 cl06739 DmpG_comm superfamily - - "DmpG-like communication domain; This domain is found towards the C-terminal region of various aldolase enzymes. It consists of five alpha-helices, four of which form an antiparallel helical bundle that plugs the C-terminus of the N-terminal TIM barrel domain. The communication domain is thought to play an important role in the heterodimerisation of the enzyme." Q#6646 - CGI_10001691 superfamily 220165 124 268 4.32E-76 231.279 cl07796 AcetDehyd-dimer superfamily - - "Prokaryotic acetaldehyde dehydrogenase, dimerisation; Members of this family are found in prokaryotic acetaldehyde dehydrogenase (acylating), and adopt a structure consisting of an alpha-beta-alpha-beta(3) core. They mediate dimerisation of the protein." Q#6646 - CGI_10001691 superfamily 214863 3 116 9.44E-13 63.3363 cl18317 Semialdhyde_dh superfamily - - "Semialdehyde dehydrogenase, NAD binding domain; The semialdehyde dehydrogenase family is found in N-acetyl-glutamine semialdehyde dehydrogenase (AgrC), which is involved in arginine biosynthesis, and aspartate-semialdehyde dehydrogenase, an enzyme involved in the biosynthesis of various amino acids from aspartate. This family is also found in yeast and fungal Arg5,6 protein, which is cleaved into the enzymes N-acety-gamma-glutamyl-phosphate reductase and acetylglutamate kinase. These are also involved in arginine biosynthesis. All proteins in this entry contain a NAD binding region of semialdehyde dehydrogenase." Q#6647 - CGI_10001692 superfamily 245608 3 257 2.89E-101 298.66 cl11421 FAA_hydrolase superfamily - - "Fumarylacetoacetate (FAA) hydrolase family; This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyses fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hepatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerises this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. This family also includes various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase, encoded by mhpD in E. coli, is involved in the phenylpropionic acid pathway of E. coli and catalyzes the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase." Q#6648 - CGI_10001696 superfamily 216434 19 347 1.84E-116 344.061 cl08318 PPDK_N superfamily - - "Pyruvate phosphate dikinase, PEP/pyruvate binding domain; This enzyme catalyzes the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP)." Q#6649 - CGI_10003564 superfamily 241763 115 328 3.04E-117 339.985 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#6649 - CGI_10003564 superfamily 244586 27 85 2.63E-17 74.5874 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#6650 - CGI_10003565 superfamily 241763 75 287 4.28E-99 292.22 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#6650 - CGI_10003565 superfamily 244586 3 45 1.84E-13 63.4167 cl07031 Inhibitor_I29 superfamily N - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#6651 - CGI_10003566 superfamily 241763 25 238 3.36E-115 331.126 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#6652 - CGI_10003567 superfamily 248012 68 165 2.94E-21 86.1732 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#6653 - CGI_10003568 superfamily 241763 11 103 6.09E-15 66.4934 cl00298 Peptidase_C1 superfamily N - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#6654 - CGI_10003569 superfamily 248012 26 169 1.31E-21 86.2232 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#6657 - CGI_10001572 superfamily 245213 374 413 2.38E-08 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6657 - CGI_10001572 superfamily 245213 290 325 8.99E-06 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6657 - CGI_10001572 superfamily 245213 423 458 1.19E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6657 - CGI_10001572 superfamily 245213 237 289 0.00107063 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6657 - CGI_10001572 superfamily 205157 199 233 1.02E-06 45.9915 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#6657 - CGI_10001572 superfamily 245213 332 370 7.24E-05 40.7952 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6658 - CGI_10001573 superfamily 243100 301 362 2.75E-07 47.1736 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#6660 - CGI_10004791 superfamily 243035 5 70 8.86E-18 71.9435 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6662 - CGI_10008563 superfamily 110440 234 260 0.007974 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6662 - CGI_10008563 superfamily 110440 542 568 0.007974 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6666 - CGI_10008567 superfamily 245230 1 194 4.05E-139 398.958 cl10017 Tubulin_FtsZ superfamily N - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#6669 - CGI_10008570 superfamily 245716 51 72 6.01E-05 42.1603 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#6670 - CGI_10008571 superfamily 247856 110 156 0.000182264 38.6829 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#6671 - CGI_10008572 superfamily 242370 1 116 1.47E-30 110.475 cl01218 CutC superfamily N - "CutC family; Copper transport in Escherichia coli is mediated by the products of at least six genes, cutA, cutB, cutC, cutD, cutE, and cutF. A mutation in one or more of these genes results in an increased copper sensitivity. Members of this family are between 200 and 300 amino acids in length are found in both eukaryotes and bacteria." Q#6674 - CGI_10016469 superfamily 245206 5 245 3.18E-112 328.353 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#6676 - CGI_10016471 superfamily 243034 4 103 5.72E-23 91.6727 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6676 - CGI_10016471 superfamily 218373 286 367 1.86E-28 106.336 cl04882 SGS superfamily - - "SGS domain; This domain was thought to be unique to the SGT1-like proteins, but is also found in calcyclin binding proteins." Q#6676 - CGI_10016471 superfamily 241659 173 256 3.15E-26 100.145 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#6678 - CGI_10016473 superfamily 243212 166 293 6.05E-19 82.0065 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#6678 - CGI_10016473 superfamily 215866 22 144 1.64E-16 75.4395 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#6681 - CGI_10016476 superfamily 241984 168 404 1.05E-53 179.76 cl00615 Membrane-FADS-like superfamily - - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#6681 - CGI_10016476 superfamily 242849 18 91 1.71E-24 96.1188 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#6682 - CGI_10016478 superfamily 219525 137 193 3.19E-06 43.947 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#6682 - CGI_10016478 superfamily 219525 200 255 0.000443256 37.7838 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#6683 - CGI_10016480 superfamily 217293 20 219 4.22E-30 115.423 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6683 - CGI_10016480 superfamily 202474 239 309 3.73E-13 67.2937 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#6687 - CGI_10016484 superfamily 241592 23 42 1.87E-08 46.1726 cl00074 H2A superfamily C - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#6688 - CGI_10016485 superfamily 241782 92 482 4.54E-53 184.082 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#6689 - CGI_10016486 superfamily 241644 95 170 5.67E-11 57.9972 cl00154 UBCc superfamily C - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#6690 - CGI_10016487 superfamily 208843 424 583 7.64E-75 245.885 cl08275 RHD-n superfamily - - "N-terminal sub-domain of the Rel homology domain (RHD); Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal sub-domain, which may be distantly related to the DNA-binding domain found in P53. The C-terminal sub-domain has an immunoglobulin-like fold and serves as a dimerization module that also binds DNA (see cd00102). The RHD is found in NF-kappa B, nuclear factor of activated T-cells (NFAT), the tonicity-responsive enhancer binding protein (TonEBP), and the arthropod proteins Dorsal and Relish (Rel)." Q#6690 - CGI_10016487 superfamily 247038 588 688 8.62E-29 112.963 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#6691 - CGI_10016488 superfamily 241868 42 192 5.79E-50 162.696 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#6692 - CGI_10016489 superfamily 245055 47 144 2.11E-18 85.3169 cl09326 MATE_like superfamily N - "Multidrug and toxic compound extrusion family and similar proteins; The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria." Q#6703 - CGI_10008051 superfamily 241572 55 144 2.33E-14 66.4932 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#6703 - CGI_10008051 superfamily 241572 154 250 1.94E-13 64.5714 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#6705 - CGI_10008053 superfamily 222150 873 895 6.48E-06 44.6901 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#6705 - CGI_10008053 superfamily 246975 860 881 0.000635261 38.8673 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#6706 - CGI_10008054 superfamily 217293 16 218 1.87E-26 105.022 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6706 - CGI_10008054 superfamily 202474 225 310 1.92E-07 49.9597 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#6709 - CGI_10010990 superfamily 247725 14 103 2.82E-44 149.241 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#6712 - CGI_10010993 superfamily 245226 52 202 1.19E-23 94.6748 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#6713 - CGI_10010994 superfamily 193256 2883 3150 2.86E-63 220.975 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#6713 - CGI_10010994 superfamily 193251 2530 2793 6.02E-47 173.584 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#6713 - CGI_10010994 superfamily 193253 3163 3508 4.67E-35 140.557 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#6713 - CGI_10010994 superfamily 193257 3522 3641 5.57E-26 110.845 cl15086 AAA_9 superfamily N - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#6713 - CGI_10010994 superfamily 247743 2227 2362 7.69E-07 50.7568 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#6715 - CGI_10010996 superfamily 248097 221 347 4.23E-23 92.3282 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6715 - CGI_10010996 superfamily 248097 61 183 5.99E-19 80.7722 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6716 - CGI_10010998 superfamily 248097 23 126 8.40E-22 84.6242 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6717 - CGI_10010999 superfamily 248097 151 275 1.49E-28 106.581 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6719 - CGI_10011001 superfamily 128937 4 69 2.93E-11 55.3464 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#6719 - CGI_10011001 superfamily 128937 79 139 8.95E-11 54.1908 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#6721 - CGI_10001136 superfamily 246748 180 419 4.02E-119 352.665 cl14876 Zinc_peptidase_like superfamily N - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#6721 - CGI_10001136 superfamily 244870 2 163 2.05E-61 201.364 cl08238 PA superfamily N - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#6723 - CGI_10005202 superfamily 247068 2022 2118 3.35E-15 75.0425 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 1757 1842 1.23E-13 70.4201 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 949 1036 1.81E-13 70.0349 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 2874 2959 3.59E-12 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 2129 2214 4.91E-12 65.7977 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 1646 1745 3.92E-10 60.0197 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 245213 3311 3348 4.28E-10 58.8022 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6723 - CGI_10005202 superfamily 247068 2765 2862 2.40E-09 57.7086 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 1327 1412 5.57E-09 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 520 609 2.31E-08 55.0122 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 1546 1634 3.14E-08 54.627 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 2653 2754 6.65E-08 53.4714 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 2351 2431 7.01E-07 50.3898 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 1277 1315 8.37E-07 50.0046 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 1159 1249 9.13E-07 50.0046 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 770 833 2.55E-06 48.849 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 1045 1145 3.48E-06 48.0786 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 241568 255 293 6.07E-05 43.9908 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#6723 - CGI_10005202 superfamily 245213 3357 3392 0.000146411 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6723 - CGI_10005202 superfamily 247068 632 729 0.00017152 43.071 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 1865 1959 0.000587867 41.145 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 845 938 0.000875244 40.7598 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 247068 405 509 0.00129157 40.3746 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6723 - CGI_10005202 superfamily 245213 3396 3430 0.00138406 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6723 - CGI_10005202 superfamily 246918 76 122 4.42E-07 50.2779 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#6723 - CGI_10005202 superfamily 205157 3271 3308 2.27E-05 44.8359 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#6723 - CGI_10005202 superfamily 246918 132 173 2.47E-05 44.8851 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#6723 - CGI_10005202 superfamily 247068 1970 2027 0.000359169 41.9515 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6726 - CGI_10005205 superfamily 216981 364 468 1.49E-15 74.4913 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#6727 - CGI_10005206 superfamily 241946 71 174 9.66E-11 55.641 cl00558 Abi superfamily - - "CAAX protease self-immunity; Members of this family are probably proteases (after a isoprenyl group is attached to the Cys residue in the C-terminal CAAX motif of a protein to attach it to the membrane, the AAX tripeptide being removed by one of the CAAX prenyl proteases). The family contains the CAAX prenyl protease. The proteins contain a highly conserved Glu-Glu motif at the amino end of the alignment. The alignment also contains two histidine residues that may be involved in zinc binding. While they are involved in membrane anchoring of proteins in eukaryotes, little is known about their function in prokaryotes. In some known bacteriocin loci, Abi genes have been found downstream of bacteriocin structural genes where they are probably involved in self-immunity. Investigation of the bacteriocin-like loci in the Gram positive bacteria locus from Lactobacillus sakei 23K confirmed that the bacteriocin-like genes (sak23Kalphabeta) exhibited antimicrobial activity when expressed in a heterologous host and that the associated Abi gene (sak23Ki) conferred immunity against the cognate bacteriocin. Interestingly, the immunity genes from three similar systems conferred a high degree of cross-immunity against each other's bacteriocins, suggesting the recognition of a common receptor. Site-directed mutagenesis demonstrated that the conserved motifs constituting the putative proteolytic active site of the Abi proteins are essential for the immunity function of Sak23Ki - thus a new concept in self-immunity." Q#6729 - CGI_10005208 superfamily 219409 6 164 7.82E-14 65.2484 cl06456 Dynactin_p22 superfamily - - "Dynactin subunit p22; This family contains p22, the smallest subunit of dynactin, a complex that binds to cytoplasmic dynein and is a required activator for cytoplasmic dynein-mediated vesicular transport. Dynactin localises to the cleavage furrow and to the midbodies of dividing cells, suggesting that it may function in cytokinesis. Family members are approximately 170 residues long." Q#6730 - CGI_10005209 superfamily 204415 218 532 5.28E-88 284.111 cl16016 TTKRSYEDQ superfamily C - Predicted coiled-coil domain-containing protein; This is the C-terminal 500 amino acids of a family of proteins with a predicted coiled-coil domain conserved from nematodes to humans. It carries a characteristic TTKRSYEDQ sequence-motif. The function is not known. Q#6730 - CGI_10005209 superfamily 150821 8 88 6.61E-25 99.4966 cl10895 KLRAQ superfamily C - Predicted coiled-coil domain-containing protein; This is the N-terminal 100 amino acid domain of a family of proteins conserved from nematodes to humans. It carries a characteristic KLRAQ sequence-motif. The function is not known. Q#6730 - CGI_10005209 superfamily 243591 72 164 0.000114005 41.2489 cl03951 CDC37_N superfamily N - Cdc37 N terminal kinase binding; Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain pfam08565. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function. Q#6733 - CGI_10001489 superfamily 241680 208 259 0.00017553 40.6998 cl00200 MIP superfamily N - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#6734 - CGI_10001490 superfamily 245815 50 263 5.69E-120 353.918 cl11961 ALDH-SF superfamily C - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#6735 - CGI_10002143 superfamily 216254 11 91 1.89E-16 75.3622 cl08303 Recep_L_domain superfamily C - Receptor L domain; The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. Q#6736 - CGI_10002778 superfamily 243072 367 492 7.72E-34 124.418 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6736 - CGI_10002778 superfamily 243072 268 393 6.23E-31 116.714 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6736 - CGI_10002778 superfamily 243072 192 327 4.57E-19 83.587 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6739 - CGI_10002427 superfamily 241600 2 118 3.04E-35 122.734 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6740 - CGI_10002428 superfamily 241600 2 48 6.57E-09 49.1611 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6742 - CGI_10003589 superfamily 201217 553 600 5.56E-13 65.2396 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6742 - CGI_10003589 superfamily 201217 603 653 1.67E-10 57.9208 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6742 - CGI_10003589 superfamily 201217 709 757 3.25E-09 54.0688 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6742 - CGI_10003589 superfamily 201217 656 704 1.52E-08 52.1428 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6742 - CGI_10003589 superfamily 201217 446 497 5.19E-07 47.9056 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6742 - CGI_10003589 superfamily 201217 501 550 1.41E-05 43.6684 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6742 - CGI_10003589 superfamily 201217 397 443 0.00616894 35.5792 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6743 - CGI_10003590 superfamily 219542 52 162 8.92E-39 138.914 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#6743 - CGI_10003590 superfamily 219541 490 619 1.36E-29 114.488 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#6743 - CGI_10003590 superfamily 215896 173 359 2.00E-15 73.8684 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#6744 - CGI_10003591 superfamily 243109 1111 1267 6.79E-75 246.469 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#6745 - CGI_10003592 superfamily 243092 1234 1583 1.18E-21 97.7908 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#6745 - CGI_10003592 superfamily 201217 1904 1953 9.95E-14 68.7064 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6745 - CGI_10003592 superfamily 201217 1956 2005 4.93E-13 66.7804 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6745 - CGI_10003592 superfamily 201217 2008 2058 3.33E-11 61.3876 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6745 - CGI_10003592 superfamily 205718 2097 2126 1.58E-08 53.263 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6745 - CGI_10003592 superfamily 201217 2113 2154 1.14E-07 50.9872 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6745 - CGI_10003592 superfamily 201217 2062 2110 4.31E-05 43.6684 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6745 - CGI_10003592 superfamily 201217 1849 1901 0.000118824 42.1276 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6745 - CGI_10003592 superfamily 201217 1802 1846 0.00334182 37.8904 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6746 - CGI_10003053 superfamily 241563 60 95 2.94E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6748 - CGI_10003055 superfamily 110440 160 186 0.00240433 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6749 - CGI_10003056 superfamily 241563 60 95 0.000196094 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6749 - CGI_10003056 superfamily 110440 482 508 0.000860983 37.3873 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#6750 - CGI_10003057 superfamily 247856 292 355 2.07E-06 44.8461 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#6751 - CGI_10004229 superfamily 243064 19 111 0.000182887 39.6494 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#6752 - CGI_10004230 superfamily 243119 251 295 2.27E-07 46.658 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#6752 - CGI_10004230 superfamily 243100 92 142 0.00250561 35.2324 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#6753 - CGI_10004231 superfamily 243064 19 98 1.71E-06 43.8866 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#6755 - CGI_10002810 superfamily 243072 33 89 0.00436531 33.1259 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6756 - CGI_10002811 superfamily 221797 1552 1737 1.48E-42 156.323 cl15115 Nipped-B_C superfamily - - Sister chromatid cohesion C-terminus; This domain lies towards the C-terminus of nipped-B or sister chromatid cohesion proteins. Q#6756 - CGI_10002811 superfamily 205062 1070 1111 1.31E-11 62.4882 cl15079 Cohesin_HEAT superfamily - - "HEAT repeat associated with sister chromatid cohesion; This HEAT repeat is found most frequently in sister chromatid cohesion proteins such as Nipped-B. HEAT repeats are found tandemly repeated in many proteins, and they appear to serve as flexible scaffolding on which other components can assemble." Q#6760 - CGI_10018191 superfamily 241642 127 186 2.53E-10 56.3498 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#6760 - CGI_10018191 superfamily 220133 5 67 6.45E-18 78.8295 cl07702 Syntaxin-6_N superfamily C - "Syntaxin 6, N-terminal; Members of this family, which are found in the amino terminus of various SNARE proteins, adopt a structure consisting of an antiparallel three-helix bundle. Their exact function has not been determined, though it is known that they regulate the SNARE motif, as well as mediate various protein-protein interactions involved in membrane-transport." Q#6761 - CGI_10018192 superfamily 241862 134 270 7.27E-20 85.4856 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#6762 - CGI_10018193 superfamily 243034 100 208 3.80E-11 57.0048 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6763 - CGI_10018194 superfamily 246749 8 94 4.37E-21 89.0936 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#6763 - CGI_10018194 superfamily 243098 603 649 1.01E-14 69.9343 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#6763 - CGI_10018194 superfamily 246749 133 196 7.38E-09 53.3917 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#6763 - CGI_10018194 superfamily 246749 313 384 4.80E-16 74.3784 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#6763 - CGI_10018194 superfamily 246749 417 490 9.68E-06 44.4104 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#6764 - CGI_10018195 superfamily 247792 37 85 6.48E-07 47.4404 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6764 - CGI_10018195 superfamily 128778 236 356 0.0023839 37.2443 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#6766 - CGI_10018197 superfamily 247918 562 806 4.26E-55 193.313 cl17364 PMT_2 superfamily - - Dolichyl-phosphate-mannose-protein mannosyltransferase; This family contains members that are not captured by pfam02366. Q#6766 - CGI_10018197 superfamily 241739 256 481 1.50E-13 71.4367 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#6766 - CGI_10018197 superfamily 197746 838 897 3.93E-08 51.9583 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#6766 - CGI_10018197 superfamily 197746 980 1030 1.46E-06 47.3359 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#6766 - CGI_10018197 superfamily 197746 911 968 1.60E-06 47.3359 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#6768 - CGI_10018199 superfamily 243072 1 81 1.60E-18 77.0386 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6770 - CGI_10018202 superfamily 245847 20 159 1.58E-17 74.9005 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6771 - CGI_10018203 superfamily 245847 39 160 1.92E-21 86.4565 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6774 - CGI_10018207 superfamily 241600 82 135 3.83E-19 80.3623 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6776 - CGI_10018209 superfamily 243035 62 123 0.000206484 37.5794 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6777 - CGI_10018210 superfamily 245847 18 141 8.91E-18 74.9005 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6779 - CGI_10018212 superfamily 221601 18 134 9.03E-07 47.5504 cl13870 ARA70 superfamily - - "Nuclear coactivator; This domain family is found in eukaryotes, and is typically between 127 and 138 amino acids in length. This family is ARA70, a nuclear coactivator which interacts with peroxisome proliferator-activated receptor gamma (PPARgamma) to regulate transcription and the addition of the PPARgamma ligand (prostaglandin J2) enhances this interaction." Q#6780 - CGI_10018213 superfamily 246680 212 288 4.66E-15 68.131 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#6781 - CGI_10018214 superfamily 248097 201 332 8.21E-15 69.6385 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6781 - CGI_10018214 superfamily 248097 84 181 0.000280726 38.7854 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6783 - CGI_10018216 superfamily 241733 2 90 9.32E-62 184.714 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#6784 - CGI_10018217 superfamily 242877 40 260 1.58E-121 348.835 cl02093 Coq4 superfamily - - "Coenzyme Q (ubiquinone) biosynthesis protein Coq4; Coq4p was shown to peripherally associate with the matrix face of the mitochondrial inner membrane. The putative mitochondrial- targeting sequence present at the amino-terminus of the polypeptide efficiently imported it to mitochondria. The function of Coq4p is unknown, although its presence is required to maintain a steady-state level of Coq7p, another component of the Q biosynthetic pathway. The overall structure of Coq4 is alpha helical and shows resemblance to haemoglobin/myoglobin (information from TOPSAN)." Q#6785 - CGI_10018218 superfamily 244843 47 578 4.63E-148 440.9 cl08040 Ggt superfamily - - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#6786 - CGI_10018219 superfamily 243155 2 105 4.71E-51 160.258 cl02716 RNA_pol_Rpb8 superfamily C - "RNA polymerase Rpb8; Rpb8 is a subunit common to the three yeast RNA polymerases, pol I, II and III. Rpb8 interacts with the largest subunit Rpb1, and with Rpb3 and Rpb11, two smaller subunits." Q#6787 - CGI_10018220 superfamily 243155 46 80 1.92E-15 66.343 cl02716 RNA_pol_Rpb8 superfamily N - "RNA polymerase Rpb8; Rpb8 is a subunit common to the three yeast RNA polymerases, pol I, II and III. Rpb8 interacts with the largest subunit Rpb1, and with Rpb3 and Rpb11, two smaller subunits." Q#6792 - CGI_10002472 superfamily 247792 17 67 6.31E-06 43.9736 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6792 - CGI_10002472 superfamily 241563 164 193 0.00364409 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#6793 - CGI_10002473 superfamily 242324 112 196 0.000978724 38.2385 cl01133 Na_H_antiport_1 superfamily NC - Na+/H+ antiporter 1; This family contains a number of bacterial Na+/H+ antiporter 1 proteins. These are integral membrane proteins that catalyze the exchange of H+ for Na+ in a manner that is highly dependent on the pH. Q#6795 - CGI_10002475 superfamily 241986 25 96 2.54E-21 81.8264 cl00617 SRP19 superfamily C - SRP19 protein; The signal recognition particle (SRP) binds to the signal peptide of proteins as they are being translated. The binding of the SRP halts translation and the complex is then transported to the endoplasmic reticulum's cytoplasmic surface. The SRP then aids translocation of the protein through the ER membrane. The SRP is a ribonucleoprotein that is composed of a small RNA and several proteins. One of these proteins is the SRP19 protein (Sec65 in yeast). Q#6797 - CGI_10008304 superfamily 207662 37 109 6.72E-29 109.188 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#6797 - CGI_10008304 superfamily 245599 353 526 3.12E-20 88.0486 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#6798 - CGI_10008305 superfamily 248013 15 31 0.000351955 36.3972 cl17459 CHROMO superfamily NC - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#6799 - CGI_10008306 superfamily 243058 14 90 8.69E-10 55.3983 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#6799 - CGI_10008306 superfamily 248012 110 209 1.08E-19 82.2404 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#6800 - CGI_10008307 superfamily 218847 17 186 1.75E-49 166.521 cl18479 CDO_I superfamily - - Cysteine dioxygenase type I; Cysteine dioxygenase type I (EC:1.13.11.20) converts cysteine to cysteinesulphinic acid and is the rate-limiting step in sulphate production. Q#6802 - CGI_10008310 superfamily 243035 519 586 1.51E-07 49.9258 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6802 - CGI_10008310 superfamily 248281 380 421 2.40E-07 48.4303 cl17727 GT1 superfamily C - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#6803 - CGI_10017832 superfamily 246676 60 272 7.86E-87 260.726 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#6804 - CGI_10017833 superfamily 241886 96 157 1.18E-16 74.5989 cl00470 Aldo_ket_red superfamily C - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#6810 - CGI_10017839 superfamily 241872 181 462 3.73E-23 99.4667 cl00453 CDP-OH_P_transf superfamily N - CDP-alcohol phosphatidyltransferase; All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. Q#6811 - CGI_10017840 superfamily 201678 99 128 9.49E-06 43.2576 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#6811 - CGI_10017840 superfamily 201678 170 196 2.69E-05 41.7168 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#6811 - CGI_10017840 superfamily 201678 135 161 0.000461205 38.25 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#6811 - CGI_10017840 superfamily 201678 57 81 0.00140871 36.7092 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#6811 - CGI_10017840 superfamily 201678 221 244 0.00154698 36.7092 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#6811 - CGI_10017840 superfamily 149008 249 306 0.00496284 35.6037 cl06654 RabGGT_insert superfamily C - "Rab geranylgeranyl transferase alpha-subunit, insert domain; Rab geranylgeranyl transferase (RabGGT) catalyzes the addition of two geranylgeranyl groups to the C-terminal cysteine residues of Rab proteins, which is crucial for membrane association and function of these proteins in intracellular vesicular trafficking. This domain is inserted between pfam01239 repeats. This domain adopts an Ig-like fold and is thought to be involved in protein-protein interactions and might be involved in the recognition and binding of REP." Q#6812 - CGI_10017841 superfamily 241958 46 472 8.81E-101 310.987 cl00573 SDF superfamily - - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#6813 - CGI_10017842 superfamily 247057 20 86 5.24E-30 105.095 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#6814 - CGI_10017844 superfamily 245596 1 147 2.85E-48 157.05 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#6816 - CGI_10017847 superfamily 241583 4 97 1.73E-32 117.475 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#6818 - CGI_10017849 superfamily 245847 47 112 0.000193113 37.151 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6819 - CGI_10012741 superfamily 243072 186 327 8.41E-07 47.7634 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6819 - CGI_10012741 superfamily 243072 66 211 4.52E-05 42.3707 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6819 - CGI_10012741 superfamily 243072 10 87 0.000112953 41.2151 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6819 - CGI_10012741 superfamily 243072 287 424 0.00131892 38.1335 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6819 - CGI_10012741 superfamily 220672 444 643 1.67E-23 99.2434 cl10957 Frag1 superfamily - - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#6820 - CGI_10012742 superfamily 213107 21 61 3.94E-06 42.6423 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#6821 - CGI_10012743 superfamily 215827 179 357 2.53E-23 99.8503 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#6822 - CGI_10012744 superfamily 243072 86 234 7.35E-27 109.01 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6822 - CGI_10012744 superfamily 241568 1449 1481 4.01E-05 43.6056 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#6822 - CGI_10012744 superfamily 243179 1706 1784 9.22E-09 55.2315 cl02781 tetraspanin_LEL superfamily C - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#6822 - CGI_10012744 superfamily 219521 320 472 0.000204209 43.3807 cl09440 7TMR-DISM_7TM superfamily C - 7TM diverse intracellular signalling; This entry represents the transmembrane region of the 7TM-DISM (7TM Receptors with Diverse Intracellular Signalling Modules). Q#6823 - CGI_10012745 superfamily 243179 294 369 8.00E-07 47.1423 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#6824 - CGI_10012746 superfamily 241683 147 530 1.58E-149 450.814 cl00204 PFK superfamily N - "Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to PFK family that includes ATP- and pyrophosphate (PPi)- dependent phosphofructokinases. Some members evolved by gene duplication and thus have a large C-terminal/N-terminal extension comprising a second PFK domain. Generally, ATP-PFKs are allosteric homotetramers, and PPi-PFKs are dimeric and nonallosteric except for plant PPi-PFKs which are allosteric heterotetramers." Q#6824 - CGI_10012746 superfamily 245206 78 140 2.69E-05 45.0336 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#6825 - CGI_10012747 superfamily 247775 1 199 1.20E-31 122.308 cl17221 ArsB_NhaD_permease superfamily C - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#6826 - CGI_10012748 superfamily 247875 114 231 5.30E-06 44.9612 cl17321 2OG-FeII_Oxy_2 superfamily N - 2OG-Fe(II) oxygenase superfamily; 2OG-Fe(II) oxygenase superfamily. Q#6827 - CGI_10012749 superfamily 241574 86 289 3.93E-48 171.23 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6827 - CGI_10012749 superfamily 217309 346 816 5.51E-173 514.169 cl09289 EMP70 superfamily - - Endomembrane protein 70; Endomembrane protein 70. Q#6829 - CGI_10012751 superfamily 241574 295 501 3.76E-80 252.122 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6831 - CGI_10005126 superfamily 241754 59 755 0 1201.76 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#6831 - CGI_10005126 superfamily 216736 1634 1723 9.48E-31 119.212 cl03379 DIL superfamily - - DIL domain; The DIL domain has no known function. Q#6831 - CGI_10005126 superfamily 210118 859 876 0.00258434 37.7008 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#6831 - CGI_10005126 superfamily 210118 810 827 0.00582265 36.9199 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#6831 - CGI_10005126 superfamily 210118 758 780 0.00592841 36.5347 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#6832 - CGI_10005127 superfamily 222453 991 1049 4.79E-20 86.5916 cl16474 DUF4210 superfamily - - "Domain of unknown function (DUF4210); This short domain is found in fungi, plants and animals, and the proteins appear to be necessary for chromosome segregation during meiosis." Q#6832 - CGI_10005127 superfamily 206060 1121 1178 9.25E-20 85.3972 cl16457 Chromosome_seg superfamily - - "Chromosome segregation during meiosis; The proteins come from eukaryotes, plants and animals, and are necessary for chromosome segregation during meiosis." Q#6833 - CGI_10005128 superfamily 243072 874 995 3.05E-26 107.084 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6833 - CGI_10005128 superfamily 243072 1331 1446 6.09E-24 100.151 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6833 - CGI_10005128 superfamily 243072 1135 1255 1.05E-23 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6833 - CGI_10005128 superfamily 243072 1427 1542 5.39E-23 97.4542 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6833 - CGI_10005128 superfamily 243072 1523 1638 1.29E-22 96.2986 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6833 - CGI_10005128 superfamily 243072 1236 1382 1.95E-21 92.8318 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6833 - CGI_10005128 superfamily 243072 944 1052 3.58E-20 89.365 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6833 - CGI_10005128 superfamily 247744 229 263 0.00259393 39.5592 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#6838 - CGI_10011307 superfamily 114912 40 60 4.41E-05 38.5713 cl17946 zf-U1 superfamily N - "U1 zinc finger; This family consists of several U1 small nuclear ribonucleoprotein C (U1-C) proteins. The U1 small nuclear ribonucleoprotein (U1 snRNP) binds to the pre-mRNA 5' splice site (ss) at early stages of spliceosome assembly. Recruitment of U1 to a class of weak 5' ss is promoted by binding of the protein TIA-1 to uridine-rich sequences immediately downstream from the 5' ss. Binding of TIA-1 in the vicinity of a 5' ss helps to stabilise U1 snRNP recruitment, at least in part, via a direct interaction with U1-C, thus providing one molecular mechanism for the function of this splicing regulator. This domain is probably a zinc-binding. It is found in multiple copies in some members of the family." Q#6839 - CGI_10011308 superfamily 247916 716 786 1.34E-13 67.7931 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#6839 - CGI_10011308 superfamily 247916 263 313 1.07E-07 50.8443 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#6841 - CGI_10011310 superfamily 241600 83 259 7.88E-86 257.554 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6846 - CGI_10011315 superfamily 241754 1 191 9.86E-104 306.16 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#6849 - CGI_10011318 superfamily 241754 24 147 1.73E-77 244.914 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#6852 - CGI_10011321 superfamily 243072 141 261 1.16E-35 134.819 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6852 - CGI_10011321 superfamily 243072 47 193 6.73E-26 106.699 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6852 - CGI_10011321 superfamily 243072 233 324 3.79E-12 66.253 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#6859 - CGI_10021027 superfamily 192060 154 229 4.88E-08 50.8207 cl09599 HbrB superfamily N - HbrB-like; HbrB is involved hyphal growth and polarity. Q#6860 - CGI_10021028 superfamily 246723 1 345 1.12E-166 491.308 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#6862 - CGI_10021030 superfamily 245201 240 490 3.55E-123 386.469 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6862 - CGI_10021030 superfamily 201217 570 616 2.17E-12 64.4692 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6862 - CGI_10021030 superfamily 201217 620 669 4.22E-12 63.6988 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6862 - CGI_10021030 superfamily 201217 672 721 6.43E-11 60.232 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6862 - CGI_10021030 superfamily 201217 793 838 3.86E-08 52.1428 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6862 - CGI_10021030 superfamily 205718 552 581 0.00030031 40.1662 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6862 - CGI_10021030 superfamily 201217 725 789 0.000672167 39.4312 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#6863 - CGI_10021031 superfamily 238211 26 259 9.64E-117 338.096 cl18908 TS_Pyrimidine_HMase superfamily - - "Thymidylate synthase and pyrimidine hydroxymethylase: Thymidylate synthase (TS) and deoxycytidylate hydroxymethylase (dCMP-HMase) are homologs that catalyze analogous alkylation of C5 of pyrimidine nucleotides. Both enzymes are involved in the biosynthesis of DNA precursors and are active as homodimers. However, they exhibit distinct pyrimidine base specificities and differ in the details of their catalyzed reactions. TS is biologically ubiquitous and catalyzes the conversion of dUMP and methylene-tetrahydrofolate (CH2THF) to dTMP and dihydrofolate (DHF). It also acts as a regulator of its own expression by binding and inactivating its own RNA. Due to its key role in the de novo pathway for thymidylate synthesis and, hence, DNA synthesis, it is one of the most conserved enzymes across species and phyla. TS is a well-recognized target for anticancer chemotherapy, as well as a valuable new target against infectious diseases. Interestingly, in several protozoa, a single polypeptide chain codes for both, dihydrofolate reductase (DHFR) and thymidylate synthase (TS), forming a bifunctional enzyme (DHFR-TS), possibly through gene fusion at a single evolutionary point. DHFR-TS is also active as a dimer. Virus encoded dCMP-HMase catalyzes the reversible conversion of dCMP and CH2THF to hydroxymethyl-dCMP and THF. This family also includes dUMP hydroxymethylase, which is encoded by several bacteriophages that infect Bacillus subtilis, for their own protection against the host restriction system, and contain hydroxymethyl-dUMP instead of dTMP in their DNA." Q#6865 - CGI_10021033 superfamily 243069 104 207 8.37E-20 86.8174 cl02525 Band_7 superfamily C - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#6865 - CGI_10021033 superfamily 242376 307 390 9.29E-13 64.1686 cl01225 SCP2 superfamily - - "SCP-2 sterol transfer family; This domain is involved in binding sterols. It is found in the SCP2 protein, as well as the C terminus of the enzyme estradiol 17 beta-dehydrogenase EC:1.1.1.62. The UNC-24 protein contains an SPFH domain pfam01145." Q#6868 - CGI_10021036 superfamily 243096 53 111 7.64E-08 46.522 cl02571 RhoGEF superfamily C - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#6869 - CGI_10021037 superfamily 243096 16 90 1.64E-06 43.0552 cl02571 RhoGEF superfamily NC - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#6871 - CGI_10021039 superfamily 243095 68 261 1.03E-60 206.381 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#6872 - CGI_10021040 superfamily 215647 63 264 9.70E-11 59.5445 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#6873 - CGI_10021041 superfamily 243199 5 87 6.52E-09 54.2194 cl02808 RT_like superfamily N - "RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs." Q#6874 - CGI_10021042 superfamily 245201 32 278 3.45E-67 211.61 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6875 - CGI_10021043 superfamily 217473 110 371 2.04E-38 144.43 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#6875 - CGI_10021043 superfamily 243034 665 692 0.00367429 36.2416 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6876 - CGI_10021044 superfamily 247745 261 512 5.05E-173 509.241 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#6876 - CGI_10021044 superfamily 245003 527 610 4.77E-21 89.5634 cl08536 Alpha-mann_mid superfamily - - "Alpha mannosidase, middle domain; Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase." Q#6877 - CGI_10021045 superfamily 217473 113 354 2.56E-32 126.711 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#6879 - CGI_10021047 superfamily 248097 2 103 1.27E-15 67.6754 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6880 - CGI_10021048 superfamily 248097 177 303 9.42E-20 82.6982 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6882 - CGI_10021050 superfamily 248097 243 348 7.24E-14 68.4458 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6883 - CGI_10021051 superfamily 248097 10 84 1.12E-12 59.9714 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6884 - CGI_10021052 superfamily 248097 26 114 4.81E-14 64.5938 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#6885 - CGI_10021053 superfamily 246918 258 304 1.95E-10 57.2115 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#6885 - CGI_10021053 superfamily 246918 308 368 7.65E-09 52.5891 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#6888 - CGI_10021056 superfamily 246911 159 262 9.75E-32 116.964 cl15262 PUB superfamily - - "PNGase/UBA or UBX (PUB) domain of p97 adaptor proteins; The PUB domain is found in p97 adaptor proteins such as PNGase, UBXD1 (UBX domain-containing protein 1), and RNF31 (RING finger protein 31). It functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The p97, a type II AAA+ ATPase, is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The PUB domain in UBX-domain protein 1 (UBXD1), which is widely expressed in higher eukaryotes (except for fungi) and which is involved in substrate recruitment to p97, interacts strongly with the C-terminus of p97. Peptide:N-glycanase (PNGase), a deglycosylating enzyme that functions in proteasome-dependent degradation of misfolded glycoproteins which are translocated from the endoplasmic reticulum (ER) to the cytosol during ERAD, associates with the ubiquitin-proteasome system proteins mediated by the N-terminal PUB domain. PNGase is present in all eukaryotic organisms; however, the yeast PNGase ortholog does not contain the PUB domain. The RNF31 protein, also known as HOIP or Zibra, contains an N-terminal PUB domain similar to those in PNGase and UBXD1, suggesting its association with p97." Q#6888 - CGI_10021056 superfamily 241645 336 407 0.00648506 34.5544 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#6892 - CGI_10007180 superfamily 245029 8 126 6.09E-11 55.7316 cl09190 MAPEG superfamily - - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#6893 - CGI_10007181 superfamily 218140 335 819 6.68E-132 406.984 cl04579 Anoctamin superfamily - - "Calcium-activated chloride channel; The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes." Q#6894 - CGI_10007182 superfamily 218140 304 781 4.06E-135 416.614 cl04579 Anoctamin superfamily - - "Calcium-activated chloride channel; The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes." Q#6895 - CGI_10007183 superfamily 241798 9 327 0 563.141 cl00338 ALAD_PBGS superfamily - - "Porphobilinogen synthase (PBGS), which is also called delta-aminolevulinic acid dehydratase (ALAD), catalyzes the condensation of two 5-aminolevulinic acid (ALA) molecules to form the pyrrole porphobilinogen (PBG), which is the second step in the biosynthesis of tetrapyrroles, such as heme, vitamin B12 and chlorophyll. This reaction involves the formation of a Schiff base link between the substrate and the enzyme. PBGSs are metalloenzymes, some of which have a second, allosteric metal binding site, beside the metal ion binding site in their active site. Although PBGS is a family of homologous enzymes, its metal ion utilization at catalytic site varies between zinc and magnesium and/or potassium. PBGS can be classified into two groups based on differences in their active site metal binding site. They either contain a cysteine-rich zinc binding site (consensus DXCXCX(Y/F)X3G(H/Q)CG) or an aspartate-rich magnesium binding site (consensus DXALDX(Y/F)X3G(H/Q)DG). The cysteine-rich zinc binding site appears more common. Most members represented by this model also have a second allosteric magnesium binding site (consensus RX~164DX~65EXXXD, missing in a eukaryotic subfamily with cysteine-rich zinc binding site)." Q#6896 - CGI_10007184 superfamily 247068 473 564 1.57E-07 51.1602 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6896 - CGI_10007184 superfamily 247068 1349 1440 3.10E-07 50.3898 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6896 - CGI_10007184 superfamily 247068 668 727 4.04E-06 46.923 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6896 - CGI_10007184 superfamily 247068 1546 1605 1.41E-05 45.3822 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6896 - CGI_10007184 superfamily 247068 382 461 0.00152808 38.8338 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6896 - CGI_10007184 superfamily 247068 1250 1337 0.00894543 36.5226 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#6898 - CGI_10007186 superfamily 241629 233 367 1.27E-59 191.269 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#6898 - CGI_10007186 superfamily 241629 1 76 3.65E-30 113.074 cl00133 SCP superfamily N - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#6899 - CGI_10007187 superfamily 245206 4 259 3.67E-95 283.148 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#6900 - CGI_10007188 superfamily 245206 5 139 9.39E-37 136.387 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#6901 - CGI_10002908 superfamily 241600 24 93 7.01E-19 77.2807 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6902 - CGI_10002909 superfamily 218912 18 150 1.04E-46 162.429 cl18485 COG2 superfamily - - "COG (conserved oligomeric Golgi) complex component, COG2; The COG complex comprises eight proteins COG1-8. The COG complex plays critical roles in Golgi structure and function. The proposed function of the complex is to mediate the initial physical contact between transport vesicles and their membrane targets. A comparable role in tethering vesicles has been suggested for at least six additional large multisubunit complexes, including the exocyst, a complex that mediates trafficking to the plasma membrane. COG2 structure reveals a six-helix bundle with few conserved surface features but a general resemblance to recently determined crystal structures of four different exocyst subunits. These bundles inCOG2 may act as platforms for interaction with other trafficing proteins including SNAREs (soluble N-ethylmaleimide factor attachment protein receptors) and Rabs." Q#6902 - CGI_10002909 superfamily 221383 574 698 1.84E-31 119.69 cl13459 DUF3510 superfamily - - Domain of unknown function (DUF3510); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 130 amino acids in length. This domain is found associated with pfam06148. Q#6903 - CGI_10002910 superfamily 246680 712 791 0.000153356 40.8364 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#6905 - CGI_10024310 superfamily 241574 61 105 3.57E-20 83.0189 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6906 - CGI_10024311 superfamily 241574 118 182 0.00127467 37.9506 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6907 - CGI_10024312 superfamily 241574 2 97 2.52E-27 107.672 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6907 - CGI_10024312 superfamily 241574 188 387 4.65E-24 98.4269 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6910 - CGI_10024315 superfamily 247905 93 232 3.32E-18 78.8188 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#6910 - CGI_10024315 superfamily 247805 6 62 0.000124287 40.7055 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6911 - CGI_10024316 superfamily 247999 496 546 6.84E-07 47.4852 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#6912 - CGI_10024317 superfamily 241574 10 177 6.93E-58 191.645 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6912 - CGI_10024317 superfamily 241574 207 427 3.40E-26 104.975 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#6913 - CGI_10024318 superfamily 245847 11 93 1.11E-06 46.7268 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#6914 - CGI_10024319 superfamily 246597 7 287 0 623.864 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#6915 - CGI_10024320 superfamily 243069 72 199 1.33E-53 177.363 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#6916 - CGI_10024321 superfamily 217867 8 176 1.02E-107 308.405 cl04383 P21-Arc superfamily - - ARP2/3 complex ARPC3 (21 kDa) subunit; The seven component ARP2/3 actin-organising complex is involved in actin assembly and function. Q#6917 - CGI_10024322 superfamily 247757 7 140 1.82E-06 46.4705 cl17203 Fer4_NifH superfamily C - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#6918 - CGI_10024323 superfamily 243034 207 302 3.39E-19 81.2723 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6918 - CGI_10024323 superfamily 243034 61 156 3.40E-10 56.2344 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6919 - CGI_10024324 superfamily 243034 472 567 1.97E-19 84.3539 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6919 - CGI_10024324 superfamily 243034 326 421 1.30E-10 58.9308 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#6920 - CGI_10024325 superfamily 242113 18 251 1.53E-54 177.904 cl00814 Cyclase superfamily - - Putative cyclase; Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site. Q#6922 - CGI_10024327 superfamily 243035 20 89 2.03E-09 49.8401 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#6924 - CGI_10024329 superfamily 220692 48 367 5.97E-22 93.8081 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#6925 - CGI_10024330 superfamily 192535 88 184 8.09E-05 42.9682 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#6929 - CGI_10024334 superfamily 241672 5 294 1.16E-61 199.555 cl00192 ribokinase_pfkB_like superfamily - - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#6930 - CGI_10024335 superfamily 219565 481 549 5.53E-07 50.9521 cl06690 DUF1619 superfamily N - Protein of unknown function (DUF1619); This is a family of sequences derived from hypothetical eukaryotic proteins. The region in question is approximately 330 residues long and has a cysteine rich amino-terminus. Q#6930 - CGI_10024335 superfamily 219565 216 387 0.000629668 41.3221 cl06690 DUF1619 superfamily C - Protein of unknown function (DUF1619); This is a family of sequences derived from hypothetical eukaryotic proteins. The region in question is approximately 330 residues long and has a cysteine rich amino-terminus. Q#6931 - CGI_10024336 superfamily 248145 7 99 7.08E-10 57.2595 cl17591 CAF1 superfamily C - CAF1 family ribonuclease; The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localises to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom#resolution. Q#6931 - CGI_10024336 superfamily 248145 171 227 0.00188687 37.6144 cl17591 CAF1 superfamily NC - CAF1 family ribonuclease; The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localises to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom#resolution. Q#6936 - CGI_10024341 superfamily 218247 25 257 2.33E-34 126.337 cl04727 Pex2_Pex12 superfamily - - "Pex2 / Pex12 amino terminal region; This region is found at the N terminal of a number of known and predicted peroxins including Pex2, Pex10 and Pex12. This conserved region is usually associated with a C terminal ring finger (pfam00097) domain." Q#6936 - CGI_10024341 superfamily 247792 288 325 4.92E-05 40.1849 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#6937 - CGI_10024342 superfamily 243058 121 231 1.56E-08 53.8575 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#6937 - CGI_10024342 superfamily 244760 820 929 3.22E-36 133.588 cl07618 B2-adapt-app_C superfamily - - "Beta2-adaptin appendage, C-terminal sub-domain; Members of this family adopt a structure consisting of a 5 stranded beta-sheet, flanked by one alpha helix on the outer side, and by two alpha helices on the inner side. This domain is required for binding to clathrin, and its subsequent polymerisation. Furthermore, a hydrophobic patch present in the domain also binds to a subset of D-phi-F/W motif-containing proteins that are bound by the alpha-adaptin appendage domain (epsin, AP180, eps15)." Q#6937 - CGI_10024342 superfamily 243521 711 811 1.39E-15 74.2018 cl03759 Alpha_adaptinC2 superfamily - - "Adaptin C-terminal domain; Alpha adaptin is a heterotetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This ig-fold domain is found in alpha, beta and gamma adaptins." Q#6938 - CGI_10024343 superfamily 243306 11 237 5.91E-140 394.988 cl03114 RNase_PH superfamily - - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#6939 - CGI_10024344 superfamily 217247 6 66 8.68E-06 42.3802 cl18397 Glyco_hydro_2_C superfamily NC - "Glycosyl hydrolases family 2, TIM barrel domain; This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities." Q#6942 - CGI_10024347 superfamily 149667 34 219 8.35E-19 81.2627 cl07343 GON superfamily - - GON domain; The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. Q#6943 - CGI_10024348 superfamily 241600 1 187 5.36E-63 196.307 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#6944 - CGI_10004049 superfamily 115363 201 249 4.38E-08 48.9074 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#6945 - CGI_10004050 superfamily 115363 178 228 5.62E-09 52.3742 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#6945 - CGI_10004050 superfamily 115363 243 279 8.87E-06 43.1294 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#6945 - CGI_10004050 superfamily 207713 364 393 0.00185095 36.5285 cl02729 WWE superfamily C - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#6946 - CGI_10004051 superfamily 207713 4 33 0.000229597 34.2173 cl02729 WWE superfamily C - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#6948 - CGI_10004053 superfamily 247725 29 199 2.08E-99 299.516 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#6948 - CGI_10004053 superfamily 246908 414 507 1.88E-44 152.885 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#6949 - CGI_10004054 superfamily 247805 472 525 6.57E-06 45.7912 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6949 - CGI_10004054 superfamily 192286 104 255 4.52E-97 306.274 cl18178 UPF1_Zn_bind superfamily - - RNA helicase (UPF2 interacting domain); UPF1 is an essential RNA helicase that detects mRNAs containing premature stop codons and triggers their degradation. This domain contains 3 zinc binding motifs and forms interactions with another protein (UPF2) that is also involved nonsense-mediated mRNA decay (NMD). Q#6949 - CGI_10004054 superfamily 221913 663 860 5.65E-78 254.772 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#6952 - CGI_10004528 superfamily 242372 32 72 2.81E-09 49.6775 cl01221 DTW superfamily C - DTW domain; This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after. Q#6953 - CGI_10004529 superfamily 217293 32 230 7.64E-29 115.808 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6953 - CGI_10004529 superfamily 220608 424 536 5.44E-25 101.999 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#6953 - CGI_10004529 superfamily 202474 237 315 1.20E-10 61.1305 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#6954 - CGI_10004530 superfamily 241573 25 244 3.13E-72 232.222 cl00051 CysPc superfamily N - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#6954 - CGI_10004530 superfamily 241653 260 408 2.77E-30 113.955 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#6955 - CGI_10004531 superfamily 245213 74 109 6.59E-06 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6955 - CGI_10004531 superfamily 245213 37 72 0.000246727 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6955 - CGI_10004531 superfamily 245213 111 146 0.000392372 36.4606 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#6956 - CGI_10000862 superfamily 246669 56 188 1.68E-74 227.932 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#6957 - CGI_10002928 superfamily 246722 1 163 1.55E-15 74.4549 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#6958 - CGI_10002929 superfamily 241867 106 229 9.59E-08 49.818 cl00446 Lactamase_B superfamily N - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#6958 - CGI_10002929 superfamily 241867 75 103 0.000206567 39.4422 cl00446 Lactamase_B superfamily C - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#6962 - CGI_10007093 superfamily 246613 83 314 6.98E-141 403.313 cl14058 lectin_L-type superfamily - - "legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely." Q#6964 - CGI_10007095 superfamily 248318 289 349 1.36E-10 56.6753 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#6965 - CGI_10007096 superfamily 246669 2400 2515 2.58E-42 154.236 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#6965 - CGI_10007096 superfamily 246708 2223 2377 1.37E-41 151.9 cl14781 PI-PLC-Y superfamily - - "Phosphatidylinositol-specific phospholipase C, Y domain; This associates with pfam00388 to form a single structural unit." Q#6965 - CGI_10007096 superfamily 246675 2055 2131 1.13E-35 139.214 cl14615 PI-PLCc_GDPD_SF superfamily NC - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#6965 - CGI_10007096 superfamily 243053 1364 1592 2.43E-27 114.265 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#6965 - CGI_10007096 superfamily 241645 2701 2797 5.60E-24 100.348 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#6965 - CGI_10007096 superfamily 241645 2570 2661 3.82E-11 62.5984 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#6968 - CGI_10003488 superfamily 247805 1 82 7.46E-09 48.4876 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#6971 - CGI_10010929 superfamily 245201 1022 1262 8.87E-57 197.459 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#6971 - CGI_10010929 superfamily 243045 352 395 0.00046076 40.3091 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#6972 - CGI_10010930 superfamily 247058 11 205 1.05E-59 190.464 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#6973 - CGI_10010931 superfamily 244539 1133 1312 2.81E-42 155.926 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#6973 - CGI_10010931 superfamily 247856 693 754 1.89E-06 47.1573 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#6973 - CGI_10010931 superfamily 246664 9 487 8.33E-166 513.386 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#6973 - CGI_10010931 superfamily 203841 1371 1473 1.17E-13 70.4444 cl17716 NAD_binding_6 superfamily N - Ferric reductase NAD binding domain; Ferric reductase NAD binding domain. Q#6973 - CGI_10010931 superfamily 242267 948 1092 1.48E-09 57.6852 cl01043 Ferric_reduct superfamily - - "Ferric reductase like transmembrane component; This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease." Q#6974 - CGI_10010932 superfamily 150820 15 280 6.85E-51 175.407 cl10894 DuoxA superfamily - - "Dual oxidase maturation factor; DuoxA (Dual oxidase maturation factor) is the essential protein necessary for the final release of DUOX2 (an NADPH:O2 oxidoreductase flavoprotein) from the endoplasmic reticulum. Dual oxidases (DUOX1 and DUOX2) constitute the catalytic core of the hydrogen peroxide generator, which generates H2O2 at the apical membrane of thyroid follicular cells, essential for iodination of thyroglobulin by thyroid peroxidases. DuoxA carries five membrane-integral regions including a reverse signal-anchor with external N-terminus (type III) and two N-glycosylation sites. It is conserved from nematodes to humans." Q#6977 - CGI_10010935 superfamily 245814 298 371 6.04E-11 59.056 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6977 - CGI_10010935 superfamily 245814 393 477 8.75E-11 58.2856 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6977 - CGI_10010935 superfamily 245814 182 267 0.000235973 39.6018 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6977 - CGI_10010935 superfamily 245814 79 154 0.000685329 37.7713 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6977 - CGI_10010935 superfamily 245814 12 45 0.00709076 35.0287 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#6978 - CGI_10010936 superfamily 241578 203 343 4.67E-38 139.344 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#6978 - CGI_10010936 superfamily 219345 479 567 0.000937407 40.9739 cl06326 Phlebovirus_G1 superfamily C - Phlebovirus glycoprotein G1; This family consists of several Phlebovirus glycoprotein G1 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi. Q#6978 - CGI_10010936 superfamily 243119 583 614 0.0013632 37.4234 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#6982 - CGI_10010940 superfamily 241615 58 97 1.96E-06 45.9393 cl00107 LysM superfamily - - "Lysine Motif is a small domain involved in binding peptidoglycan; LysM, a small globular domain with approximately 40 amino acids, is a widespread protein module involved in binding peptidoglycan in bacteria and chitin in eukaryotes. The domain was originally identified in enzymes that degrade bacterial cell walls, but proteins involved in many other biological functions also contain this domain. It has been reported that the LysM domain functions as a signal for specific plant-bacteria recognition in bacterial pathogenesis. Many of these enzymes are modular and are composed of catalytic units linked to one or several repeats of LysM domains. LysM domains are found in bacteria and eukaryotes." Q#6982 - CGI_10010940 superfamily 242902 775 881 2.45E-36 135.527 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#6983 - CGI_10010941 superfamily 217293 69 214 3.09E-19 85.7623 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#6985 - CGI_10010943 superfamily 248312 12 181 0.00167194 36.5625 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#6986 - CGI_10010944 superfamily 248312 21 190 2.34E-05 41.9553 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#6987 - CGI_10010945 superfamily 248312 16 191 2.18E-07 47.7333 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#6988 - CGI_10010946 superfamily 248312 16 191 2.18E-07 47.7333 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#6991 - CGI_10010949 superfamily 247786 1 265 6.53E-110 322.676 cl17232 F420_oxidored superfamily - - NADP oxidoreductase coenzyme F420-dependent; NADP oxidoreductase coenzyme F420-dependent. Q#6992 - CGI_10008168 superfamily 247986 3 107 7.53E-11 60.0794 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#6992 - CGI_10008168 superfamily 247986 222 357 2.79E-06 46.5974 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#6993 - CGI_10008169 superfamily 247986 389 479 6.49E-15 74.3318 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#6993 - CGI_10008169 superfamily 247986 634 738 1.41E-06 48.5234 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#6993 - CGI_10008169 superfamily 245225 42 368 6.19E-76 253.387 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#6996 - CGI_10008172 superfamily 155088 505 658 1.18E-36 136.957 cl02758 AMOP superfamily - - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#6996 - CGI_10008172 superfamily 243124 139 265 2.26E-15 75.1561 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#6997 - CGI_10008173 superfamily 155088 506 667 3.91E-31 121.164 cl02758 AMOP superfamily - - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#6997 - CGI_10008173 superfamily 243124 123 270 1.36E-17 81.7044 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#6997 - CGI_10008173 superfamily 243065 662 843 2.06E-06 47.7817 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#6998 - CGI_10008174 superfamily 155088 474 617 2.37E-31 120.779 cl02758 AMOP superfamily - - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#7000 - CGI_10008176 superfamily 217293 1 181 8.07E-34 125.438 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#7000 - CGI_10008176 superfamily 202474 190 269 1.51E-29 113.903 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7000 - CGI_10008176 superfamily 202474 330 356 0.000593472 39.5593 cl08379 Neur_chan_memb superfamily N - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7001 - CGI_10008177 superfamily 217293 35 246 8.05E-37 134.683 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#7001 - CGI_10008177 superfamily 202474 253 437 7.07E-33 123.918 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7002 - CGI_10008178 superfamily 247068 464 559 1.80E-28 111.251 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7002 - CGI_10008178 superfamily 247068 362 456 7.80E-22 92.3765 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7002 - CGI_10008178 superfamily 247068 567 664 8.67E-22 92.3765 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7002 - CGI_10008178 superfamily 247068 242 341 1.53E-21 91.6061 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7002 - CGI_10008178 superfamily 247068 132 234 8.66E-16 75.0425 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7002 - CGI_10008178 superfamily 247068 686 772 3.05E-13 67.3385 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7002 - CGI_10008178 superfamily 247068 23 101 3.24E-05 43.2503 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7003 - CGI_10008179 superfamily 247068 470 565 2.17E-28 111.251 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7003 - CGI_10008179 superfamily 247068 573 668 1.96E-23 96.9989 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7003 - CGI_10008179 superfamily 247068 249 347 1.73E-21 91.2209 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7003 - CGI_10008179 superfamily 247068 139 241 5.43E-18 81.2057 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7003 - CGI_10008179 superfamily 247068 368 461 5.14E-17 78.5093 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7003 - CGI_10008179 superfamily 247068 688 775 7.58E-12 63.4865 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7003 - CGI_10008179 superfamily 247068 23 130 4.59E-08 51.9306 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7005 - CGI_10008181 superfamily 247068 417 512 6.05E-25 100.851 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7005 - CGI_10008181 superfamily 247068 521 621 9.08E-20 86.2133 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7005 - CGI_10008181 superfamily 247068 199 301 2.91E-19 84.6725 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7005 - CGI_10008181 superfamily 247068 325 408 1.83E-16 76.5833 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7005 - CGI_10008181 superfamily 247068 93 191 5.19E-16 75.0425 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7005 - CGI_10008181 superfamily 247068 634 717 3.69E-13 66.9533 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7005 - CGI_10008181 superfamily 247068 9 81 5.68E-05 42.3006 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7007 - CGI_10008183 superfamily 214781 195 255 4.43E-09 54.6556 cl02747 NRF superfamily C - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#7008 - CGI_10005849 superfamily 245814 3 79 2.24E-06 42.4763 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7010 - CGI_10005851 superfamily 241596 36 101 2.28E-12 58.7647 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#7011 - CGI_10005852 superfamily 244881 99 397 1.19E-174 494.447 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#7012 - CGI_10005853 superfamily 218015 123 206 9.16E-13 63.3035 cl08416 FBA superfamily N - "F-box associated region; Members of this family are associated with F-box domains, hence the name FBA. This domain is probably involved in binding other proteins that will be targeted for ubiquitination. Human FBXO2 is involved in binding to N-glycosylated proteins." Q#7012 - CGI_10005853 superfamily 243074 8 52 8.25E-11 55.2053 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#7013 - CGI_10005854 superfamily 243092 1822 1979 2.00E-15 78.1456 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7014 - CGI_10005855 superfamily 245040 24 70 3.36E-05 37.4077 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#7018 - CGI_10015706 superfamily 247684 21 238 9.80E-61 199.426 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7019 - CGI_10015707 superfamily 247684 3 418 1.15E-89 284.555 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7021 - CGI_10015709 superfamily 247684 3 418 1.09E-74 245.265 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7022 - CGI_10015710 superfamily 247684 1 412 5.14E-82 264.91 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7023 - CGI_10015711 superfamily 202484 23 70 1.09E-13 60.3204 cl03798 zf-Tim10_DDP superfamily N - Tim10/DDP family zinc finger; Putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein TIMM8A. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localised to the mitochondrial intermembrane space. Q#7024 - CGI_10015712 superfamily 245835 710 910 9.76E-123 373.197 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#7024 - CGI_10015712 superfamily 243047 13 129 4.43E-34 127.736 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#7024 - CGI_10015712 superfamily 220441 444 545 0.000780191 40.8066 cl18557 DUF2076 superfamily NC - "Uncharacterized protein conserved in bacteria (DUF2076); This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins." Q#7025 - CGI_10015713 superfamily 247725 1406 1577 1.84E-48 173.197 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7025 - CGI_10015713 superfamily 243052 584 689 2.78E-31 120.901 cl02480 MyTH4 superfamily - - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#7025 - CGI_10015713 superfamily 247725 880 1006 2.90E-30 119.233 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7025 - CGI_10015713 superfamily 243052 1142 1253 6.83E-25 102.797 cl02480 MyTH4 superfamily - - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#7025 - CGI_10015713 superfamily 243052 225 333 1.24E-22 97.4311 cl02480 MyTH4 superfamily C - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#7025 - CGI_10015713 superfamily 242889 1856 1969 3.75E-11 62.2354 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#7025 - CGI_10015713 superfamily 243052 489 520 0.00605472 38.1104 cl02480 MyTH4 superfamily C - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#7025 - CGI_10015713 superfamily 243052 172 203 0.00605472 38.1104 cl02480 MyTH4 superfamily C - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#7027 - CGI_10015715 superfamily 241790 1 71 6.86E-43 144.091 cl00330 Ribosomal_S8 superfamily C - Ribosomal protein S8; Ribosomal protein S8. Q#7031 - CGI_10015719 superfamily 243062 126 228 2.03E-53 169.379 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#7032 - CGI_10015720 superfamily 216062 48 245 5.90E-33 121.776 cl02928 TGFb_propeptide superfamily - - TGF-beta propeptide; This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. Q#7033 - CGI_10015721 superfamily 215647 364 575 1.62E-11 63.3964 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#7035 - CGI_10015723 superfamily 241600 4 74 2.35E-29 104.63 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7038 - CGI_10007550 superfamily 243072 141 275 1.40E-32 124.418 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7038 - CGI_10007550 superfamily 243072 210 343 2.35E-32 123.648 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7038 - CGI_10007550 superfamily 243072 419 545 2.02E-31 120.951 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7038 - CGI_10007550 superfamily 243072 486 607 8.79E-31 119.025 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7038 - CGI_10007550 superfamily 243072 284 409 4.59E-30 117.099 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7038 - CGI_10007550 superfamily 243072 17 167 7.96E-19 84.7426 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7039 - CGI_10007551 superfamily 246918 148 196 1.69E-12 61.8339 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7040 - CGI_10007552 superfamily 241571 257 321 5.98E-05 40.8587 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#7040 - CGI_10007552 superfamily 241571 104 173 0.0014385 36.5182 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#7041 - CGI_10007704 superfamily 207662 57 128 9.23E-35 124.982 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#7041 - CGI_10007704 superfamily 245599 353 516 4.53E-21 90.3598 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#7042 - CGI_10007705 superfamily 243051 268 411 2.31E-37 139.435 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#7042 - CGI_10007705 superfamily 243051 469 639 2.22E-30 119.405 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#7042 - CGI_10007705 superfamily 243051 691 814 1.48E-15 75.8773 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#7042 - CGI_10007705 superfamily 241571 74 176 1.25E-13 69.3634 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#7042 - CGI_10007705 superfamily 245213 645 674 0.000277322 40.3126 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7042 - CGI_10007705 superfamily 245213 217 244 0.000765144 38.7718 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7042 - CGI_10007705 superfamily 248012 829 928 9.09E-21 89.5592 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#7043 - CGI_10007706 superfamily 207662 57 129 1.56E-31 116.122 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#7043 - CGI_10007706 superfamily 245599 343 507 1.58E-20 88.819 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#7044 - CGI_10007707 superfamily 207662 61 133 5.78E-31 114.966 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#7044 - CGI_10007707 superfamily 245599 379 537 2.37E-20 88.4338 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#7045 - CGI_10007708 superfamily 207662 38 109 1.76E-30 113.426 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#7045 - CGI_10007708 superfamily 245599 340 495 5.43E-27 106.923 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#7046 - CGI_10007709 superfamily 207662 100 171 1.52E-31 118.048 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#7046 - CGI_10007709 superfamily 245599 497 671 7.82E-26 105.383 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#7047 - CGI_10007710 superfamily 242922 102 184 1.36E-41 137.313 cl02176 TAF11 superfamily - - "TATA Binding Protein (TBP) Associated Factor 11 (TAF11) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex; The TATA Binding Protein (TBP) Associated Factor 11 (TAF11) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. TAF11 interacts with the ligand binding domains of the nuclear receptors for vitamin D3 and thyroid hormone. TAF11 also directly interacts with TFIIA, acting as a bridging factor that stabilizes the TFIIA-TBP-DNA complex. Each TAF, with the help of a specific activator, is required only for the expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFS and many other transcription factors. TFIID has a histone octamer-like substructure. The TAF11 domain is structurally analogous to histone H3 and interacts with TAF13, making a novel histone-like heterodimer. The dimer may be structurally and functionally similar to the spt3 protein within the SAGA histone acetyltransferase complex." Q#7048 - CGI_10007711 superfamily 245201 99 469 0 710.289 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7049 - CGI_10007712 superfamily 218671 18 193 1.31E-68 214.586 cl05290 NKAIN superfamily - - "Na,K-Atpase Interacting protein; NKAIN (Na,K-Atpase INteracting) proteins are a family of evolutionary conserved transmembrane proteins that localise to neurons, that are critical for neuronal function, and that interact with the beta subunits, beta1 in vertebrates and beta in Drosophila, of Na,K-ATPase. NKAINs have highly conserved trans-membrane domains but otherwise no other characterized domains. NKAINs may function as subunits of pore or channel structures in neurons or they may affect the function of other membrane proteins. They are likely to function within the membrane bilayer." Q#7053 - CGI_10004688 superfamily 247755 1317 1537 3.55E-113 357.189 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#7053 - CGI_10004688 superfamily 247755 671 871 2.75E-110 348.305 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#7053 - CGI_10004688 superfamily 216049 366 627 1.18E-34 135.877 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#7053 - CGI_10004688 superfamily 216049 999 1270 3.04E-32 128.558 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#7054 - CGI_10004689 superfamily 245598 3 196 3.29E-50 170.923 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#7054 - CGI_10004689 superfamily 247856 300 360 6.75E-05 40.6089 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#7055 - CGI_10004690 superfamily 245598 98 263 6.31E-38 134.714 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#7057 - CGI_10007747 superfamily 245847 41 188 3.03E-17 74.9005 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#7061 - CGI_10007751 superfamily 245596 93 412 0 530.256 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#7062 - CGI_10007752 superfamily 247804 345 391 1.42E-13 64.8959 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#7062 - CGI_10007752 superfamily 243177 59 203 8.46E-15 71.7088 cl02779 TRFH superfamily N - "Telomeric Repeat binding Factor or TTAGGG Repeat binding Factor, central (dimerization) domain Homology; TRFH. Telomeres are protein/DNA complexes that make up the physical ends of eukaryotic linear chromosomes and are essential for chromosome stability, protecting the chromosome ends from degradation and end-to-end fusion. Proteins TRF1, TRF2 and Taz1 bind telomeric DNA and are also involved in recruiting interacting proteins, TIN2, and Rap1, to the telomeres. It has also been demonstrated that PARP1 associates with TRF2 and is capable of poly(ADP-ribosyl)ation of TRF2, which affects binding of TRF2 to telomeric DNA. TRF1, TRF2 and Taz1 proteins contain three functional domains: an N-terminal acidic domain, a central TRF-specific/dimerization domain, and a C-terminal DNA binding domain with a single Myb-like repeat. Homodimerization, a prerequisite to DNA binding, results in the juxtaposition of two Myb DNA binding domains." Q#7063 - CGI_10007753 superfamily 215827 174 343 3.06E-19 86.7535 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#7066 - CGI_10002719 superfamily 241563 61 99 1.53E-07 48.4375 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7066 - CGI_10002719 superfamily 110440 486 510 0.000764974 37.3873 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#7066 - CGI_10002719 superfamily 243092 306 439 0.00206062 38.8552 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7067 - CGI_10006759 superfamily 207662 125 197 9.33E-35 124.982 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#7067 - CGI_10006759 superfamily 245599 317 482 2.07E-28 110.775 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#7068 - CGI_10006760 superfamily 190261 76 151 5.99E-28 108.405 cl03504 RFX_DNA_binding superfamily - - RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. Q#7069 - CGI_10006761 superfamily 207662 317 393 4.82E-30 114.196 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#7069 - CGI_10006761 superfamily 245599 501 732 5.45E-14 70.8413 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#7070 - CGI_10006762 superfamily 242577 12 109 4.79E-19 77.8651 cl01553 GFA superfamily C - Glutathione-dependent formaldehyde-activating enzyme; Glutathione-dependent formaldehyde-activating enzyme. Q#7071 - CGI_10006763 superfamily 247743 316 468 3.40E-09 55.2299 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#7072 - CGI_10006764 superfamily 220695 20 139 0.000971035 39.0991 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#7073 - CGI_10006765 superfamily 241563 9 40 2.14E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7074 - CGI_10006766 superfamily 241584 8 97 0.000320435 37.4759 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7077 - CGI_10004170 superfamily 241563 68 109 9.48E-07 46.3184 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7078 - CGI_10009893 superfamily 241563 60 97 1.85E-05 43.0447 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7079 - CGI_10009894 superfamily 242406 140 177 0.00765803 35.0468 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#7082 - CGI_10009897 superfamily 241563 187 223 1.35E-06 46.3184 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7084 - CGI_10009899 superfamily 241691 294 349 1.94E-05 43.2696 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#7084 - CGI_10009899 superfamily 221377 191 257 0.00295043 37.0631 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#7085 - CGI_10009900 superfamily 205062 628 660 0.000746713 38.6058 cl15079 Cohesin_HEAT superfamily N - "HEAT repeat associated with sister chromatid cohesion; This HEAT repeat is found most frequently in sister chromatid cohesion proteins such as Nipped-B. HEAT repeats are found tandemly repeated in many proteins, and they appear to serve as flexible scaffolding on which other components can assemble." Q#7086 - CGI_10009901 superfamily 245819 654 830 9.65E-62 206.661 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#7086 - CGI_10009901 superfamily 245201 359 581 1.82E-32 127.268 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7086 - CGI_10009901 superfamily 219526 593 640 2.22E-08 53.7771 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#7088 - CGI_10009903 superfamily 241699 6 93 1.34E-35 128.078 cl00221 ACBP superfamily - - Acyl CoA binding protein (ACBP) binds thiol esters of long fatty acids and coenzyme A in a one-to-one binding mode with high specificity and affinity. Acyl-CoAs are important intermediates in fatty lipid synthesis and fatty acid degradation and play a role in regulation of intermediary metabolism and gene regulation. The suggested role of ACBP is to act as a intracellular acyl-CoA transporter and pool former. ACBPs are present in a large group of eukaryotic species and several tissue-specific isoforms have been detected. Q#7089 - CGI_10009904 superfamily 248458 285 477 3.28E-07 50.7753 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7090 - CGI_10009905 superfamily 243152 106 233 1.23E-29 112.382 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#7090 - CGI_10009905 superfamily 241700 319 483 7.94E-39 139.447 cl00222 lysozyme_like superfamily - - "lysozyme_like domain. This contains several members including Soluble Lytic Transglycosylases (SLT), Goose Egg-White Lysozymes (GEWL), Hen Egg-White Lysozymes (HEWL), chitinases, bacteriophage lambda lysozymes, endolysins, autolysins, and chitosanases. All the members are involved in the hydrolysis of beta-1,4- linked polysaccharides." Q#7091 - CGI_10009906 superfamily 245596 57 348 1.45E-150 429.962 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#7092 - CGI_10009907 superfamily 217293 1 192 1.96E-33 124.667 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#7092 - CGI_10009907 superfamily 202474 199 275 6.72E-09 54.5821 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7093 - CGI_10009908 superfamily 246918 57 104 3.12E-08 47.5815 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7093 - CGI_10009908 superfamily 246918 1 50 6.51E-07 44.1147 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7094 - CGI_10009909 superfamily 246918 18 66 6.53E-08 44.4999 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7097 - CGI_10009912 superfamily 244201 399 524 1.80E-06 46.8403 cl05797 SMC_hinge superfamily - - SMC proteins Flexible Hinge Domain; This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. Q#7099 - CGI_10009915 superfamily 243175 6 126 1.45E-57 176.671 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#7100 - CGI_10002529 superfamily 243035 22 89 1.37E-13 62.3936 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7102 - CGI_10002531 superfamily 245841 300 780 1.85E-171 517.637 cl12025 PolY superfamily - - "Y-family of DNA polymerases; Y-family DNA polymerases are a specialized subset of polymerases that facilitate translesion synthesis (TLS), a process that allows the bypass of a variety of DNA lesions. Unlike replicative polymerases, TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. The active sites of TLS polymerases are large and flexible to allow the accomodation of distorted bases. Most TLS polymerases are members of the Y-family, including Pol eta, Pol kappa/IV, Pol iota, Rev1, and Pol V, which is found exclusively in bacteria. In eukaryotes, the B-family polymerase Pol zeta also functions as a TLS polymerase. Expression of Y-family polymerases is often induced by DNA damage and is believed to be highly regulated. TLS is likely induced by the monoubiquitination of the replication clamp PCNA, which provides a scaffold for TLS polymerases to bind in order to access the lesion. Because of their high error rates, TLS polymerases are potential targets for cancer treatment and prevention." Q#7102 - CGI_10002531 superfamily 213388 1201 1294 4.28E-22 93.4908 cl17091 Rev1_C superfamily - - "C-terminal domain of the Y-family polymerase Rev1; Rev1 is a eukaryotic translesion synthesis (TLS) polymerase; TLS is a process that allows the bypass of a variety of DNA lesions. TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. Rev1 has both structural and enzymatic roles. Structurally, it is believed to interact with other nonclassical polymerases and replication machinery to act as a scaffold. The C-terminal domain modeled here is essential for TLS and has been shown to mediate interactions with the Rev7 subunit of the B-family TLS polymerase Pol zeta (Rev3/Rev7), as well as with the RIRs (Rev1-interacting regions) of polymerases kappa, iota, and eta. Rev1 is known to actively promote the introduction of mutations, potentially making it a significant target for cancer treatment." Q#7102 - CGI_10002531 superfamily 241565 52 120 3.93E-07 49.2423 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#7103 - CGI_10002759 superfamily 247724 15 182 1.23E-37 131.114 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#7105 - CGI_10002761 superfamily 247724 14 177 7.86E-39 133.035 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#7106 - CGI_10002762 superfamily 247911 27 366 3.49E-133 400.946 cl17357 Fumble superfamily - - "Fumble; Fumble is required for cell division in Drosophila. Mutants lacking fumble exhibit abnormalities in bipolar spindle organisation, chromosome segregation, and contractile ring formation. Analyses have demonstrated that encodes three protein isoforms, all of which contain a domain with high similarity to the pantothenate kinases of A. nidulans and mouse. A role of fumble in membrane synthesis has been proposed." Q#7106 - CGI_10002762 superfamily 246946 459 756 5.72E-53 186.344 cl15397 DUF89 superfamily - - Protein of unknown function DUF89; This family has no known function. Q#7108 - CGI_10000684 superfamily 242889 16 88 3.28E-15 66.8578 cl02111 PCI superfamily N - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#7110 - CGI_10004818 superfamily 241578 553 706 1.53E-18 85.421 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7110 - CGI_10004818 superfamily 241578 755 913 1.31E-12 67.7018 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7110 - CGI_10004818 superfamily 241578 365 504 1.14E-11 64.6202 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7110 - CGI_10004818 superfamily 247068 1124 1213 1.84E-05 44.997 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7110 - CGI_10004818 superfamily 152683 98 200 6.79E-09 55.3717 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#7111 - CGI_10004819 superfamily 247684 27 95 8.72E-15 67.6875 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7112 - CGI_10004820 superfamily 247684 3 307 8.90E-76 245.265 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7113 - CGI_10004821 superfamily 247684 45 149 6.14E-22 90.4143 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7114 - CGI_10004822 superfamily 247684 36 464 1.09E-105 327.697 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7115 - CGI_10004823 superfamily 247684 4 331 3.09E-81 259.902 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7116 - CGI_10004824 superfamily 247684 4 254 1.72E-54 187.099 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7118 - CGI_10019849 superfamily 246975 39 60 0.00488209 33.4745 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#7123 - CGI_10019854 superfamily 245201 38 347 0 627.482 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7124 - CGI_10019855 superfamily 241675 53 232 4.65E-68 211.68 cl00195 SIR2 superfamily N - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#7125 - CGI_10019856 superfamily 243072 737 775 7.16E-08 51.6154 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7126 - CGI_10019857 superfamily 242728 84 112 1.78E-08 46.8906 cl01821 zf-CHCC superfamily N - Zinc-finger domain; This is a short zinc-finger domain conserved from fungi to humans. It is Cx8Hx14Cx2C. Q#7128 - CGI_10019859 superfamily 247068 43 149 4.95E-06 44.2266 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7128 - CGI_10019859 superfamily 247068 166 255 0.000250208 39.219 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7131 - CGI_10019862 superfamily 220736 188 340 1.79E-29 111.633 cl11068 PTEN_C2 superfamily - - "C2 domain of PTEN tumour-suppressor protein; This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (pfam00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane." Q#7131 - CGI_10019862 superfamily 241574 88 187 1.51E-11 62.0837 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#7132 - CGI_10019863 superfamily 219632 1 72 4.03E-21 82.706 cl06786 Eaf7 superfamily - - "Chromatin modification-related protein EAF7; The S. cerevisiae member of this family is part of NuA4, the only essential histone acetyltransferase complex in Saccharomyces cerevisiae involved in global histone acetylation." Q#7133 - CGI_10019864 superfamily 247799 82 128 1.89E-10 54.8735 cl17245 KH-I superfamily N - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#7135 - CGI_10019866 superfamily 247727 16 47 6.19E-08 47.6905 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#7136 - CGI_10019867 superfamily 247727 36 136 1.97E-08 50.5063 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#7137 - CGI_10019868 superfamily 243034 82 172 4.92E-07 47.3748 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#7138 - CGI_10019869 superfamily 241564 287 352 2.74E-29 110.433 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#7138 - CGI_10019869 superfamily 241564 47 111 1.29E-13 66.5203 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#7138 - CGI_10019869 superfamily 247792 512 551 0.000116998 40.1216 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7142 - CGI_10019873 superfamily 241574 89 170 4.90E-11 59.989 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#7144 - CGI_10019875 superfamily 218406 1 122 5.57E-45 151.043 cl04914 MGAT2 superfamily N - "N-acetylglucosaminyltransferase II (MGAT2); UDP-N-acetyl-D-glucosamine:alpha-6-D-mannoside beta-1,2-N- acetylglucosaminyltransferase II (EC 2.4.1.143) (GnT II/MGAT2) is a Golgi resident enzyme that catalyzes an essential step in the biosynthetic pathway leading from high mannose to complex N-linked oligosaccharides. Mutations in the MGAT2 gene lead to congenital disorder of glycosylation (CDG IIa). CDG IIa patients have an increased bleeding tendency, unrelated to coagulation factors." Q#7145 - CGI_10019876 superfamily 218406 74 225 4.76E-68 216.142 cl04914 MGAT2 superfamily C - "N-acetylglucosaminyltransferase II (MGAT2); UDP-N-acetyl-D-glucosamine:alpha-6-D-mannoside beta-1,2-N- acetylglucosaminyltransferase II (EC 2.4.1.143) (GnT II/MGAT2) is a Golgi resident enzyme that catalyzes an essential step in the biosynthetic pathway leading from high mannose to complex N-linked oligosaccharides. Mutations in the MGAT2 gene lead to congenital disorder of glycosylation (CDG IIa). CDG IIa patients have an increased bleeding tendency, unrelated to coagulation factors." Q#7146 - CGI_10019877 superfamily 216981 508 651 1.31E-14 71.4098 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#7146 - CGI_10019877 superfamily 207690 223 246 1.09E-05 43.4605 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#7146 - CGI_10019877 superfamily 207690 131 152 0.000193159 39.9937 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#7148 - CGI_10019879 superfamily 243084 134 240 1.14E-31 114.007 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#7149 - CGI_10019880 superfamily 112128 1106 1310 4.18E-92 296.319 cl03992 TF_AP-2 superfamily - - Transcription factor AP-2; Transcription factor AP-2. Q#7149 - CGI_10019880 superfamily 146396 10 82 4.52E-23 95.7762 cl04239 ENT superfamily - - ENT domain; This presumed domain is named after Emsy N Terminus (ENT). Emsy is a protein that is amplified in breast cancer and interacts with BRCA2. The N terminus of this protein is found to be similar to other vertebrate and plant proteins of unknown function. This domain has a completely conserved histidine residue that may be functionally important. Q#7151 - CGI_10019882 superfamily 243179 40 118 2.66E-21 84.6625 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#7152 - CGI_10010868 superfamily 247866 9 128 6.63E-26 100.22 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#7153 - CGI_10010869 superfamily 241592 24 75 4.51E-08 46.8362 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#7155 - CGI_10010871 superfamily 217944 40 114 2.13E-24 97.4039 cl04436 RPAP2_Rtr1 superfamily - - Rtr1/RPAP2 family; This family includes the human RPAP2 (RNAP II associated polypeptide) protein and the yeast Rtr1 protein. It has been suggested that this family of proteins are regulators of core RNA polymerase II function. Q#7156 - CGI_10010872 superfamily 220370 25 318 1.61E-61 204.168 cl10719 Tau95 superfamily - - "RNA polymerase III transcription factor (TF)IIIC subunit; TFIIIC1 is a multisubunit DNA binding factor that serves as a dynamic platform for assembly of pre-initiation complexes on class III genes. This entry represents the tau 95 subunit which holds a key position in TFIIIC, exerting both upstream and downstream influence on the TFIIIC-DNA complex by rendering the complex more stable. Once bound to tDNA-intragenic promoter elements, TFIIIC directs the assembly of TFIIIB on the DNA, which in turn recruits the RNA polymerase III (pol III) and activates multiple rounds of transcription." Q#7160 - CGI_10010876 superfamily 247725 3 182 2.20E-99 288.398 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7163 - CGI_10010879 superfamily 243092 360 651 1.49E-51 184.846 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7163 - CGI_10010879 superfamily 243092 49 392 8.41E-14 71.9824 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7164 - CGI_10010880 superfamily 207717 392 465 1.66E-34 128.313 cl02755 LAM superfamily - - "LA motif RNA-binding domain; This domain is found at the N-terminus of La RNA-binding proteins as well as in other related proteins. Typically, the domain co-occurs with an RNA-recognition motif (RRM), and together these domains function to bind primary transcripts of RNA polymerase III in the La autoantigen (Lupus La protein, LARP3, or Sjoegren syndrome type B antigen, SS-B). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes." Q#7164 - CGI_10010880 superfamily 128927 903 941 5.98E-13 65.4703 cl02733 DM15 superfamily - - "Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function; Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function. " Q#7164 - CGI_10010880 superfamily 128927 862 901 2.22E-08 52.3735 cl02733 DM15 superfamily - - "Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function; Tandem repeat in fly CG14066 (La related protein), human KIAA0731 and worm R144.7. Unknown function. " Q#7165 - CGI_10010881 superfamily 245201 1 235 3.07E-133 402.842 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7167 - CGI_10010883 superfamily 203768 156 258 6.61E-42 141.485 cl06716 RED_C superfamily - - "RED-like protein C-terminal region; This family contains sequences that are similar to the C-terminal region of Red protein. This and related proteins are thought to be localised to the nucleus, and contain a RED repeat which consists of a number of RE and RD sequence elements. The region in question has several conserved NLS sequences. The function of Red protein is unknown, but efficient sequestration to nuclear bodies suggests that its expression may be tightly regulated or that the protein self-aggregates extremely efficiently." Q#7168 - CGI_10010884 superfamily 219589 76 236 1.23E-38 136.133 cl06717 RED_N superfamily - - "RED-like protein N-terminal region; This family contains sequences that are similar to the N-terminal region of Red protein. This and related proteins contain a RED repeat which consists of a number of RE and RD sequence elements. The region in question has several conserved NLS sequences and a putative trimeric coiled-coil region, suggesting that these proteins are expressed in the nucleus. The function of Red protein is unknown, but efficient sequestration to nuclear bodies suggests that its expression may be tightly regulated of that the protein self-aggregates extremely efficiently." Q#7169 - CGI_10010885 superfamily 241740 1 127 4.22E-46 149.347 cl00269 cytidine_deaminase-like superfamily - - "Cytidine and deoxycytidylate deaminase zinc-binding region. The family contains cytidine deaminases, nucleoside deaminases, deoxycytidylate deaminases and riboflavin deaminases. Also included are the apoBec family of mRNA editing enzymes. All members are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate." Q#7170 - CGI_10010886 superfamily 247684 12 426 2.84E-100 312.674 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7171 - CGI_10018464 superfamily 241577 258 479 6.04E-142 409.937 cl00056 MH2 superfamily - - "C-terminal Mad Homology 2 (MH2) domain; The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers." Q#7171 - CGI_10018464 superfamily 241576 1 57 9.49E-39 137.969 cl00055 MH1 superfamily N - "N-terminal Mad Homology 1 (MH1) domain; The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. Receptor-regulated SMAD proteins (R-SMADs, including SMAD1, SMAD2, SMAD3, SMAD5, and SMAD9) are activated by phosphorylation by transforming growth factor (TGF)-beta type I receptors. The active R-SMAD associates with a common mediator SMAD (Co-SMAD or SMAD4) and other cofactors, which together translocate to the nucleus to regulate gene expression. The inhibitory or antagonistic SMADs (I-SMADs, including SMAD6 and SMAD7) negatively regulate TGF-beta signaling by competing with R-SMADs for type I receptor or Co-SMADs. MH1 domains of R-SMAD and SMAD4 contain a nuclear localization signal as well as DNA-binding activity. The activated R-SMAD/SMAD4 complex then binds with very low affinity to a DNA sequence CAGAC called SMAD-binding element (SBE) via the MH1 domain." Q#7172 - CGI_10018465 superfamily 246723 156 601 0 601.09 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#7178 - CGI_10018471 superfamily 241691 502 618 0.00294818 37.8768 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#7180 - CGI_10018473 superfamily 246680 1008 1071 0.000772889 39.241 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#7181 - CGI_10018474 superfamily 243146 125 166 5.50E-06 41.493 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#7183 - CGI_10018476 superfamily 246680 114 194 4.59E-28 104.014 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#7183 - CGI_10018476 superfamily 241640 2 87 6.23E-20 85.0369 cl00149 Tryp_SPc superfamily N - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#7184 - CGI_10018477 superfamily 241640 37 168 1.90E-42 143.186 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#7185 - CGI_10018478 superfamily 241578 88 246 8.77E-44 153.986 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7185 - CGI_10018478 superfamily 243119 527 568 0.0047028 35.4873 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7185 - CGI_10018478 superfamily 243119 573 624 0.00493785 35.4974 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7186 - CGI_10018479 superfamily 241578 186 344 3.54E-48 167.083 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7186 - CGI_10018479 superfamily 243119 688 721 0.000133518 40.505 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7187 - CGI_10018480 superfamily 241578 124 282 2.68E-46 161.69 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7187 - CGI_10018480 superfamily 243119 618 669 0.000142108 40.1198 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7187 - CGI_10018480 superfamily 243119 571 609 0.00205124 36.653 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7189 - CGI_10018482 superfamily 243034 159 259 6.54E-15 69.3311 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#7190 - CGI_10018483 superfamily 245010 29 134 8.24E-07 43.3755 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#7191 - CGI_10018484 superfamily 243074 63 107 2.33E-14 67.9169 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#7192 - CGI_10018485 superfamily 242059 43 465 3.16E-94 294.011 cl00738 MBOAT superfamily - - "MBOAT, membrane-bound O-acyltransferase family; The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue." Q#7193 - CGI_10018486 superfamily 243056 33 236 3.51E-30 114.764 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#7194 - CGI_10018487 superfamily 247750 20 201 1.18E-116 352.837 cl17196 E1_enzyme_family superfamily C - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#7194 - CGI_10018487 superfamily 247750 256 438 4.55E-48 171.408 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#7195 - CGI_10018488 superfamily 245876 17 82 8.09E-11 54.8146 cl12113 HSF_DNA-bind superfamily C - HSF-type DNA-binding; HSF-type DNA-binding. Q#7198 - CGI_10018491 superfamily 116798 37 186 6.39E-18 76.1762 cl17955 Lipocalin_2 superfamily - - "Lipocalin-like domain; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The structure is an eight-stranded beta barrel." Q#7199 - CGI_10018492 superfamily 216686 134 309 1.87E-35 128.98 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#7200 - CGI_10003249 superfamily 247725 255 290 1.93E-10 56.4407 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7201 - CGI_10001599 superfamily 245819 631 806 1.41E-71 234.396 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#7201 - CGI_10001599 superfamily 245201 322 559 4.09E-40 149.609 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7201 - CGI_10001599 superfamily 219526 577 618 3.28E-08 53.3919 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#7205 - CGI_10014807 superfamily 247724 2 62 4.90E-21 87.5907 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#7206 - CGI_10014808 superfamily 243744 38 57 0.00959794 30.7946 cl04410 DFP superfamily N - "DNA / pantothenate metabolism flavoprotein; The DNA/pantothenate metabolism flavoprotein (EC:4.1.1.36) affects synthesis of DNA, and pantothenate metabolism." Q#7208 - CGI_10014812 superfamily 243035 43 85 1.34E-07 45.3034 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7212 - CGI_10014816 superfamily 243091 135 235 6.99E-09 52.8796 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#7213 - CGI_10014817 superfamily 243035 18 51 0.000138526 37.1942 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7216 - CGI_10014820 superfamily 241563 103 138 0.000547638 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7218 - CGI_10014635 superfamily 245206 4 302 0 524.569 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#7221 - CGI_10014638 superfamily 241802 69 372 6.93E-121 358.338 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#7221 - CGI_10014638 superfamily 245020 393 463 5.27E-19 81.4354 cl09141 ACT superfamily - - "ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme; Members of this CD belong to the superfamily of ACT regulatory domains. Pairs of ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. The ACT domain has been detected in a number of diverse proteins; some of these proteins are involved in amino acid and purine biosynthesis, phenylalanine hydroxylation, regulation of bacterial metabolism and transcription, and many remain to be characterized. ACT domain-containing enzymes involved in amino acid and purine synthesis are in many cases allosteric enzymes with complex regulation enforced by the binding of ligands. The ACT domain is commonly involved in the binding of a small regulatory molecule, such as the amino acids L-Ser and L-Phe in the case of D-3-phosphoglycerate dehydrogenase and the bifunctional chorismate mutase-prephenate dehydratase enzyme (P-protein), respectively. Aspartokinases typically consist of two C-terminal ACT domains in a tandem repeat, but the second ACT domain is inserted within the first, resulting in, what is normally the terminal beta strand of ACT2, formed from a region N-terminal of ACT1. ACT domain repeats have been shown to have nonequivalent ligand-binding sites with complex regulatory patterns such as those seen in the bifunctional enzyme, aspartokinase-homoserine dehydrogenase (ThrA). In other enzymes, such as phenylalanine hydroxylases, the ACT domain appears to function as a flexible small module providing allosteric regulation via transmission of conformational changes, these conformational changes are not necessarily initiated by regulatory ligand binding at the ACT domain itself. ACT domains are present either singularly, N- or C-terminal, or in pairs present C-terminal or between two catalytic domains. Unique to cyanobacteria are four ACT domains C-terminal to an aspartokinase domain. A few proteins are composed almost entirely of ACT domain repeats as seen in the four ACT domain protein, the ACR protein, found in higher plants; and the two ACT domain protein, the glycine cleavage system transcriptional repressor (GcvR) protein, found in some bacteria. Also seen are single ACT domain proteins similar to the Streptococcus pneumoniae ACT domain protein (uncharacterized pdb structure 1ZPV) found in both bacteria and archaea. Purportedly, the ACT domain is an evolutionarily mobile ligand binding regulatory module that has been fused to different enzymes at various times." Q#7222 - CGI_10014639 superfamily 243095 713 904 1.70E-63 217.29 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#7222 - CGI_10014639 superfamily 245835 314 541 7.91E-13 69.2985 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#7223 - CGI_10014640 superfamily 241599 73 133 2.28E-14 66.498 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#7224 - CGI_10014641 superfamily 241760 274 322 9.31E-23 92.0337 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#7224 - CGI_10014641 superfamily 149946 175 265 5.01E-14 68.4499 cl07621 efhand_2 superfamily - - "EF-hand; Members of this family adopt a helix-loop-helix motif, as per other EF hand domains. However, since they do not contain the canonical pattern of calcium binding residues found in many EF hand domains, they do not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (pfam00397), enhancing dystroglycan binding." Q#7224 - CGI_10014641 superfamily 165695 431 487 0.00666033 35.8998 cl14550 PLN00126 superfamily NC - "succinate dehydrogenase, cytochrome b subunit family; Provisional" Q#7225 - CGI_10014642 superfamily 243146 178 221 4.07E-16 69.5095 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#7225 - CGI_10014642 superfamily 243146 84 130 8.85E-15 65.6575 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#7225 - CGI_10014642 superfamily 243146 38 82 8.95E-13 60.2647 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#7225 - CGI_10014642 superfamily 243146 132 177 6.92E-12 57.9535 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#7225 - CGI_10014642 superfamily 243146 1 36 0.00980709 32.5304 cl02701 Kelch_3 superfamily N - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#7226 - CGI_10014643 superfamily 248458 380 531 5.93E-05 44.6121 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7226 - CGI_10014643 superfamily 199528 694 827 3.21E-05 45.8572 cl15392 PRK10429 superfamily N - melibiose:sodium symporter; Provisional Q#7227 - CGI_10014644 superfamily 245201 1137 1404 9.22E-144 441.609 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7229 - CGI_10014646 superfamily 241842 26 128 6.12E-44 141.179 cl00400 Fe-S_biosyn superfamily - - Iron-sulphur cluster biosynthesis; This family is involved in iron-sulphur cluster biosynthesis. Its members include proteins that are involved in nitrogen fixation such as the HesB and HesB-like proteins. Q#7230 - CGI_10014647 superfamily 245202 64 149 3.09E-47 153.184 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#7231 - CGI_10014648 superfamily 247727 156 254 7.12E-14 68.2254 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#7231 - CGI_10014648 superfamily 247725 52 107 0.000595978 38.5359 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7233 - CGI_10014650 superfamily 247905 1058 1185 3.35E-25 103.857 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#7233 - CGI_10014650 superfamily 247805 749 888 2.25E-24 102.03 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#7233 - CGI_10014650 superfamily 243084 1402 1487 5.79E-39 142.567 cl02556 Bromodomain superfamily C - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#7233 - CGI_10014650 superfamily 207699 681 719 1.03E-09 56.5335 cl02688 BRK superfamily - - BRK domain; The function of this domain is unknown. It is often found associated with helicases and transcription factors. Q#7233 - CGI_10014650 superfamily 244735 311 344 6.40E-09 53.9756 cl07469 QLQ superfamily - - "QLQ; The QLQ domain is named after the conserved Gln, Leu, Gln motif. The QLQ domain is found at the N-terminus of SWI2/SNF2 protein, which has been shown to be involved in protein-protein interactions. This domain has thus been postulated to be involved in mediating protein interactions." Q#7233 - CGI_10014650 superfamily 243139 553 594 2.25E-06 47.3922 cl02676 HSA superfamily C - HSA; This domain is predicted to bind DNA and is often found associated with helicases. Q#7234 - CGI_10014651 superfamily 242203 33 292 9.33E-89 269.181 cl00935 Brix superfamily - - Brix domain; Brix domain. Q#7235 - CGI_10014652 superfamily 243128 96 293 7.74E-14 68.8954 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#7236 - CGI_10014653 superfamily 214022 14 225 1.18E-108 316.523 cl17166 MMACHC-like superfamily - - "Methylmalonic aciduria and homocystinuria type C protein and similar proteins; MMACHC, also called CblC, is involved in the intracellular processing of vitamin B12 by catalyzing two reactions: the reductive decyanation of cyanocobalamin in the presence of a flavoprotein oxidoreductase and the dealkylation of alkylcobalamins through the nucleophilic displacement of the alkyl group by glutathione. Mutations in MMACHC cause combined methylmalonic acidemia/aciduria and homocystinuria (CblC type), the most common inherited disorder of cobalamin metabolism. The structure of MMACHC reveals it to be the most divergent member of the NADPH-dependent flavin reductase family that can use FMN or FAD to catalyze reductive decyanation; it is also the first enzyme with glutathione transferase (GST) activity that is unrelated to the GST superfamily in structure and sequence." Q#7240 - CGI_10014657 superfamily 151147 1 63 7.75E-16 68.2689 cl11240 DUF2475 superfamily - - Protein of unknown function (DUF2475); This family of proteins has no known function. Q#7241 - CGI_10014658 superfamily 217833 38 219 3.11E-87 258.564 cl04364 VPS28 superfamily - - VPS28 protein; VPS28 protein. Q#7243 - CGI_10014660 superfamily 151147 18 84 7.12E-18 73.6617 cl11240 DUF2475 superfamily - - Protein of unknown function (DUF2475); This family of proteins has no known function. Q#7244 - CGI_10014661 superfamily 217833 36 217 1.80E-87 258.949 cl04364 VPS28 superfamily - - VPS28 protein; VPS28 protein. Q#7246 - CGI_10014663 superfamily 246751 93 386 8.22E-119 353.472 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#7246 - CGI_10014663 superfamily 241546 405 504 0.00104124 37.7482 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#7247 - CGI_10014664 superfamily 246751 422 713 7.56E-96 304.551 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#7247 - CGI_10014664 superfamily 246751 62 353 2.94E-92 295.306 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#7247 - CGI_10014664 superfamily 246751 724 782 2.47E-13 69.9647 cl14883 Lipase superfamily N - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#7248 - CGI_10014665 superfamily 241619 41 99 0.000494029 35.6357 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#7249 - CGI_10002273 superfamily 202351 1 129 2.82E-31 120.68 cl03662 Na_Pi_cotrans superfamily - - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#7249 - CGI_10002273 superfamily 219963 872 968 1.19E-12 65.707 cl08487 GCV_T_C superfamily - - "Glycine cleavage T-protein C-terminal barrel domain; This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase." Q#7249 - CGI_10002273 superfamily 202351 119 207 1.66E-07 50.5732 cl03662 Na_Pi_cotrans superfamily C - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#7249 - CGI_10002273 superfamily 183292 258 375 6.37E-06 48.2792 cl18135 PRK11728 superfamily NC - hydroxyglutarate oxidase; Provisional Q#7249 - CGI_10002273 superfamily 242769 740 825 2.41E-05 44.6949 cl01893 SoxG superfamily N - "Sarcosine oxidase, gamma subunit family; Sarcosine oxidase is a hetero-tetrameric enzyme that contains both covalently bound FMN and non-covalently bound FAD and NAD(+). This enzyme catalyzes the oxidative demethylation of sarcosine to yield glycine, H2O2, and 5,10-CH2-tetrahydrofolate (H4folate) in a reaction requiring H4folate and O2." Q#7250 - CGI_10002275 superfamily 245815 15 468 0 746.876 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#7251 - CGI_10002277 superfamily 241782 11 377 0 625.785 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#7252 - CGI_10004890 superfamily 245814 281 351 2.51E-08 50.5655 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7252 - CGI_10004890 superfamily 241578 20 188 3.00E-32 119.867 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7252 - CGI_10004890 superfamily 245213 196 231 4.93E-07 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7253 - CGI_10004891 superfamily 245814 147 223 3.77E-05 40.9355 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7253 - CGI_10004891 superfamily 245814 56 125 2.78E-05 41.525 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7254 - CGI_10004892 superfamily 215577 26 161 1.36E-05 42.7331 cl14728 PLN03103 superfamily N - GDP-L-galactose-hexose-1-phosphate guanyltransferase; Provisional Q#7255 - CGI_10004893 superfamily 147983 14 166 1.04E-40 144.471 cl05576 BRE superfamily C - "Brain and reproductive organ-expressed protein (BRE); This family consists of several eukaryotic brain and reproductive organ-expressed (BRE) proteins. BRE is a putative stress-modulating gene, found able to down-regulate TNF-alpha-induced-NF-kappaB activation upon over expression. A total of six isoforms are produced by alternative splicing predominantly at either end of the gene.Compared to normal cells, immortalised human cell lines uniformly express higher levels of BRE. Peripheral blood monocytes respond to LPS by down-regulating the expression of all the BRE isoforms.It is thought that the function of BRE and its isoforms is to regulate peroxisomal activities." Q#7255 - CGI_10004893 superfamily 147983 259 305 3.04E-06 46.6303 cl05576 BRE superfamily N - "Brain and reproductive organ-expressed protein (BRE); This family consists of several eukaryotic brain and reproductive organ-expressed (BRE) proteins. BRE is a putative stress-modulating gene, found able to down-regulate TNF-alpha-induced-NF-kappaB activation upon over expression. A total of six isoforms are produced by alternative splicing predominantly at either end of the gene.Compared to normal cells, immortalised human cell lines uniformly express higher levels of BRE. Peripheral blood monocytes respond to LPS by down-regulating the expression of all the BRE isoforms.It is thought that the function of BRE and its isoforms is to regulate peroxisomal activities." Q#7256 - CGI_10004894 superfamily 247805 161 242 2.78E-10 56.7241 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#7257 - CGI_10004895 superfamily 247905 2 124 3.80E-24 91.9156 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#7258 - CGI_10004896 superfamily 246975 289 310 0.00554021 34.6301 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#7258 - CGI_10004896 superfamily 243147 365 388 0.00898994 34.1408 cl02703 zf-BED superfamily N - BED zinc finger; BED zinc finger. Q#7259 - CGI_10004897 superfamily 246752 8 63 0.00398592 32.7697 cl14886 UPF0227 superfamily C - "Uncharacterized protein family (UPF0227); Despite being classed as uncharacterized proteins, the members of this family are almost certainly enzymes that are distantly related to the pfam00561." Q#7260 - CGI_10000602 superfamily 247684 1 115 6.78E-77 233.824 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7261 - CGI_10017571 superfamily 245814 224 284 8.65E-05 41.7059 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7262 - CGI_10017572 superfamily 243087 46 119 6.68E-28 108.836 cl02562 PWI superfamily - - PWI domain; PWI domain. Q#7264 - CGI_10017574 superfamily 247725 104 229 2.74E-68 226.145 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7264 - CGI_10017574 superfamily 243096 690 880 2.44E-47 169.015 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#7264 - CGI_10017574 superfamily 241622 492 579 1.18E-10 59.8878 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#7264 - CGI_10017574 superfamily 247725 883 1055 1.93E-55 191.826 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7264 - CGI_10017574 superfamily 241645 418 490 2.45E-09 55.9799 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#7266 - CGI_10017576 superfamily 247725 38 113 1.38E-12 64.6845 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7269 - CGI_10017579 superfamily 110440 387 414 0.000339512 38.1577 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#7272 - CGI_10017582 superfamily 241659 153 233 3.68E-14 65.2339 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#7272 - CGI_10017582 superfamily 241659 47 124 8.06E-14 64.0783 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#7273 - CGI_10017583 superfamily 248458 157 472 3.44E-18 84.6729 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7274 - CGI_10017584 superfamily 248458 160 422 5.07E-29 116.644 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7275 - CGI_10017585 superfamily 248458 135 369 1.72E-10 61.1757 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7276 - CGI_10017586 superfamily 248458 150 282 4.05E-15 75.4281 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7276 - CGI_10017586 superfamily 248458 339 485 9.04E-09 55.7829 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7278 - CGI_10017588 superfamily 216939 708 793 2.64E-10 58.0581 cl03492 PC4 superfamily - - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#7278 - CGI_10017588 superfamily 241563 60 102 3.08E-06 45.548 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7278 - CGI_10017588 superfamily 216939 634 697 3.62E-06 45.7317 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#7279 - CGI_10017589 superfamily 241563 568 610 0.000110181 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7279 - CGI_10017589 superfamily 241563 19 61 0.00017219 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7279 - CGI_10017589 superfamily 191851 97 172 0.000286447 41.4615 cl06708 DUF1640 superfamily N - Protein of unknown function (DUF1640); This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured. Q#7279 - CGI_10017589 superfamily 128778 621 723 0.00220974 38.0147 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#7281 - CGI_10017591 superfamily 110440 237 263 1.12E-05 41.6245 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#7282 - CGI_10017592 superfamily 241564 372 437 1.18E-25 101.188 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#7282 - CGI_10017592 superfamily 241564 252 320 5.47E-24 96.1807 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#7282 - CGI_10017592 superfamily 241564 30 95 4.00E-17 76.9207 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#7282 - CGI_10017592 superfamily 247792 587 626 2.89E-05 42.0476 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7284 - CGI_10000905 superfamily 241674 113 146 2.40E-09 50.2876 cl00194 EF1B superfamily C - "Elongation factor 1 beta (EF1B) guanine nucleotide exchange domain. EF1B catalyzes the exchange of GDP bound to the G-protein, EF1A, for GTP, an important step in the elongation cycle of the protein biosynthesis. EF1A binds to and delivers the aminoacyl tRNA to the ribosome. The guanine nucleotide exchange domain of EF1B, which is the alpha subunit in yeast, is responsible for the catalysis of this exchange reaction." Q#7284 - CGI_10000905 superfamily 204519 80 106 1.08E-06 42.2343 cl11209 EF-1_beta_acid superfamily - - Eukaryotic elongation factor 1 beta central acidic region; Eukaryotic elongation factor 1 beta central acidic region. Q#7285 - CGI_10004949 superfamily 247684 119 321 6.55E-30 117.378 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7286 - CGI_10004950 superfamily 247725 96 197 4.87E-41 137.336 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7286 - CGI_10004950 superfamily 247725 10 88 4.79E-38 130.435 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7287 - CGI_10004951 superfamily 247684 11 398 1.96E-96 297.652 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7289 - CGI_10004953 superfamily 245206 4 290 8.98E-93 279.126 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#7291 - CGI_10004955 superfamily 242903 23 158 2.44E-86 252.864 cl02148 APC10-like superfamily - - "APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination; This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here." Q#7292 - CGI_10004956 superfamily 247692 94 434 1.34E-69 243.89 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#7292 - CGI_10004956 superfamily 219000 1266 1459 9.92E-62 212.121 cl05717 Drf_FH3 superfamily - - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#7292 - CGI_10004956 superfamily 219001 1089 1262 2.00E-29 118.565 cl05720 Drf_GBD superfamily - - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#7292 - CGI_10004956 superfamily 245660 677 949 4.72E-21 96.2469 cl11493 PQQ_DH_like superfamily - - "PQQ-dependent dehydrogenases and related proteins; This family is composed of dehydrogenases with pyrroloquinoline quinone (PQQ) as a cofactor, such as ethanol, methanol, and membrane-bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller, and the family also includes distantly related proteins which are not enzymatically active and do not bind PQQ." Q#7292 - CGI_10004956 superfamily 245209 489 552 0.00364517 37.9194 cl09936 PP-binding superfamily - - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#7293 - CGI_10004957 superfamily 243098 656 704 6.83E-14 67.6231 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#7293 - CGI_10004957 superfamily 219918 6 86 9.39E-17 76.9588 cl07265 DUF1767 superfamily - - Domain of unknown function (DUF1767); Eukaryotic domain of unknown function. This domain is found to the N-terminus of the nucleic acid binding domain. Q#7295 - CGI_10006736 superfamily 241675 88 318 5.83E-116 351.548 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#7296 - CGI_10006737 superfamily 242206 61 141 1.70E-23 89.2018 cl00938 Rieske superfamily N - "Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis." Q#7298 - CGI_10006739 superfamily 217380 29 332 5.25E-112 331.982 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#7299 - CGI_10006740 superfamily 246940 77 251 8.91E-07 47.7134 cl15377 Radical_SAM superfamily C - "Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin." Q#7299 - CGI_10006740 superfamily 203999 259 323 1.36E-22 89.9645 cl15482 Wyosine_form superfamily - - Wyosine base formation; Some proteins in this family appear to be important in wyosine base formation in a subset of phenylalanine specific tRNAs. It has been proposed that they participates in converting tRNA(Phe)-m(1)G(37) to tRNA(Phe)-yW. Q#7300 - CGI_10006741 superfamily 241858 96 215 2.35E-14 66.9252 cl00429 SNARE_assoc superfamily - - SNARE associated Golgi protein; This is a family of SNARE associated Golgi proteins. The yeast member of this family localises with the t-SNARE Tlg2. Q#7302 - CGI_10006743 superfamily 245864 75 215 9.46E-17 78.4742 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#7305 - CGI_10002252 superfamily 220653 133 185 7.67E-16 70.4783 cl10936 PP28 superfamily C - "Casein kinase substrate phosphoprotein PP28; This domain is a region of 70 residues conserved in proteins from plants to humans and contains a serine/arginine rich motif. In rats the full protein is a casein kinase substrate, and this region contains phosphorylation sites for both cAMP-dependent protein kinase and casein kinase II." Q#7305 - CGI_10002252 superfamily 222070 229 283 3.80E-09 53.0652 cl18634 DDE_3 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#7306 - CGI_10002253 superfamily 242901 1 144 2.17E-86 251.434 cl02138 G10 superfamily - - G10 protein; G10 protein. Q#7307 - CGI_10024807 superfamily 220692 27 151 0.000375663 40.2653 cl18570 7TM_GPCR_Srw superfamily C - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#7309 - CGI_10024809 superfamily 245206 35 196 4.83E-68 211.753 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#7311 - CGI_10024811 superfamily 243092 121 339 4.36E-25 107.036 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7311 - CGI_10024811 superfamily 246721 587 781 6.68E-07 51.4897 cl14807 ACE1-Sec16-like superfamily N - "Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site." Q#7311 - CGI_10024811 superfamily 217249 1134 1178 0.000684051 39.9119 cl03742 Prp18 superfamily NC - "Prp18 domain; The splicing factor Prp18 is required for the second step of pre-mRNA splicing. The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles." Q#7311 - CGI_10024811 superfamily 221501 863 972 0.00573053 37.0746 cl13679 RCR superfamily N - "Chitin synthesis regulation, resistance to Congo red; RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5." Q#7313 - CGI_10024813 superfamily 243109 759 937 5.14E-77 254.43 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#7313 - CGI_10024813 superfamily 218425 4 517 0 721.406 cl04931 eIF-3_zeta superfamily - - "Eukaryotic translation initiation factor 3 subunit 7 (eIF-3); This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery. The gene coding for the protein has been implicated in cancer in mammals." Q#7313 - CGI_10024813 superfamily 247807 972 1097 0.000516934 40.3562 cl17253 AAA_17 superfamily - - AAA domain; AAA domain. Q#7314 - CGI_10024814 superfamily 243051 446 600 4.18E-40 145.213 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#7314 - CGI_10024814 superfamily 243051 269 429 9.89E-39 141.361 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#7314 - CGI_10024814 superfamily 243051 606 736 2.71E-37 137.509 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#7314 - CGI_10024814 superfamily 241568 200 258 1.05E-07 49.7688 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#7314 - CGI_10024814 superfamily 241583 82 194 4.48E-44 156.963 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#7315 - CGI_10024815 superfamily 244704 3 90 6.41E-44 143.776 cl07364 Nfu_N superfamily - - "Scaffold protein Nfu/NifU N terminal; This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters." Q#7315 - CGI_10024815 superfamily 241897 106 186 9.02E-27 99.294 cl00484 NifU superfamily - - NifU-like domain; This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown. Q#7316 - CGI_10024816 superfamily 244704 3 90 6.41E-44 143.776 cl07364 Nfu_N superfamily - - "Scaffold protein Nfu/NifU N terminal; This domain is found at the N terminus of NifU and NifU related proteins, and in the human Nfu protein. Both of these proteins are thought to be involved in the the assembly of iron-sulphur clusters." Q#7316 - CGI_10024816 superfamily 241897 106 186 9.02E-27 99.294 cl00484 NifU superfamily - - NifU-like domain; This is an alignment of the carboxy-terminal domain. This is the only common region between the NifU protein from nitrogen-fixing bacteria and rhodobacterial species. The biochemical function of NifU is unknown. Q#7317 - CGI_10024817 superfamily 248054 27 238 2.00E-27 109.699 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#7318 - CGI_10024818 superfamily 244509 369 500 1.82E-16 76.7971 cl06793 PRKCSH superfamily - - "Glucosidase II beta subunit-like protein; The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. Mutations in the gene coding for PRKCSH have been found to be involved in the development of autosomal dominant polycystic liver disease (ADPLD), but the precise role the protein has in the pathogenesis of this disease is unknown. This family also includes an ER sensor for misfolded glycoproteins and is therefore likely to be a generic sugar binding domain." Q#7318 - CGI_10024818 superfamily 247856 211 266 0.0029515 35.9865 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#7324 - CGI_10024825 superfamily 216971 200 369 9.01E-47 158.938 cl03532 Octopine_DH superfamily - - "NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain; This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids." Q#7324 - CGI_10024825 superfamily 201664 8 126 1.24E-08 52.616 cl18216 NAD_Gly3P_dh_N superfamily C - NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus; NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain. Q#7325 - CGI_10024826 superfamily 216971 216 384 1.45E-45 156.241 cl03532 Octopine_DH superfamily - - "NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain; This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids." Q#7325 - CGI_10024826 superfamily 201664 20 139 6.07E-09 53.7716 cl18216 NAD_Gly3P_dh_N superfamily C - NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus; NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain. Q#7326 - CGI_10024827 superfamily 216971 197 368 3.77E-42 146.611 cl03532 Octopine_DH superfamily - - "NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain; This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids." Q#7326 - CGI_10024827 superfamily 201664 4 123 2.96E-06 45.6824 cl18216 NAD_Gly3P_dh_N superfamily C - NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus; NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain. Q#7328 - CGI_10024829 superfamily 216971 91 261 5.63E-42 143.53 cl03532 Octopine_DH superfamily - - "NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain; This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids." Q#7329 - CGI_10024830 superfamily 216971 197 344 4.79E-34 123.885 cl03532 Octopine_DH superfamily - - "NAD/NADP octopine/nopaline dehydrogenase, alpha-helical domain; This group of enzymes act on the CH-NH substrate bond using NAD(+) or NADP(+) as an acceptor. The Pfam family consists mainly of octopine and nopaline dehydrogenases from Ti plasmids." Q#7329 - CGI_10024830 superfamily 201664 4 123 2.04E-08 51.8456 cl18216 NAD_Gly3P_dh_N superfamily C - NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus; NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain. Q#7330 - CGI_10024831 superfamily 206051 616 685 6.34E-34 126.879 cl16450 Acetyltransf_13 superfamily - - ESCO1/2 acetyl-transferase; ESCO1/2 acetyl-transferase. Q#7330 - CGI_10024831 superfamily 206051 1279 1348 8.18E-34 126.494 cl16450 Acetyltransf_13 superfamily - - ESCO1/2 acetyl-transferase; ESCO1/2 acetyl-transferase. Q#7330 - CGI_10024831 superfamily 206049 465 504 9.82E-15 70.8172 cl16448 zf-C2H2_3 superfamily - - zinc-finger of acetyl-transferase ESCO; zinc-finger of acetyl-transferase ESCO. Q#7330 - CGI_10024831 superfamily 206049 1128 1167 9.82E-15 70.8172 cl16448 zf-C2H2_3 superfamily - - zinc-finger of acetyl-transferase ESCO; zinc-finger of acetyl-transferase ESCO. Q#7331 - CGI_10024832 superfamily 246671 113 235 8.69E-08 48.9585 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#7333 - CGI_10024834 superfamily 243068 811 901 5.42E-07 50.8482 cl02523 Zona_pellucida superfamily N - Zona pellucida-like domain; Zona pellucida-like domain. Q#7333 - CGI_10024834 superfamily 246671 50 172 1.38E-05 44.6323 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#7335 - CGI_10024836 superfamily 243091 350 398 0.000528413 39.6251 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#7335 - CGI_10024836 superfamily 246975 656 677 0.000780196 38.4821 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#7335 - CGI_10024836 superfamily 222150 642 666 0.00280144 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7335 - CGI_10024836 superfamily 222150 764 789 0.00577305 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7336 - CGI_10024837 superfamily 245206 1 184 2.89E-81 244.344 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#7337 - CGI_10024838 superfamily 245201 137 460 0 619.113 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7337 - CGI_10024838 superfamily 243088 33 92 1.58E-24 97.4785 cl02563 PX_domain superfamily N - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#7339 - CGI_10024840 superfamily 246676 339 501 3.96E-36 133.624 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#7339 - CGI_10024840 superfamily 246710 171 338 2.03E-33 125.233 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#7339 - CGI_10024840 superfamily 246671 1 128 3.06E-25 101.731 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#7341 - CGI_10024842 superfamily 245835 145 380 3.25E-75 235.737 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#7341 - CGI_10024842 superfamily 243088 8 122 5.24E-66 209.115 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#7342 - CGI_10024843 superfamily 202715 72 172 5.11E-40 132.702 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#7343 - CGI_10024844 superfamily 219284 708 964 2.89E-89 290.41 cl06206 RIC1 superfamily - - RIC1; RIC1 has been identified in yeast as a Golgi protein involved in retrograde transport to the cis-Golgi network. It forms a heterodimer with Rgp1 and functions as a guanyl-nucleotide exchange factor. Q#7344 - CGI_10024845 superfamily 245201 54 314 0 535.748 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7344 - CGI_10024845 superfamily 246908 1 39 1.63E-12 62.5998 cl15255 SH2 superfamily N - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#7345 - CGI_10024846 superfamily 241622 114 193 5.15E-15 67.977 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#7345 - CGI_10024846 superfamily 241622 198 271 1.90E-10 55.6507 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#7346 - CGI_10024847 superfamily 210068 467 492 1.22E-05 42.8491 cl15286 RPEL superfamily - - RPEL repeat; The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that the Drosophila myocardin-related transcription factor contains a pfam02037 domain that is also implicated in DNA binding. Q#7346 - CGI_10024847 superfamily 210068 505 530 1.86E-05 42.4639 cl15286 RPEL superfamily - - RPEL repeat; The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that the Drosophila myocardin-related transcription factor contains a pfam02037 domain that is also implicated in DNA binding. Q#7346 - CGI_10024847 superfamily 210068 429 454 0.00219564 36.3007 cl15286 RPEL superfamily - - RPEL repeat; The RPEL repeat is named after four conserved amino acids it contains. The function of the RPEL repeat is unknown however it might be a DNA binding repeat based on the observation that the Drosophila myocardin-related transcription factor contains a pfam02037 domain that is also implicated in DNA binding. Q#7347 - CGI_10024848 superfamily 248279 133 246 1.58E-39 138.624 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#7347 - CGI_10024848 superfamily 247999 94 126 4.56E-07 46.4847 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#7347 - CGI_10024848 superfamily 243098 319 372 8.26E-05 40.3349 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#7347 - CGI_10024848 superfamily 243098 261 310 0.000310442 38.4089 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#7349 - CGI_10024850 superfamily 245201 1012 1268 2.36E-70 237.435 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7351 - CGI_10024852 superfamily 246925 120 323 6.35E-09 56.595 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#7352 - CGI_10024853 superfamily 241591 30 101 2.01E-19 79.2023 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#7353 - CGI_10024854 superfamily 241584 1227 1311 8.06E-17 78.3071 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7353 - CGI_10024854 superfamily 241584 438 527 9.05E-16 75.2255 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7353 - CGI_10024854 superfamily 241584 726 816 2.61E-13 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7353 - CGI_10024854 superfamily 241584 1132 1214 3.18E-12 64.8251 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7353 - CGI_10024854 superfamily 241584 229 316 1.01E-05 45.5651 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7353 - CGI_10024854 superfamily 245814 841 901 0.00929044 36.3131 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7353 - CGI_10024854 superfamily 216647 338 431 8.84E-14 69.5029 cl03309 DB superfamily - - "DB module; This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657." Q#7353 - CGI_10024854 superfamily 216647 634 718 3.02E-13 67.9621 cl03309 DB superfamily - - "DB module; This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657." Q#7353 - CGI_10024854 superfamily 216647 131 222 2.25E-10 59.4877 cl03309 DB superfamily - - "DB module; This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657." Q#7353 - CGI_10024854 superfamily 216647 1029 1123 5.40E-07 49.4725 cl03309 DB superfamily - - "DB module; This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657." Q#7353 - CGI_10024854 superfamily 216647 1 47 3.61E-05 43.6945 cl03309 DB superfamily N - "DB module; This domain has no known function. It is found in several C. elegans proteins. The domain contains 12 conserved cysteines that probably form six disulphide bridges. This domain is found associated with ig pfam00047 and fn3 pfam00041 domains, as well as in some lipases pfam00657." Q#7354 - CGI_10024855 superfamily 247068 239 334 1.20E-05 43.4562 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7354 - CGI_10024855 superfamily 247068 156 232 0.000543453 38.4847 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#7356 - CGI_10024857 superfamily 241578 37 189 1.36E-37 130.971 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7357 - CGI_10024858 superfamily 192535 30 209 7.57E-07 48.7462 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#7360 - CGI_10024861 superfamily 190534 22 114 1.81E-27 99.4047 cl18165 bZIP_Maf superfamily - - "bZIP Maf transcription factor; Maf transcription factors contain a conserved basic region leucine zipper (bZIP) domain, which mediates their dimerisation and DNA binding property. Thus, this family is probably related to pfam00170." Q#7361 - CGI_10024862 superfamily 243072 78 209 4.68E-17 78.5794 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7361 - CGI_10024862 superfamily 243072 519 611 1.38E-10 59.3194 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7364 - CGI_10024865 superfamily 248264 91 250 7.41E-59 186.289 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#7364 - CGI_10024865 superfamily 222263 13 102 8.56E-09 51.1645 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#7365 - CGI_10024866 superfamily 220393 27 318 9.48E-95 287.735 cl10751 Tmem26 superfamily - - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#7366 - CGI_10024867 superfamily 245303 137 503 0 552.55 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#7366 - CGI_10024867 superfamily 241613 94 127 6.95E-15 70.3133 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#7367 - CGI_10024868 superfamily 245303 24 160 1.99E-74 228.983 cl10447 GH18_chitinase-like superfamily C - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#7368 - CGI_10024869 superfamily 245303 24 386 0 561.795 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#7368 - CGI_10024869 superfamily 243119 445 496 2.95E-13 64.7624 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7369 - CGI_10024870 superfamily 245303 26 394 0 540.224 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#7369 - CGI_10024870 superfamily 243119 442 491 1.90E-11 60.5353 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7369 - CGI_10024870 superfamily 243119 591 633 2.87E-07 48.2089 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7369 - CGI_10024870 superfamily 243119 636 677 4.65E-06 44.732 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7369 - CGI_10024870 superfamily 243119 563 597 0.00069953 38.1938 cl02629 CBM_14 superfamily C - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7370 - CGI_10024871 superfamily 241884 47 116 2.37E-32 113.819 cl00467 Ntn_hydrolase superfamily C - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#7373 - CGI_10024874 superfamily 241640 2 267 1.54E-78 239.871 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#7375 - CGI_10024876 superfamily 243035 15 83 5.63E-07 47.2294 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7377 - CGI_10010415 superfamily 241782 68 467 0 734.441 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#7378 - CGI_10010416 superfamily 241624 103 414 1.34E-42 152.866 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#7379 - CGI_10010417 superfamily 247723 262 334 2.69E-44 152.027 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7379 - CGI_10010417 superfamily 247723 431 515 1.22E-42 147.691 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7379 - CGI_10010417 superfamily 247723 164 236 5.16E-42 145.441 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7383 - CGI_10010421 superfamily 247676 56 173 3.06E-53 167.787 cl17012 GINS_A superfamily - - "Alpha-helical domain of GINS complex proteins; Sld5, Psf1, Psf2 and Psf3; The GINS complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In eukaryotes, GINS is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3. The GINS complex has been found in eukaryotes and archaea, but not in bacteria. The four subunits of the complex are homologous and consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3." Q#7384 - CGI_10010422 superfamily 221645 217 325 1.19E-10 59.2854 cl13951 DUF3754 superfamily - - "Protein of unknown function (DUF3754); This domain family is found in bacteria, archaea and eukaryotes, and is typically between 135 and 166 amino acids in length. There is a single completely conserved residue P that may be functionally important." Q#7385 - CGI_10010423 superfamily 247678 196 318 5.87E-61 193.308 cl17014 eIF-5_eIF-2B superfamily - - "Domain found in IF2B/IF5; This family includes the N terminus of eIF-5, and the C terminus of eIF-2 beta. This region corresponds to the whole of the archaebacterial eIF-2 beta homologue. The region contains a putative zinc binding C4 finger." Q#7387 - CGI_10010425 superfamily 247916 339 404 4.41E-07 48.9183 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#7387 - CGI_10010425 superfamily 247916 289 370 0.00260378 37.7654 cl17362 Transglut_core superfamily C - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#7388 - CGI_10010426 superfamily 247778 16 175 4.30E-20 84.2309 cl17224 MenA superfamily N - "1,4-dihydroxy-2-naphthoate octaprenyltransferase [Coenzyme metabolism]" Q#7389 - CGI_10010427 superfamily 241623 2001 2279 0 561.381 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#7389 - CGI_10010427 superfamily 149738 1863 1962 6.46E-41 148.907 cl07398 Rapamycin_bind superfamily - - Rapamycin binding domain; This domain forms an alpha helical structure and binds to rapamycin. Q#7389 - CGI_10010427 superfamily 202180 2360 2392 1.10E-13 68.6444 cl03505 FATC superfamily - - "FATC domain; The FATC domain is named after FRAP, ATM, TRRAP C-terminal. The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability." Q#7392 - CGI_10010430 superfamily 242917 177 292 1.29E-45 158.035 cl02170 Sec62 superfamily N - Translocation protein Sec62; Translocation protein Sec62. Q#7394 - CGI_10010432 superfamily 243077 18 64 1.71E-10 57.1701 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#7394 - CGI_10010432 superfamily 221283 402 541 1.82E-42 149.409 cl13336 DUF3395 superfamily - - Domain of unknown function (DUF3395); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 147 to 176 amino acids in length. This domain is found associated with pfam00226. Q#7397 - CGI_10010435 superfamily 241764 286 348 1.40E-31 118.522 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#7397 - CGI_10010435 superfamily 247743 416 579 1.19E-20 89.5127 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#7397 - CGI_10010435 superfamily 204202 659 720 8.97E-30 112.734 cl07827 Vps4_C superfamily - - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#7398 - CGI_10010436 superfamily 247743 31 131 2.17E-11 56.7707 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#7399 - CGI_10010437 superfamily 248097 279 409 2.32E-15 71.9126 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#7400 - CGI_10010438 superfamily 191973 279 362 0.00676977 35.0315 cl07022 Striatin superfamily C - "Striatin family; Striatin is an intracellular protein which has a caveolin-binding motif, a coiled-coil structure, a calmodulin-binding site, and a WD (pfam00400) repeat domain. It acts as a scaffold protein and is involved in signalling pathways." Q#7401 - CGI_10005743 superfamily 202203 33 98 2.65E-31 113.817 cl03534 E2F_TDP superfamily - - "E2F/DP family winged-helix DNA-binding domain; This family contains the transcription factor E2F and its dimerisation partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. The crystal structure of an E2F4-DP2-DNA complex shows that the DNA-binding domains of the E2F and DP proteins both have a fold related to the winged-helix DNA-binding motif. Recognition of the central c/gGCGCg/c sequence of the consensus DNA-binding site is symmetric, and amino acids that contact these bases are conserved among all known E2F and DP proteins." Q#7402 - CGI_10005744 superfamily 242748 311 362 0.00230741 35.99 cl01853 COG4467 superfamily C - "Regulator of replication initiation timing [Replication, recombination, and repair]" Q#7407 - CGI_10012864 superfamily 246669 348 540 2.79E-58 199.529 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#7408 - CGI_10012865 superfamily 244307 77 318 3.90E-73 240.278 cl06123 DHR2_DOCK superfamily N - "Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins; DOCK proteins comprise a family of atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. They are also called the CZH (CED-5, Dock180, and MBC-zizimin homology) family, after the first family members identified. Dock180 was first isolated as a binding partner for the adaptor protein Crk. The Caenorhabditis elegans protein, Ced-5, is essential for cell migration and phagocytosis, while the Drosophila ortholog, Myoblast city (MBC), is necessary for myoblast fusion and dorsal closure. DOCKs are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1 (or Dock180), 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1, and DHR-2 (also called CZH2 or Docker). This alignment model represents the DHR-2 domain of DOCK proteins, which contains the catalytic GEF activity for Rac and/or Cdc42." Q#7409 - CGI_10012866 superfamily 220184 565 869 1.64E-44 164.684 cl18549 Mcm10 superfamily - - Mcm10 replication factor; Mcm10 is a eukaryotic DNA replication factor that regulates the stability and chromatin association of DNA polymerase alpha. Q#7409 - CGI_10012866 superfamily 192253 428 473 1.91E-12 63.4641 cl07823 zf-primase superfamily - - Primase zinc finger; This zinc finger is found in yeast Mcm10 proteins and DnaG-type primases. Q#7410 - CGI_10012867 superfamily 241640 114 212 4.80E-08 51.6711 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#7411 - CGI_10012868 superfamily 247948 62 111 9.24E-12 58.0742 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#7412 - CGI_10012869 superfamily 241594 25 309 2.23E-09 56.1595 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#7414 - CGI_10012871 superfamily 215754 202 291 2.22E-17 77.6788 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#7414 - CGI_10012871 superfamily 215754 306 389 2.22E-17 77.6788 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#7414 - CGI_10012871 superfamily 215754 406 482 1.47E-10 58.0336 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#7414 - CGI_10012871 superfamily 245814 33 124 1.20E-09 55.5113 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7414 - CGI_10012871 superfamily 245814 130 150 0.00635822 34.9139 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7415 - CGI_10012872 superfamily 247792 250 297 0.000163182 40.1216 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7415 - CGI_10012872 superfamily 128778 471 561 0.000186795 40.7111 cl17972 BBC superfamily N - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#7416 - CGI_10012873 superfamily 147626 77 116 1.64E-06 43.7831 cl05227 DUF1519 superfamily N - Protein of unknown function (DUF1519); This family consists of several putative homing endonuclease proteins of around 245 residues in length which appear to be found exclusively in Naegleria species. The function of this family is unclear. Q#7421 - CGI_10012878 superfamily 222296 125 375 3.66E-84 265.849 cl16339 DUF4147 superfamily - - "Domain of unknown function (DUF4147); This domain is frequently found at the N-terminus of proteins carrying the glycerate kinase-like domain MOFRL, pfam05161." Q#7421 - CGI_10012878 superfamily 203185 527 624 1.50E-26 104.657 cl15995 MOFRL superfamily - - "MOFRL family; MOFRL(multi-organism fragment with rich Leucine) family exists in bacteria and eukaryotes. The function of this domain is not clear, although it exists in some putative enzymes such as reductases and kinases." Q#7422 - CGI_10012879 superfamily 247725 376 484 5.89E-15 72.5961 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7422 - CGI_10012879 superfamily 243096 230 360 8.03E-12 63.0034 cl02571 RhoGEF superfamily C - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#7424 - CGI_10012881 superfamily 243072 174 296 1.53E-28 111.707 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7424 - CGI_10012881 superfamily 243072 73 230 5.29E-28 110.166 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7424 - CGI_10012881 superfamily 245847 417 527 9.23E-05 41.7192 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#7429 - CGI_10019821 superfamily 203913 432 574 2.40E-10 58.7677 cl07084 P4Ha_N superfamily - - "Prolyl 4-Hydroxylase alpha-subunit, N-terminal region; The members of this family are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme (EC:1.14.11.2) is important in the post-translational modification of collagen, as it catalyzes the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase. The function of the N-terminal region featured in this family does not seem to be known." Q#7430 - CGI_10019822 superfamily 245201 635 933 8.94E-173 507.27 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7430 - CGI_10019822 superfamily 245847 18 171 4.55E-25 102.816 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#7435 - CGI_10019827 superfamily 248192 17 336 1.58E-89 283.006 cl17638 PLN02808 superfamily - - alpha-galactosidase Q#7435 - CGI_10019827 superfamily 243030 406 431 0.000174131 39.5379 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#7437 - CGI_10019829 superfamily 241913 45 150 2.33E-14 65.7053 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#7438 - CGI_10019830 superfamily 202484 1 42 5.71E-08 43.7569 cl03798 zf-Tim10_DDP superfamily N - Tim10/DDP family zinc finger; Putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein TIMM8A. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localised to the mitochondrial intermembrane space. Q#7439 - CGI_10019831 superfamily 241570 444 555 1.79E-17 79.6774 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#7440 - CGI_10019832 superfamily 216966 405 571 2.54E-29 114.729 cl03523 HORMA superfamily - - "HORMA domain; The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognise chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity." Q#7440 - CGI_10019832 superfamily 241621 104 220 5.56E-07 48.4636 cl00116 PDGF superfamily - - "Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family domain; PDGF is a potent activator for cells of mesenchymal origin; PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer; VEGF is a potent mitogen in embryonic and somatic angiogenesis with a unique specificity for vascular endothelial cells; VEGF forms homodimers and exists in 4 different isoforms; overall, the VEGF monomer resembles that of PDGF, but its N-terminal segment is helical rather than extended; the cysteine knot motif is a common feature of this domain" Q#7443 - CGI_10019835 superfamily 245230 2 426 0 945.942 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#7445 - CGI_10019837 superfamily 241900 385 683 5.46E-119 361.741 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#7450 - CGI_10019842 superfamily 247792 44 82 0.00127606 37.4252 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7450 - CGI_10019842 superfamily 245716 450 476 1.16E-06 46.3975 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#7451 - CGI_10019843 superfamily 245847 654 793 1.18E-07 51.7344 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#7451 - CGI_10019843 superfamily 243035 971 1076 1.10E-06 48.365 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7451 - CGI_10019843 superfamily 243035 1194 1220 0.000870821 39.3301 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7451 - CGI_10019843 superfamily 238012 902 931 0.00279069 37.3338 cl11390 EGF_Lam superfamily N - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#7454 - CGI_10019846 superfamily 192107 503 696 1.05E-68 232.108 cl07312 Med14 superfamily - - "Mediator complex subunit MED14; Saccharomyces cerevisiae RGR1 mediator complex subunit affects chromatin structure, transcriptional regulation of diverse genes and sporulation, required for glucose repression, HO repression, RME1 repression and sporulation. This subunit is also found in higher eukaryotes and Med14 is the agreed unified nomenclature for this subunit. Med14 is found in the tail region of Mediator." Q#7454 - CGI_10019846 superfamily 218108 1504 1644 0.0049268 39.1037 cl04540 CITED superfamily C - "CITED; CITED, CBP/p300-interacting transactivator with ED-rich tail, are characterized by a conserved 32-amino acid sequence at the C-terminus. CITED proteins do not bind DNA directly and are thought to function as transcriptional co-activators." Q#7455 - CGI_10019847 superfamily 205450 50 136 1.02E-26 97.0784 cl16202 DUF4061 superfamily - - "Domain of unknown function (DUF4061); This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 90 amino acids in length. There is a conserved AFG sequence motif." Q#7456 - CGI_10019848 superfamily 215647 7 128 4.78E-26 101.916 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#7457 - CGI_10009475 superfamily 220669 20 362 3.34E-160 456.486 cl10954 Tmpp129 superfamily - - Putative transmembrane protein precursor; This is a family of proteins conserved from worms to humans. The proteins are purported to be transmembrane protein-precursors but the function is unknown. Q#7458 - CGI_10009476 superfamily 243091 258 343 6.24E-08 50.1832 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#7459 - CGI_10009477 superfamily 219225 32 112 0.000263363 37.0575 cl06114 FAIM1 superfamily N - "Fas apoptotic inhibitory molecule (FAIM1); This family consists of several fas apoptotic inhibitory molecule (FAIM1) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM1 is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology." Q#7460 - CGI_10009478 superfamily 243179 114 216 1.03E-23 92.4245 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#7461 - CGI_10009479 superfamily 192194 7 43 4.32E-08 44.8811 cl07556 DUF1903 superfamily C - "Domain of unknown function (DUF1903); Members of this family adopt a coiled coil structure, with two antiparallel alpha-helices that are tightly strapped together by two disulfide bridges at each end. The protein sequence shows a cysteine motif, required for the stabilisation of the coiled-coil-like structure. Additional inter-helix hydrophobic contacts impart stability to this scaffold. The precise function of this eukaryotic domain is, as yet, unknown." Q#7462 - CGI_10009480 superfamily 218789 5 169 2.22E-27 104.255 cl05447 Ceramidase superfamily C - "Ceramidase; This family consists of several ceramidases. Ceramidases are enzymes involved in regulating cellular levels of ceramides, sphingoid bases, and their phosphates, EC:3.5.1.23." Q#7463 - CGI_10009481 superfamily 241563 18 50 8.11E-05 40.3983 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7464 - CGI_10009482 superfamily 247684 1 404 5.57E-80 258.747 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7465 - CGI_10009483 superfamily 247684 1 268 2.62E-61 202.122 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7466 - CGI_10009484 superfamily 244558 225 306 3.32E-26 104.175 cl06950 AARP2CN superfamily - - AARP2CN (NUC121) domain; This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU. Q#7467 - CGI_10009485 superfamily 243084 704 803 6.22E-52 176.206 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#7467 - CGI_10009485 superfamily 247736 522 577 0.000113731 41.1073 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#7467 - CGI_10009485 superfamily 148209 37 288 1.31E-128 387.368 cl05793 PCAF_N superfamily - - PCAF (P300/CBP-associated factor) N-terminal domain; This region is spliced out of human histone acetyltranfersase KAT2A isoform 2. It is predicted to be of a mixed alpha/beta fold - though predominantly helical. Q#7469 - CGI_10009487 superfamily 241563 60 102 3.97E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7469 - CGI_10009487 superfamily 110440 485 509 0.00238181 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#7469 - CGI_10009487 superfamily 241563 8 53 0.00428729 35.3907 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7469 - CGI_10009487 superfamily 128778 98 230 0.00651088 35.7035 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#7471 - CGI_10009489 superfamily 241554 106 232 3.01E-29 109.282 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7472 - CGI_10009490 superfamily 241554 2 123 6.24E-26 98.1115 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7473 - CGI_10009491 superfamily 241554 118 248 7.20E-08 49.5764 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7473 - CGI_10009491 superfamily 241554 9 61 5.16E-06 44.5688 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7474 - CGI_10000279 superfamily 247684 1 78 5.68E-15 71.9948 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7475 - CGI_10014865 superfamily 248264 163 206 0.000317131 39.5278 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#7476 - CGI_10014866 superfamily 217293 37 246 1.15E-41 148.165 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#7476 - CGI_10014866 superfamily 202474 253 306 1.85E-07 50.3449 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7477 - CGI_10014867 superfamily 247692 42 385 1.41E-155 450.919 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#7478 - CGI_10014868 superfamily 247723 207 283 3.12E-25 96.9558 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7478 - CGI_10014868 superfamily 243077 51 102 8.58E-18 76.0449 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#7479 - CGI_10014869 superfamily 243029 28 101 3.18E-13 64.6793 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#7479 - CGI_10014869 superfamily 215647 172 298 5.91E-05 42.9809 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#7484 - CGI_10014874 superfamily 243107 453 493 0.000575656 37.9098 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#7487 - CGI_10014877 superfamily 243045 291 390 4.24E-15 72.6659 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#7487 - CGI_10014877 superfamily 243045 117 169 1.55E-09 56.4875 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#7487 - CGI_10014877 superfamily 241596 13 65 5.48E-09 53.7571 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#7490 - CGI_10014880 superfamily 246946 21 411 1.96E-92 284.57 cl15397 DUF89 superfamily - - Protein of unknown function DUF89; This family has no known function. Q#7491 - CGI_10014881 superfamily 247065 3 105 3.64E-12 58.8954 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#7492 - CGI_10014882 superfamily 216554 107 248 5.09E-22 90.2313 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#7493 - CGI_10014883 superfamily 247723 13 87 1.10E-25 98.0762 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7494 - CGI_10014884 superfamily 248281 136 204 3.38E-06 42.6427 cl17727 GT1 superfamily C - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#7495 - CGI_10014885 superfamily 194336 183 278 1.02E-19 84.9901 cl02517 ZU5 superfamily - - ZU5 domain; Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function. Q#7495 - CGI_10014885 superfamily 246680 532 599 2.73E-10 57.3289 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#7497 - CGI_10014887 superfamily 220692 86 371 8.26E-15 73.3925 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#7498 - CGI_10014888 superfamily 241599 148 206 9.25E-26 96.5436 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#7499 - CGI_10014890 superfamily 247740 12 182 1.69E-74 225.822 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#7501 - CGI_10014892 superfamily 247769 166 213 5.63E-06 43.4821 cl17215 HDc superfamily C - Metal dependent phosphohydrolases with conserved 'HD' motif Q#7501 - CGI_10014892 superfamily 247057 61 112 4.88E-05 39.6054 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#7502 - CGI_10014893 superfamily 245814 29 76 0.000395909 37.7089 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7506 - CGI_10016706 superfamily 241581 4 70 0.000900831 38.9066 cl00062 FHA superfamily C - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#7507 - CGI_10016707 superfamily 219936 251 349 5.07E-14 68.8165 cl18534 SPA superfamily - - Stabilisation of polarity axis; Yeast AFI1 (ARF3-interaction protein 1) has been shown to interact with the outer plaque of the spindle pole body. In Aspergillus nidulans the protein member is necessary for stabilisation of the polarity axes during septation. and in S. cerevisiae it functions as a polarisation-specific docking factor. Q#7507 - CGI_10016707 superfamily 219579 36 95 8.53E-13 64.127 cl16001 Afi1 superfamily - - "Docking domain of Afi1 for Arf3 in vesicle trafficking; This domain occurs at the N-terminal of Afi1, a protein necessary for vesicle trafficking in yeast. This domain is the interacting region of the protein which binds to Arf3. Afi1 is distributed asymmetrically at the plasma membrane and is required for polarized distribution of Arf3 but not of an Arf3 guanine nucleotide-exchange factor, Yel1p. However, Afi1 is not required for targeting of Arf3 or Yel1p to the plasma membrane. Afi1 functions as an Arf3 polarization-specific adapter and participates in development of polarity. Although Arf3 is the homologue of human Arf6 it does not function in the same way, not being necessary for endocytosis or for mating factor receptor internalisation. In the S phase, however, it is concentrated at the plasma membrane of the emerging bud. Because of its polarized localisation and its critical function in the normal budding pattern of yeast, Arf3 is probably a regulator of vesicle trafficking, which is important for polarized growth." Q#7508 - CGI_10016708 superfamily 245213 703 737 5.44E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 740 775 1.05E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 625 661 2.51E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 473 509 4.90E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 284 319 4.90E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 511 547 5.52E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 435 471 1.05E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 663 698 2.10E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 360 395 2.74E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 209 244 5.06E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 246 281 6.34E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 549 585 6.49E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 815 850 0.000128982 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 587 622 0.000192379 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 322 357 0.000665766 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 778 812 0.00622585 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7508 - CGI_10016708 superfamily 245213 398 432 0.00805851 35.6902 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7509 - CGI_10016709 superfamily 215754 177 270 1.18E-19 81.1456 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#7509 - CGI_10016709 superfamily 215754 91 173 6.41E-15 68.0488 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#7509 - CGI_10016709 superfamily 215754 16 78 4.45E-12 60.3448 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#7513 - CGI_10016713 superfamily 248097 464 586 6.00E-08 51.1118 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#7513 - CGI_10016713 superfamily 248097 206 259 0.000133243 40.7114 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#7514 - CGI_10016714 superfamily 243035 64 121 1.08E-16 71.8621 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7516 - CGI_10016716 superfamily 241644 31 135 5.18E-41 136.948 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#7517 - CGI_10016717 superfamily 217020 2 76 5.06E-18 75.3238 cl03574 Seryl_tRNA_N superfamily C - Seryl-tRNA synthetase N-terminal domain; This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase. Q#7518 - CGI_10016718 superfamily 241763 164 258 1.94E-51 168.956 cl00298 Peptidase_C1 superfamily C - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#7518 - CGI_10016718 superfamily 244586 78 135 2.22E-16 70.7354 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#7519 - CGI_10016719 superfamily 241763 118 299 1.69E-81 247.922 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#7519 - CGI_10016719 superfamily 244586 28 88 1.23E-20 83.0619 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#7520 - CGI_10016720 superfamily 241763 115 328 4.27E-117 339.6 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#7520 - CGI_10016720 superfamily 244586 27 85 1.89E-17 74.9726 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#7521 - CGI_10016721 superfamily 241763 116 329 2.75E-117 340.37 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#7521 - CGI_10016721 superfamily 244586 28 83 1.40E-14 67.2686 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#7522 - CGI_10016722 superfamily 241763 56 269 8.39E-119 341.911 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#7524 - CGI_10016724 superfamily 247908 224 379 4.61E-76 235.651 cl17354 NIF superfamily - - NLI interacting factor-like phosphatase; This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain. Q#7525 - CGI_10016725 superfamily 116798 16 104 2.06E-16 70.7834 cl17955 Lipocalin_2 superfamily C - "Lipocalin-like domain; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The structure is an eight-stranded beta barrel." Q#7531 - CGI_10010007 superfamily 241584 530 622 0.000280103 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7531 - CGI_10010007 superfamily 245814 178 250 2.18E-07 49.0409 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7531 - CGI_10010007 superfamily 245814 83 149 7.53E-06 44.32 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7531 - CGI_10010007 superfamily 245814 271 328 4.01E-05 42.0088 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7534 - CGI_10010010 superfamily 246925 2 293 9.40E-65 214.912 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#7534 - CGI_10010010 superfamily 203777 347 515 1.94E-32 121.601 cl06737 RanGAP1_C superfamily - - "RanGAP1 C-terminal domain; Ran-GTPase activating protein 1 (RanGAP1) is a GTPase activator for the nuclear Ras-related regulatory protein Ran, converting it to the putatively inactive GDP-bound state. Its C-terminal domain is required for RanGAP1 localisation at the vertebrate nuclear pore complex, and is sumoylated by the small ubiquitin-related modifier protein (SUMO-1). This domain is composed almost entirely of helical substructures that are organised into an alpha-alpha superhelix fold, with the exception of the peptide containing the lysine residue required for SUMO-1 conjugation." Q#7536 - CGI_10010012 superfamily 222604 127 636 2.84E-161 511.418 cl16723 MOR2-PAG1_N superfamily - - Cell morphogenesis N-terminal; This family is the conserved N-terminal region of proteins that are involved in cell morphogenesis. Q#7536 - CGI_10010012 superfamily 222607 1975 2228 2.39E-63 219.729 cl16725 MOR2-PAG1_C superfamily - - Cell morphogenesis C-terminal; This family is the conserved C-terminal region of proteins that are involved in cell morphogenesis. Q#7538 - CGI_10010014 superfamily 243092 3 295 9.66E-94 292.317 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7538 - CGI_10010014 superfamily 243092 298 338 0.000379987 38.8326 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7538 - CGI_10010014 superfamily 243092 341 383 0.00147001 36.9066 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7539 - CGI_10010015 superfamily 244843 211 574 6.07E-109 339.592 cl08040 Ggt superfamily C - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#7539 - CGI_10010015 superfamily 216363 103 186 2.64E-19 84.059 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#7542 - CGI_10010018 superfamily 241600 79 262 2.28E-78 238.679 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7543 - CGI_10005454 superfamily 247724 23 146 1.26E-39 136.896 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#7546 - CGI_10005457 superfamily 243058 238 351 7.86E-33 121.267 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7546 - CGI_10005457 superfamily 243058 104 225 5.43E-31 116.26 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7546 - CGI_10005457 superfamily 243058 316 434 4.38E-20 86.2143 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7546 - CGI_10005457 superfamily 201951 9 90 2.78E-13 66.2502 cl03339 IBB superfamily - - "Importin beta binding domain; This family consists of the importin alpha (karyopherin alpha), importin beta (karyopherin beta) binding domain. The domain mediates formation of the importin alpha beta complex; required for classical NLS import of proteins into the nucleus, through the nuclear pore complex and across the nuclear envelope. Also in the alignment is the NLS of importin alpha which overlaps with the IBB domain." Q#7549 - CGI_10008610 superfamily 246597 106 422 1.54E-94 293.796 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#7550 - CGI_10008611 superfamily 217293 34 227 9.10E-35 128.519 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#7550 - CGI_10008611 superfamily 202474 235 330 8.06E-15 72.3013 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7551 - CGI_10008612 superfamily 217293 180 380 4.88E-40 145.083 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#7551 - CGI_10008612 superfamily 202474 388 484 1.00E-13 69.6049 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7552 - CGI_10008613 superfamily 247044 120 254 3.20E-49 161.245 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#7552 - CGI_10008613 superfamily 247044 2 85 3.75E-40 138.149 cl15697 ADF_gelsolin superfamily N - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#7553 - CGI_10008614 superfamily 247905 196 310 2.41E-19 85.7524 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#7553 - CGI_10008614 superfamily 247805 16 178 1.89E-51 179.218 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#7553 - CGI_10008614 superfamily 243199 826 872 2.01E-12 66.5458 cl02808 RT_like superfamily C - "RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs." Q#7554 - CGI_10008615 superfamily 207685 101 171 2.69E-36 123.053 cl02642 PABP superfamily - - "Poly-adenylate binding protein, unique domain; The region featured in this family is found towards the C-terminus of poly(A)-binding proteins (PABPs). These are eukaryotic proteins that, through their binding of the 3' poly(A) tail on mRNA, have very important roles in the pathways of gene expression. They seem to provide a scaffold on which other proteins can bind and mediate processes such as export, translation and turnover of the transcripts. Moreover, they may act as antagonists to the binding of factors that allow mRNA degradation, regulating mRNA longevity. PABPs are also involved in nuclear transport. PABPs interact with poly(A) tails via RNA-recognition motifs (pfam00076). Note that the PABP C-terminal region is also found in members of the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains - these are also included in this family." Q#7555 - CGI_10008616 superfamily 241571 142 228 3.48E-15 70.1338 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#7556 - CGI_10000847 superfamily 241559 29 58 7.97E-05 36.4884 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#7559 - CGI_10023171 superfamily 241743 229 343 3.35E-27 104.313 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#7559 - CGI_10023171 superfamily 246918 49 101 1.98E-15 69.9231 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7559 - CGI_10023171 superfamily 246918 106 158 1.40E-13 64.9155 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7562 - CGI_10023174 superfamily 241554 164 209 1.37E-10 58.6724 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7562 - CGI_10023174 superfamily 245226 11 59 0.00059847 38.8209 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#7565 - CGI_10023177 superfamily 247745 113 453 0 539.545 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#7565 - CGI_10023177 superfamily 245003 447 533 1.34E-19 85.7114 cl08536 Alpha-mann_mid superfamily - - "Alpha mannosidase, middle domain; Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase." Q#7566 - CGI_10023178 superfamily 243072 69 195 1.08E-30 112.862 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7567 - CGI_10023179 superfamily 241578 44 208 1.55E-49 169.331 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7569 - CGI_10023181 superfamily 207662 13 84 5.19E-34 124.982 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#7569 - CGI_10023181 superfamily 245599 540 656 1.92E-07 50.4257 cl11397 NR_LBD superfamily C - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#7571 - CGI_10023183 superfamily 218219 15 199 1.58E-35 125.51 cl04693 PRELI superfamily - - "PRELI-like family; This family includes a conserved region found in the PRELI protein and yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. This region is also found in a number of other eukaryotic proteins." Q#7574 - CGI_10023186 superfamily 247792 441 479 0.0086164 34.7288 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7574 - CGI_10023186 superfamily 243109 264 392 4.06E-53 177.509 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#7577 - CGI_10023189 superfamily 216944 55 128 6.03E-12 57.5923 cl03496 Propep_M14 superfamily - - "Carboxypeptidase activation peptide; Carboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase, and is responsible for modulation of folding and activity of the pro-enzyme." Q#7578 - CGI_10023190 superfamily 245595 5 276 3.58E-143 407.677 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#7579 - CGI_10023191 superfamily 110440 334 358 0.00301411 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#7580 - CGI_10023192 superfamily 243066 59 172 2.03E-14 70.3389 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#7580 - CGI_10023192 superfamily 222150 511 535 0.003698 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7581 - CGI_10023193 superfamily 243088 77 184 3.87E-38 132.837 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#7584 - CGI_10023196 superfamily 245213 284 317 0.000509306 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7584 - CGI_10023196 superfamily 243061 346 445 2.09E-34 125.532 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#7584 - CGI_10023196 superfamily 243061 64 163 2.54E-33 122.451 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#7584 - CGI_10023196 superfamily 243061 171 269 1.51E-30 115.132 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#7585 - CGI_10023197 superfamily 248264 67 113 1.00E-05 42.2242 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#7586 - CGI_10023199 superfamily 246664 173 653 2.00E-144 434.045 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#7587 - CGI_10023200 superfamily 246664 331 710 4.59E-141 424.68 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#7587 - CGI_10023200 superfamily 246664 222 269 2.17E-05 46.1494 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#7588 - CGI_10023201 superfamily 220672 18 111 3.70E-15 68.0422 cl10957 Frag1 superfamily N - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#7589 - CGI_10023202 superfamily 245213 1458 1490 7.16E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7589 - CGI_10023202 superfamily 245213 1375 1408 3.37E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7589 - CGI_10023202 superfamily 245213 1498 1531 6.40E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7589 - CGI_10023202 superfamily 245213 1120 1152 0.000133628 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7589 - CGI_10023202 superfamily 245213 1576 1612 0.000135648 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7589 - CGI_10023202 superfamily 245213 993 1027 0.00159919 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7589 - CGI_10023202 superfamily 245213 1204 1244 0.00646413 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7589 - CGI_10023202 superfamily 248097 1921 2052 6.56E-26 106.195 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#7589 - CGI_10023202 superfamily 241578 1270 1323 1.84E-05 46.6092 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7589 - CGI_10023202 superfamily 243124 565 642 9.94E-05 42.7896 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#7589 - CGI_10023202 superfamily 241578 1034 1074 0.000106186 44.6832 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7589 - CGI_10023202 superfamily 241584 307 378 0.000750155 40.0999 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7589 - CGI_10023202 superfamily 221695 1358 1378 0.00076541 39.3606 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#7589 - CGI_10023202 superfamily 245674 40 113 0.00103555 39.5654 cl11531 DUF904 superfamily - - Protein of unknown function (DUF904); This family consists of several bacterial and archaeal hypothetical proteins of unknown function. Q#7589 - CGI_10023202 superfamily 241578 1653 1701 0.00884891 38.52 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7590 - CGI_10023203 superfamily 246722 1 195 7.10E-74 243.818 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#7590 - CGI_10023203 superfamily 246724 196 242 1.43E-15 74.3113 cl14815 H3TH_StructSpec-5'-nucleases superfamily C - "H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination; The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases." Q#7590 - CGI_10023203 superfamily 246722 262 315 0.0089284 37.3512 cl14812 PIN_SF superfamily N - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#7591 - CGI_10023204 superfamily 243084 1215 1298 5.56E-17 79.3614 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#7591 - CGI_10023204 superfamily 247999 1058 1100 3.09E-09 55.1892 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#7596 - CGI_10000785 superfamily 247085 84 191 5.92E-27 100.658 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#7596 - CGI_10000785 superfamily 245596 1 36 6.01E-20 84.5633 cl11394 Glyco_tranf_GTA_type superfamily NC - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#7596 - CGI_10000785 superfamily 245596 38 65 1.63E-09 54.903 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#7598 - CGI_10009080 superfamily 241600 52 217 1.48E-29 117.341 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7598 - CGI_10009080 superfamily 241600 281 388 3.16E-18 83.8291 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7598 - CGI_10009080 superfamily 241600 549 627 6.25E-06 46.4647 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7600 - CGI_10009082 superfamily 241600 12 78 1.82E-17 78.8215 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7600 - CGI_10009082 superfamily 241600 274 356 8.91E-06 44.9239 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7601 - CGI_10009083 superfamily 241600 125 310 9.78E-53 175.121 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7605 - CGI_10009087 superfamily 243269 401 784 9.57E-52 186.32 cl03012 Ammonium_transp superfamily - - Ammonium Transporter Family; Ammonium Transporter Family. Q#7605 - CGI_10009087 superfamily 243269 49 337 8.98E-44 163.593 cl03012 Ammonium_transp superfamily - - Ammonium Transporter Family; Ammonium Transporter Family. Q#7606 - CGI_10009088 superfamily 243269 9 390 1.17E-61 214.824 cl03012 Ammonium_transp superfamily - - Ammonium Transporter Family; Ammonium Transporter Family. Q#7606 - CGI_10009088 superfamily 243269 555 843 1.34E-32 130.851 cl03012 Ammonium_transp superfamily N - Ammonium Transporter Family; Ammonium Transporter Family. Q#7607 - CGI_10009089 superfamily 243269 26 436 8.81E-107 327.303 cl03012 Ammonium_transp superfamily - - Ammonium Transporter Family; Ammonium Transporter Family. Q#7608 - CGI_10002458 superfamily 247805 204 262 6.33E-06 45.7912 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#7608 - CGI_10002458 superfamily 221913 414 603 2.32E-64 215.867 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#7608 - CGI_10002458 superfamily 241762 701 759 1.06E-16 76.2366 cl00297 R3H superfamily - - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#7608 - CGI_10002458 superfamily 207411 903 946 1.81E-09 55.1433 cl01438 zf-AN1 superfamily - - "AN1-like Zinc finger; Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues." Q#7609 - CGI_10002459 superfamily 247684 174 578 2.31E-64 222.538 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7609 - CGI_10002459 superfamily 247684 721 890 7.75E-26 110.515 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7611 - CGI_10009747 superfamily 245201 17 291 3.11E-153 433.627 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7614 - CGI_10009750 superfamily 241609 55 138 6.81E-22 84.3519 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#7614 - CGI_10009750 superfamily 241609 1 52 1.03E-11 56.9306 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#7615 - CGI_10009751 superfamily 241589 109 197 4.96E-18 77.3211 cl00071 GLECT superfamily C - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#7615 - CGI_10009751 superfamily 241589 2 93 2.86E-14 66.5036 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#7617 - CGI_10009753 superfamily 248097 131 257 2.26E-27 103.114 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#7618 - CGI_10009754 superfamily 243072 43 166 3.48E-24 93.217 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7621 - CGI_10002534 superfamily 248264 31 72 0.00505911 34.135 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#7625 - CGI_10002538 superfamily 241619 34 80 4.12E-05 39.8729 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#7625 - CGI_10002538 superfamily 245847 136 189 0.00161609 36.3806 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#7626 - CGI_10002539 superfamily 199156 101 116 0.000529104 35.4968 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#7630 - CGI_10002615 superfamily 248097 13 133 3.42E-19 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#7630 - CGI_10002615 superfamily 248097 215 334 1.94E-17 76.9202 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#7630 - CGI_10002615 superfamily 248097 135 202 7.03E-09 52.6526 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#7631 - CGI_10002388 superfamily 245213 100 136 1.49E-08 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7633 - CGI_10002410 superfamily 248338 183 284 0.000348933 40.6625 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#7634 - CGI_10003031 superfamily 243082 32 243 5.36E-21 90.235 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#7634 - CGI_10003031 superfamily 243082 263 386 3.90E-14 70.2046 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#7635 - CGI_10003032 superfamily 245819 817 995 3.09E-56 193.565 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#7635 - CGI_10003032 superfamily 245201 497 744 1.90E-37 143.061 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7635 - CGI_10003032 superfamily 245225 25 373 3.07E-33 133.591 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#7635 - CGI_10003032 superfamily 219526 763 804 2.85E-07 51.0807 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#7642 - CGI_10000979 superfamily 243091 39 150 2.45E-16 71.2115 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#7643 - CGI_10027495 superfamily 241563 72 110 0.000130209 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7643 - CGI_10027495 superfamily 217579 105 172 0.0057649 35.8725 cl04094 XH superfamily C - XH domain; The XH (rice gene X Homology) domain is found in a family of plant proteins including gene X. The molecular function of these proteins is unknown. However these proteins usually contain an XS domain that is also found in the PTGS protein SGS3. This domain contains a conserved glutamate residue that may be functionally important. Q#7644 - CGI_10027496 superfamily 245027 106 217 0.00147681 37.758 cl09176 FlgN superfamily - - FlgN protein; This family includes the FlgN protein and export chaperone involved in flagellar synthesis. Q#7644 - CGI_10027496 superfamily 241563 72 110 0.00173119 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7646 - CGI_10027498 superfamily 246751 72 367 5.75E-119 348.849 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#7647 - CGI_10027499 superfamily 246751 57 352 1.11E-113 334.597 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#7648 - CGI_10027500 superfamily 243096 68 248 1.26E-32 126.258 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#7649 - CGI_10027501 superfamily 241782 21 382 5.21E-150 431.714 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#7652 - CGI_10027504 superfamily 247792 234 278 3.60E-06 43.2032 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7652 - CGI_10027504 superfamily 218247 22 214 1.07E-33 123.641 cl04727 Pex2_Pex12 superfamily - - "Pex2 / Pex12 amino terminal region; This region is found at the N terminal of a number of known and predicted peroxins including Pex2, Pex10 and Pex12. This conserved region is usually associated with a C terminal ring finger (pfam00097) domain." Q#7658 - CGI_10027511 superfamily 216363 92 197 2.16E-26 98.6965 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#7659 - CGI_10027512 superfamily 248012 22 158 3.41E-10 53.8165 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#7660 - CGI_10027513 superfamily 248012 516 657 1.44E-07 50.3497 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#7660 - CGI_10027513 superfamily 214507 399 450 3.24E-07 48.1952 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#7661 - CGI_10027514 superfamily 245201 639 955 2.39E-49 176.75 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7662 - CGI_10027515 superfamily 241645 5 116 1.22E-64 193.991 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#7663 - CGI_10027516 superfamily 243132 7 141 2.82E-47 170.637 cl02661 A_deamin superfamily C - "Adenosine-deaminase (editase) domain; Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defence against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc." Q#7663 - CGI_10027516 superfamily 243132 321 543 4.53E-45 163.284 cl02661 A_deamin superfamily N - "Adenosine-deaminase (editase) domain; Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defence against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc." Q#7666 - CGI_10027519 superfamily 216316 20 234 2.96E-72 230.594 cl10574 CD36 superfamily N - CD36 family; The CD36 family is thought to be a novel class of scavenger receptors. There is also evidence suggesting a possible role in signal transduction. CD36 is involved in cell adhesion. Q#7667 - CGI_10027520 superfamily 217380 39 316 4.60E-88 271.12 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#7668 - CGI_10027521 superfamily 216456 569 721 1.03E-15 78.5194 cl03182 RYDR_ITPR superfamily - - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#7668 - CGI_10027521 superfamily 219849 1884 1975 6.43E-12 65.2835 cl09597 RIH_assoc superfamily - - "RyR and IP3R Homology associated; This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1,4,5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels. There seems to be no known function for this domain. Also see the IP3-binding domain pfam01365 and pfam02815." Q#7675 - CGI_10027528 superfamily 247755 578 813 2.76E-150 443.595 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#7675 - CGI_10027528 superfamily 216049 247 420 9.45E-06 46.8954 cl18356 ABC_membrane superfamily C - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#7676 - CGI_10027529 superfamily 241867 26 283 7.50E-119 344.052 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#7677 - CGI_10027530 superfamily 245601 2685 2982 1.82E-37 144.439 cl11399 HP superfamily - - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#7677 - CGI_10027530 superfamily 245213 2250 2282 1.64E-07 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1266 1298 9.89E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 2334 2377 1.10E-05 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 2067 2101 8.04E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1095 1129 8.04E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 2505 2545 9.83E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 2291 2328 0.000311897 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1307 1343 0.000846663 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1390 1433 0.00390203 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1475 1515 0.00497783 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 2195 2228 0.00661794 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 241578 1511 1553 6.64E-08 54.6983 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7677 - CGI_10027530 superfamily 246918 1646 1692 4.51E-07 49.8927 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7677 - CGI_10027530 superfamily 241578 2590 2630 1.10E-06 50.8464 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7677 - CGI_10027530 superfamily 243124 295 437 1.33E-05 46.6513 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#7677 - CGI_10027530 superfamily 245213 166 209 1.60E-05 45.0324 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 241578 1130 1169 2.87E-05 46.6092 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7677 - CGI_10027530 superfamily 241578 2102 2141 7.30E-05 45.4536 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7677 - CGI_10027530 superfamily 245213 2546 2590 0.000360051 41.1804 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1223 1261 0.000476573 40.7952 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 246918 1822 1871 0.000489004 41.0331 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7677 - CGI_10027530 superfamily 246918 13 62 0.000626463 40.6479 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#7677 - CGI_10027530 superfamily 221695 1370 1393 0.00185162 38.9754 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#7677 - CGI_10027530 superfamily 245213 1052 1086 0.00307216 38.484 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1436 1470 0.0038134 38.0988 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1350 1381 0.00424043 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 2024 2057 0.0044608 38.0988 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 1599 1635 0.00683887 37.2264 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7677 - CGI_10027530 superfamily 245213 2464 2497 0.00849291 36.9432 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#7678 - CGI_10027531 superfamily 217945 172 247 5.55E-14 69.5625 cl18435 B-block_TFIIIC superfamily - - B-block binding subunit of TFIIIC; Yeast transcription factor IIIC (TFIIIC) is a multi-subunit protein complex that interacts with two control elements of class III promoters called the A and B blocks. This family represents the subunit within TFIIIC involved in B-block binding. Q#7679 - CGI_10027532 superfamily 220692 61 355 3.78E-21 92.6525 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#7680 - CGI_10027533 superfamily 245870 503 623 2.40E-13 67.7291 cl12097 DUF1772 superfamily - - Domain of unknown function (DUF1772); This domain is of unknown function. Q#7681 - CGI_10027534 superfamily 245870 23 156 2.24E-12 60.0251 cl12097 DUF1772 superfamily - - Domain of unknown function (DUF1772); This domain is of unknown function. Q#7682 - CGI_10027535 superfamily 245870 23 156 1.79E-13 63.1067 cl12097 DUF1772 superfamily - - Domain of unknown function (DUF1772); This domain is of unknown function. Q#7687 - CGI_10027540 superfamily 193687 39 118 8.88E-35 117.639 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#7687 - CGI_10027540 superfamily 245596 1 17 4.17E-06 43.3533 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#7688 - CGI_10027541 superfamily 241782 64 482 3.70E-85 270.594 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#7689 - CGI_10027542 superfamily 245814 450 519 1.55E-12 65.2031 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7689 - CGI_10027542 superfamily 245814 344 415 2.22E-11 62.1376 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7689 - CGI_10027542 superfamily 245814 232 304 1.02E-06 48.2705 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7689 - CGI_10027542 superfamily 245814 11 55 0.00304093 37.4849 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#7690 - CGI_10027543 superfamily 247723 274 357 1.76E-32 121.893 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7690 - CGI_10027543 superfamily 247723 144 224 2.03E-28 110.161 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7690 - CGI_10027543 superfamily 243107 759 803 1.81E-15 72.1925 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#7691 - CGI_10027544 superfamily 220771 1 33 3.12E-07 42.8057 cl11113 APC_CDC26 superfamily C - "Anaphase-promoting complex APC subunit 1; The anaphase-promoting complex (APC) or cyclosome is a cell cycle-regulated ubiquitin-protein ligase that regulates important events in mitosis such as the initiation of anaphase and exit from telophase. The APC, in conjunction with other enzymes, assembles multi-ubiquitin chains on a variety of regulatory proteins thereby targeting them for proteolysis by the 26S proteasome. CDC26 is one of the nine or so subunits identified within APC but its exact function is not known. The APC/C becomes active at the metaphase/anaphase transition and remains active during G1 phase. One mechanism linked to activation of the APC/C is phosphorylation. The yeast APC/C is composed of at least 13 subunits, but the function of many of the subunits is unknown. Hcn1 is the smallest subunit of the S. pombe APC/C, and is found to be essential for cell viability, APC/C integrity, and proper APC/C regulation. In addition, Hcn1 phosphorylation indicates a specific role for the phosphorylation of this subunit late in the cell cycle." Q#7692 - CGI_10027545 superfamily 241600 307 384 1.41E-27 113.104 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7693 - CGI_10027546 superfamily 247684 38 418 7.31E-61 214.449 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7693 - CGI_10027546 superfamily 247684 677 964 1.21E-56 202.507 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7694 - CGI_10027547 superfamily 243124 99 217 4.67E-29 109.054 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#7700 - CGI_10004285 superfamily 241559 172 228 3.18E-05 41.1471 cl00030 CH superfamily N - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#7701 - CGI_10004286 superfamily 198867 14 50 5.21E-06 41.9433 cl06652 BACK superfamily N - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#7704 - CGI_10005988 superfamily 246671 162 259 4.55E-15 73.6112 cl14606 Reeler_cohesin_like superfamily N - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#7704 - CGI_10005988 superfamily 241563 498 532 0.000697306 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7704 - CGI_10005988 superfamily 242530 536 600 0.00187283 37.866 cl01483 DUF964 superfamily C - Protein of unknown function (DUF964); This family consists of several relatively short bacterial and archaeal hypothetical sequences. The function of this family is unknown. Q#7707 - CGI_10005991 superfamily 220691 86 263 0.00624391 36.4418 cl18569 7TM_GPCR_Srv superfamily N - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#7708 - CGI_10005993 superfamily 110440 523 550 0.00154667 36.6169 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#7711 - CGI_10005996 superfamily 247725 1132 1259 3.10E-56 192.122 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7711 - CGI_10005996 superfamily 219738 965 1054 1.55E-06 48.1899 cl06980 Anillin superfamily C - "Cell division protein anillin; Anillin is a protein involved in septin organisation during cell division. It is an actin binding protein that is localised to the cleavage furrow, and it maintains the localisation of active myosin, which ensures the spatial control of concerted contraction during cytokinesis." Q#7716 - CGI_10024455 superfamily 218609 21 83 1.16E-22 85.1179 cl05189 Destabilase superfamily N - "Destabilase; Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine." Q#7717 - CGI_10024456 superfamily 217293 33 232 1.08E-35 131.216 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#7717 - CGI_10024456 superfamily 202474 239 335 8.65E-14 69.2197 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7718 - CGI_10024457 superfamily 218609 1 94 1.43E-36 121.327 cl05189 Destabilase superfamily - - "Destabilase; Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine." Q#7719 - CGI_10024458 superfamily 243035 7 68 5.13E-20 77.7215 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7722 - CGI_10024461 superfamily 245864 2 421 1.23E-105 323.846 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#7723 - CGI_10024462 superfamily 245864 11 403 1.13E-105 323.076 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#7724 - CGI_10024463 superfamily 241816 1 93 4.83E-19 76.4253 cl00364 Ribosomal_L7_L12 superfamily N - "Ribosomal protein L7/L12. Ribosomal protein L7/L12 refers to the large ribosomal subunit proteins L7 and L12, which are identical except that L7 is acetylated at the N terminus. It is a component of the L7/L12 stalk, which is located at the surface of the ribosome. The stalk base consists of a portion of the 23S rRNA and ribosomal proteins L11 and L10. An extended C-terminal helix of L10 provides the binding site for L7/L12. L7/L12 consists of two domains joined by a flexible hinge, with the helical N-terminal domain (NTD) forming pairs of homodimers that bind to the extended helix of L10. It is the only multimeric ribosomal component, with either four or six copies per ribosome that occur as two or three dimers bound to the L10 helix. L7/L12 is the only ribosomal protein that does not interact directly with rRNA, but instead has indirect interactions through L10. The globular C-terminal domains of L7/L12 are highly mobile. They are exposed to the cytoplasm and contain binding sites for other molecules. Initiation factors, elongation factors, and release factors are known to interact with the L7/L12 stalk during their GTP-dependent cycles. The binding site for the factors EF-Tu and EF-G comprises L7/L12, L10, L11, the L11-binding region of 23S rRNA, and the sarcin-ricin loop of 23S rRNA. Removal of L7/L12 has minimal effect on factor binding and it has been proposed that L7/L12 induces the catalytically active conformation of EF-Tu and EF-G, thereby stimulating the GTPase activity of both factors. In eukaryotes, the proteins that perform the equivalent function to L7/L12 are called P1 and P2, which do not share sequence similarity with L7/L12. However, a bacterial L7/L12 homolog is found in some eukaryotes, in mitochondria and chloroplasts. In archaea, the protein equivalent to L7/L12 is called aL12 or L12p, but it is closer in sequence to P1 and P2 than to L7/L12." Q#7725 - CGI_10024464 superfamily 241565 329 402 6.26E-05 41.5383 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#7725 - CGI_10024464 superfamily 115393 6 274 3.16E-135 398.288 cl05998 Pescadillo_N superfamily - - "Pescadillo N-terminus; This family represents the N-terminal region of Pescadillo. Pescadillo protein localises to distinct substructures of the interphase nucleus including nucleoli, the site of ribosome biogenesis. During mitosis pescadillo closely associates with the periphery of metaphase chromosomes and by late anaphase is associated with nucleolus-derived foci and prenucleolar bodies. Blastomeres in mouse embryos lacking pescadillo arrest at morula stages of development, the nucleoli fail to differentiate and accumulation of ribosomes is inhibited. It has been proposed that in mammalian cells pescadillo is essential for ribosome biogenesis and nucleologenesis and that disruption to its function results in cell cycle arrest. This family is often found in conjunction with a pfam00533 domain." Q#7726 - CGI_10024465 superfamily 177822 38 140 9.71E-09 52.6149 cl18088 PLN02164 superfamily C - sulfotransferase Q#7727 - CGI_10024466 superfamily 241578 27 193 4.40E-13 65.7758 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#7727 - CGI_10024466 superfamily 243119 303 353 5.60E-13 62.8364 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7727 - CGI_10024466 superfamily 243119 247 295 1.08E-06 45.1172 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#7729 - CGI_10024468 superfamily 191120 26 273 2.17E-113 336.169 cl04821 DUF647 superfamily - - "Protein of unknown function (DUF647); In plants, this domain plays a role in auxin-transport, plant growth and development." Q#7730 - CGI_10024469 superfamily 245201 492 752 6.52E-44 159.239 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7730 - CGI_10024469 superfamily 205721 22 147 5.45E-33 124.386 cl16297 KSR1-SAM superfamily - - SAM like domain present in kinase suppressor RAS 1; SAM like domain present in kinase suppressor RAS 1. Q#7731 - CGI_10024470 superfamily 247743 31 172 7.30E-25 97.9871 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#7731 - CGI_10024470 superfamily 203973 236 325 1.88E-24 95.266 cl16006 Rep_fac_C superfamily - - "Replication factor C C-terminal domain; This is the C-terminal domain of RFC (replication factor-C) protein of the clamp loader complex which binds to the DNA sliding clamp (proliferating cell nuclear antigen, PCNA). The five modules of RFC assemble into a right-handed spiral, which results in only three of the five RFC subunits (RFC-A, RFC-B and RFC-C) making contact with PCNA, leaving a wedge-shaped gap between RFC-E and the PCNA clamp-loader complex. The C-terminal is vital for the correct orientation of RFC-E with respect to RFC-A." Q#7733 - CGI_10024472 superfamily 241626 11 86 3.36E-19 76.5041 cl00125 RHOD superfamily N - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#7734 - CGI_10024473 superfamily 241626 6 110 1.20E-28 105.009 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#7734 - CGI_10024473 superfamily 241626 157 204 4.05E-10 54.1762 cl00125 RHOD superfamily N - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#7736 - CGI_10024475 superfamily 204801 517 635 3.48E-67 219.849 cl13421 MVP_shoulder superfamily - - Shoulder domain; This domain is found in the Major Vault Protein and has been called the shoulder domain. This family includes two bacterial proteins. This suggests that some bacteria may possess vault particles. Q#7736 - CGI_10024475 superfamily 201831 114 155 2.43E-15 71.8116 cl03238 Vault superfamily - - Major Vault Protein repeat; The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein. Q#7736 - CGI_10024475 superfamily 201831 167 209 5.91E-11 59.1 cl03238 Vault superfamily - - Major Vault Protein repeat; The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein. Q#7736 - CGI_10024475 superfamily 201831 220 264 2.17E-09 54.4776 cl03238 Vault superfamily - - Major Vault Protein repeat; The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein. Q#7736 - CGI_10024475 superfamily 201831 325 365 1.12E-08 52.5516 cl03238 Vault superfamily - - Major Vault Protein repeat; The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein. Q#7737 - CGI_10024476 superfamily 217539 131 341 4.88E-14 69.274 cl18414 Nucleotid_trans superfamily - - Nucleotide-diphospho-sugar transferase; Proteins in this family have been been predicted to be nucleotide-diphospho-sugar transferases. Q#7738 - CGI_10024477 superfamily 221887 5 107 1.62E-38 131.483 cl15229 ING superfamily - - "Inhibitor of growth proteins N-terminal histone-binding; Histones undergo numerous post-translational modifications, including acetylation and methylation, at residues which are then probable docking sites for various chromatin remodelling complexes. Inhibitor of growth proteins (INGs) specifically bind to residues that have been thus modified. INGs carry a well-characterized C-terminal PHD-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3), as well as this N-terminal domain that binds unmodified H3 tails. Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail." Q#7738 - CGI_10024477 superfamily 247999 192 239 1.64E-09 52.1076 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#7739 - CGI_10024478 superfamily 243109 4 166 3.16E-82 266.291 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#7739 - CGI_10024478 superfamily 243109 224 392 2.11E-71 236.245 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#7739 - CGI_10024478 superfamily 243109 437 598 1.45E-69 231.237 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#7739 - CGI_10024478 superfamily 243109 627 791 6.96E-63 212.363 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#7739 - CGI_10024478 superfamily 243109 831 989 5.16E-58 198.495 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#7740 - CGI_10024479 superfamily 241656 8 185 7.83E-43 142.975 cl00169 Mog1 superfamily - - "homolog to Ran-Binding Protein Mog1p; binds to the small GTPase Ran, which plays an important role in nuclear import. Binding is independent of Ran's nucleotide state (RanGTP/RanGDP)" Q#7741 - CGI_10024480 superfamily 241567 137 196 7.33E-21 87.6559 cl00042 CASc superfamily NC - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#7741 - CGI_10024480 superfamily 246680 2 31 3.24E-06 43.3444 cl14633 DD_superfamily superfamily N - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#7747 - CGI_10024486 superfamily 243110 119 205 8.64E-05 43.1869 cl02616 MACPF superfamily C - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#7748 - CGI_10024487 superfamily 243110 18 66 1.76E-07 50.1205 cl02616 MACPF superfamily C - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#7751 - CGI_10024490 superfamily 199166 136 244 5.36E-07 48.4776 cl15308 AMN1 superfamily C - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#7751 - CGI_10024490 superfamily 224772 6 50 0.00132467 38.0825 cl15312 KptA superfamily C - "RNA:NAD 2'-phosphotransferase [Translation, ribosomal structure and biogenesis]" Q#7752 - CGI_10024491 superfamily 242156 58 171 1.41E-52 165.389 cl00869 PTH2_family superfamily - - "Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes." Q#7753 - CGI_10024492 superfamily 241752 1866 1985 2.17E-46 165.185 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#7753 - CGI_10024492 superfamily 247723 351 426 3.21E-16 76.572 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7753 - CGI_10024492 superfamily 247723 541 612 5.24E-15 73.1053 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7753 - CGI_10024492 superfamily 241554 1387 1496 1.06E-18 85.3899 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7753 - CGI_10024492 superfamily 247723 441 519 3.67E-11 61.9345 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7753 - CGI_10024492 superfamily 247723 633 684 2.80E-05 44.1804 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7753 - CGI_10024492 superfamily 241554 1257 1278 0.000439287 41.7236 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7753 - CGI_10024492 superfamily 247723 699 747 0.000756333 39.9781 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7754 - CGI_10024493 superfamily 247794 68 386 4.08E-133 390.337 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#7755 - CGI_10024494 superfamily 247792 319 373 4.99E-13 63.5091 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7755 - CGI_10024494 superfamily 214806 204 295 2.11E-09 54.2238 cl15966 CRA superfamily - - "CT11-RanBPM; protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi)" Q#7755 - CGI_10024494 superfamily 199226 112 145 0.00170685 35.8732 cl11662 LisH superfamily - - "LisH; The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex." Q#7755 - CGI_10024494 superfamily 128914 169 208 0.00179976 36.0098 cl15352 CTLH superfamily N - C-terminal to LisH motif; Alpha-helical motif of unknown function. Q#7760 - CGI_10024500 superfamily 192467 45 160 3.08E-40 134.877 cl10863 NEP superfamily - - Uncharacterized conserved protein; This is the N-terminal 80 residues of a family of proteins conserved from plants to humans. It contains a characteristic NEP sequence motif. The function is not known. Q#7761 - CGI_10024501 superfamily 247787 138 419 0 590.003 cl17233 RecA-like_NTPases superfamily - - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#7761 - CGI_10024501 superfamily 215848 428 529 3.71E-24 97.3505 cl08258 ATP-synt_ab_C superfamily - - "ATP synthase alpha/beta chain, C terminal domain; ATP synthase alpha/beta chain, C terminal domain. " Q#7761 - CGI_10024501 superfamily 217261 69 136 2.17E-16 74.484 cl18399 ATP-synt_ab_N superfamily - - "ATP synthase alpha/beta family, beta-barrel domain; This family includes the ATP synthase alpha and beta subunits the ATP synthase associated with flagella." Q#7762 - CGI_10024502 superfamily 207794 1 332 1.69E-159 456.754 cl02948 GH20_hexosaminidase superfamily C - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#7763 - CGI_10024504 superfamily 241563 72 109 4.68E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7763 - CGI_10024504 superfamily 241563 28 59 0.00117454 37.0736 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7763 - CGI_10024504 superfamily 246954 122 207 0.00212056 39.3238 cl15415 Sec1 superfamily NC - Sec1 family; Sec1 family. Q#7764 - CGI_10010094 superfamily 247692 1 171 8.34E-24 95.4357 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#7765 - CGI_10010095 superfamily 245201 48 291 3.81E-86 261.685 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7767 - CGI_10010097 superfamily 241563 383 419 0.00338166 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7769 - CGI_10016037 superfamily 243134 41 137 5.20E-15 66.904 cl02663 Fasciclin superfamily C - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#7770 - CGI_10016038 superfamily 243134 43 161 3.70E-18 75.7636 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#7770 - CGI_10016038 superfamily 243134 1 28 3.44E-05 39.94 cl02663 Fasciclin superfamily N - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#7772 - CGI_10016040 superfamily 243082 1 232 5.53E-135 390.141 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#7779 - CGI_10016047 superfamily 192535 59 253 0.00518642 36.805 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#7780 - CGI_10016048 superfamily 221340 169 360 6.82E-19 82.4464 cl13405 DUF3472 superfamily - - "Domain of unknown function (DUF3472); This presumed domain is functionally uncharacterized. This domain is found in bacteria, eukaryotes and viruses. This domain is typically between 174 to 190 amino acids in length. This domain has a single completely conserved residue G that may be functionally important." Q#7784 - CGI_10016052 superfamily 243035 154 229 0.000470121 37.9846 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7786 - CGI_10016054 superfamily 207713 140 197 8.73E-05 39.6101 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#7787 - CGI_10016055 superfamily 241554 1 158 1.44E-37 135.861 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7787 - CGI_10016055 superfamily 241554 210 348 1.34E-27 106.961 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7787 - CGI_10016055 superfamily 241554 415 464 3.28E-13 66.5151 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7788 - CGI_10016056 superfamily 207713 284 341 1.97E-05 43.0769 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#7788 - CGI_10016056 superfamily 241554 56 86 0.000205766 40.3215 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#7789 - CGI_10016057 superfamily 247723 7 78 1.87E-12 62.3197 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7790 - CGI_10016058 superfamily 247723 82 156 1.31E-09 52.2696 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7791 - CGI_10016060 superfamily 241752 44 159 5.82E-37 125.124 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#7793 - CGI_10016518 superfamily 242220 28 234 7.09E-58 185.105 cl00957 Translin superfamily - - "Translin family; Members of this family include Translin that interacts with DNA and forms a ring around the DNA. This family also includes human translin-associated protein X, which was found to interact with translin with yeast two-hybrid screen." Q#7794 - CGI_10016519 superfamily 241754 1 334 3.23E-171 515.709 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#7794 - CGI_10016519 superfamily 243088 1180 1305 2.20E-48 169.869 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#7794 - CGI_10016519 superfamily 241581 442 508 7.93E-06 45.2799 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#7794 - CGI_10016519 superfamily 243082 932 1040 0.00443959 39.3969 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#7795 - CGI_10016520 superfamily 219143 2 118 6.18E-37 126.284 cl05971 PIG-F superfamily N - GPI biosynthesis protein family Pig-F; PIG-F is involved in glycosylphosphatidylinositol (GPI) anchor biosynthesis. Q#7796 - CGI_10016521 superfamily 243074 12 57 1.74E-09 49.8125 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#7797 - CGI_10016522 superfamily 245660 224 416 0.00437287 38.467 cl11493 PQQ_DH_like superfamily C - "PQQ-dependent dehydrogenases and related proteins; This family is composed of dehydrogenases with pyrroloquinoline quinone (PQQ) as a cofactor, such as ethanol, methanol, and membrane-bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller, and the family also includes distantly related proteins which are not enzymatically active and do not bind PQQ." Q#7798 - CGI_10016523 superfamily 243054 119 332 7.78E-10 58.9964 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#7799 - CGI_10016524 superfamily 219078 179 310 1.52E-05 43.2687 cl12333 DUF1113 superfamily - - Protein of unknown function (DUF1113); This family consists of several bacterial proteins of unknown function. Q#7800 - CGI_10016525 superfamily 241900 137 461 9.92E-29 114.708 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#7802 - CGI_10016527 superfamily 247755 805 1024 9.03E-114 352.952 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#7802 - CGI_10016527 superfamily 247755 182 406 8.07E-86 276.658 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#7802 - CGI_10016527 superfamily 216049 517 760 2.56E-28 116.231 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#7802 - CGI_10016527 superfamily 216049 19 135 2.83E-12 66.9258 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#7803 - CGI_10016528 superfamily 245201 36 324 2.15E-176 514.724 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7803 - CGI_10016528 superfamily 245201 418 671 7.13E-57 196.202 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7803 - CGI_10016528 superfamily 245597 306 366 3.79E-15 72.0079 cl11395 Pkinase_C superfamily - - Protein kinase C terminal domain; Protein kinase C terminal domain. Q#7804 - CGI_10016529 superfamily 241557 27 185 1.18E-64 198.511 cl00022 YbaK_like superfamily - - "YbaK-like. The YbaK family of deacylase domains includes the INS amino acid-editing domain of the bacterial class II prolyl tRNA synthetase (ProRS), and it's trans-acting homologs, YbaK, ProX, and PrdX. The primary function of INS is to hydrolyze mischarged cysteinyl-tRNA(Pro)'s, thus helping ensure the fidelity of translation. Organisms whose ProRS lacks the INS domain express an INS homolog in trans (e.g. YbaK, ProX, or PrdX)." Q#7805 - CGI_10016530 superfamily 245227 31 875 0 1485.4 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#7808 - CGI_10016533 superfamily 247875 81 287 1.12E-40 143.572 cl17321 2OG-FeII_Oxy_2 superfamily - - 2OG-Fe(II) oxygenase superfamily; 2OG-Fe(II) oxygenase superfamily. Q#7809 - CGI_10016534 superfamily 244824 102 536 3.72E-121 369.766 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#7811 - CGI_10016536 superfamily 191210 48 294 1.58E-137 392.056 cl04959 DUF706 superfamily - - "Family of unknown function (DUF706); Family of uncharacterized eukaryotic function. Some members have a described putative function, but a common theme is not evident." Q#7812 - CGI_10016537 superfamily 247683 235 289 1.30E-26 102.659 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#7812 - CGI_10016537 superfamily 241622 140 220 3.59E-13 65.6658 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#7812 - CGI_10016537 superfamily 247744 349 527 1.51E-27 108.919 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#7814 - CGI_10016539 superfamily 216709 254 402 7.70E-73 230.241 cl03357 Nop superfamily - - Putative snoRNA binding domain; This family consists of various Pre RNA processing ribonucleoproteins. The function of the aligned region is unknown however it may be a common RNA or snoRNA or Nop1p binding domain. Nop5p (Nop58p) from yeast is the protein component of a ribonucleoprotein protein required for pre-18s rRNA processing and is suggested to function with Nop1p in a snoRNA complex. Nop56p and Nop5p interact with Nop1p and are required for ribosome biogenesis. Prp31p is required for pre-mRNA splicing in S. cerevisiae. Q#7814 - CGI_10016539 superfamily 208568 163 213 4.56E-24 95.2664 cl06890 NOSIC superfamily - - NOSIC (NUC001) domain; This is the central domain in Nop56/SIK1-like proteins. Q#7814 - CGI_10016539 superfamily 219731 1 66 6.90E-10 55.6508 cl06964 NOP5NT superfamily - - NOP5NT (NUC127) domain; This N terminal domain is found in RNA-binding proteins of the NOP5 family. Q#7815 - CGI_10016540 superfamily 245819 423 557 2.49E-35 132.318 cl11967 Nucleotidyl_cyc_III superfamily C - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#7815 - CGI_10016540 superfamily 219812 87 304 9.68E-13 67.3312 cl07121 NIT superfamily - - "Nitrate and nitrite sensing; The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure." Q#7815 - CGI_10016540 superfamily 219526 383 409 4.10E-07 49.9251 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#7816 - CGI_10016541 superfamily 245819 407 583 1.92E-56 190.868 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#7816 - CGI_10016541 superfamily 219812 69 223 5.32E-13 68.1016 cl07121 NIT superfamily C - "Nitrate and nitrite sensing; The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure." Q#7816 - CGI_10016541 superfamily 219526 321 393 5.75E-06 46.4583 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#7817 - CGI_10016542 superfamily 243303 5 74 8.01E-43 135.825 cl03104 CKS superfamily - - Cyclin-dependent kinase regulatory subunit; Cyclin-dependent kinase regulatory subunit. Q#7818 - CGI_10016543 superfamily 242166 61 177 3.50E-35 121.103 cl00881 SQR_QFR_TM superfamily - - "Succinate:quinone oxidoreductase (SQR) and Quinol:fumarate reductase (QFR) family, transmembrane subunits; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol, while QFR catalyzes the reverse reaction. SQR, also called succinate dehydrogenase or Complex II, is part of the citric acid cycle and the aerobic respiratory chain, while QFR is involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQRs may reduce either high or low potential quinones while QFRs oxidize only low potential quinols. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit(s) containing the electron donor/acceptor (quinol or quinone). The reversible reduction of quinone is an essential feature of respiration, allowing the transfer of electrons between respiratory complexes. SQRs and QFRs can be classified into five types (A-E) according to the number of their hydrophobic subunits and heme groups. This classification is consistent with the characteristics and phylogeny of the catalytic and iron-sulfur subunits. Type E proteins, e.g. non-classical archael SQRs, contain atypical transmembrane subunits and are not included in this hierarchy. The heme and quinone binding sites reside in the transmembrane subunits. Although succinate oxidation and fumarate reduction are carried out by separate enzymes in most organisms, some bifunctional enzymes that exhibit both SQR and QFR activities exist." Q#7819 - CGI_10016544 superfamily 222150 495 518 5.76E-06 44.6901 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7819 - CGI_10016544 superfamily 222150 691 714 0.0055524 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7819 - CGI_10016544 superfamily 222150 775 798 0.00854976 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7820 - CGI_10016545 superfamily 246748 138 433 4.85E-86 274.67 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#7821 - CGI_10016546 superfamily 241619 8 58 0.000210296 39.4844 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#7821 - CGI_10016546 superfamily 219525 210 256 0.00032888 38.5542 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#7821 - CGI_10016546 superfamily 219525 352 391 0.000563738 37.7838 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#7821 - CGI_10016546 superfamily 219525 176 222 0.000575661 37.7838 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#7821 - CGI_10016546 superfamily 219525 318 364 0.00062967 37.7838 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#7822 - CGI_10016547 superfamily 219525 302 348 0.000677902 37.7838 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#7822 - CGI_10016547 superfamily 219525 251 285 0.0012864 37.0134 cl06646 GCC2_GCC3 superfamily C - GCC2 and GCC3; GCC2 and GCC3. Q#7822 - CGI_10016547 superfamily 219525 387 433 0.00413851 35.4726 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#7822 - CGI_10016547 superfamily 219525 200 239 0.00796748 34.7022 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#7822 - CGI_10016547 superfamily 219525 337 377 0.00806189 34.7022 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#7823 - CGI_10016548 superfamily 219525 128 167 0.00680749 32.7762 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#7824 - CGI_10003493 superfamily 244880 2 161 1.16E-18 78.7421 cl08263 TBP_TLF superfamily - - "TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. New members of the TBP family, called TBP-like proteins (TBLP, TLF, TLP) or TBP-related factors (TRF1, TRF2,TRP), are similar to the core domain of TBPs, with identical or chemically similar amino acids at many equivalent positions, suggesting similar structure. However, TLFs contain distinct, conserved amino acids at several positions that distinguish them from TBP." Q#7827 - CGI_10003497 superfamily 216939 9 55 5.92E-05 38.0277 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#7828 - CGI_10003498 superfamily 241564 30 99 7.98E-20 77.3059 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#7829 - CGI_10009367 superfamily 247725 1 48 1.44E-16 70.7275 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7836 - CGI_10009374 superfamily 241645 132 194 0.000493991 38.7874 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#7836 - CGI_10009374 superfamily 241645 524 603 1.36E-27 106.745 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#7836 - CGI_10009374 superfamily 241832 380 441 2.63E-07 49.4121 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#7837 - CGI_10009375 superfamily 243034 709 801 0.00152554 38.5152 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#7837 - CGI_10009375 superfamily 247743 265 411 0.00114792 39.4904 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#7838 - CGI_10009376 superfamily 247727 632 709 8.42E-05 41.6467 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#7839 - CGI_10009377 superfamily 243045 42 143 0.00238477 36.4572 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#7840 - CGI_10009378 superfamily 248458 51 207 4.17E-08 53.8569 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7840 - CGI_10009378 superfamily 248458 429 525 6.57E-08 53.0865 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#7841 - CGI_10009379 superfamily 244824 68 521 0 584.579 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#7842 - CGI_10009380 superfamily 246597 3 287 0 546.824 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#7843 - CGI_10021654 superfamily 222150 45 70 0.000236644 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7843 - CGI_10021654 superfamily 222150 130 154 0.00094706 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7843 - CGI_10021654 superfamily 222150 17 40 0.00165046 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7843 - CGI_10021654 superfamily 222150 101 126 0.00460294 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7843 - CGI_10021654 superfamily 222150 157 182 0.00956112 33.9045 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#7844 - CGI_10021655 superfamily 247639 98 343 2.17E-47 163.016 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#7846 - CGI_10021657 superfamily 241610 1 47 1.34E-12 58.0302 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#7847 - CGI_10021658 superfamily 243035 43 86 1.76E-12 58.4615 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7848 - CGI_10021659 superfamily 241584 44 113 8.94E-05 40.9427 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7848 - CGI_10021659 superfamily 241584 167 202 0.00487319 35.5499 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7849 - CGI_10021660 superfamily 241737 11 167 1.85E-91 266.332 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#7852 - CGI_10021663 superfamily 247757 19 233 3.82E-99 290.135 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#7853 - CGI_10021664 superfamily 247755 514 735 4.67E-107 341.407 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#7853 - CGI_10021664 superfamily 247755 1392 1610 6.15E-103 329.851 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#7854 - CGI_10021665 superfamily 243109 75 257 1.12E-98 291.813 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#7856 - CGI_10021668 superfamily 243066 28 137 2.42E-13 65.7165 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#7856 - CGI_10021668 superfamily 198867 170 253 2.65E-07 48.1064 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#7858 - CGI_10021670 superfamily 241563 72 113 0.000207369 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7860 - CGI_10021672 superfamily 144065 88 154 2.56E-29 106.986 cl02842 Ribosomal_S5 superfamily - - "Ribosomal protein S5, N-terminal domain; Ribosomal protein S5, N-terminal domain. " Q#7860 - CGI_10021672 superfamily 190724 171 244 6.24E-23 89.7838 cl04231 Ribosomal_S5_C superfamily - - "Ribosomal protein S5, C-terminal domain; Ribosomal protein S5, C-terminal domain. " Q#7861 - CGI_10021673 superfamily 247684 10 183 2.41E-18 81.0959 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7862 - CGI_10021674 superfamily 247684 8 181 6.73E-19 82.6367 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7863 - CGI_10021675 superfamily 247723 81 146 4.46E-24 97.2608 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7864 - CGI_10021676 superfamily 247723 170 245 2.04E-46 155.643 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#7866 - CGI_10021678 superfamily 243058 298 407 1.62E-12 64.6431 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7866 - CGI_10021678 superfamily 243058 416 520 0.000901558 38.0644 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7866 - CGI_10021678 superfamily 243058 101 208 0.00133556 37.6792 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7867 - CGI_10021679 superfamily 247683 2042 2096 5.10E-30 115.862 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#7867 - CGI_10021679 superfamily 247683 1830 1882 2.48E-26 105.125 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#7867 - CGI_10021679 superfamily 247683 1761 1813 7.87E-24 98.1808 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#7867 - CGI_10021679 superfamily 241622 5 80 1.16E-15 75.2958 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#7867 - CGI_10021679 superfamily 207673 1198 1244 2.72E-07 49.9273 cl02617 Sorb superfamily - - Sorbin homologous domain; Sorbin homologous domain. Q#7869 - CGI_10021681 superfamily 241832 17 99 6.88E-19 75.2688 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#7870 - CGI_10021682 superfamily 241622 98 172 4.26E-16 74.9106 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#7874 - CGI_10021686 superfamily 247639 2 251 1.25E-39 139.904 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#7875 - CGI_10021687 superfamily 243058 197 294 2.37E-06 45.7684 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7875 - CGI_10021687 superfamily 248012 318 404 2.16E-19 83.396 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#7876 - CGI_10021688 superfamily 245815 3 198 1.40E-121 356.658 cl11961 ALDH-SF superfamily N - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#7877 - CGI_10021689 superfamily 247684 11 381 3.67E-77 251.043 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#7878 - CGI_10021690 superfamily 247856 202 266 0.0053819 34.0605 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#7879 - CGI_10021691 superfamily 241832 1 69 5.04E-38 125.329 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#7880 - CGI_10021692 superfamily 241806 21 163 7.02E-85 248.4 cl00350 Ribosomal_S19 superfamily - - Ribosomal protein S19; Ribosomal protein S19. Q#7881 - CGI_10021693 superfamily 247748 104 283 9.64E-93 275.704 cl17194 Oxidored_q6 superfamily - - "NADH ubiquinone oxidoreductase, 20 Kd subunit; NADH ubiquinone oxidoreductase, 20 Kd subunit. " Q#7882 - CGI_10021694 superfamily 222681 117 273 3.83E-51 172.786 cl16800 PINIT superfamily - - PINIT domain; The PINIT domain is a protein domain that is found in PIAS proteins. The PINIT domain is about 180 amino acids in length. Q#7882 - CGI_10021694 superfamily 111745 318 367 1.99E-21 87.7433 cl17930 zf-MIZ superfamily - - MIZ/SP-RING zinc finger; This domain has SUMO (small ubiquitin-like modifier) ligase activity and is involved in DNA repair and chromosome organisation. Q#7882 - CGI_10021694 superfamily 207684 11 45 0.000109293 40.1624 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#7884 - CGI_10021696 superfamily 241616 2 78 1.13E-38 126.513 cl00109 MADS superfamily - - "MADS: MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptonal regulators. Binds DNA and exists as hetero and homo-dimers. Composed of 2 main subgroups: SRF-like/Type I and MEF2-like (myocyte enhancer factor 2)/ Type II. These subgroups differ mainly in position of the alpha 2 helix responsible for the dimerization interface; Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi." Q#7886 - CGI_10021698 superfamily 241715 57 167 3.38E-37 125.841 cl00238 Frataxin superfamily - - "Frataxin is a nuclear-encoded mitochondrial protein implicated in Friedreich's ataxia (FRDA), an human autosomal recessive neurodegenerative disease; Frataxin is found in eukaryotes and in purple bacteria; lack of frataxin causes iron to accumulate in the mitochondrial matrix suggesting that frataxin is involved in mitochondrial iron homeostasis and possibly in iron transport; the domain has an alpha-beta fold consisting of two helices flanking an antiparallel beta sheet." Q#7888 - CGI_10021700 superfamily 220372 155 557 0 533.953 cl12369 Det1 superfamily - - "De-etiolated protein 1 Det1; This is the C-terminal conserved 400 residues of Det1 proteins of approximately 550 amino acids. Det1 (de-etiolated-1) is an essential negative regulator of plant light responses, and it is a component of the Arabidopsis CDD complex containing DDB1 and COP10 ubiquitin E2 variant. Mammalian Det1 forms stable DDD-E2 complexes, consisting of DDB1, DDA1 (DET1, DDB1 Associated 1), and a member of the UBE2E group of canonical ubiquitin conjugating enzymes and modulates Cul4A function." Q#7889 - CGI_10021701 superfamily 241616 2 74 5.23E-37 129.594 cl00109 MADS superfamily - - "MADS: MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptonal regulators. Binds DNA and exists as hetero and homo-dimers. Composed of 2 main subgroups: SRF-like/Type I and MEF2-like (myocyte enhancer factor 2)/ Type II. These subgroups differ mainly in position of the alpha 2 helix responsible for the dimerization interface; Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi." Q#7890 - CGI_10021702 superfamily 217293 8 218 4.65E-67 216.73 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#7890 - CGI_10021702 superfamily 202474 226 329 1.45E-06 48.0337 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#7891 - CGI_10021703 superfamily 243179 98 202 1.00E-19 81.3163 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#7894 - CGI_10005252 superfamily 243092 477 798 1.49E-69 232.225 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7894 - CGI_10005252 superfamily 191973 40 174 5.08E-56 188.726 cl07022 Striatin superfamily - - "Striatin family; Striatin is an intracellular protein which has a caveolin-binding motif, a coiled-coil structure, a calmodulin-binding site, and a WD (pfam00400) repeat domain. It acts as a scaffold protein and is involved in signalling pathways." Q#7895 - CGI_10005253 superfamily 192478 244 473 1.16E-123 378.654 cl10883 DUF2356 superfamily - - Conserved protein (DUF2356); This is a 200 amino acid region of a family of proteins conserved from plants to humans. Some members have been putatively annotated as being integrator complex subunit 3 but this could not be confirmed. The function is unknown. Q#7896 - CGI_10005254 superfamily 241739 233 552 1.30E-140 412.73 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#7896 - CGI_10005254 superfamily 245205 137 220 2.16E-37 133.128 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#7897 - CGI_10005255 superfamily 243083 502 587 6.28E-06 45.8472 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#7898 - CGI_10005256 superfamily 219924 103 150 0.000116039 40.4123 cl07276 eIF3_subunit superfamily NC - Translation initiation factor eIF3 subunit; This is a family of proteins which are subunits of the eukaryotic translation initiation factor 3 (eIF3). In yeast it is called Hcr1. The Saccharomyces cerevisiae protein eIF3j (HCR1) has been shown to be required for processing of 20S pre-rRNA and binds to 18S rRNA and eIF3 subunits Rpg1p and Prt1p. Q#7899 - CGI_10000309 superfamily 248100 47 105 7.33E-07 42.1412 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#7900 - CGI_10003551 superfamily 203101 95 246 6.90E-41 141.726 cl04785 Popeye superfamily - - Popeye protein conserved region; The function of Popeye proteins is not well understood. They are predominantly expressed in cardiac and skeletal muscle. This family represents a conserved region which includes three potential transmembrane domains. Q#7901 - CGI_10003553 superfamily 247097 196 232 0.00260149 35.8898 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#7902 - CGI_10003554 superfamily 247724 442 593 4.56E-57 191.251 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#7905 - CGI_10013565 superfamily 245055 14 512 1.41E-83 280.182 cl09326 MATE_like superfamily - - "Multidrug and toxic compound extrusion family and similar proteins; The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria." Q#7906 - CGI_10013566 superfamily 192033 443 677 2.93E-91 286.213 cl07158 DUF1741 superfamily - - Domain of unknown function (DUF1741); This is a eukaryotic domain of unknown function. Q#7907 - CGI_10013567 superfamily 243176 5 400 0 737.56 cl02777 chaperonin_like superfamily C - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#7908 - CGI_10013568 superfamily 243176 1 218 6.17E-151 433.637 cl02777 chaperonin_like superfamily N - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#7909 - CGI_10013569 superfamily 220364 65 180 8.93E-65 201.452 cl10716 Fra10Ac1 superfamily - - "Folate-sensitive fragile site protein Fra10Ac1; This entry represents the full-length proteins in which, in higher eukaryotes, the nested domain EDSLL lies. Fra10Ac1 is a highly conserved protein, of unknown function that is nuclear and highly expressed in brain." Q#7911 - CGI_10013571 superfamily 241574 792 978 1.93E-68 230.936 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#7911 - CGI_10013571 superfamily 241574 312 475 2.84E-33 130.013 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#7911 - CGI_10013571 superfamily 243066 1095 1188 1.36E-17 80.7393 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#7912 - CGI_10013572 superfamily 241574 96 170 1.28E-17 76.8557 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#7913 - CGI_10013573 superfamily 241574 54 270 3.99E-24 97.2713 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#7914 - CGI_10013574 superfamily 150445 17 133 1.19E-15 68.945 cl10753 Cid2 superfamily - - Caffeine-induced death protein 2; Members of this family of proteins mediate the disruption of the DNA replication checkpoint (S-M checkpoint) mechanism caused by caffeine. Q#7915 - CGI_10013575 superfamily 243058 253 364 3.64E-31 117.03 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7915 - CGI_10013575 superfamily 243058 116 238 1.38E-28 109.711 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7915 - CGI_10013575 superfamily 243058 329 447 1.06E-22 93.5331 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#7915 - CGI_10013575 superfamily 201951 4 102 3.32E-26 102.844 cl03339 IBB superfamily - - "Importin beta binding domain; This family consists of the importin alpha (karyopherin alpha), importin beta (karyopherin beta) binding domain. The domain mediates formation of the importin alpha beta complex; required for classical NLS import of proteins into the nucleus, through the nuclear pore complex and across the nuclear envelope. Also in the alignment is the NLS of importin alpha which overlaps with the IBB domain." Q#7916 - CGI_10013576 superfamily 217895 52 169 0.00650401 34.9263 cl04401 CD20 superfamily - - "CD20-like family; This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulfide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probably topology where both amino- and carboxy termini protrude into the cytoplasm. This family also includes LR8 like proteins from humans, mice and rats. The function of the human LR8 protein is unknown although it is known to be strongly expressed in the lung fibroblasts. This family also includes sarcospan is a transmembrane component of dystrophin-associated glycoprotein. Loss of the sarcoglycan complex and sarcospan alone is sufficient to cause muscular dystrophy. The role of the sarcoglycan complex and sarcospan is thought to be to strengthen the dystrophin axis connecting the basement membrane with the cytoskeleton." Q#7918 - CGI_10013578 superfamily 245040 4 61 5.62E-06 39.7648 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#7921 - CGI_10013581 superfamily 246680 749 827 0.000135758 41.2267 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#7921 - CGI_10013581 superfamily 248012 16 107 0.000628891 39.0981 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#7921 - CGI_10013581 superfamily 246680 300 378 0.00125006 38.1451 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#7923 - CGI_10000750 superfamily 241563 59 99 8.57E-05 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7923 - CGI_10000750 superfamily 243092 306 406 0.000391374 41.1664 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7924 - CGI_10005358 superfamily 247058 70 263 1.02E-52 178.523 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#7924 - CGI_10005358 superfamily 205939 293 383 3.20E-43 149.933 cl18284 ECH_C superfamily C - 2-enoyl-CoA Hydratase C-terminal region; This is the C-terminal region of enoyl-CoA hydratase. Q#7925 - CGI_10005359 superfamily 245206 11 199 2.69E-78 238.256 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#7928 - CGI_10005362 superfamily 193607 8 139 4.90E-72 214.358 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#7929 - CGI_10005363 superfamily 243161 3 60 1.08E-05 40.8406 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#7931 - CGI_10007727 superfamily 241818 103 238 4.83E-49 161.987 cl00366 PMSR superfamily - - Peptide methionine sulfoxide reductase; This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine. Q#7931 - CGI_10007727 superfamily 245009 4 117 3.04E-06 44.5809 cl09109 NTF2_like superfamily - - "Nuclear transport factor 2 (NTF2-like) superfamily. This family includes members of the NTF2 family, Delta-5-3-ketosteroid isomerases, Scytalone Dehydratases, and the beta subunit of Ring hydroxylating dioxygenases. This family is a classic example of divergent evolution wherein the proteins have many common structural details but diverge greatly in their function. For example, nuclear transport factor 2 (NTF2) mediates the nuclear import of RanGDP and binds to both RanGDP and FxFG repeat-containing nucleoporins while Ketosteroid isomerases catalyze the isomerization of delta-5-3-ketosteroid to delta-4-3-ketosteroid, by intramolecular transfer of the C4-beta proton to the C6-beta position. While the function of the beta sub-unit of the Ring hydroxylating dioxygenases is not known, Scytalone Dehydratases catalyzes two reactions in the biosynthetic pathway that produces fungal melanin. Members of the NTF2-like superfamily are widely distributed among bacteria, archaea and eukaryotes." Q#7932 - CGI_10007728 superfamily 218201 434 559 3.97E-40 143.014 cl07855 Tsg superfamily - - "Twisted gastrulation (Tsg) protein conserved region; Tsg was identified in Drosophila as being required to specify the dorsal-most structures in the embryo, for example amnioserosa. Biochemical experiments have revealed three key properties of Tsg: it can synergistically inhibit Dpp/BMP action in both Drosophila and vertebrates by forming a tripartite complete between itself, SOG/chordin and a BMP ligand; Tsg seems to enhance the Tld/BMP-1-mediated cleavage rate of SOG/chordin and may change the preference of site utilisation; Tsg can promote the dissociation of chordin cysteine-rich-containing fragments from the ligand to inhibit BMP signalling." Q#7932 - CGI_10007728 superfamily 201778 8 114 1.86E-13 67.619 cl18219 GFO_IDH_MocA superfamily - - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#7936 - CGI_10007732 superfamily 241833 130 273 4.88E-52 172.71 cl00389 SIS superfamily - - SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Q#7936 - CGI_10007732 superfamily 241833 300 383 4.84E-37 133.108 cl00389 SIS superfamily C - SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Q#7937 - CGI_10007733 superfamily 241833 338 529 5.92E-86 266.002 cl00389 SIS superfamily - - SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Q#7937 - CGI_10007733 superfamily 241833 130 291 9.38E-69 220.86 cl00389 SIS superfamily - - SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Q#7939 - CGI_10007735 superfamily 217905 125 657 8.56E-96 306.418 cl04405 Gaa1 superfamily - - "Gaa1-like, GPI transamidase component; GPI (glycosyl phosphatidyl inositol) transamidase is a multi-protein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase that adds glycosylphosphatidylinositols (GPIs) to newly synthesised proteins." Q#7940 - CGI_10007736 superfamily 245836 222 395 3.98E-69 220.901 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#7940 - CGI_10007736 superfamily 244947 397 473 1.61E-36 129.946 cl08424 OBF_DNA_ligase_family superfamily - - "The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases; ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain." Q#7940 - CGI_10007736 superfamily 204434 104 128 2.71E-06 44.4929 cl10963 zf-CCHH superfamily - - "Zinc-finger (CX5CX6HX5H) motif; This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism." Q#7940 - CGI_10007736 superfamily 204434 5 27 0.000101166 39.8705 cl10963 zf-CCHH superfamily - - "Zinc-finger (CX5CX6HX5H) motif; This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism." Q#7941 - CGI_10007737 superfamily 220624 25 114 3.04E-09 51.9965 cl10881 Nefa_Nip30_N superfamily - - "N-terminal domain of NEFA-interacting nuclear protein NIP30; This is a the N-terminal 100 amino acids of a family of proteins conserved from plants to humans. The full-length protein has putatively been called NEFA-interacting nuclear protein NIP30, however no reference could be found to confirm this." Q#7942 - CGI_10007738 superfamily 241662 13 113 6.10E-30 104.95 cl00180 RabGEF superfamily - - "Nucleotide exchange factor for Rab-like small GTPases (RabGEF), Mss4 type; RabGEF positely regulates the function of Rab GTPase by promoting exchange of GDP for GTP; members of the Rab subfamily of Ras GTPases are important in vesicular transport;" Q#7944 - CGI_10007740 superfamily 241865 1 151 4.37E-33 119.358 cl00440 BtpA superfamily N - "BtpA family; The BtpA protein is tightly associated with the thylakoid membranes, where it stabilises the reaction centre proteins of photosystem I." Q#7945 - CGI_10007741 superfamily 241572 10 91 2.35E-16 71.886 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#7945 - CGI_10007741 superfamily 241572 142 195 6.26E-08 49.1634 cl00050 CYCLIN superfamily NC - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#7947 - CGI_10007743 superfamily 248401 226 516 2.62E-39 144.946 cl17847 Rsm22 superfamily - - "Mitochondrial small ribosomal subunit Rsm22; Rsm22 has been identified as a mitochondrial small ribosomal subunit and is a methyltransferase. In Schizosaccharomyces pombe, Rsm22 is tandemly fused to Cox11 (a factor required for copper insertion into cytochrome oxidase) and the two proteins are proteolytically cleaved after import into the mitochondria." Q#7948 - CGI_10007744 superfamily 243072 64 194 2.96E-28 111.321 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7948 - CGI_10007744 superfamily 243072 137 296 6.01E-25 102.077 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7948 - CGI_10007744 superfamily 243072 248 364 5.30E-20 87.8242 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7948 - CGI_10007744 superfamily 247057 700 759 7.11E-27 105.342 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#7948 - CGI_10007744 superfamily 243072 39 67 0.000477809 39.0744 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#7949 - CGI_10007745 superfamily 242228 3 52 8.04E-18 69.9729 cl00977 Nop10p superfamily - - "Nucleolar RNA-binding protein, Nop10p family; Nop10p is a nucleolar protein that is specifically associated with H/ACA snoRNAs. It is essential for normal 18S rRNA production and rRNA pseudouridylation by the ribonucleoprotein particles containing H/ACA snoRNAs (H/ACA snoRNPs). Nop10p is probably necessary for the stability of these RNPs." Q#7951 - CGI_10005257 superfamily 221536 410 509 5.62E-06 46.6337 cl13732 SBF2 superfamily NC - "Myotubularin protein; This domain family is found in eukaryotes, and is approximately 220 amino acids in length. The family is found in association with pfam02141, pfam03456, pfam03455. This family is the middle region of SBF2, a member of the myotubularin family. Myotubularin-related proteins have been suggested to work in phosphoinositide-mediated signalling events that may also convey control of myelination. Mutations of SBF2 are implicated in Charcot-Marie-Tooth disease." Q#7952 - CGI_10005258 superfamily 204041 19 166 1.08E-40 137.328 cl07367 GLTP superfamily - - Glycolipid transfer protein (GLTP); GLTP is a cytosolic protein that catalyzes the intermembrane transfer of glycolipids. Q#7954 - CGI_10005260 superfamily 247725 11 130 1.21E-70 214.785 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7955 - CGI_10005261 superfamily 245201 84 412 1.16E-150 460.823 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7955 - CGI_10005261 superfamily 227318 458 518 0.0091015 38.3596 cl09222 COG4985 superfamily N - "ABC-type phosphate transport system, auxiliary component [Inorganic ion transport and metabolism]" Q#7956 - CGI_10005262 superfamily 243100 56 104 3.50E-06 44.2179 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#7957 - CGI_10005263 superfamily 247725 24 123 0.00150491 34.8317 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#7958 - CGI_10005264 superfamily 199156 141 156 0.000332125 36.6524 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#7960 - CGI_10006840 superfamily 219603 532 737 3.59E-34 132.572 cl06744 GCFC superfamily - - "GC-rich sequence DNA-binding factor-like protein; Sequences found in this family are similar to a region of a human GC-rich sequence DNA-binding factor homolog. This is thought to be a protein involved in transcriptional regulation due to partial homologies to a transcription repressor and histone-interacting protein. This family also contains tuftelin interacting protein 11 which has been identified as both a nuclear and cytoplasmic protein, and has been implicated in the secretory pathway. Sip1, a septin interacting protein is also a member of this family." Q#7961 - CGI_10006841 superfamily 245201 22 329 2.71E-60 204.307 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7963 - CGI_10006843 superfamily 245225 1 361 1.10E-78 259.896 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#7963 - CGI_10006843 superfamily 247986 378 497 2.53E-12 66.2426 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#7963 - CGI_10006843 superfamily 247986 613 746 7.81E-06 46.5974 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#7964 - CGI_10006844 superfamily 245225 10 389 2.17E-68 235.244 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#7964 - CGI_10006844 superfamily 247986 1085 1189 3.58E-11 63.5462 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#7964 - CGI_10006844 superfamily 247986 426 508 1.05E-10 62.0054 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#7964 - CGI_10006844 superfamily 197504 631 765 2.96E-47 167.466 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#7964 - CGI_10006844 superfamily 197504 1312 1445 7.16E-46 163.614 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#7964 - CGI_10006844 superfamily 245225 938 1088 1.36E-11 66.9501 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#7965 - CGI_10006845 superfamily 245225 41 422 2.95E-77 256.43 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#7965 - CGI_10006845 superfamily 247986 452 554 2.34E-14 72.791 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#7965 - CGI_10006845 superfamily 197504 669 801 3.93E-43 153.984 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#7966 - CGI_10006846 superfamily 197504 1 96 5.65E-34 119.702 cl18192 PBPe superfamily N - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#7967 - CGI_10027629 superfamily 247792 23 75 1.24E-06 46.2848 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7968 - CGI_10027630 superfamily 247792 20 72 7.14E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#7968 - CGI_10027630 superfamily 241563 169 197 0.00503192 35.5328 cl00034 BBOX superfamily C - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7969 - CGI_10027631 superfamily 246709 4 30 0.000369043 36.299 cl14782 RNase_H superfamily N - "RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, Type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription." Q#7972 - CGI_10027635 superfamily 247637 2 311 2.08E-139 404.338 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#7973 - CGI_10027636 superfamily 241600 132 203 2.50E-14 67.3022 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7974 - CGI_10027637 superfamily 241600 8 137 1.32E-31 113.104 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#7975 - CGI_10027638 superfamily 243035 107 183 1.87E-05 41.0662 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7975 - CGI_10027638 superfamily 241776 93 116 0.00639491 34.859 cl00315 RPS2 superfamily NC - "Ribosomal protein S2 (RPS2), involved in formation of the translation initiation complex, where it might contact the messenger RNA and several components of the ribosome. It has been shown that in Escherichia coli RPS2 is essential for the binding of ribosomal protein S1 to the 30s ribosomal subunit. In humans, most likely in all vertebrates, and perhaps in all metazoans, the protein also functions as the 67 kDa laminin receptor (LAMR1 or 67LR), which is formed from a 37 kDa precursor, and is overexpressed in many tumors. 67LR is a cell surface receptor which interacts with a variety of ligands, laminin-1 and others. It is assumed that the ligand interactions are mediated via the conserved C-terminus, which becomes extracellular as the protein undergoes conformational changes which are not well understood. Specifically, a conserved palindromic motif, LMWWML, may participate in the interactions. 67LR plays essential roles in the adhesion of cells to the basement membrane and subsequent signalling events, and has been linked to several diseases. Some evidence also suggests that the precursor of 67LR, 37LRP is also present in the nucleus in animals, where it appears associated with histones." Q#7977 - CGI_10027640 superfamily 222225 19 130 6.41E-07 45.9764 cl18652 DoxX_2 superfamily - - DoxX-like family; This family of uncharacterized proteins are related to DoxX pfam07681. Q#7979 - CGI_10027642 superfamily 241563 123 162 5.31E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7980 - CGI_10027643 superfamily 241563 83 121 0.000161536 39.6279 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7980 - CGI_10027643 superfamily 242067 174 300 0.00353518 38.4879 cl00751 DUF155 superfamily NC - "Uncharacterized ACR, YagE family COG1723; Uncharacterized ACR, YagE family COG1723. " Q#7980 - CGI_10027643 superfamily 243092 344 455 0.00558764 37.6996 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#7981 - CGI_10027644 superfamily 241563 117 156 4.11E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7982 - CGI_10027645 superfamily 216363 77 164 2.28E-16 70.9622 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#7984 - CGI_10027647 superfamily 241563 71 110 7.90E-07 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#7985 - CGI_10027648 superfamily 242611 36 286 1.76E-119 350.642 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#7986 - CGI_10027649 superfamily 220608 36 155 5.16E-34 128.193 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#7987 - CGI_10027650 superfamily 220608 36 114 9.99E-19 82.739 cl10859 G8 superfamily C - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#7990 - CGI_10027653 superfamily 243035 30 97 1.40E-11 59.1705 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7990 - CGI_10027653 superfamily 243035 127 249 8.91E-11 56.8593 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7991 - CGI_10027654 superfamily 243035 51 160 3.41E-12 59.1705 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#7992 - CGI_10027655 superfamily 245596 53 354 5.47E-139 405.82 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#7992 - CGI_10027655 superfamily 247085 368 442 3.21E-05 42.4927 cl15820 RICIN superfamily C - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#7993 - CGI_10027656 superfamily 245201 1 165 1.70E-41 142.376 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#7994 - CGI_10027657 superfamily 241584 175 267 2.61E-05 42.0983 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#7994 - CGI_10027657 superfamily 248016 117 174 0.00111516 36.9247 cl17462 T5orf172 superfamily C - T5orf172 domain; This domain was identified by Iyer and colleagues. Q#7999 - CGI_10027662 superfamily 243092 160 406 7.60E-17 78.916 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8000 - CGI_10027663 superfamily 246925 17 213 3.64E-06 46.1946 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#8001 - CGI_10027664 superfamily 241611 62 205 0.00349796 37.368 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#8002 - CGI_10027665 superfamily 245814 37 98 1.96E-05 44.4023 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8002 - CGI_10027665 superfamily 216347 1190 1612 3.73E-71 246.294 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#8002 - CGI_10027665 superfamily 245814 472 543 8.70E-08 51.7373 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8002 - CGI_10027665 superfamily 245814 558 623 6.27E-05 43.1132 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8002 - CGI_10027665 superfamily 245814 692 752 0.00383486 37.7204 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8003 - CGI_10027666 superfamily 245814 88 174 4.52E-07 44.3516 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8003 - CGI_10027666 superfamily 245814 12 70 5.75E-07 43.9348 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8004 - CGI_10027667 superfamily 245814 179 243 0.000574273 37.5604 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8004 - CGI_10027667 superfamily 245814 80 146 0.00325511 35.4092 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8008 - CGI_10027671 superfamily 245599 409 608 8.57E-84 263.772 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#8008 - CGI_10027671 superfamily 207662 143 220 4.11E-51 171.878 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#8010 - CGI_10027673 superfamily 243094 81 442 3.83E-115 369.364 cl02569 RasGAP superfamily - - "Ras GTPase Activating Domain; RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator." Q#8010 - CGI_10027673 superfamily 128469 1595 1696 2.79E-13 69.0212 cl17971 VPS9 superfamily - - Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. Q#8011 - CGI_10027674 superfamily 243039 367 549 3.77E-80 251.791 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#8011 - CGI_10027674 superfamily 190233 187 226 1.32E-05 43.2118 cl08341 zf-TRAF superfamily C - TRAF-type zinc finger; TRAF-type zinc finger. Q#8011 - CGI_10027674 superfamily 247792 43 82 0.00178233 36.7182 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8012 - CGI_10027675 superfamily 243072 19 100 1.70E-14 64.7122 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8013 - CGI_10027676 superfamily 241846 248 319 4.30E-20 87.851 cl00409 tgt superfamily N - queuine tRNA-ribosyltransferase; Provisional Q#8013 - CGI_10027676 superfamily 241846 86 143 0.00404403 37.0064 cl00409 tgt superfamily NC - queuine tRNA-ribosyltransferase; Provisional Q#8014 - CGI_10027677 superfamily 247723 147 221 4.41E-34 119.68 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#8015 - CGI_10027678 superfamily 243092 823 1074 2.41E-44 163.275 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8015 - CGI_10027678 superfamily 128914 44 95 8.05E-10 56.4254 cl15352 CTLH superfamily - - C-terminal to LisH motif; Alpha-helical motif of unknown function. Q#8016 - CGI_10027679 superfamily 243082 376 604 1.99E-43 158.22 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#8016 - CGI_10027679 superfamily 245879 735 810 3.31E-17 78.5541 cl12116 DUSP superfamily - - DUSP domain; The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. Q#8016 - CGI_10027679 superfamily 245220 32 92 1.63E-16 75.9042 cl09957 zf-UBP superfamily - - Zn-finger in ubiquitin-hydrolases and other protein; Zn-finger in ubiquitin-hydrolases and other protein. Q#8016 - CGI_10027679 superfamily 245879 631 693 8.40E-11 59.7935 cl12116 DUSP superfamily - - DUSP domain; The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. Q#8016 - CGI_10027679 superfamily 243082 174 209 7.48E-05 44.4986 cl02553 Peptidase_C19 superfamily NC - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#8017 - CGI_10027680 superfamily 241574 482 707 1.43E-93 298.731 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#8017 - CGI_10027680 superfamily 241574 771 1001 1.55E-40 150.429 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#8017 - CGI_10027680 superfamily 238012 167 205 0.000368608 39.645 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8018 - CGI_10027681 superfamily 241568 2327 2384 1.30E-07 51.3096 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#8018 - CGI_10027681 superfamily 241568 2131 2183 1.94E-07 50.9244 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#8018 - CGI_10027681 superfamily 241568 2277 2323 0.000828701 40.1388 cl00043 CCP superfamily C - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#8018 - CGI_10027681 superfamily 243061 1392 1493 1.52E-38 142.096 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 291 392 1.66E-38 142.096 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 513 613 1.95E-38 141.711 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 1282 1384 6.01E-38 140.555 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 1063 1161 8.41E-38 140.17 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 621 722 1.97E-37 139.014 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 182 282 3.49E-37 138.244 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 1737 1838 3.83E-37 138.244 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 952 1052 4.72E-37 137.859 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 1499 1601 1.51E-33 127.843 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 1171 1274 5.58E-33 126.303 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 1607 1706 5.38E-32 123.221 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 400 503 5.69E-32 123.221 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 839 943 2.91E-31 121.295 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 728 829 5.16E-31 120.525 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 243061 73 174 6.91E-30 117.058 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 111397 2193 2272 2.26E-17 80.463 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#8018 - CGI_10027681 superfamily 243061 11 64 2.13E-14 72.3746 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#8018 - CGI_10027681 superfamily 214531 1999 2041 1.86E-10 59.1524 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#8018 - CGI_10027681 superfamily 215683 1978 2015 8.11E-08 51.7871 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#8018 - CGI_10027681 superfamily 214531 1868 1910 0.000108978 42.5889 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#8018 - CGI_10027681 superfamily 214531 1922 1951 0.00132555 39.1221 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#8022 - CGI_10027685 superfamily 247724 16 87 7.02E-15 70.6323 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8024 - CGI_10027687 superfamily 241868 1178 1338 2.71E-52 183.476 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#8025 - CGI_10027688 superfamily 241696 123 438 6.47E-159 455.166 cl00218 Glyco_hydrolase_16 superfamily - - "glycosyl hydrolase family 16; The O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycosyl hydrolase family 16. Family 16 includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues." Q#8026 - CGI_10027689 superfamily 241696 96 416 4.08E-165 470.189 cl00218 Glyco_hydrolase_16 superfamily - - "glycosyl hydrolase family 16; The O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycosyl hydrolase family 16. Family 16 includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues." Q#8029 - CGI_10027692 superfamily 149426 344 493 1.39E-09 55.8653 cl18038 SEFIR superfamily - - "SEFIR domain; This family comprises IL17 receptors (IL17Rs) and SEF proteins. The latter are feedback inhibitors of FGF signalling and are also thought to be receptors. Due to its similarity to the TIR domain (pfam01582), the SEFIR region is thought to be involved in homotypic interactions with other SEFIR/TIR-domain-containing proteins. Thus, SEFs and IL17Rs may be involved in TOLL/IL1R-like signalling pathways." Q#8030 - CGI_10027693 superfamily 247755 31 239 7.24E-63 209.435 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8030 - CGI_10027693 superfamily 247789 372 578 1.17E-31 122.367 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#8032 - CGI_10027695 superfamily 238076 33 160 5.85E-79 244.251 cl18938 PAX superfamily - - Paired Box domain Q#8032 - CGI_10027695 superfamily 241599 269 324 2.89E-22 89.9952 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#8034 - CGI_10027697 superfamily 247057 200 260 5.22E-25 97.3757 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#8034 - CGI_10027697 superfamily 247057 129 189 2.61E-22 90.05 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#8034 - CGI_10027697 superfamily 247725 351 399 1.04E-18 82.2987 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8034 - CGI_10027697 superfamily 247725 393 418 2.58E-09 54.9495 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8035 - CGI_10027698 superfamily 243072 54 177 9.82E-36 133.278 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8035 - CGI_10027698 superfamily 243072 191 220 4.53E-05 42.156 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8036 - CGI_10027699 superfamily 242542 51 235 8.20E-37 129.658 cl01505 YhhN superfamily - - "YhhN-like protein; The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many of the members of this family are annotated as being possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues." Q#8037 - CGI_10015249 superfamily 217293 1 173 7.08E-26 103.096 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#8037 - CGI_10015249 superfamily 202474 180 214 0.00393738 36.8629 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#8038 - CGI_10015250 superfamily 217293 24 224 4.72E-36 132.371 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#8038 - CGI_10015250 superfamily 202474 249 327 1.56E-07 50.3449 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#8039 - CGI_10015251 superfamily 191444 73 144 2.97E-11 55.7933 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#8040 - CGI_10015252 superfamily 247787 1 186 1.74E-102 297.573 cl17233 RecA-like_NTPases superfamily N - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#8041 - CGI_10015253 superfamily 247765 15 44 3.42E-19 76.9507 cl17211 RecA superfamily C - recA bacterial DNA recombination protein; RecA is a DNA-dependent ATPase and functions in DNA repair systems. RecA protein catalyzes an ATP-dependent DNA strand-exchange reaction that is the central step in the repair of dsDNA breaks by homologous recombination. Q#8042 - CGI_10015254 superfamily 246680 261 312 0.00561802 34.288 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#8044 - CGI_10015256 superfamily 245456 947 1262 1.07E-151 463.335 cl10970 AP_MHD_Cterm superfamily - - "C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD); This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15." Q#8045 - CGI_10015257 superfamily 245201 216 436 3.94E-14 72.2693 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8045 - CGI_10015257 superfamily 247792 110 151 0.00611552 35.9478 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8045 - CGI_10015257 superfamily 247792 18 59 0.00611552 35.9478 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8046 - CGI_10015258 superfamily 247805 566 709 1.91E-16 78.5332 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#8046 - CGI_10015258 superfamily 247905 821 945 7.06E-06 46.0769 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#8046 - CGI_10015258 superfamily 243778 1032 1122 1.64E-34 129.266 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#8046 - CGI_10015258 superfamily 219532 1161 1308 1.96E-21 91.9922 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#8046 - CGI_10015258 superfamily 245716 304 324 0.000528028 39.5349 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#8047 - CGI_10015259 superfamily 217380 62 342 1.91E-45 162.879 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#8047 - CGI_10015259 superfamily 248097 447 579 4.05E-19 83.8538 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#8048 - CGI_10015260 superfamily 247986 14 103 7.52E-10 56.2274 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#8049 - CGI_10015261 superfamily 241749 83 217 1.70E-18 78.5817 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#8060 - CGI_10015272 superfamily 241563 61 97 1.31E-05 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8062 - CGI_10015274 superfamily 245882 25 407 1.09E-180 516.844 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#8065 - CGI_10016677 superfamily 241563 60 99 9.45E-06 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8065 - CGI_10016677 superfamily 243092 301 452 0.00871822 36.9292 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8068 - CGI_10016680 superfamily 241563 13 41 0.00115205 37.3167 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8070 - CGI_10016682 superfamily 218231 248 367 4.44E-25 99.6602 cl04708 ELMO_CED12 superfamily C - "ELMO/CED-12 family; This family represents a conserved domain which is found in a number of eukaryotic proteins including CED-12, ELMO I and ELMO II. ELMO1 is a component of signalling pathways that regulate phagocytosis and cell migration and is the mammalian orthologue of the C. elegans gene, ced-12. CED-12 is required for the engulfment of dying cells and cell migration. In mammalian cells, ELMO1 interacts with Dock180 as part of the CrkII/Dock180/Rac pathway responsible for phagocytosis and cell migration. ELMO1 is ubiquitously expressed, although its expression is highest in the spleen, an organ rich in immune cells. ELMO1 has a PH domain and a polyproline sequence motif at its C terminus which are not present in this alignment." Q#8071 - CGI_10016683 superfamily 218028 177 311 6.73E-10 57.7075 cl04479 AAA_4 superfamily - - "Divergent AAA domain; This family is related to the pfam00004 family, and presumably has the same function (ATP-binding)." Q#8073 - CGI_10016685 superfamily 110440 216 242 0.00739465 33.5353 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#8074 - CGI_10016686 superfamily 245819 56 199 2.69E-49 160.052 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#8074 - CGI_10016686 superfamily 219526 1 42 2.85E-05 41.8359 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8075 - CGI_10016687 superfamily 245864 31 490 4.32E-102 316.528 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#8076 - CGI_10016688 superfamily 245864 31 459 2.18E-99 308.438 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#8078 - CGI_10016690 superfamily 243072 193 325 6.02E-25 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8078 - CGI_10016690 superfamily 243072 49 219 3.14E-23 94.7578 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8078 - CGI_10016690 superfamily 243073 440 479 7.78E-09 51.7021 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#8080 - CGI_10016692 superfamily 241583 188 311 2.27E-25 105.398 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#8080 - CGI_10016692 superfamily 243060 499 586 0.00123788 38.1288 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#8081 - CGI_10016693 superfamily 241583 551 681 2.35E-25 105.398 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#8083 - CGI_10016695 superfamily 241583 180 411 1.45E-24 102.702 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#8083 - CGI_10016695 superfamily 243060 596 656 6.82E-05 41.9808 cl02507 SEA superfamily N - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#8083 - CGI_10016695 superfamily 216572 8 122 0.0043289 36.8691 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#8085 - CGI_10016697 superfamily 247724 9 171 5.20E-60 188.503 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8087 - CGI_10016699 superfamily 207794 2 194 7.53E-59 193.962 cl02948 GH20_hexosaminidase superfamily N - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#8087 - CGI_10016699 superfamily 191182 277 341 0.000523538 38.051 cl04917 Nsp1_C superfamily C - Nsp1-like C-terminal region; This family probably forms a coiled-coil. This important region of Nsp1 is involved in binding Nup82. Q#8089 - CGI_10016701 superfamily 247725 131 216 1.34E-42 141.66 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8089 - CGI_10016701 superfamily 246908 4 94 1.08E-32 115.653 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#8090 - CGI_10016702 superfamily 246680 91 168 1.49E-14 65.0494 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#8091 - CGI_10016703 superfamily 243035 31 151 1.39E-24 93.0681 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8094 - CGI_10016730 superfamily 241563 59 98 2.85E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8094 - CGI_10016730 superfamily 191851 98 209 0.000546105 39.9207 cl06708 DUF1640 superfamily - - Protein of unknown function (DUF1640); This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured. Q#8094 - CGI_10016730 superfamily 110440 476 502 0.00175327 36.6169 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#8094 - CGI_10016730 superfamily 110440 517 544 0.00859854 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#8096 - CGI_10016732 superfamily 243100 176 229 8.92E-17 71.4909 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#8097 - CGI_10016733 superfamily 204198 35 118 2.32E-17 78.0154 cl07820 DUF1981 superfamily - - Domain of unknown function (DUF1981); Members of this family of functionally uncharacterized domains are found in various plant and yeast protein transport proteins. Q#8098 - CGI_10016734 superfamily 220692 41 354 1.75E-22 95.3489 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#8103 - CGI_10016739 superfamily 241563 73 110 7.63E-05 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8103 - CGI_10016739 superfamily 128778 117 208 0.000392441 39.1703 cl17972 BBC superfamily C - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#8104 - CGI_10016740 superfamily 128778 63 172 2.70E-08 51.8819 cl17972 BBC superfamily C - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#8104 - CGI_10016740 superfamily 241563 19 56 0.000183617 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8105 - CGI_10016741 superfamily 241810 157 211 1.88E-28 103.749 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#8105 - CGI_10016741 superfamily 189761 76 152 8.02E-36 123.376 cl03010 Ribosomal_S4e superfamily - - Ribosomal family S4e; Ribosomal family S4e. Q#8105 - CGI_10016741 superfamily 191938 1 22 4.42E-06 42.2606 cl06900 RS4NT superfamily N - RS4NT (NUC023) domain; This is the N-terminal domain of Ribosomal S4 / S4e proteins. This domain is associated with S4 and KOW domains. Q#8106 - CGI_10016742 superfamily 222150 213 239 1.06E-05 43.5345 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8106 - CGI_10016742 superfamily 222150 243 270 0.000108897 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8106 - CGI_10016742 superfamily 222150 154 181 0.00734424 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8107 - CGI_10016743 superfamily 244875 99 194 3.04E-07 48.1668 cl08255 Na_K-ATPase superfamily NC - Sodium / potassium ATPase beta chain; Sodium / potassium ATPase beta chain. Q#8114 - CGI_10011342 superfamily 248097 24 159 4.66E-14 64.5938 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#8123 - CGI_10020879 superfamily 241564 74 141 4.46E-26 99.2623 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#8123 - CGI_10020879 superfamily 247792 312 351 0.000449614 37.4252 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8124 - CGI_10020880 superfamily 241600 35 134 5.48E-41 137.372 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8125 - CGI_10020881 superfamily 241600 370 566 2.35E-83 261.406 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8129 - CGI_10020885 superfamily 219542 48 157 2.57E-40 145.847 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#8129 - CGI_10020885 superfamily 215896 167 349 3.47E-20 88.8912 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#8129 - CGI_10020885 superfamily 219541 820 872 2.40E-14 71.7307 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#8129 - CGI_10020885 superfamily 215896 566 635 8.08E-07 48.8304 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#8129 - CGI_10020885 superfamily 219541 534 562 8.81E-05 42.0703 cl18516 Cu-oxidase_2 superfamily NC - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#8130 - CGI_10020886 superfamily 245847 42 86 6.57E-06 41.003 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#8132 - CGI_10020888 superfamily 241574 94 300 4.35E-79 248.27 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#8134 - CGI_10020890 superfamily 246598 1273 1523 1.17E-171 517.565 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#8134 - CGI_10020890 superfamily 152569 966 1196 2.69E-157 478.424 cl13557 PRP8_domainIV superfamily - - "PRP8 domain IV core; This domain is found in eukaryotes, and is about 20 amino acids in length. It is found associated with pfam10597, pfam10596, pfam10598, pfam08083, pfam08082, pfam01398, pfam08084. There is a conserved LILR sequence motif. The domain is a selenomethionine domain in a subunit of the spliceosome. The function of PRP8 domain IV is believed to be interaction with the splicosomal core." Q#8134 - CGI_10020890 superfamily 151125 648 807 1.10E-107 339.066 cl18045 U6-snRNA_bdg superfamily - - "U6-snRNA interacting domain of PrP8; This domain incorporates the interacting site for the U6-snRNA as part of the U4/U6.U5 tri-snRNPs complex of the spliceosome, and is the prime candidate for the role of cofactor for the spliceosome's RNA core. The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor." Q#8134 - CGI_10020890 superfamily 119117 414 549 3.05E-88 284.205 cl11217 U5_2-snRNA_bdg superfamily - - "U5-snRNA binding site 2 of PrP8; The essential spliceosomal protein Prp8 interacts with U5 and U6 snRNAs and with specific pre-mRNA sequences that participate in catalysis. This close association with crucial RNA sequences, together with extensive genetic evidence, suggests that Prp8 could directly affect the function of the catalytic core, perhaps acting as a splicing cofactor." Q#8134 - CGI_10020890 superfamily 192639 192 285 1.60E-40 146.619 cl11218 RRM_4 superfamily - - "RNA recognition motif of the spliceosomal PrP8; The large RNA-protein complex of the spliceosome catalyzes pre-mRNA splicing. One of the most conserved core proteins is PrP8 which occupies a central position in the catalytic core of the spliceosome, and has been implicated in several crucial molecular rearrangements that occur there, and has recently come under the spotlight for its role in the inherited human disease, Retinitis Pigmentosa. The RNA-recognition motif of PrP8 is highly conserved and provides a possible RNA binding centre for the 5-prime SS, BP, or 3-prime SS of pre-mRNA which are known to contact with Prp8. The most conserved regions of an RRM are defined as the RNP1 and RNP2 sequences. Recognition of RNA targets can also be modulated by a number of other factors, most notably the two loops beta1-alpha1, beta2-beta3 and the amino acid residues C-terminal to the RNP2 domain." Q#8135 - CGI_10020891 superfamily 243116 713 1078 0 645.143 cl02626 DNA_pol_A superfamily - - "Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication; DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains." Q#8135 - CGI_10020891 superfamily 243116 414 456 1.66E-11 66.1885 cl02626 DNA_pol_A superfamily C - "Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication; DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains." Q#8135 - CGI_10020891 superfamily 245226 252 285 0.000118206 42.2877 cl10012 DnaQ_like_exo superfamily NC - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#8137 - CGI_10020893 superfamily 227462 20 135 2.76E-35 122.027 cl18814 COG5133 superfamily N - Uncharacterized conserved protein [Function unknown] Q#8140 - CGI_10000860 superfamily 243072 178 285 0.00580601 35.8223 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8143 - CGI_10026580 superfamily 110440 484 508 0.00237925 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#8143 - CGI_10026580 superfamily 217316 111 191 0.00397012 35.68 cl03832 DUF234 superfamily - - Archaea bacterial proteins of unknown function; Archaea bacterial proteins of unknown function. Q#8146 - CGI_10026583 superfamily 238012 112 145 0.000753652 36.5634 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8150 - CGI_10026587 superfamily 241622 1418 1496 1.60E-16 77.607 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#8150 - CGI_10026587 superfamily 241622 781 853 1.07E-13 69.5178 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#8150 - CGI_10026587 superfamily 241622 686 767 1.64E-12 66.051 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#8150 - CGI_10026587 superfamily 241622 1552 1630 3.54E-07 50.2579 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#8150 - CGI_10026587 superfamily 247683 1652 1712 5.25E-17 78.5354 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8150 - CGI_10026587 superfamily 247744 1812 1970 7.15E-16 78.1035 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#8150 - CGI_10026587 superfamily 246680 6 89 6.48E-05 43.3288 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#8150 - CGI_10026587 superfamily 247746 468 576 0.00368576 39.7996 cl17192 ATP-synt_B superfamily C - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#8151 - CGI_10026588 superfamily 243172 54 412 1.63E-122 373.261 cl02773 VKG_Carbox superfamily N - "Vitamin K-dependent gamma-carboxylase; Using reduced vitamin K, oxygen, and carbon dioxide, gamma-glutamyl carboxylase post-translationally modifies certain glutamates by adding carbon dioxide to the gamma position of those amino acids. In vertebrates, the modification of glutamate residues of target proteins is facilitated by an interaction between a propeptide present on target proteins and the gamma-glutamyl carboxylase." Q#8151 - CGI_10026588 superfamily 247772 464 514 0.0076315 34.9132 cl17218 Cupin_2 superfamily N - Cupin domain; This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). Q#8152 - CGI_10026589 superfamily 112833 94 221 3.26E-65 210.67 cl04372 DUF367 superfamily - - Domain of unknown function (DUF367); Domain of unknown function (DUF367). Q#8152 - CGI_10026589 superfamily 112833 341 468 3.26E-65 210.67 cl04372 DUF367 superfamily - - Domain of unknown function (DUF367); Domain of unknown function (DUF367). Q#8152 - CGI_10026589 superfamily 217870 57 90 2.53E-09 53.6549 cl04386 RLI superfamily - - "Possible Fer4-like domain in RNase L inhibitor, RLI; Possible metal-binding domain in endoribonuclease RNase L inhibitor. Found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, fer4, pfam00037. Also often found adjacent to the DUF367 domain pfam04034 in uncharacterized proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons, and could possibly play a more general role in the regulation of RNA stability in mammalian cells. Inhibitory activity requires concentration-dependent association of RLI with RNase L." Q#8152 - CGI_10026589 superfamily 217870 304 337 8.43E-09 52.1141 cl04386 RLI superfamily - - "Possible Fer4-like domain in RNase L inhibitor, RLI; Possible metal-binding domain in endoribonuclease RNase L inhibitor. Found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, fer4, pfam00037. Also often found adjacent to the DUF367 domain pfam04034 in uncharacterized proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons, and could possibly play a more general role in the regulation of RNA stability in mammalian cells. Inhibitory activity requires concentration-dependent association of RLI with RNase L." Q#8153 - CGI_10026590 superfamily 219817 448 629 6.40E-18 83.0512 cl07129 Xpo1 superfamily - - "Exportin 1-like protein; The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus." Q#8153 - CGI_10026590 superfamily 244509 41 165 2.51E-10 60.2335 cl06793 PRKCSH superfamily - - "Glucosidase II beta subunit-like protein; The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. Mutations in the gene coding for PRKCSH have been found to be involved in the development of autosomal dominant polycystic liver disease (ADPLD), but the precise role the protein has in the pathogenesis of this disease is unknown. This family also includes an ER sensor for misfolded glycoproteins and is therefore likely to be a generic sugar binding domain." Q#8153 - CGI_10026590 superfamily 243689 376 442 9.18E-07 48.3937 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#8154 - CGI_10026591 superfamily 245671 38 95 0.00363708 35.3016 cl11522 Tom22 superfamily NC - "Mitochondrial import receptor subunit Tom22; The mitochondrial protein translocase family, which is responsible for movement of nuclear encoded pre-proteins into mitochondria, is very complex with at least 19 components. These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family represents the Tom22 proteins. The N terminal region of Tom22 has been shown to have chaperone-like activity, and the C terminal region faces the intermembrane face." Q#8154 - CGI_10026591 superfamily 245671 118 175 0.00363708 35.3016 cl11522 Tom22 superfamily NC - "Mitochondrial import receptor subunit Tom22; The mitochondrial protein translocase family, which is responsible for movement of nuclear encoded pre-proteins into mitochondria, is very complex with at least 19 components. These proteins include several chaperone proteins, four proteins of the outer membrane translocase (Tom) import receptor, five proteins of the Tom channel complex, five proteins of the inner membrane translocase (Tim) and three "motor" proteins. This family represents the Tom22 proteins. The N terminal region of Tom22 has been shown to have chaperone-like activity, and the C terminal region faces the intermembrane face." Q#8155 - CGI_10026592 superfamily 191444 46 124 8.93E-07 42.6965 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#8156 - CGI_10026593 superfamily 216686 172 354 4.38E-47 160.566 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#8157 - CGI_10026594 superfamily 243092 201 477 2.58E-43 155.956 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8158 - CGI_10026595 superfamily 242899 27 176 5.23E-54 170.819 cl02135 TRAPP superfamily - - "Transport protein particle (TRAPP) component; TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterized TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localise TRAPP to the Golgi." Q#8159 - CGI_10026596 superfamily 241583 236 447 8.39E-91 291.064 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#8159 - CGI_10026596 superfamily 216572 38 180 6.55E-30 116.605 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#8159 - CGI_10026596 superfamily 246918 544 595 4.23E-11 60.2931 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8159 - CGI_10026596 superfamily 204025 1062 1096 1.17E-06 46.8609 cl07344 PLAC superfamily - - PLAC (protease and lacunin) domain; The PLAC (protease and lacunin) domain is a short six-cysteine region that is usually found at the C terminal of proteins. It is found in a range of proteins including PACE4 (paired basic amino acid cleaving enzyme 4) and the extracellular matrix protein lacunin. Q#8159 - CGI_10026596 superfamily 246918 830 881 2.16E-05 43.7295 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8159 - CGI_10026596 superfamily 246918 884 941 0.000625126 39.1071 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8159 - CGI_10026596 superfamily 246918 948 999 0.00141204 37.9515 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8160 - CGI_10026597 superfamily 243035 1995 2115 3.20E-25 104.239 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8160 - CGI_10026597 superfamily 207627 1700 1776 2.87E-07 50.7159 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#8161 - CGI_10026598 superfamily 243119 158 206 2.83E-12 62.4613 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8161 - CGI_10026598 superfamily 243119 43 89 3.24E-10 56.6732 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8162 - CGI_10026599 superfamily 245303 28 372 2.20E-87 274.051 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8163 - CGI_10026600 superfamily 245303 25 379 1.89E-106 323.742 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8164 - CGI_10026601 superfamily 245303 27 360 6.06E-98 301.4 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8165 - CGI_10026602 superfamily 245303 23 387 0 599.545 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8165 - CGI_10026602 superfamily 243119 550 602 2.97E-13 65.1476 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8165 - CGI_10026602 superfamily 243119 442 492 1.08E-12 63.6068 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8166 - CGI_10026603 superfamily 245303 49 412 0 620.346 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8166 - CGI_10026603 superfamily 245303 626 989 0 617.264 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8166 - CGI_10026603 superfamily 245303 998 1091 5.45E-49 179.677 cl10447 GH18_chitinase-like superfamily N - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8166 - CGI_10026603 superfamily 243119 1154 1204 3.55E-12 63.6068 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8166 - CGI_10026603 superfamily 243119 475 525 3.55E-12 63.6068 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8166 - CGI_10026603 superfamily 243119 578 622 1.43E-08 53.2064 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8166 - CGI_10026603 superfamily 243119 1257 1289 0.000715938 39.3393 cl02629 CBM_14 superfamily C - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8167 - CGI_10026604 superfamily 245303 12 375 0 615.723 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8167 - CGI_10026604 superfamily 245303 429 791 0 594.922 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8167 - CGI_10026604 superfamily 243119 986 1038 1.43E-12 64.3772 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8167 - CGI_10026604 superfamily 243119 815 866 1.75E-12 63.992 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8168 - CGI_10026605 superfamily 245303 38 408 0 535.217 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#8168 - CGI_10026605 superfamily 243119 642 694 2.66E-13 65.918 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8168 - CGI_10026605 superfamily 243119 450 499 5.46E-05 41.6505 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#8169 - CGI_10026606 superfamily 248013 5 47 2.03E-11 58.8147 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#8170 - CGI_10026607 superfamily 241867 249 513 1.78E-32 129.842 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#8171 - CGI_10026608 superfamily 217869 5 215 4.73E-95 295.35 cl12286 Not3 superfamily - - "Not1 N-terminal domain, CCR4-Not complex component; Not1 N-terminal domain, CCR4-Not complex component. " Q#8171 - CGI_10026608 superfamily 243016 533 663 9.22E-44 153.255 cl02384 NOT2_3_5 superfamily - - "NOT2 / NOT3 / NOT5 family; NOT1, NOT2, NOT3, NOT4 and NOT5 form a nuclear complex that negatively regulates the basal and activated transcription of many genes. This family includes NOT2, NOT3 and NOT5." Q#8172 - CGI_10026609 superfamily 245206 3 250 5.05E-86 262.599 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8173 - CGI_10026610 superfamily 245201 1648 1884 1.64E-29 119.649 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8173 - CGI_10026610 superfamily 243072 173 320 4.95E-29 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8173 - CGI_10026610 superfamily 243072 249 438 4.40E-18 83.587 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8173 - CGI_10026610 superfamily 243072 42 199 1.36E-14 73.5718 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8173 - CGI_10026610 superfamily 247724 1000 1158 3.20E-27 111.274 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8174 - CGI_10026611 superfamily 247724 6 169 9.68E-39 132.649 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8175 - CGI_10026612 superfamily 247724 1 94 2.41E-16 71.408 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8176 - CGI_10026614 superfamily 243035 90 144 7.50E-07 45.3034 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8179 - CGI_10026617 superfamily 216709 185 332 1.08E-46 160.52 cl03357 Nop superfamily - - Putative snoRNA binding domain; This family consists of various Pre RNA processing ribonucleoproteins. The function of the aligned region is unknown however it may be a common RNA or snoRNA or Nop1p binding domain. Nop5p (Nop58p) from yeast is the protein component of a ribonucleoprotein protein required for pre-18s rRNA processing and is suggested to function with Nop1p in a snoRNA complex. Nop56p and Nop5p interact with Nop1p and are required for ribosome biogenesis. Prp31p is required for pre-mRNA splicing in S. cerevisiae. Q#8179 - CGI_10026617 superfamily 220400 334 463 5.70E-38 135.54 cl10762 Prp31_C superfamily - - Prp31 C terminal domain; This is the C terminal domain of the pre-mRNA processing factor Prp31. Prp31 is required for U4/U6.U5 tri-snRNP formation. In humans this protein has been linked to autosomal dominant retinitis pigmentosa. Q#8179 - CGI_10026617 superfamily 208568 90 141 2.85E-21 87.1772 cl06890 NOSIC superfamily - - NOSIC (NUC001) domain; This is the central domain in Nop56/SIK1-like proteins. Q#8180 - CGI_10026618 superfamily 218122 309 536 4.61E-68 226.71 cl04558 Choline_transpo superfamily N - Plasma-membrane choline transporter; This family represents a high-affinity plasma-membrane choline transporter in C.elegans which is thought to be rate-limiting for ACh synthesis in cholinergic nerve terminals. Q#8182 - CGI_10026620 superfamily 241758 139 239 8.20E-19 82.4178 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#8182 - CGI_10026620 superfamily 247757 20 148 3.36E-45 155.877 cl17203 Fer4_NifH superfamily C - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#8182 - CGI_10026620 superfamily 241619 306 359 8.74E-05 40.64 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#8183 - CGI_10026621 superfamily 243040 1 95 7.42E-60 185.308 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#8184 - CGI_10026622 superfamily 241737 49 191 7.00E-18 82.544 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#8184 - CGI_10026622 superfamily 241617 968 1026 4.89E-16 75.1087 cl00110 MBD superfamily - - "MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family." Q#8184 - CGI_10026622 superfamily 243114 1047 1159 3.62E-30 117.511 cl02622 Pre-SET superfamily - - Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains. Q#8184 - CGI_10026622 superfamily 243091 1447 1515 1.09E-17 81.9971 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#8184 - CGI_10026622 superfamily 243091 1167 1233 1.66E-09 57.3443 cl02566 SET superfamily C - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#8185 - CGI_10026623 superfamily 247805 6 220 1.18E-68 222.745 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#8185 - CGI_10026623 superfamily 247905 239 368 4.12E-29 112.331 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#8185 - CGI_10026623 superfamily 222474 399 462 1.22E-22 92.1009 cl16500 DUF4217 superfamily - - Domain of unknown function (DUF4217); This short domain is found at the C-terminus of many helicase proteins. Q#8187 - CGI_10026625 superfamily 241599 1 39 1.60E-12 58.4089 cl00084 homeodomain superfamily N - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#8189 - CGI_10026627 superfamily 217390 245 341 5.34E-06 44.4729 cl18407 TPT superfamily N - Triose-phosphate Transporter family; This family includes transporters with a specificity for triose phosphate. Q#8191 - CGI_10026629 superfamily 241648 270 321 2.56E-22 89.7394 cl00158 ZnF_GATA superfamily - - Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Q#8191 - CGI_10026629 superfamily 218566 122 214 4.61E-06 45.5794 cl18464 GATA-N superfamily C - "GATA-type transcription activator, N-terminal; GATA transcription factors mediate cell differentiation in a diverse range of tissues. Mutation are often associated with certain congenital human disorders. The six classical vertebrate GATA proteins, GATA-1 to GATA-6, are highly homologous and have two tandem zinc fingers. The classical GATA transcription factors function transcription activators. In lower metazoans GATA proteins carry a single canonical zinc finger. This family represents the N-terminal domain of the family of GATA transcription activators." Q#8193 - CGI_10026631 superfamily 246680 493 576 1.75E-06 47.0352 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#8194 - CGI_10026632 superfamily 245864 177 642 2.70E-76 253.355 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#8194 - CGI_10026632 superfamily 245864 6 156 3.36E-32 128.935 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#8196 - CGI_10026634 superfamily 243050 245 297 7.00E-21 86.495 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#8196 - CGI_10026634 superfamily 243050 362 414 7.00E-21 86.495 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#8196 - CGI_10026634 superfamily 243050 475 527 1.15E-20 86.1098 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#8196 - CGI_10026634 superfamily 243050 23 75 1.71E-20 85.3394 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#8196 - CGI_10026634 superfamily 243050 133 185 1.02E-13 66.4646 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#8197 - CGI_10026635 superfamily 241567 5 179 1.70E-35 126.176 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#8198 - CGI_10026636 superfamily 241587 5 61 1.06E-14 62.3066 cl00069 GGL superfamily - - "G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors" Q#8199 - CGI_10026637 superfamily 247727 161 221 4.20E-05 41.2615 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#8201 - CGI_10026639 superfamily 245814 132 206 2.42E-11 61.7363 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8201 - CGI_10026639 superfamily 245814 444 517 4.54E-10 57.8843 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8201 - CGI_10026639 superfamily 245814 339 415 2.19E-09 55.9583 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8201 - CGI_10026639 superfamily 245814 26 100 8.50E-08 51.3359 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8201 - CGI_10026639 superfamily 245814 1174 1227 7.33E-05 42.4763 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8201 - CGI_10026639 superfamily 245814 544 619 0.000218218 40.9355 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8201 - CGI_10026639 superfamily 245814 641 721 7.10E-11 60.5968 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8201 - CGI_10026639 superfamily 245814 227 311 1.10E-08 54.0485 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8201 - CGI_10026639 superfamily 245814 939 1012 5.34E-06 46.1948 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8202 - CGI_10026640 superfamily 241645 161 220 2.10E-09 54.9658 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#8203 - CGI_10026641 superfamily 245201 2660 2906 1.59E-54 192.837 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8203 - CGI_10026641 superfamily 241584 1425 1517 3.85E-23 97.9523 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 1722 1811 5.87E-23 97.1819 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 245814 854 925 5.56E-21 91.1032 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 241584 929 1021 4.69E-20 88.7075 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 1030 1119 8.69E-19 85.2407 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 1829 1909 1.39E-18 84.4703 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 1324 1414 2.45E-18 83.6999 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 632 725 3.37E-18 83.3147 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 1621 1709 4.35E-18 83.3147 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 2504 2594 5.92E-18 82.9295 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 2213 2299 9.55E-18 82.1591 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 245814 2428 2500 1.10E-17 81.4732 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 241584 2112 2205 1.41E-17 81.7739 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 1131 1222 2.30E-17 81.0035 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 733 826 3.14E-17 80.6183 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 330 414 7.96E-15 73.6847 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 241584 431 513 2.68E-12 66.3659 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#8203 - CGI_10026641 superfamily 245814 160 218 3.39E-08 53.6471 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 245814 1943 2010 4.40E-07 50.5655 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 245814 72 140 0.000106851 43.2467 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 245814 1247 1320 1.91E-17 80.7028 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 245814 1545 1617 1.04E-15 75.6952 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 245814 255 326 1.82E-14 71.8432 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 245814 549 628 5.85E-14 70.6876 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 245814 2323 2405 9.18E-10 58.6708 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8203 - CGI_10026641 superfamily 245814 2035 2108 1.52E-09 57.5908 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8204 - CGI_10026642 superfamily 245814 157 226 3.77E-13 64.8179 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8204 - CGI_10026642 superfamily 245814 258 325 1.35E-12 63.2771 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8204 - CGI_10026642 superfamily 245814 343 424 1.98E-08 51.3521 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8204 - CGI_10026642 superfamily 245814 1 59 2.74E-07 48.2705 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8205 - CGI_10026643 superfamily 241571 529 627 1.24E-17 80.149 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8205 - CGI_10026643 superfamily 241563 93 128 0.00288134 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8206 - CGI_10007825 superfamily 241564 53 121 4.13E-25 96.5659 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#8206 - CGI_10007825 superfamily 241564 272 325 1.92E-14 67.6759 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#8207 - CGI_10007826 superfamily 241563 68 108 6.22E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8208 - CGI_10007827 superfamily 245201 221 421 1.86E-38 140.45 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8208 - CGI_10007827 superfamily 216276 31 115 5.02E-16 73.7363 cl15639 Activin_recp superfamily - - "Activin types I and II receptor domain; This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box." Q#8208 - CGI_10007827 superfamily 243113 187 213 2.75E-07 47.4902 cl02621 TGF_beta_GS superfamily - - Transforming growth factor beta type I GS-motif; This motif is found in the transforming growth factor beta (TGF-beta) type I which regulates cell growth and differentiation. The name of the GS motif comes from its highly conserved GSGSGLP signature in the cytoplasmic juxtamembrane region immediately preceding the protein's kinase domain. Point mutations in the GS motif modify the signaling ability of the type I receptor. Q#8208 - CGI_10007827 superfamily 245201 341 508 1.10E-06 49.2284 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8209 - CGI_10007828 superfamily 247724 72 112 6.67E-12 60.2529 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8212 - CGI_10001137 superfamily 247856 33 91 4.83E-13 59.4837 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8213 - CGI_10001500 superfamily 130849 58 91 0.00140898 38.3072 cl17981 lycopene_cycl superfamily C - lycopene cyclase; This model represents a family of bacterial lycopene cyclases catalyzing the transformation of lycopene to carotene. These enzymes are found in a limited spectrum of alpha and gamma proteobacteria as well as Flavobacterium. Q#8214 - CGI_10001501 superfamily 115363 204 232 2.04E-06 43.5146 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8215 - CGI_10001502 superfamily 115363 82 115 4.99E-07 42.7442 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8215 - CGI_10001502 superfamily 115363 38 65 0.00568945 31.9586 cl05972 MIB_HERC2 superfamily N - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8216 - CGI_10008467 superfamily 219974 585 772 4.20E-51 181.301 cl18538 Dna2 superfamily - - "DNA replication factor Dna2; Dna2 is a DNA replication factor with single-stranded DNA-dependent ATPase, ATP-dependent nuclease, ( 5'-flap endonuclease) and helicase activities. It is required for Okazaki fragment processing and is involved in DNA repair pathways." Q#8216 - CGI_10008467 superfamily 221913 1253 1427 1.99E-41 152.695 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#8216 - CGI_10008467 superfamily 222258 1075 1281 9.36E-11 61.8151 cl18656 AAA_30 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#8216 - CGI_10008467 superfamily 241999 750 886 8.83E-06 46.2537 cl00641 Cas4_I-A_I-B_I-C_I-D_II-B superfamily N - CRISPR/Cas system-associated protein Cas4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas4 is RecB-like nuclease with three-cysteine C-terminal cluster Q#8217 - CGI_10008468 superfamily 244906 51 116 6.26E-30 114.93 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#8217 - CGI_10008468 superfamily 244906 192 257 7.68E-30 114.93 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#8217 - CGI_10008468 superfamily 149105 930 1023 9.04E-05 44.7333 cl12353 TMPIT superfamily C - "TMPIT-like protein; A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this." Q#8218 - CGI_10008469 superfamily 245596 51 279 1.30E-82 253.288 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#8219 - CGI_10008470 superfamily 245610 8 361 0 741.875 cl11424 nitrilase superfamily - - "Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes; This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy." Q#8220 - CGI_10008471 superfamily 227718 71 119 0.00241822 34.4839 cl02254 COG5431 superfamily NC - Uncharacterized metal-binding protein [Function unknown] Q#8221 - CGI_10008472 superfamily 241559 992 1093 3.96E-23 96.6111 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#8223 - CGI_10008474 superfamily 247755 158 184 0.00107921 38.5705 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8225 - CGI_10008476 superfamily 245206 9 353 1.56E-72 231.627 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8227 - CGI_10008478 superfamily 241624 1 309 1.61E-44 159.799 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#8228 - CGI_10008479 superfamily 241862 118 276 8.33E-24 97.0416 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#8230 - CGI_10008481 superfamily 243058 287 380 0.000298784 38.8348 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#8230 - CGI_10008481 superfamily 248012 4 78 4.20E-07 47.5725 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#8232 - CGI_10005784 superfamily 216056 40 189 5.82E-24 97.3803 cl08279 Peptidase_M16 superfamily - - Insulinase (Peptidase family M16); Insulinase (Peptidase family M16). Q#8232 - CGI_10005784 superfamily 218490 270 407 2.40E-05 43.6191 cl08432 Peptidase_M16_C superfamily N - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#8233 - CGI_10005785 superfamily 219525 50 89 4.09E-05 40.095 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#8234 - CGI_10005786 superfamily 245213 38 74 1.17E-08 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8234 - CGI_10005786 superfamily 245213 114 150 1.50E-08 51.4834 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8234 - CGI_10005786 superfamily 245213 76 112 2.30E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8234 - CGI_10005786 superfamily 245213 1 36 3.06E-07 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8234 - CGI_10005786 superfamily 221370 348 504 3.80E-09 55.8405 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#8234 - CGI_10005786 superfamily 243086 497 534 4.73E-08 50.0662 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#8234 - CGI_10005786 superfamily 243029 259 299 0.00361832 35.9533 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#8235 - CGI_10005787 superfamily 215647 78 116 1.76E-05 41.4401 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#8236 - CGI_10005788 superfamily 241754 84 249 6.03E-116 349.962 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#8236 - CGI_10005788 superfamily 111612 36 78 1.33E-06 44.009 cl03686 Myosin_N superfamily - - Myosin N-terminal SH3-like domain; This domain has an SH3-like fold. It is found at the N-terminus of many but not all myosins. The function of this domain is unknown. Q#8237 - CGI_10009522 superfamily 248458 26 170 5.36E-15 75.0429 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8237 - CGI_10009522 superfamily 248458 255 436 5.81E-14 71.9613 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8238 - CGI_10009523 superfamily 246975 241 262 0.00669731 34.2449 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#8239 - CGI_10009524 superfamily 247755 1145 1239 1.32E-70 236.038 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8239 - CGI_10009524 superfamily 247755 45 206 2.67E-64 218.319 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8239 - CGI_10009524 superfamily 244201 578 693 1.66E-32 124.265 cl05797 SMC_hinge superfamily - - SMC proteins Flexible Hinge Domain; This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. Q#8240 - CGI_10009525 superfamily 217740 9 255 4.84E-91 271.542 cl18427 Scramblase superfamily - - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#8241 - CGI_10009526 superfamily 247639 53 304 4.13E-40 142.6 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#8243 - CGI_10009528 superfamily 241748 324 531 6.76E-125 369.968 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#8243 - CGI_10009528 superfamily 216431 12 138 8.85E-13 65.7673 cl08317 Creatinase_N superfamily - - Creatinase/Prolidase N-terminal domain; This family includes the N-terminal non-catalytic domains from creatinase and prolidase. The exact function of this domain is uncertain. Q#8243 - CGI_10009528 superfamily 216431 183 276 0.000145935 40.7293 cl08317 Creatinase_N superfamily C - Creatinase/Prolidase N-terminal domain; This family includes the N-terminal non-catalytic domains from creatinase and prolidase. The exact function of this domain is uncertain. Q#8244 - CGI_10009529 superfamily 241599 20 73 1.73E-05 37.6081 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#8245 - CGI_10009530 superfamily 243072 202 362 1.44E-33 125.959 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8245 - CGI_10009530 superfamily 243072 103 260 4.20E-29 113.247 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8246 - CGI_10009531 superfamily 219673 105 271 7.16E-40 140.126 cl06835 COPIIcoated_ERV superfamily N - "Endoplasmic reticulum vesicle transporter; This family is conserved from plants and fungi to humans. Erv46 works in close conjunction with Erv41 and together they form a complex which cycles between the endoplasmic reticulum and Golgi complex. Erv46-41 interacts strongly with the endoplasmic reticulum glucosidase II. Mammalian glucosidase II comprises a catalytic alpha-subunit and a 58 kDa beta subunit, which is required for ER localisation. All proteins identified biochemically as Erv41p-Erv46p interactors are localised to the early secretory pathway and are involved in protein maturation and processing in the ER and/or sorting into COPII vesicles for transport to the Golgi." Q#8246 - CGI_10009531 superfamily 206021 5 105 7.79E-13 62.8973 cl16436 ERGIC_N superfamily - - "Endoplasmic Reticulum-Golgi Intermediate Compartment (ERGIC); This family is the N-terminal of ERGIC proteins, ER-Golgi intermediate compartment clusters, otherwise known as Ervs, and is associated with family COPIIcoated_ERV, pfam07970." Q#8247 - CGI_10009532 superfamily 243540 41 262 2.74E-12 63.8061 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#8248 - CGI_10009533 superfamily 243540 54 276 1.21E-17 78.8288 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#8249 - CGI_10009534 superfamily 241754 437 1140 0 677.832 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#8249 - CGI_10009534 superfamily 243072 126 281 2.88E-28 113.247 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8249 - CGI_10009534 superfamily 243072 69 168 4.25E-15 74.7274 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8252 - CGI_10009537 superfamily 241625 1 122 8.87E-16 68.8896 cl00123 PROF superfamily - - "Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway." Q#8253 - CGI_10009538 superfamily 242894 2 181 2.50E-35 125.953 cl02122 TFIIF_beta superfamily - - "Transcription initiation factor IIF, beta subunit; Accurate transcription in vivo requires at least six general transcription initiation factors, in addition to RNA polymerase II. Transcription initiation factor IIF (TFIIF) is a tetramer of two beta subunits associate with two alpha subunits which interacts directly with RNA polymerase II. The beta subunit of TFIIF is required for recruitment of RNA polymerase II onto the promoter." Q#8254 - CGI_10009539 superfamily 247743 53 192 3.10E-23 93.7499 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#8256 - CGI_10001634 superfamily 241563 61 99 0.000569982 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8259 - CGI_10004436 superfamily 243092 15 296 2.88E-31 119.362 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8260 - CGI_10004437 superfamily 247684 8 447 0 702.797 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#8260 - CGI_10004437 superfamily 247743 780 854 7.58E-06 46.5196 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#8261 - CGI_10004438 superfamily 241578 1617 1795 2.15E-75 253.812 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#8265 - CGI_10013143 superfamily 245226 50 143 5.40E-50 159.188 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#8270 - CGI_10013148 superfamily 242406 4 105 3.51E-13 62.6089 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#8273 - CGI_10013151 superfamily 241958 234 489 3.58E-82 263.222 cl00573 SDF superfamily N - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#8273 - CGI_10013151 superfamily 241958 26 167 2.66E-06 48.2806 cl00573 SDF superfamily C - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#8276 - CGI_10013154 superfamily 241571 40 125 1.67E-11 57.8074 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8277 - CGI_10013155 superfamily 246921 319 371 1.01E-12 65.0893 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#8277 - CGI_10013155 superfamily 246921 241 297 6.84E-06 45.0589 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#8277 - CGI_10013155 superfamily 246921 383 435 0.000472787 39.6661 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#8277 - CGI_10013155 superfamily 246921 185 236 0.0021374 37.3549 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#8278 - CGI_10013156 superfamily 218987 12 86 9.49E-35 116.378 cl05685 GCN5L1 superfamily N - GCN5-like protein 1 (GCN5L1); This family consists of several eukaryotic GCN5-like protein 1 (GCN5L1) sequences. The function of this family is unknown. Q#8286 - CGI_10021284 superfamily 241600 1 72 1.36E-18 75.7399 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8287 - CGI_10021285 superfamily 245531 42 123 0.000279694 36.1878 cl11158 BEN superfamily - - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#8290 - CGI_10021288 superfamily 243384 26 151 3.28E-42 139.914 cl03312 Flavokinase superfamily - - Riboflavin kinase; This family represents the C-terminal region of the bifunctional riboflavin biosynthesis protein known as RibC in Bacillus subtilis. The RibC protein from Bacillus subtilis has both flavokinase and flavin adenine dinucleotide synthetase (FAD-synthetase) activities. RibC plays an essential role in the flavin metabolism. This domain is thought to have kinase activity. Q#8291 - CGI_10021289 superfamily 241610 45 98 9.89E-19 76.905 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#8291 - CGI_10021289 superfamily 241610 107 159 1.81E-18 76.1346 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#8291 - CGI_10021289 superfamily 241610 170 222 5.91E-17 71.8974 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#8291 - CGI_10021289 superfamily 241610 11 38 2.46E-07 45.7038 cl00101 KU superfamily N - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#8292 - CGI_10021290 superfamily 248458 92 450 1.84E-07 51.5457 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8293 - CGI_10021291 superfamily 245814 223 284 0.000373351 38.2391 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8294 - CGI_10021292 superfamily 245847 83 198 0.000600948 38.6376 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#8294 - CGI_10021292 superfamily 245847 279 394 0.00262326 37.0968 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#8299 - CGI_10021297 superfamily 245206 5 104 4.16E-36 128.785 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8299 - CGI_10021297 superfamily 242406 101 195 1.26E-16 73.3945 cl01271 DUF1768 superfamily C - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#8300 - CGI_10021298 superfamily 245206 1 64 2.77E-19 77.9382 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8301 - CGI_10021299 superfamily 245206 1 278 2.35E-123 355.282 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8303 - CGI_10021301 superfamily 243092 24 110 1.43E-09 56.5744 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8304 - CGI_10003509 superfamily 248247 53 300 3.10E-75 240.199 cl17693 Integrin_beta superfamily N - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#8304 - CGI_10003509 superfamily 219677 303 330 0.00792165 33.9504 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#8305 - CGI_10003510 superfamily 149701 114 155 6.72E-10 53.3741 cl07373 Integrin_b_cyt superfamily - - "Integrin beta cytoplasmic domain; Integrins are a group of transmembrane proteins which function as extracellular matrix receptors and in cell adhesion. Integrins are ubiquitously expressed and are heterodimeric, each composed of an alpha and beta subunit. Several variations of the the alpha and beta subunits exist, and association of different alpha and beta subunits can have different a different binding specificity. This domain corresponds to the cytoplasmic domain of the beta subunit." Q#8305 - CGI_10003510 superfamily 219677 38 62 0.000129831 38.5728 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#8309 - CGI_10017734 superfamily 243072 13 118 1.51E-23 90.5206 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8311 - CGI_10017736 superfamily 245010 8 109 1.48E-16 70.1085 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#8312 - CGI_10017737 superfamily 242432 66 223 3.19E-28 107.815 cl01321 SURF1 superfamily C - "SURF1 superfamily. Surf1/Shy1 has been implicated in the posttranslational steps of the biogenesis of the mitochondrially-encoded Cox1 subunit of cytochrome c oxidase (complex IV). Cytochrome c oxidase (complex IV), the terminal electron-transferring complex of the respiratory chain, is an assemblage of nuclear and mitochondrially-encoded subunits. Its assembly is mediated by nuclear encoded assembly factors, one of which is Surf1/Shy1. Mutations in human Surf1 are a major cause of Leigh syndrome, a severe neurodegenerative disorder." Q#8313 - CGI_10017738 superfamily 247744 25 231 3.42E-104 303.322 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#8314 - CGI_10017739 superfamily 247792 37 82 5.61E-05 41.2772 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8314 - CGI_10017739 superfamily 241563 177 211 0.00117608 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8315 - CGI_10017740 superfamily 247745 20 315 9.01E-47 166.338 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#8316 - CGI_10017741 superfamily 247856 83 142 3.11E-08 50.6241 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8316 - CGI_10017741 superfamily 215754 278 368 5.35E-31 114.658 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#8316 - CGI_10017741 superfamily 215754 187 275 1.89E-27 105.028 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#8316 - CGI_10017741 superfamily 215754 373 462 5.71E-19 81.916 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#8316 - CGI_10017741 superfamily 247856 21 78 0.00343951 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8317 - CGI_10017742 superfamily 247044 195 305 3.09E-27 103.485 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#8317 - CGI_10017742 superfamily 247044 12 126 8.97E-20 82.684 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#8319 - CGI_10017744 superfamily 247856 45 99 0.00128058 34.4457 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8321 - CGI_10017746 superfamily 190614 255 334 4.04E-30 116.121 cl15647 YEATS superfamily - - "YEATS family; We have named this family the YEATS family, after `YNK7', `ENL', `AF-9', and `TFIIF small subunit'. This family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity" Q#8322 - CGI_10017747 superfamily 219448 911 1015 2.62E-33 125.505 cl06523 DRMBL superfamily - - DNA repair metallo-beta-lactamase; The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in DNA repair. Q#8322 - CGI_10017747 superfamily 241867 723 810 2.77E-08 53.7114 cl00446 Lactamase_B superfamily NC - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#8323 - CGI_10017748 superfamily 216686 67 132 8.37E-05 39.999 cl18377 Galactosyl_T superfamily N - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#8323 - CGI_10017748 superfamily 245230 29 60 0.000446009 38.3918 cl10017 Tubulin_FtsZ superfamily N - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#8326 - CGI_10017751 superfamily 207716 73 141 9.08E-15 64.971 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#8327 - CGI_10017752 superfamily 245206 40 307 1.03E-60 197.447 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8328 - CGI_10017753 superfamily 247684 12 185 8.71E-19 82.6367 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#8329 - CGI_10017754 superfamily 245206 51 102 0.00290341 35.2779 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8330 - CGI_10017755 superfamily 245206 1 27 1.18E-06 42.9819 cl09931 NADB_Rossmann superfamily NC - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8331 - CGI_10017756 superfamily 245206 40 126 3.41E-25 96.9098 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8332 - CGI_10017757 superfamily 247684 12 185 4.23E-19 83.4071 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#8333 - CGI_10017758 superfamily 207716 22 89 3.93E-22 86.157 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#8334 - CGI_10017759 superfamily 207716 41 114 4.66E-20 78.453 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#8335 - CGI_10017760 superfamily 215866 83 240 1.08E-27 107.411 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#8335 - CGI_10017760 superfamily 243212 258 413 6.40E-24 97.029 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#8336 - CGI_10017761 superfamily 215866 23 178 1.31E-33 123.589 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#8336 - CGI_10017761 superfamily 243212 198 354 1.76E-28 108.97 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#8337 - CGI_10017762 superfamily 248136 3 135 5.41E-53 166.339 cl17582 Sybindin superfamily - - "Sybindin-like family; Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses." Q#8338 - CGI_10017763 superfamily 245835 746 937 0.00357634 39.2614 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#8338 - CGI_10017763 superfamily 213465 887 993 0.00390679 38.752 cl17074 PRK03963 superfamily C - V-type ATP synthase subunit E; Provisional Q#8339 - CGI_10017764 superfamily 246925 311 651 9.28E-11 62.7582 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#8339 - CGI_10017764 superfamily 246925 70 329 0.00803881 37.7202 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#8340 - CGI_10017765 superfamily 245622 773 891 7.35E-22 93.8282 cl11446 Rhomboid superfamily - - "Rhomboid family; This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite." Q#8341 - CGI_10017766 superfamily 245864 11 499 2.05E-74 245.008 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#8342 - CGI_10003803 superfamily 241583 1 93 3.53E-25 96.4862 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#8343 - CGI_10003804 superfamily 241609 140 218 1.02E-23 92.4411 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#8343 - CGI_10003804 superfamily 245213 106 138 0.000869955 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8344 - CGI_10003805 superfamily 241609 43 111 3.11E-20 79.7295 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#8346 - CGI_10000368 superfamily 243161 3 60 7.73E-05 36.6034 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#8348 - CGI_10001062 superfamily 245847 48 170 4.30E-10 53.6604 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#8349 - CGI_10001119 superfamily 241691 32 264 1.60E-32 121.737 cl00213 DNA_BRE_C superfamily - - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#8351 - CGI_10001154 superfamily 222429 6 80 8.76E-16 67.6508 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#8354 - CGI_10015859 superfamily 248097 285 408 7.29E-27 103.499 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#8354 - CGI_10015859 superfamily 248022 40 241 2.60E-17 81.9403 cl17468 Aa_trans superfamily N - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#8355 - CGI_10015860 superfamily 245601 568 681 4.22E-16 78.9552 cl11399 HP superfamily C - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#8355 - CGI_10015860 superfamily 245601 806 949 4.64E-09 56.9988 cl11399 HP superfamily N - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#8355 - CGI_10015860 superfamily 245601 392 432 1.32E-05 47.442 cl11399 HP superfamily C - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#8355 - CGI_10015860 superfamily 247809 248 347 0.00150066 39.7481 cl17255 ATP-grasp_4 superfamily N - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#8356 - CGI_10015861 superfamily 247916 58 169 3.25E-17 77.8262 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#8357 - CGI_10015862 superfamily 247724 26 44 1.72E-05 39.5156 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8358 - CGI_10015863 superfamily 216942 23 71 0.00083894 34.9092 cl03494 CDI superfamily - - Cyclin-dependent kinase inhibitor; Cell cycle progression is negatively controlled by cyclin-dependent kinases inhibitors (CDIs). CDIs are involved in cell cycle arrest at the G1 phase. Q#8361 - CGI_10015866 superfamily 241760 577 613 2.77E-11 59.7555 cl00295 ZZ superfamily C - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#8361 - CGI_10015866 superfamily 217473 171 365 1.45E-40 149.052 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#8363 - CGI_10015868 superfamily 218186 173 610 0 577.433 cl18446 PA26 superfamily - - PA26 p53-induced protein (sestrin); PA26 is a p53-inducible protein. Its function is unknown. It has similarity to pfam04636 in its N-terminus. Q#8364 - CGI_10015869 superfamily 243096 812 995 6.20E-24 101.22 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#8364 - CGI_10015869 superfamily 243074 117 160 5.68E-08 50.9681 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#8364 - CGI_10015869 superfamily 247725 1060 1141 3.77E-07 49.9455 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8364 - CGI_10015869 superfamily 247725 987 1076 0.00184973 39.1767 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8364 - CGI_10015869 superfamily 222625 400 484 0.00554402 37.5107 cl16748 DUF4347 superfamily C - "Domain of unknown function (DUF4347); This domain family is found in bacteria and eukaryotes, and is approximately 160 amino acids in length. There are two completely conserved residues (C and G) that may be functionally important." Q#8367 - CGI_10015872 superfamily 248458 143 273 2.16E-19 89.6805 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8367 - CGI_10015872 superfamily 248458 332 478 3.22E-13 70.8057 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8367 - CGI_10015872 superfamily 245864 587 992 3.38E-57 204.434 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#8368 - CGI_10015873 superfamily 245814 41 108 5.07E-07 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8369 - CGI_10015874 superfamily 245864 3 385 3.60E-36 137.025 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#8377 - CGI_10001563 superfamily 241632 138 369 6.90E-80 256.796 cl00137 SERPIN superfamily C - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#8380 - CGI_10001646 superfamily 247684 17 425 2.01E-99 312.674 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#8381 - CGI_10001503 superfamily 245864 1 180 1.22E-63 204.82 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#8384 - CGI_10001358 superfamily 241563 59 96 2.12E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8384 - CGI_10001358 superfamily 128778 101 213 0.00839823 35.3183 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#8388 - CGI_10002151 superfamily 219525 274 313 0.000233362 39.3246 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#8391 - CGI_10002630 superfamily 241750 137 465 1.88E-130 383.908 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#8395 - CGI_10003128 superfamily 243091 196 263 1.23E-09 54.8056 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#8397 - CGI_10009672 superfamily 247743 149 205 8.81E-05 40.5923 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#8397 - CGI_10009672 superfamily 248372 7 125 2.00E-11 58.75 cl17818 Nuc_deoxyrib_tr superfamily - - Nucleoside 2-deoxyribosyltransferase; Nucleoside 2-deoxyribosyltransferase EC:2.4.2.6 catalyzes the cleavage of the glycosidic bonds of 2`-deoxyribonucleosides. Q#8398 - CGI_10009673 superfamily 110440 105 132 0.00270046 32.7649 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#8402 - CGI_10009677 superfamily 246598 24 289 1.10E-170 478.616 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#8403 - CGI_10009678 superfamily 245147 154 359 0.00438272 37.2555 cl09743 RNA_lig_T4_1 superfamily - - "RNA ligase; Members of this family include T4 phage proteins with ATP-dependent RNA ligase activity. Host defence to phage may include cleavage and inactivation of specific tRNA molecules; members of this family act to reverse this RNA damage. The enzyme is adenylated, transiently, on a Lys residue in a motif KXDGSL. This family also includes fungal tRNA ligases that have adenylyltransferase activity. tRNA ligases are enzymes required for the splicing of precursor tRNA molecules containing introns." Q#8404 - CGI_10009679 superfamily 191163 1 190 1.23E-127 360.201 cl04888 DUF667 superfamily - - "Protein of unknown function (DUF667); This family of proteins are highly conserved in eukaryotes. Some proteins in the family are annotated as transcription factors. However, there is currently no support for this in the literature." Q#8405 - CGI_10009680 superfamily 245201 37 322 1.12E-55 186.151 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8407 - CGI_10009682 superfamily 241874 31 601 0 788.392 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#8408 - CGI_10009683 superfamily 247727 125 219 0.000575769 37.4095 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#8410 - CGI_10009685 superfamily 243092 31 252 1.72E-21 94.324 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8410 - CGI_10009685 superfamily 243141 355 420 0.0098793 35.0255 cl02687 RWD superfamily N - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#8413 - CGI_10009688 superfamily 243074 4 48 5.61E-08 46.7309 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#8414 - CGI_10009689 superfamily 192915 102 233 6.10E-09 54.2965 cl13451 DUF3506 superfamily - - Domain of unknown function (DUF3506); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 131 to 148 amino acids in length. This domain has a conserved KLTGD sequence motif. Q#8416 - CGI_10003357 superfamily 248293 110 179 0.00120145 36.1778 cl17739 MADF_DNA_bdg superfamily - - Alcohol dehydrogenase transcription factor Myb/SANT-like; The myb/SANT-like domain in Adf-1 (MADF) is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Q#8418 - CGI_10004066 superfamily 215647 340 541 2.67E-34 131.192 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#8418 - CGI_10004066 superfamily 221370 118 310 2.31E-11 62.7741 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#8418 - CGI_10004066 superfamily 243029 35 87 2.79E-07 48.5009 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#8418 - CGI_10004066 superfamily 243035 660 706 0.00397938 36.6337 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8419 - CGI_10004067 superfamily 245818 37 144 6.88E-11 58.5653 cl11966 Rel-Spo_like superfamily C - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#8419 - CGI_10004067 superfamily 220744 150 281 5.84E-30 113.096 cl11075 OAS1_C superfamily C - "2'-5'-oligoadenylate synthetase 1, domain 2, C-terminus; This is the largely alpha-helical, C-terminal half of 2'-5'-oligoadenylate synthetase 1, being described as domain 2 of the enzyme and homologous to a tandem ubiquitin repeat. It carries the region of enzymic activity between 320 and 344 at the extreme C-terminal end. Oligoadenylate synthetases are antiviral enzymes that counteract vial attack by degrading viral RNA. The enzyme uses ATP in 2'-specific nucleotidyl transfer reactions to synthesise 2'.5'-oligoadenylates, which activate latent ribonuclease, resulting in degradation of viral RNA and inhibition of virus replication. This domain is often associated with NTP_transf_2 pfam01909." Q#8422 - CGI_10010613 superfamily 245201 249 518 4.15E-48 167.799 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8423 - CGI_10010614 superfamily 243035 116 236 7.50E-21 84.9789 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8424 - CGI_10010615 superfamily 243035 178 298 3.55E-20 83.8233 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8425 - CGI_10010616 superfamily 219755 1 47 1.78E-18 73.5255 cl07021 SYF2 superfamily N - SYF2 splicing factor; Proteins in this family are involved in cell cycle progression and pre-mRNA splicing. Q#8426 - CGI_10010617 superfamily 216152 1 214 1.48E-56 185.21 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#8428 - CGI_10010619 superfamily 242406 4 59 6.84E-05 36.8005 cl01271 DUF1768 superfamily NC - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#8429 - CGI_10010620 superfamily 216152 1 214 4.20E-53 175.965 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#8430 - CGI_10010621 superfamily 248458 31 194 0.00407806 35.7525 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8431 - CGI_10010622 superfamily 246902 1318 1497 4.52E-97 311.416 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#8431 - CGI_10010622 superfamily 246902 618 764 1.00E-86 280.602 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#8431 - CGI_10010622 superfamily 247725 461 592 3.87E-52 181.689 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8431 - CGI_10010622 superfamily 243088 342 473 2.56E-39 145.217 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#8431 - CGI_10010622 superfamily 195686 55 217 6.62E-54 187.987 cl08291 TCTP superfamily - - Translationally controlled tumour protein; Translationally controlled tumour protein. Q#8432 - CGI_10010623 superfamily 222150 756 781 0.00581993 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8432 - CGI_10010623 superfamily 222150 728 753 0.00994065 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8434 - CGI_10010625 superfamily 245367 6 198 9.84E-72 238.297 cl10727 DUF2042 superfamily C - Uncharacterized conserved protein (DUF2042); This entry is the conserved N-terminal 300 residues of a group of proteins found from protozoa to Humans. The function is unknown. Q#8434 - CGI_10010625 superfamily 245367 196 833 3.98E-60 218.538 cl10727 DUF2042 superfamily - - Uncharacterized conserved protein (DUF2042); This entry is the conserved N-terminal 300 residues of a group of proteins found from protozoa to Humans. The function is unknown. Q#8438 - CGI_10012816 superfamily 204985 68 124 2.65E-08 52.1739 cl14987 Chorein_N superfamily N - "N-terminal region of Chorein, a TM vesicle-mediated sorter; Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport." Q#8439 - CGI_10012817 superfamily 221585 258 353 6.81E-22 92.1373 cl13842 hSac2 superfamily - - "Inositol phosphatase; This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam02383. hSac2 functions as an inositol polyphosphate 5-phosphatase." Q#8439 - CGI_10012817 superfamily 217007 3 74 2.33E-21 94.589 cl11995 Syja_N superfamily N - SacI homology domain; This Pfam family represents a protein domain which shows homology to the yeast protein SacI. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin. Q#8440 - CGI_10012818 superfamily 247907 789 946 2.26E-24 101.341 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#8440 - CGI_10012818 superfamily 245213 621 657 1.38E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8440 - CGI_10012818 superfamily 245213 699 735 3.44E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8440 - CGI_10012818 superfamily 245213 667 697 1.57E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8440 - CGI_10012818 superfamily 245213 546 581 0.00180031 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8440 - CGI_10012818 superfamily 245213 751 785 0.00854612 35.6902 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8440 - CGI_10012818 superfamily 248288 1016 1076 8.31E-08 51.2501 cl17734 DAN superfamily - - "DAN domain; This domain contains 9 conserved cysteines and is extracellular. Therefore the cysteines may form disulphide bridges. This family of proteins has been termed the DAN family after the first member to be reported. This family includes DAN, Cerberus and Gremlin. The gremlin protein is an antagonist of bone morphogenetic protein signaling. It is postulated that all members of this family antagonise different TGF beta pfam00019 ligands. Recent work shows that the DAN protein is not an efficient antagonist of BMP-2/4 class signals, we found that DAN was able to interact with GDF-5 in a frog embryo assay, suggesting that DAN may regulate signaling by the GDF-5/6/7 class of BMPs in vivo." Q#8440 - CGI_10012818 superfamily 243030 114 145 4.87E-05 42.3047 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#8440 - CGI_10012818 superfamily 243030 353 384 6.79E-05 41.9195 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#8440 - CGI_10012818 superfamily 199168 402 425 7.05E-05 41.5684 cl15310 LRR_TYP superfamily - - "Leucine-rich repeats, typical (most populated) subfamily; Leucine-rich repeats, typical (most populated) subfamily. " Q#8440 - CGI_10012818 superfamily 214507 46 95 0.00021192 40.4912 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#8440 - CGI_10012818 superfamily 199168 165 188 0.000258602 40.0276 cl15310 LRR_TYP superfamily - - "Leucine-rich repeats, typical (most populated) subfamily; Leucine-rich repeats, typical (most populated) subfamily. " Q#8440 - CGI_10012818 superfamily 199168 450 471 0.00318243 36.946 cl15310 LRR_TYP superfamily - - "Leucine-rich repeats, typical (most populated) subfamily; Leucine-rich repeats, typical (most populated) subfamily. " Q#8442 - CGI_10012820 superfamily 243092 38 332 2.34E-25 106.65 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8443 - CGI_10012821 superfamily 245226 9 115 4.82E-13 63.0884 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#8444 - CGI_10012822 superfamily 241583 120 306 1.25E-64 218.209 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#8444 - CGI_10012822 superfamily 247097 428 464 0.000757376 38.8997 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#8444 - CGI_10012822 superfamily 247097 761 795 0.00712124 36.2033 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#8446 - CGI_10012824 superfamily 246612 579 671 1.03E-13 68.9853 cl14057 BPL_LplA_LipB superfamily N - "Biotin/lipoate A/B protein ligase family; This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyzes the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. The unusual biosynthesis pathway of lipoic acid is mechanistically intertwined with attachment of the cofactor." Q#8446 - CGI_10012824 superfamily 241555 397 475 0.00110701 40.7866 cl00020 GAT_1 superfamily NC - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#8447 - CGI_10012825 superfamily 241832 7 75 9.44E-20 80.3084 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#8447 - CGI_10012825 superfamily 243175 89 208 9.03E-13 61.8109 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#8448 - CGI_10012826 superfamily 245814 189 259 2.77E-10 55.1879 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8448 - CGI_10012826 superfamily 245814 54 136 7.48E-07 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8450 - CGI_10012828 superfamily 241782 34 302 8.14E-98 297.558 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#8453 - CGI_10001473 superfamily 243179 7 90 8.91E-21 81.5809 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#8454 - CGI_10003828 superfamily 247727 75 171 0.00568455 33.8212 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#8455 - CGI_10003829 superfamily 207921 269 385 6.21E-40 139.225 cl03350 Ribosomal_L28e superfamily - - Ribosomal L28e protein family; Ribosomal L28e protein family. Q#8456 - CGI_10003830 superfamily 241550 56 124 2.61E-16 71.8685 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#8458 - CGI_10003239 superfamily 248458 139 510 1.10E-20 92.3769 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8459 - CGI_10003240 superfamily 248458 139 476 6.23E-19 86.9841 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8460 - CGI_10003241 superfamily 248458 162 511 2.42E-19 88.1397 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8461 - CGI_10003242 superfamily 248458 166 477 7.49E-19 86.5989 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8462 - CGI_10005641 superfamily 248012 43 135 2.20E-13 63.0613 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#8464 - CGI_10005643 superfamily 241597 34 99 2.43E-32 115.088 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#8466 - CGI_10005645 superfamily 247743 869 1000 5.10E-23 97.6019 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#8466 - CGI_10005645 superfamily 247743 586 731 3.67E-06 47.1407 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#8466 - CGI_10005645 superfamily 220158 60 135 1.08E-14 71.4639 cl07772 PEX-1N superfamily - - "Peroxisome biogenesis factor 1, N-terminal; Members of this family adopt a double psi beta-barrel fold, similar in structure to the Cdc48 N-terminal domain. It has been suggested that this domain may be involved in interactions with ubiquitin, ubiquitin-like protein modifiers, or ubiquitin-like domains, such as Ubx. Furthermore, the domain may possess a putative adaptor or substrate binding site, allowing for peroxisomal biogenesis, membrane fusion and protein translocation." Q#8467 - CGI_10005646 superfamily 241647 137 167 1.55E-08 50.6042 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#8467 - CGI_10005646 superfamily 241647 64 108 9.75E-06 42.515 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#8468 - CGI_10021060 superfamily 243035 27 70 5.72E-05 37.9846 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8470 - CGI_10021062 superfamily 243092 9 167 6.25E-30 112.428 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8471 - CGI_10021063 superfamily 241607 85 119 0.000139473 39.1754 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#8472 - CGI_10021064 superfamily 241600 510 682 4.22E-62 207.863 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8472 - CGI_10021064 superfamily 241619 443 493 2.02E-05 43.3397 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#8473 - CGI_10021065 superfamily 241600 96 267 7.61E-68 212.486 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8473 - CGI_10021065 superfamily 241619 31 79 4.13E-05 40.6433 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#8477 - CGI_10021069 superfamily 242884 7 102 8.98E-36 119.253 cl02104 Ribosomal_L36e superfamily - - Ribosomal protein L36e; Ribosomal protein L36e. Q#8478 - CGI_10021070 superfamily 245814 329 381 0.000442871 38.6243 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8480 - CGI_10021072 superfamily 149667 1 120 1.01E-18 83.9591 cl07343 GON superfamily C - GON domain; The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. Q#8480 - CGI_10021072 superfamily 243093 142 221 6.27E-18 78.7262 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#8481 - CGI_10021073 superfamily 241611 89 224 5.99E-14 71.2656 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#8481 - CGI_10021073 superfamily 243093 587 663 1.85E-05 44.4434 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#8481 - CGI_10021073 superfamily 238012 1016 1044 0.00332072 37.3338 cl11390 EGF_Lam superfamily N - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8483 - CGI_10021075 superfamily 215827 144 255 5.98E-11 59.7896 cl02830 Tyrosinase superfamily C - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#8484 - CGI_10021076 superfamily 215827 144 311 2.80E-24 101.391 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#8485 - CGI_10021077 superfamily 241599 76 130 5.49E-08 48.7789 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#8485 - CGI_10021077 superfamily 243166 132 282 1.01E-35 129.336 cl02759 TRAM_LAG1_CLN8 superfamily C - TLC domain; TLC domain. Q#8487 - CGI_10021079 superfamily 247856 79 133 1.12E-17 72.5805 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8487 - CGI_10021079 superfamily 247856 7 63 4.77E-11 54.4761 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8488 - CGI_10021080 superfamily 215754 24 118 5.24E-23 91.546 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#8488 - CGI_10021080 superfamily 215754 124 223 9.86E-23 90.7756 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#8488 - CGI_10021080 superfamily 215754 230 351 2.94E-15 70.36 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#8489 - CGI_10021081 superfamily 245213 3332 3365 2.34E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 496 536 0.000120683 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 756 788 0.000295771 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 1133 1173 0.000729915 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 1006 1039 0.000969351 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 3081 3114 0.00120169 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 3458 3493 0.00152092 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 798 828 0.0022604 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 669 706 0.00256141 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 245213 2656 2698 0.00280058 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 243124 98 192 1.65E-14 74.3857 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#8489 - CGI_10021081 superfamily 243065 2223 2392 1.21E-08 56.6813 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#8489 - CGI_10021081 superfamily 241578 2697 2737 2.00E-07 53.5427 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#8489 - CGI_10021081 superfamily 241578 328 368 6.25E-07 52.0019 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#8489 - CGI_10021081 superfamily 243060 3588 3655 5.68E-05 44.6772 cl02507 SEA superfamily N - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#8489 - CGI_10021081 superfamily 245213 2909 2948 0.00045505 41.1804 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 241578 834 868 0.000649714 43.1424 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#8489 - CGI_10021081 superfamily 241578 404 450 0.000909931 42.372 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#8489 - CGI_10021081 superfamily 245213 2740 2778 0.00120138 40.0248 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8489 - CGI_10021081 superfamily 241578 361 411 0.00199896 41.6016 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#8489 - CGI_10021081 superfamily 241578 578 619 0.00475525 40.446 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#8490 - CGI_10008783 superfamily 218329 11 410 5.64E-80 255.125 cl04845 DIE2_ALG10 superfamily - - "DIE2/ALG10 family; The ALG10 protein from Saccharomyces cerevisiae encodes the alpha-1,2 glucosyltransferase of the endoplasmic reticulum. This protein has been characterized in rat as potassium channel regulator 1." Q#8491 - CGI_10008784 superfamily 244901 5 233 3.17E-108 326.505 cl08306 Peptidase_C12 superfamily - - "Cysteine peptidase C12 contains ubiquitin carboxyl-terminal hydrolase (UCH) families L1, L3, L5 and BAP1; The ubiquitin C-terminal hydrolase (UCH; ubiquitinyl hydrolase; ubiquitin thiolesterase) family of deubiquitinating enzymes (DUBs) consists of four members to date: UCH-L1, UCH-L3, UCH-L5 (UCH37) and BRCA1-associated protein-1 (BAP1), all containing a conserved catalytic domain with cysteine peptidase activity. UCH-L1 hydrolyzes carboxyl terminal esters and amides of ubiquitin (Ub). Dysfunction of this hydrolase activity can lead to an accumulation of alpha-synuclein, which is linked to Parkinson's disease (PD) and neurofibrillary tangles, linked to Alzheimer's disease (AD). UCH-L1, in its dimeric form, has additional enzymatic activity as a ubiquitin ligase. UCH-L3 hydrolyzes isopeptide bonds at the C-terminal glycine of either Ub or Nedd8, a ubiquitin-like protein. UCH-L3 can also interact with Lys48-linked Ub dimers to protect it from degradation while inhibiting its hydrolase activity at the same time. UCH-L1 and UCH-L3 are the most closely related of the UCH members. UCH-L5 (UCH37) is involved in the deubiquitinating activity in the 19S proteasome regulatory complex. It is also associated with the human Ino80 chromatin-remodeling complex (hINO80) in the nucleus. BAP1 binds to the wild-type BRCA1 RING finger domain, localized in the nucleus. It consists of the N-terminal UCH domain and two predicted nuclear localization signals (NLSs), only one of which is functional. The full-length human BRCA1 is a ubiquitin ligase. However, BAP1 does not appear to function in the deubiquitination of autoubiquitinated BRCA1. There is growing evidence that UCH enzymes and human malignancies are closely correlated. Studies show that UCH enzymes play a crucial role in some signaling pathways and in cell-cycle regulation." Q#8492 - CGI_10008785 superfamily 243039 423 513 8.93E-14 68.1746 cl02446 MATH superfamily N - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#8492 - CGI_10008785 superfamily 247792 14 63 1.39E-06 45.5144 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8492 - CGI_10008785 superfamily 241563 163 192 0.0039013 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8493 - CGI_10008786 superfamily 242889 256 356 5.65E-20 83.8065 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#8496 - CGI_10008789 superfamily 247724 1 200 1.18E-74 232.041 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8500 - CGI_10008793 superfamily 243110 166 375 1.08E-18 85.5589 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#8501 - CGI_10008794 superfamily 243110 270 333 5.56E-05 43.9573 cl02616 MACPF superfamily N - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#8503 - CGI_10007205 superfamily 241619 65 126 2.89E-05 39.4877 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#8504 - CGI_10007206 superfamily 217293 2 160 9.64E-25 99.6295 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#8504 - CGI_10007206 superfamily 202474 167 249 1.89E-08 53.0413 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#8505 - CGI_10007207 superfamily 217293 2 165 4.07E-29 111.956 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#8505 - CGI_10007207 superfamily 202474 172 269 1.18E-11 62.2861 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#8506 - CGI_10007208 superfamily 192929 2 460 0 547.439 cl13492 HSNSD superfamily - - heparan sulfate-N-deacetylase; This family of proteins is are heparan sulfate N-deacetylase enzymes. This protein is found in eukaryotes. This proteinenzyme is often found associated with pfam00685. Q#8507 - CGI_10007209 superfamily 247692 210 788 0 836.045 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#8508 - CGI_10007210 superfamily 247723 300 367 3.78E-11 60.0113 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#8509 - CGI_10007211 superfamily 202711 61 231 1.61E-98 287.326 cl04190 Mob1_phocein superfamily - - "Mob1/phocein family; Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature. This family also includes phocein, a rat protein that by yeast two hybrid interacts with striatin." Q#8510 - CGI_10007212 superfamily 247723 389 463 3.45E-37 132.073 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#8510 - CGI_10007212 superfamily 247723 43 120 2.86E-35 126.712 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#8510 - CGI_10007212 superfamily 247723 125 218 1.38E-47 161.679 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#8512 - CGI_10006326 superfamily 241580 761 834 2.98E-29 113.418 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#8512 - CGI_10006326 superfamily 241572 82 166 4.62E-10 58.0188 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#8514 - CGI_10006328 superfamily 247683 2181 2223 3.75E-05 43.6055 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8514 - CGI_10006328 superfamily 243142 1468 1600 3.78E-08 53.7843 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#8520 - CGI_10023311 superfamily 216686 92 269 5.33E-41 149.396 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#8520 - CGI_10023311 superfamily 245847 432 579 2.54E-15 74.1301 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#8520 - CGI_10023311 superfamily 241619 304 353 0.00245101 37.1765 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#8521 - CGI_10023312 superfamily 222150 259 282 0.000179006 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8521 - CGI_10023312 superfamily 222150 147 171 0.000427341 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8521 - CGI_10023312 superfamily 222150 288 312 0.000515344 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8521 - CGI_10023312 superfamily 222150 204 228 0.00197631 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8521 - CGI_10023312 superfamily 222150 92 114 0.00610447 34.2897 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#8522 - CGI_10023313 superfamily 245819 578 627 1.69E-13 68.7599 cl11967 Nucleotidyl_cyc_III superfamily C - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#8522 - CGI_10023313 superfamily 245225 2 304 4.07E-42 156.639 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#8522 - CGI_10023313 superfamily 245201 395 501 2.95E-09 56.776 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8522 - CGI_10023313 superfamily 219526 517 564 9.14E-07 48.7695 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8523 - CGI_10023314 superfamily 241574 1 86 8.81E-35 129.628 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#8523 - CGI_10023314 superfamily 241574 173 336 1.12E-09 57.2105 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#8528 - CGI_10023319 superfamily 243035 336 406 0.00037691 38.755 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8528 - CGI_10023319 superfamily 241619 27 93 0.00105367 37.1765 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#8528 - CGI_10023319 superfamily 243035 123 203 0.0035209 35.8633 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8528 - CGI_10023319 superfamily 241619 237 305 0.00474003 35.2505 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#8529 - CGI_10023320 superfamily 243058 177 264 0.000104112 42.6868 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#8529 - CGI_10023320 superfamily 245201 459 727 4.80E-146 453.498 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8529 - CGI_10023320 superfamily 216347 1417 1843 4.08E-131 417.322 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#8530 - CGI_10023321 superfamily 241889 79 214 7.79E-48 161.244 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#8531 - CGI_10023322 superfamily 245201 507 771 2.24E-53 184.362 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8532 - CGI_10023323 superfamily 247755 60 277 7.93E-77 243.612 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8532 - CGI_10023323 superfamily 247789 350 509 7.19E-17 79.225 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#8533 - CGI_10023324 superfamily 247684 47 384 9.27E-63 212.908 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#8534 - CGI_10023325 superfamily 247725 436 556 3.55E-57 195.147 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8534 - CGI_10023325 superfamily 241645 306 391 1.74E-20 88.7335 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#8535 - CGI_10023326 superfamily 247065 46 138 7.86E-18 75.459 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#8536 - CGI_10023327 superfamily 241574 272 501 1.60E-68 226.313 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#8537 - CGI_10023328 superfamily 187408 309 642 7.52E-126 394.735 cl14654 V_Alix_like superfamily - - "Protein-interacting V-domain of mammalian Alix and related domains; This superfamily contains the V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. The Alix V-domain contains a binding site, partially conserved in this superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Members of this superfamily have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members, including Alix, HD-PTP, and Bro1, also have a proline-rich region (PRR), which binds multiple partners in Alix, including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. The C-terminal portion (V-domain and PRR) of Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes; it interacts with a YPxL motif in Doa4s catalytic domain to stimulate its deubiquitination activity. Rim20 may bind the ESCRT-III subunit Snf7, bringing the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and promoting the proteolytic activation of Rim101. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate often absent in human kidney, breast, lung, and cervical tumors. HD-PTP has a C-terminal catalytically inactive tyrosine phosphatase domain." Q#8537 - CGI_10023328 superfamily 187403 3 304 5.30E-111 355.197 cl14649 BRO1_Alix_like superfamily N - "Protein-interacting Bro1-like domain of mammalian Alix and related domains; This superfamily includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1 and Rim20 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, HD-PTP, and Brox) and Snf7 (in the case of yeast Bro1, and Rim20). The single domain protein human Brox, and the isolated Bro1-like domains of Alix, HD-PTP and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix, HD-PTP, Bro1, and Rim20 also have a V-shaped (V) domain, which in the case of Alix, has been shown to be a dimerization domain and to contain a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in this superfamily. Alix, HD-PTP and Bro1 also have a proline-rich region (PRR); the Alix PRR binds multiple partners. Rhophilin-1, and -2, in addition to this Bro1-like domain, have an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This protein has a C-terminal, catalytically inactive tyrosine phosphatase domain." Q#8538 - CGI_10023329 superfamily 241884 6 151 2.55E-92 271.143 cl00467 Ntn_hydrolase superfamily N - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#8542 - CGI_10023335 superfamily 116100 195 344 1.24E-60 193.568 cl08454 NAD_Gly3P_dh_C superfamily - - NAD-dependent glycerol-3-phosphate dehydrogenase C-terminus; NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the C-terminal substrate-binding domain. Q#8542 - CGI_10023335 superfamily 201664 6 170 8.29E-55 178.961 cl18216 NAD_Gly3P_dh_N superfamily - - NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus; NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain. Q#8543 - CGI_10023336 superfamily 221075 81 195 1.26E-21 86.8055 cl12855 Med30 superfamily - - "Mediator complex subunit 30; Med30 is a metazoan-specific subunit of Mediator, having no homologues in yeasts." Q#8545 - CGI_10023338 superfamily 146451 91 108 3.97E-05 36.9535 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#8546 - CGI_10001125 superfamily 242274 27 174 9.35E-10 53.4872 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#8550 - CGI_10005171 superfamily 247097 133 169 0.000442721 39.7418 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#8550 - CGI_10005171 superfamily 247097 877 913 0.000667185 38.9714 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#8553 - CGI_10005174 superfamily 203591 88 221 4.60E-34 129.028 cl06275 DUF1399 superfamily - - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#8553 - CGI_10005174 superfamily 226728 186 258 3.94E-06 48.3873 cl18775 COG4278 superfamily NC - Uncharacterized conserved protein [Function unknown] Q#8553 - CGI_10005174 superfamily 203591 14 91 0.00201623 38.5064 cl06275 DUF1399 superfamily N - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#8554 - CGI_10005175 superfamily 114894 1 217 1.61E-52 175.206 cl17945 GDE_C superfamily NC - "Amylo-alpha-1,6-glucosidase; This family includes human glycogen branching enzyme. This enzyme contains a number of distinct catalytic activities. It has been shown for the yeast homologue that mutations in this region disrupt the enzymes Amylo-alpha-1,6-glucosidase (EC:3.2.1.33)." Q#8556 - CGI_10014412 superfamily 220766 70 250 5.09E-53 175.239 cl11103 MENTAL superfamily - - "Cholesterol-capturing domain; Human meta-static lymph node (MLN) 64 is a late endosomal membrane protein, and carries this MENTAL (MLN64N-terminal) domain at its N-terminus. The domain is composed of four trans-membrane helices with three short intervening loops. The function of the domain is to capture cholesterol and pass it to the associated START domain pfam01852 for transfer to a cytosolic acceptor protein or membrane. In mammals, the MENTAL domain is involved in the localisation of MLN64 and MENTHO in late endosomes, and also in homo-and of hetero-interactions of these two proteins." Q#8556 - CGI_10014412 superfamily 246681 282 367 1.67E-12 64.6834 cl14643 SRPBCC superfamily C - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#8559 - CGI_10014415 superfamily 245814 71 140 0.000703763 36.6983 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8559 - CGI_10014415 superfamily 245814 171 250 5.98E-06 42.6815 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8561 - CGI_10014417 superfamily 201778 23 142 6.41E-17 76.8638 cl18219 GFO_IDH_MocA superfamily - - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#8561 - CGI_10014417 superfamily 217272 154 217 6.65E-06 44.4476 cl18400 GFO_IDH_MocA_C superfamily C - "Oxidoreductase family, C-terminal alpha/beta domain; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#8561 - CGI_10014417 superfamily 217272 240 300 0.00937657 34.8176 cl18400 GFO_IDH_MocA_C superfamily C - "Oxidoreductase family, C-terminal alpha/beta domain; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#8562 - CGI_10014418 superfamily 199528 7 114 0.000202016 38.1533 cl15392 PRK10429 superfamily C - melibiose:sodium symporter; Provisional Q#8563 - CGI_10014419 superfamily 201778 21 140 3.75E-21 88.805 cl18219 GFO_IDH_MocA superfamily - - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#8563 - CGI_10014419 superfamily 217272 152 215 3.54E-07 48.2996 cl18400 GFO_IDH_MocA_C superfamily C - "Oxidoreductase family, C-terminal alpha/beta domain; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#8566 - CGI_10014422 superfamily 241563 1 40 5.80E-07 45.548 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8567 - CGI_10014423 superfamily 245226 55 150 2.76E-09 51.5325 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#8569 - CGI_10014425 superfamily 245814 15 84 3.80E-05 40.5503 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8569 - CGI_10014425 superfamily 245814 125 208 1.41E-08 50.1965 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8572 - CGI_10014428 superfamily 247912 32 277 6.57E-20 89.0976 cl17358 Beta-lactamase superfamily N - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#8575 - CGI_10014431 superfamily 244539 20 436 8.30E-154 444.842 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#8576 - CGI_10014432 superfamily 242895 18 146 1.18E-56 179.351 cl02125 Med6 superfamily - - MED6 mediator sub complex component; Component of RNA polymerase II holoenzyme and mediator sub complex. Q#8579 - CGI_10014436 superfamily 247724 38 202 2.83E-70 214.715 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8580 - CGI_10014437 superfamily 247724 9 178 2.67E-56 177.736 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8581 - CGI_10014438 superfamily 248458 96 218 1.08E-05 45.7677 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8588 - CGI_10007301 superfamily 245847 22 105 0.00015375 39.7932 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#8589 - CGI_10014985 superfamily 241581 25 94 3.46E-09 55.8554 cl00062 FHA superfamily N - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#8590 - CGI_10014986 superfamily 241874 46 566 0 727.163 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#8591 - CGI_10014987 superfamily 246671 113 256 1.17E-27 105.198 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#8593 - CGI_10014989 superfamily 243555 22 218 3.52E-07 49.3118 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#8595 - CGI_10014991 superfamily 193473 1 342 3.61E-43 157.173 cl15668 Acatn superfamily N - Acetyl-coenzyme A transporter 1; The mouse Acatn is a 61 kDa hydrophobic protein with six to 10 transmembrane domains. It appears to promote 9-O-acetylation in gangliosides. Q#8599 - CGI_10014995 superfamily 246935 14 97 7.21E-12 58.4484 cl15347 CBM20 superfamily - - "The family 20 carbohydrate-binding module (CBM20), also known as the starch-binding domain, is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch." Q#8600 - CGI_10014996 superfamily 202715 134 234 7.89E-20 81.4704 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#8601 - CGI_10014997 superfamily 241872 104 184 5.60E-16 78.1828 cl00453 CDP-OH_P_transf superfamily C - CDP-alcohol phosphatidyltransferase; All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. Q#8602 - CGI_10014998 superfamily 246918 31 74 1.45E-07 44.1147 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8603 - CGI_10014999 superfamily 245595 477 766 2.70E-177 538.717 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#8603 - CGI_10014999 superfamily 245595 31 317 1.41E-151 469.123 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#8603 - CGI_10014999 superfamily 248053 770 845 9.82E-31 118.396 cl17499 Peptidase_M14NE-CP-C_like superfamily - - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#8603 - CGI_10014999 superfamily 248053 321 396 5.15E-30 116.084 cl17499 Peptidase_M14NE-CP-C_like superfamily - - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#8603 - CGI_10014999 superfamily 245595 1230 1495 2.76E-41 156.213 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#8603 - CGI_10014999 superfamily 245595 897 1130 4.79E-19 89.1794 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#8603 - CGI_10014999 superfamily 248053 1147 1221 1.26E-09 57.2917 cl17499 Peptidase_M14NE-CP-C_like superfamily - - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#8603 - CGI_10014999 superfamily 248053 1625 1707 5.51E-05 43.4345 cl17499 Peptidase_M14NE-CP-C_like superfamily - - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#8603 - CGI_10014999 superfamily 248053 1562 1618 0.00132696 39.1973 cl17499 Peptidase_M14NE-CP-C_like superfamily C - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#8604 - CGI_10015000 superfamily 241596 511 571 3.64E-15 71.4763 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#8605 - CGI_10015001 superfamily 243092 91 363 4.03E-18 82.768 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8605 - CGI_10015001 superfamily 245010 366 408 0.00259244 36.1737 cl09111 Prefoldin superfamily N - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#8606 - CGI_10015002 superfamily 150464 46 152 1.83E-14 66.1186 cl10773 Transmemb_17 superfamily - - Predicted membrane protein; This is a 100 amino acid region of a family of proteins conserved from nematodes to humans. It is predicted to be a transmembrane region but its function is not known. Q#8607 - CGI_10015003 superfamily 241574 131 268 5.61E-30 111.161 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#8607 - CGI_10015003 superfamily 241626 9 106 3.49E-09 52.892 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#8608 - CGI_10015004 superfamily 247749 2 278 3.90E-173 483.915 cl17195 LDH_MDH_like superfamily - - "NAD-dependent, lactate dehydrogenase-like, 2-hydroxycarboxylate dehydrogenase family; Members of this family include ubiquitous enzymes like L-lactate dehydrogenases (LDH), L-2-hydroxyisocaproate dehydrogenases, and some malate dehydrogenases (MDH). LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH/MDH-like proteins are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others." Q#8610 - CGI_10015006 superfamily 247724 3 74 3.74E-22 88.3611 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8611 - CGI_10015007 superfamily 243072 279 404 4.10E-31 118.255 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8611 - CGI_10015007 superfamily 243072 494 614 2.38E-29 113.247 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8611 - CGI_10015007 superfamily 243072 423 548 1.33E-27 108.24 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8611 - CGI_10015007 superfamily 243072 72 197 4.85E-27 106.699 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8611 - CGI_10015007 superfamily 243072 138 272 5.89E-25 100.921 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8612 - CGI_10015008 superfamily 243072 33 158 5.94E-31 116.329 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8612 - CGI_10015008 superfamily 243072 231 356 1.61E-30 114.788 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8612 - CGI_10015008 superfamily 243072 297 422 5.68E-28 107.855 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8612 - CGI_10015008 superfamily 243072 363 498 1.80E-27 106.699 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8612 - CGI_10015008 superfamily 243072 99 224 1.23E-24 98.995 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8613 - CGI_10015009 superfamily 248013 63 113 7.97E-07 46.1031 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#8614 - CGI_10015010 superfamily 243072 9 76 6.41E-16 71.2606 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8615 - CGI_10025236 superfamily 244824 159 565 0 855.761 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#8615 - CGI_10025236 superfamily 245008 62 157 1.41E-46 160.006 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#8615 - CGI_10025236 superfamily 243149 583 677 2.61E-19 83.941 cl02706 Alpha-amylase_C superfamily - - "Alpha amylase, C-terminal all-beta domain; Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain." Q#8616 - CGI_10025237 superfamily 217584 226 348 3.63E-22 91.6498 cl04100 MOSC_N superfamily - - "MOSC N-terminal beta barrel domain; This domain is found to the N-terminus of pfam03473. The function of this domain is unknown, however it is predicted to adopt a beta barrel fold." Q#8616 - CGI_10025237 superfamily 217583 366 474 1.93E-13 67.0027 cl04097 MOSC superfamily C - "MOSC domain; The MOSC (MOCO sulfurase C-terminal) domain is a superfamily of beta-strand-rich domains identified in the molybdenum cofactor sulfurase and several other proteins from both prokaryotes and eukaryotes. These MOSC domains contain an absolutely conserved cysteine and occur either as stand-alone forms, or fused to other domains such as NifS-like catalytic domain in Molybdenum cofactor sulfurase. The MOSC domain is predicted to be a sulfur-carrier domain that receives sulfur abstracted by the pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulfur-metal clusters." Q#8617 - CGI_10025238 superfamily 241782 42 382 1.53E-14 73.8446 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#8618 - CGI_10025239 superfamily 247799 13 74 8.26E-21 85.6059 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#8618 - CGI_10025239 superfamily 247799 273 333 9.06E-18 77.2151 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#8618 - CGI_10025239 superfamily 247799 93 157 2.58E-13 64.8051 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#8619 - CGI_10025240 superfamily 243035 40 150 6.98E-22 93.4533 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8619 - CGI_10025240 superfamily 245814 503 561 0.00923362 35.9279 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8619 - CGI_10025240 superfamily 243086 799 842 9.27E-16 73.5633 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#8619 - CGI_10025240 superfamily 215647 947 1019 2.52E-13 69.9448 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#8619 - CGI_10025240 superfamily 241571 174 269 5.75E-05 42.7847 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8619 - CGI_10025240 superfamily 215647 868 926 0.000159216 43.3661 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#8622 - CGI_10025243 superfamily 245835 306 465 0.000493439 40.8117 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#8623 - CGI_10025244 superfamily 241622 32 118 2.01E-16 74.5254 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#8624 - CGI_10025245 superfamily 245814 195 272 0.000462899 38.1759 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8626 - CGI_10025247 superfamily 243179 142 240 6.70E-16 71.4099 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#8629 - CGI_10025250 superfamily 220695 43 214 9.84E-05 42.1807 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#8630 - CGI_10025251 superfamily 247724 66 240 3.07E-104 317.168 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#8630 - CGI_10025251 superfamily 243185 251 336 1.90E-32 120.637 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#8630 - CGI_10025251 superfamily 243183 468 546 7.46E-31 116.052 cl02785 Elongation_Factor_C superfamily - - "Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown." Q#8630 - CGI_10025251 superfamily 203441 555 660 2.76E-36 131.948 cl05759 LepA_C superfamily - - GTP-binding protein LepA C-terminus; This family consists of the C-terminal region of several pro- and eukaryotic GTP-binding LepA proteins. Q#8633 - CGI_10025254 superfamily 241571 315 414 2.43E-27 107.498 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8633 - CGI_10025254 superfamily 241571 79 183 1.78E-17 79.3786 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8633 - CGI_10025254 superfamily 241571 457 542 1.64E-13 67.8226 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8633 - CGI_10025254 superfamily 241571 209 280 1.80E-09 55.8814 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8633 - CGI_10025254 superfamily 241571 3 59 6.92E-05 41.6068 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8636 - CGI_10025257 superfamily 245819 421 603 1.94E-54 184.32 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#8636 - CGI_10025257 superfamily 219526 203 407 2.82E-100 306.468 cl06648 HNOBA superfamily - - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8636 - CGI_10025257 superfamily 203730 2 172 2.34E-69 224.071 cl18246 HNOB superfamily - - "Heme NO binding; The HNOB (Heme NO Binding) domain, is a predominantly alpha-helical domain and binds heme via a covalent linkage to histidine. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8637 - CGI_10025258 superfamily 245819 563 741 1.39E-53 183.164 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#8637 - CGI_10025258 superfamily 219526 355 550 7.21E-66 219.028 cl06648 HNOBA superfamily - - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8637 - CGI_10025258 superfamily 203730 230 313 5.01E-11 61.5164 cl18246 HNOB superfamily N - "Heme NO binding; The HNOB (Heme NO Binding) domain, is a predominantly alpha-helical domain and binds heme via a covalent linkage to histidine. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8638 - CGI_10025259 superfamily 247068 450 550 2.04E-28 109.71 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8638 - CGI_10025259 superfamily 247068 233 329 2.56E-26 103.932 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8638 - CGI_10025259 superfamily 247068 346 442 4.10E-17 77.7389 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8638 - CGI_10025259 superfamily 247068 119 224 2.90E-14 69.6497 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8638 - CGI_10025259 superfamily 247068 27 109 6.35E-06 44.6118 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1083 1180 4.39E-29 115.103 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 547 641 1.09E-26 108.17 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1291 1388 1.50E-25 104.703 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 340 436 2.31E-25 104.318 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1188 1283 2.46E-25 104.318 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 650 749 2.73E-25 103.932 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 973 1075 1.67E-24 101.621 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1623 1705 2.58E-23 98.1545 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 444 538 3.18E-23 98.1545 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 235 332 3.49E-22 95.0729 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 871 964 4.63E-22 94.6877 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 2023 2125 3.45E-21 91.9913 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1814 1909 6.80E-20 88.5245 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1926 2013 8.52E-20 88.1393 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1504 1599 4.24E-18 83.1317 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 43 122 2.09E-16 78.1241 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 760 843 5.99E-16 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 131 227 6.62E-16 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1397 1489 3.33E-15 74.6573 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 1713 1806 1.38E-14 73.1165 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 2242 2344 1.05E-10 61.5605 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 247068 2137 2232 2.46E-10 60.4049 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#8639 - CGI_10025260 superfamily 216265 2573 2610 0.00224382 39.5932 cl03079 Cadherin_C superfamily N - Cadherin cytoplasmic region; Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn. Q#8640 - CGI_10025261 superfamily 241874 137 508 6.90E-72 246.339 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#8640 - CGI_10025261 superfamily 241874 579 707 0.00724425 38.3309 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#8642 - CGI_10025264 superfamily 241763 503 712 7.40E-99 305.702 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#8642 - CGI_10025264 superfamily 245040 17 90 2.76E-10 58.2544 cl09238 CY superfamily - - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#8642 - CGI_10025264 superfamily 245040 101 198 1.21E-06 47.0836 cl09238 CY superfamily - - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#8642 - CGI_10025264 superfamily 244586 418 474 4.47E-13 65.3426 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#8644 - CGI_10025266 superfamily 245622 175 310 1.19E-21 89.591 cl11446 Rhomboid superfamily - - "Rhomboid family; This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite." Q#8648 - CGI_10025270 superfamily 241862 364 514 6.60E-22 94.7304 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#8649 - CGI_10025271 superfamily 191243 11 37 3.41E-08 49.3631 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#8649 - CGI_10025271 superfamily 191243 43 69 1.16E-06 45.1259 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#8650 - CGI_10025272 superfamily 191243 11 37 0.000133608 38.1923 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#8650 - CGI_10025272 superfamily 191243 43 69 0.0013096 35.4959 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#8651 - CGI_10025273 superfamily 191243 324 350 5.72E-05 40.8887 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#8651 - CGI_10025273 superfamily 191243 11 37 6.37E-05 40.8887 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#8651 - CGI_10025273 superfamily 191243 299 322 0.000338147 38.5775 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#8651 - CGI_10025273 superfamily 191243 43 69 0.0010783 37.0367 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#8652 - CGI_10025274 superfamily 245230 1 460 0 637.825 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#8653 - CGI_10025275 superfamily 247792 25 74 5.06E-09 52.0628 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8653 - CGI_10025275 superfamily 110440 299 326 0.00032846 38.1577 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#8653 - CGI_10025275 superfamily 110440 339 366 0.00174356 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#8654 - CGI_10025276 superfamily 192535 51 173 0.00623392 36.4198 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#8655 - CGI_10025277 superfamily 197509 17 55 0.000284784 37.5477 cl09965 PAC superfamily - - Motif C-terminal to PAS motifs (likely to contribute to PAS structural domain); PAC motif occurs C-terminal to a subset of all known PAS motifs. It is proposed to contribute to the PAS domain fold. Q#8656 - CGI_10025278 superfamily 241596 70 123 2.00E-13 62.2315 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#8656 - CGI_10025278 superfamily 243045 155 218 1.92E-09 52.2503 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#8657 - CGI_10025279 superfamily 241782 46 393 1.03E-139 405.406 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#8658 - CGI_10025280 superfamily 245835 1227 1405 8.53E-40 149.39 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#8658 - CGI_10025280 superfamily 243096 1032 1176 1.31E-26 110.465 cl02571 RhoGEF superfamily N - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#8658 - CGI_10025280 superfamily 247683 28 77 1.83E-13 68.2582 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8658 - CGI_10025280 superfamily 247683 435 486 9.74E-12 63.2507 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8658 - CGI_10025280 superfamily 247683 1663 1719 3.76E-14 70.4783 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8658 - CGI_10025280 superfamily 247683 302 353 8.62E-13 66.5944 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8658 - CGI_10025280 superfamily 247683 166 215 8.21E-12 63.5278 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8658 - CGI_10025280 superfamily 247683 1495 1550 1.33E-11 63.0832 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8658 - CGI_10025280 superfamily 247683 95 142 2.15E-10 59.6828 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#8660 - CGI_10004439 superfamily 243199 5 87 9.22E-09 53.8342 cl02808 RT_like superfamily N - "RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs." Q#8661 - CGI_10004440 superfamily 215847 98 147 4.00E-10 58.6119 cl09510 Lipoxygenase superfamily NC - Lipoxygenase; Lipoxygenase. Q#8662 - CGI_10019460 superfamily 217064 157 302 6.54E-18 82.1572 cl03617 CLN3 superfamily N - CLN3 protein; This is a family of proteins from the CLN3 gene. A missense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Q#8662 - CGI_10019460 superfamily 217064 7 119 3.12E-09 55.9636 cl03617 CLN3 superfamily C - CLN3 protein; This is a family of proteins from the CLN3 gene. A missense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Q#8664 - CGI_10019462 superfamily 198738 366 453 4.06E-44 150.881 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#8664 - CGI_10019462 superfamily 247057 122 190 1.03E-20 85.8505 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#8672 - CGI_10019470 superfamily 247787 83 127 6.11E-05 43.2957 cl17233 RecA-like_NTPases superfamily C - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#8673 - CGI_10019471 superfamily 222432 534 618 1.54E-15 73.4962 cl16451 Bravo_FIGEY superfamily N - C-terminal domain of Fibronectin type III; This is the very C-terminal region of neural adhesion molecule L1 proteins that are also known as Bravo or NrCAM. It lies upstream of the IG and Fn3 domains and has the highly conserved motif FIGEY. The function is not known. Q#8673 - CGI_10019471 superfamily 245814 193 275 2.45E-11 60.5968 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8673 - CGI_10019471 superfamily 245814 47 115 6.49E-08 50.4586 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8673 - CGI_10019471 superfamily 245814 297 383 0.00147184 37.379 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#8674 - CGI_10019472 superfamily 190271 871 979 1.54E-44 157.476 cl03521 Alpha_adaptin_C superfamily - - "Alpha adaptin AP2, C-terminal domain; Alpha adaptin is a hetero tetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site." Q#8674 - CGI_10019472 superfamily 243521 767 865 2.08E-20 88.4542 cl03759 Alpha_adaptinC2 superfamily - - "Adaptin C-terminal domain; Alpha adaptin is a heterotetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This ig-fold domain is found in alpha, beta and gamma adaptins." Q#8675 - CGI_10019473 superfamily 241636 186 354 3.05E-93 279.473 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#8676 - CGI_10019474 superfamily 241636 186 211 2.12E-10 56.0568 cl00145 TBOX superfamily C - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#8678 - CGI_10019476 superfamily 243072 148 312 1.23E-17 79.3498 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8678 - CGI_10019476 superfamily 243072 253 383 3.56E-17 77.809 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8679 - CGI_10019477 superfamily 243072 227 389 2.69E-15 72.8014 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8679 - CGI_10019477 superfamily 243072 162 250 2.59E-12 63.9418 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8679 - CGI_10019477 superfamily 243072 329 498 2.65E-09 55.0822 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8680 - CGI_10019478 superfamily 218330 158 279 1.35E-48 166.654 cl04847 SHQ1 superfamily N - SHQ1 protein; S. cerevisiae SHQ1 protein is required for SnoRNAs of the box H/ACA Quantitative accumulation (unpublished). Q#8680 - CGI_10019478 superfamily 218330 310 413 6.55E-36 131.601 cl04847 SHQ1 superfamily N - SHQ1 protein; S. cerevisiae SHQ1 protein is required for SnoRNAs of the box H/ACA Quantitative accumulation (unpublished). Q#8681 - CGI_10019479 superfamily 245598 91 395 6.42E-149 428.983 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#8682 - CGI_10019480 superfamily 219963 282 374 9.47E-23 91.1301 cl08487 GCV_T_C superfamily - - "Glycine cleavage T-protein C-terminal barrel domain; This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase." Q#8683 - CGI_10019481 superfamily 221887 11 107 2.86E-24 95.659 cl15229 ING superfamily - - "Inhibitor of growth proteins N-terminal histone-binding; Histones undergo numerous post-translational modifications, including acetylation and methylation, at residues which are then probable docking sites for various chromatin remodelling complexes. Inhibitor of growth proteins (INGs) specifically bind to residues that have been thus modified. INGs carry a well-characterized C-terminal PHD-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3), as well as this N-terminal domain that binds unmodified H3 tails. Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail." Q#8683 - CGI_10019481 superfamily 247999 343 390 7.23E-10 54.4188 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#8684 - CGI_10019482 superfamily 243083 1197 1308 1.29E-50 175.773 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#8684 - CGI_10019482 superfamily 248279 322 441 5.64E-53 182.537 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#8684 - CGI_10019482 superfamily 243084 613 710 3.10E-49 171.428 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#8684 - CGI_10019482 superfamily 220792 107 248 4.88E-23 97.8618 cl11150 EPL1 superfamily - - Enhancer of polycomb-like; This is a family of EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes. Q#8684 - CGI_10019482 superfamily 247999 282 314 1.24E-14 70.3671 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#8685 - CGI_10019483 superfamily 241583 14 163 1.77E-67 228.277 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#8685 - CGI_10019483 superfamily 149667 1350 1545 8.99E-64 217.238 cl07343 GON superfamily - - GON domain; The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. Q#8685 - CGI_10019483 superfamily 246918 947 998 1.96E-05 44.1147 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8685 - CGI_10019483 superfamily 246918 611 664 0.000218452 41.0331 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8685 - CGI_10019483 superfamily 246918 721 775 0.000684712 39.4923 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8685 - CGI_10019483 superfamily 246918 1064 1113 0.00219653 37.9515 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8685 - CGI_10019483 superfamily 246918 669 719 0.00671664 36.4107 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#8687 - CGI_10019485 superfamily 243072 52 172 2.05E-15 72.031 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8687 - CGI_10019485 superfamily 243072 114 239 4.10E-14 68.5642 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8687 - CGI_10019485 superfamily 243072 182 315 6.15E-13 65.0974 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8690 - CGI_10014037 superfamily 244881 956 1192 1.68E-49 180.162 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#8690 - CGI_10014037 superfamily 215788 727 817 2.06E-25 103.413 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#8690 - CGI_10014037 superfamily 203720 1365 1445 6.98E-15 72.9662 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#8690 - CGI_10014037 superfamily 243064 1466 1558 1.37E-10 61.256 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#8690 - CGI_10014037 superfamily 216731 116 192 3.14E-06 47.2535 cl12258 A2M_N superfamily - - MG2 domain; This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin. Q#8690 - CGI_10014037 superfamily 221351 19 106 0.00774736 36.5187 cl13419 MG1 superfamily - - Alpha-2-macroglobulin MG1 domain; This is the N-terminal MG1 domain from alpha-2-macroglobulin. Q#8691 - CGI_10014038 superfamily 241596 25 75 3.55E-07 46.0531 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#8691 - CGI_10014038 superfamily 243123 93 134 2.70E-07 46.3974 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#8692 - CGI_10014039 superfamily 241596 16 66 2.98E-07 46.4383 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#8692 - CGI_10014039 superfamily 243123 84 125 4.24E-08 48.3233 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#8693 - CGI_10014040 superfamily 243072 856 956 5.20E-18 82.4314 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8693 - CGI_10014040 superfamily 243072 915 1001 3.05E-07 50.0746 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8693 - CGI_10014040 superfamily 218536 977 1257 5.50E-59 207.968 cl05038 AAR2 superfamily N - AAR2 protein; This family consists of several eukaryotic AAR2-like proteins. The yeast protein AAR2 is involved in splicing pre-mRNA of the a1 cistron and other genes that are important for cell growth. Q#8693 - CGI_10014040 superfamily 115363 357 424 1.04E-21 91.2793 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8693 - CGI_10014040 superfamily 241760 81 125 1.19E-16 76.3431 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#8693 - CGI_10014040 superfamily 241760 441 480 1.50E-11 61.7055 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#8693 - CGI_10014040 superfamily 115363 8 70 2.73E-11 61.2337 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8693 - CGI_10014040 superfamily 115363 543 574 2.30E-05 43.8998 cl05972 MIB_HERC2 superfamily N - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8695 - CGI_10014042 superfamily 241754 12 688 0 1091.86 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#8695 - CGI_10014042 superfamily 218855 832 1025 2.49E-32 125.877 cl10652 Myosin_TH1 superfamily - - Myosin tail; Myosin tail. Q#8698 - CGI_10014045 superfamily 243109 1 45 2.74E-33 114.732 cl02614 SPRY superfamily N - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#8698 - CGI_10014045 superfamily 243073 50 91 5.12E-17 68.8678 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#8699 - CGI_10014046 superfamily 245836 332 514 1.82E-98 308.497 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#8699 - CGI_10014046 superfamily 244947 520 658 8.75E-85 269.233 cl08424 OBF_DNA_ligase_family superfamily - - "The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases; ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain." Q#8699 - CGI_10014046 superfamily 189650 9 95 2.46E-25 101.59 cl02913 zf-PARP superfamily - - Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region; Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. Q#8699 - CGI_10014046 superfamily 204434 763 788 2.46E-08 51.4265 cl10963 zf-CCHH superfamily - - "Zinc-finger (CX5CX6HX5H) motif; This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism." Q#8700 - CGI_10014047 superfamily 247095 54 467 1.97E-140 412.435 cl15837 alkPPc superfamily - - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#8702 - CGI_10014049 superfamily 243045 346 435 1.26E-10 60.3395 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#8702 - CGI_10014049 superfamily 221427 931 962 0.000686027 40.8908 cl13540 Period_C superfamily NC - Period protein 2/3C-terminal region; This domain is found in eukaryotes. This domain is typically between 164 to 200 amino acids in length. This domain is found associated with pfam08447. Q#8703 - CGI_10014050 superfamily 243034 180 296 2.14E-09 53.538 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#8703 - CGI_10014050 superfamily 215821 29 90 2.33E-05 41.8423 cl18346 FKBP_C superfamily C - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#8704 - CGI_10014051 superfamily 221999 391 612 8.77E-117 365.776 cl16180 CLU superfamily - - "Clustered mitochondria; The CLU domain (CLUstered mitochondria) is a eukaryotic domain found in proteins from fungi, protozoa, plants to humans. It is required for correct functioning of the mitochondria and mitochondrial transport although the exact function of the domain is unknown. In Dictyostelium the full-length protein is required for a very late step in fission of the outer mitochondrial membrane suggesting that mitochondria are transported along microtubules, as in mammalian cells, rather than along actin filaments, as in budding yeast. Disruption of the protein-impaired cytokinesis and caused mitochondria to cluster at the cell centre. It is likely that CLU functions in a novel pathway that positions mitochondria within the cell based on their physiological state. Disruption of the CLU pathway may enhance oxidative damage, alter gene expression, cause mitochondria to cluster at microtubule plus ends, and lead eventually to mitochondrial failure." Q#8704 - CGI_10014051 superfamily 221783 801 987 4.41E-58 199.528 cl15102 eIF3_p135 superfamily - - "Translation initiation factor eIF3 subunit 135; Translation initiation factor eIF3 is a multi-subunit protein complex required for initiation of protein biosynthesis in eukaryotic cells. The complex promotes ribosome dissociation, the binding of the initiator methionyl-tRNA to the 40 S ribosomal subunit, and mRNA recruitment to the ribosome. The protein product from TIF31 genes in yeast is p135 which associates with the eIF3 but does not seem to be necessary for protein translation initiation." Q#8704 - CGI_10014051 superfamily 248006 1058 1098 0.000255408 40.6323 cl17452 TPR_10 superfamily - - Tetratricopeptide repeat; Tetratricopeptide repeat. Q#8704 - CGI_10014051 superfamily 248006 1183 1222 0.000671873 39.0915 cl17452 TPR_10 superfamily - - Tetratricopeptide repeat; Tetratricopeptide repeat. Q#8705 - CGI_10014052 superfamily 242921 24 197 4.18E-82 244.437 cl02175 Rer1 superfamily - - Rer1 family; RER1 family protein are involved in involved in the retrieval of some endoplasmic reticulum membrane proteins from the early golgi compartment. The C terminus of yeast Rer1p interacts with a coatomer complex. Q#8706 - CGI_10014053 superfamily 247755 570 806 7.08E-138 411.163 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8706 - CGI_10014053 superfamily 216049 249 525 5.16E-44 160.529 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#8707 - CGI_10014054 superfamily 243092 221 466 9.49E-60 206.417 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8708 - CGI_10014055 superfamily 242912 1 175 3.55E-101 292.203 cl02161 Ssu72 superfamily - - "Ssu72-like protein; The highly conserved and essential protein Ssu72 has intrinsic phosphatase activity and plays an essential role in the transcription cycle. Ssu72 was originally identified in a yeast genetic screen as enhancer of a defect caused by a mutation in the transcription initiation factor TFIIB. It binds to TFIIB and is also involved in mRNA elongation. Ssu72 is further involved in both poly(A) dependent and independent termination. It is a subunit of the yeast cleavage and polyadenylation factor (CPF), which is part of the machinery for mRNA 3'-end formation. Ssu72 is also essential for transcription termination of snRNAs." Q#8709 - CGI_10014056 superfamily 245225 1107 1161 3.39E-06 49.464 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#8709 - CGI_10014056 superfamily 245225 949 1032 0.00543024 39.2764 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily NC - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#8710 - CGI_10014057 superfamily 245819 669 845 3.07E-61 205.506 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#8710 - CGI_10014057 superfamily 245201 371 596 1.13E-23 101.459 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8710 - CGI_10014057 superfamily 219526 614 655 0.000497132 41.0655 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8711 - CGI_10014058 superfamily 248279 7 41 0.00680124 34.2643 cl17725 zf-HC5HC2H superfamily NC - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#8712 - CGI_10014059 superfamily 202484 7 66 5.62E-16 68.0244 cl03798 zf-Tim10_DDP superfamily - - Tim10/DDP family zinc finger; Putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein TIMM8A. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localised to the mitochondrial intermembrane space. Q#8713 - CGI_10014060 superfamily 247097 120 153 0.00948862 31.9661 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#8714 - CGI_10002449 superfamily 245226 261 472 1.22E-50 175.174 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#8714 - CGI_10002449 superfamily 224570 531 723 2.00E-10 59.3966 cl18705 COG1656 superfamily - - Uncharacterized conserved protein [Function unknown] Q#8716 - CGI_10002451 superfamily 247684 1 238 4.60E-121 352.324 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#8717 - CGI_10002452 superfamily 242232 111 160 1.47E-11 56.4124 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#8718 - CGI_10002453 superfamily 220783 2 92 1.33E-13 66.6863 cl11136 Syntaxin-18_N superfamily - - "SNARE-complex protein Syntaxin-18 N-terminus; This is the conserved N-terminal of Syntaxin-18. Syntaxin-18 is found in the SNARE complex of the endoplasmic reticulum and functions in the trafficking between the ER intermediate compartment and the cis-Golgi vesicle. In particular, the N-terminal region is important for the formation of ER aggregates. More specifically, syntaxin-18 is involved in endoplasmic reticulum-mediated phagocytosis, presumably by regulating the specific and direct fusion of the ER with the plasma or phagosomal membranes." Q#8727 - CGI_10004075 superfamily 245599 220 423 7.83E-144 412.452 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#8727 - CGI_10004075 superfamily 207662 124 200 1.42E-54 178.125 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#8730 - CGI_10004078 superfamily 243072 697 831 3.58E-21 91.291 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8730 - CGI_10004078 superfamily 243072 391 520 4.25E-19 85.513 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8730 - CGI_10004078 superfamily 243072 548 655 1.24E-16 78.1942 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8730 - CGI_10004078 superfamily 243072 759 922 9.33E-15 72.8014 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8730 - CGI_10004078 superfamily 243072 293 452 0.00068426 39.6743 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8731 - CGI_10004079 superfamily 243072 387 463 3.95E-16 75.4978 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8736 - CGI_10014021 superfamily 241600 18 65 1.37E-15 67.2655 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8737 - CGI_10014022 superfamily 241600 107 282 5.66E-63 200.159 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8737 - CGI_10014022 superfamily 241600 3 91 6.75E-32 118.882 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8739 - CGI_10014025 superfamily 241600 67 199 7.67E-57 180.899 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8745 - CGI_10014031 superfamily 219673 159 377 1.64E-112 331.185 cl06835 COPIIcoated_ERV superfamily - - "Endoplasmic reticulum vesicle transporter; This family is conserved from plants and fungi to humans. Erv46 works in close conjunction with Erv41 and together they form a complex which cycles between the endoplasmic reticulum and Golgi complex. Erv46-41 interacts strongly with the endoplasmic reticulum glucosidase II. Mammalian glucosidase II comprises a catalytic alpha-subunit and a 58 kDa beta subunit, which is required for ER localisation. All proteins identified biochemically as Erv41p-Erv46p interactors are localised to the early secretory pathway and are involved in protein maturation and processing in the ER and/or sorting into COPII vesicles for transport to the Golgi." Q#8745 - CGI_10014031 superfamily 206021 9 113 1.83E-56 182.694 cl16436 ERGIC_N superfamily - - "Endoplasmic Reticulum-Golgi Intermediate Compartment (ERGIC); This family is the N-terminal of ERGIC proteins, ER-Golgi intermediate compartment clusters, otherwise known as Ervs, and is associated with family COPIIcoated_ERV, pfam07970." Q#8746 - CGI_10014032 superfamily 220650 12 78 2.71E-17 69.6589 cl10931 Romo1 superfamily - - "Reactive mitochondrial oxygen species modulator 1; This is a family of small, approximately 100 amino acid, proteins found from yeasts to humans. The majority of endogenous reactive oxygen species (ROS) in cells are produced by the mitochondrial respiratory chain. An increase or imbalance in ROS alters the intracellular redox homeostasis, triggers DNA damage, and may contribute to cancer development and progression. Members of this family are mitochondrial reactive oxygen species modulator 1 (Romo1) proteins that are responsible for increasing the level of ROS in cells. Increased Romo1 expression can have a number of other effects including: inducing premature senescence of cultured human fibroblasts and increased resistance to 5-fluorouracil." Q#8748 - CGI_10014034 superfamily 247805 336 491 6.51E-18 82.3852 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#8748 - CGI_10014034 superfamily 247905 684 817 1.73E-13 68.8036 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#8748 - CGI_10014034 superfamily 246680 118 197 1.78E-08 53.0742 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#8748 - CGI_10014034 superfamily 213148 548 678 9.29E-08 51.5438 cl17041 helicase_insert_domain superfamily - - "helical domain inserted in SF2-type helicase domain in Hef-, MDA5- and FancM-like proteins; This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases, like archaeal Hef helicase, MDA5-like helicases and FancM-like helicases. The exact function of this domain is unknown, but seems to play a role in interaction with nucleotides and/or the stabilization of the nucleotide complex." Q#8749 - CGI_10014035 superfamily 246680 120 211 4.75E-08 49.9926 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#8749 - CGI_10014035 superfamily 246680 30 112 0.000109585 39.9774 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#8750 - CGI_10010799 superfamily 238012 305 353 3.23E-06 43.8822 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8750 - CGI_10010799 superfamily 238012 219 255 4.69E-05 40.4154 cl11390 EGF_Lam superfamily N - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8750 - CGI_10010799 superfamily 238012 50 90 8.35E-05 39.645 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8750 - CGI_10010799 superfamily 238012 258 299 0.000293401 38.1042 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8750 - CGI_10010799 superfamily 238012 4 45 0.000361491 37.719 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8750 - CGI_10010799 superfamily 238012 159 205 0.000940214 36.5634 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#8753 - CGI_10010802 superfamily 245205 35 91 0.000801621 36.0617 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#8754 - CGI_10010803 superfamily 243091 6 57 0.000574695 36.5435 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#8755 - CGI_10010804 superfamily 241600 176 248 1.35E-18 80.399 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8756 - CGI_10010805 superfamily 247637 2 331 3.73E-156 443.628 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#8757 - CGI_10010806 superfamily 241600 75 162 2.77E-43 145.461 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8757 - CGI_10010806 superfamily 241619 10 54 2.00E-05 40.2581 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#8759 - CGI_10010808 superfamily 242296 235 273 0.00899913 33.6535 cl01090 SlyX superfamily C - SlyX; The SlyX protein has no known function. It is short less than 80 amino acids and is found close to the slyD gene. The SlyX protein has a conserved PPH(Y/W) motif at its C-terminus. The protein may be a coiled-coil structure. Q#8760 - CGI_10010809 superfamily 219097 7 203 7.73E-91 283.632 cl18491 Muskelin_N superfamily - - "Muskelin N-terminus; This family represents the N-terminal region of muskelin and is found in conjunction with several pfam01344 repeats. Muskelin is an intracellular, kelch repeat protein that is needed in cell-spreading responses to the matrix adhesion molecule, thrombospondin-1." Q#8760 - CGI_10010809 superfamily 243146 278 330 1.03E-06 46.5135 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#8761 - CGI_10010810 superfamily 216101 55 653 1.38E-176 537.647 cl08288 Carn_acyltransf superfamily - - Choline/Carnitine o-acyltransferase; Choline/Carnitine o-acyltransferase. Q#8762 - CGI_10010811 superfamily 222557 26 201 2.01E-76 231.716 cl16634 DUF4291 superfamily - - Domain of unknown function (DUF4291); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 190 and 214 amino acids in length. There are two conserved sequence motifs: VYQAY and RMTW. Q#8763 - CGI_10019317 superfamily 243035 34 106 9.70E-07 47.9998 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8763 - CGI_10019317 superfamily 243035 678 751 1.06E-06 47.9998 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8763 - CGI_10019317 superfamily 243035 142 258 6.12E-06 45.3034 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8763 - CGI_10019317 superfamily 241568 606 660 0.0013419 37.8276 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#8767 - CGI_10019321 superfamily 245213 1518 1552 3.66E-08 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8767 - CGI_10019321 superfamily 245213 1558 1584 0.000789137 39.3822 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#8770 - CGI_10019324 superfamily 243109 874 1006 3.99E-70 234.117 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#8770 - CGI_10019324 superfamily 243109 439 588 1.17E-68 230.662 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#8770 - CGI_10019324 superfamily 243109 1422 1567 5.00E-63 214.472 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#8770 - CGI_10019324 superfamily 216456 234 440 3.52E-72 243 cl03182 RYDR_ITPR superfamily - - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#8770 - CGI_10019324 superfamily 216456 2119 2363 2.81E-54 191.383 cl03182 RYDR_ITPR superfamily - - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#8770 - CGI_10019324 superfamily 202095 758 851 4.03E-37 137.75 cl03409 RyR superfamily - - RyR domain; This domain is called RyR for Ryanodine receptor. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown. Q#8770 - CGI_10019324 superfamily 202095 643 735 5.78E-32 123.113 cl03409 RyR superfamily - - RyR domain; This domain is called RyR for Ryanodine receptor. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown. Q#8770 - CGI_10019324 superfamily 197746 7 60 0.00040581 40.7875 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#8770 - CGI_10019324 superfamily 197746 66 92 0.000745481 40.0171 cl02624 MIR superfamily C - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#8771 - CGI_10019325 superfamily 247856 757 812 1.42E-12 65.2617 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8771 - CGI_10019325 superfamily 219849 553 675 5.43E-36 134.234 cl09597 RIH_assoc superfamily - - "RyR and IP3R Homology associated; This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1,4,5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels. There seems to be no known function for this domain. Also see the IP3-binding domain pfam01365 and pfam02815." Q#8771 - CGI_10019325 superfamily 219038 1113 1244 2.11E-35 136.371 cl05786 RR_TM4-6 superfamily N - Ryanodine Receptor TM 4-6; This region covers TM regions 4-6 of the ryanodine receptor 1 family. Q#8771 - CGI_10019325 superfamily 247739 471 516 0.00183332 38.2676 cl17185 LPLAT superfamily N - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#8775 - CGI_10019329 superfamily 245201 55 288 1.27E-81 264.382 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8775 - CGI_10019329 superfamily 247694 749 846 7.50E-71 229.256 cl17070 AMPKA_C_like superfamily - - "C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha subunit and similar domains; This family is composed of AMPKs, microtubule-associated protein/microtubule affinity regulating kinases (MARKs), yeast Kcc4p-like proteins, plant calcineurin B-Like (CBL)-interacting protein kinases (CIPKs), and similar proteins. They are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. AMPKs act as sensors for the energy status of the cell and are activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. MARKs phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Kcc4p and related proteins are septin-associated proteins that are involved in septin organization and in the yeast morphogenesis checkpoint coordinating the cell cycle with bud formation. CIPKs interact with the calcineurin B-like (CBL) calcium sensors to form a signaling network that decode specific calcium signals triggered by a variety of environmental stimuli including salinity, drought, cold, light, and mechanical perturbation, among others. All members of this family contain an N-terminal catalytic kinase domain and a C-terminal regulatory domain which is also called kinase associated domain 1 (KA1) in some cases. The C-terminal regulatory domain serves as a protein interaction domain in AMPKs and CIPKs. In MARKs and Kcc4p-like proteins, this domain binds phospholipids and may be involved in membrane localization." Q#8775 - CGI_10019329 superfamily 241643 362 398 0.00137688 37.466 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#8776 - CGI_10019330 superfamily 150144 14 113 1.01E-17 77.1282 cl09624 Tex_N superfamily C - Tex-like protein N-terminal domain; This presumed domain is found at the N-terminus of Bordetella pertussis tex. This protein defines a novel family of prokaryotic transcriptional accessory factors. Q#8777 - CGI_10019331 superfamily 245202 91 150 5.38E-22 84.592 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#8778 - CGI_10019332 superfamily 247787 63 306 1.97E-63 202.428 cl17233 RecA-like_NTPases superfamily - - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#8780 - CGI_10019334 superfamily 241550 382 627 7.51E-70 233.663 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#8780 - CGI_10019334 superfamily 241550 64 178 1.74E-42 157.009 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#8780 - CGI_10019334 superfamily 245839 631 724 2.59E-06 47.1344 cl12020 Anticodon_Ia_like superfamily C - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#8782 - CGI_10019336 superfamily 206088 10 31 0.00166821 34.6191 cl16476 zf-CCHC_3 superfamily C - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#8783 - CGI_10011123 superfamily 244824 103 561 4.68E-119 365.143 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#8785 - CGI_10011125 superfamily 243034 388 484 1.60E-22 92.8283 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#8785 - CGI_10011125 superfamily 243034 10 109 1.51E-20 87.0503 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#8785 - CGI_10011125 superfamily 243034 253 352 1.71E-14 69.7163 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#8785 - CGI_10011125 superfamily 128966 513 552 2.32E-05 42.2546 cl17974 STI1 superfamily - - Heat shock chaperonin-binding motif; Heat shock chaperonin-binding motif. Q#8786 - CGI_10011126 superfamily 241580 132 210 1.32E-42 146.93 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#8787 - CGI_10011127 superfamily 241758 1 194 7.19E-83 261.002 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#8787 - CGI_10011127 superfamily 245229 415 535 5.59E-30 114.734 cl10015 YjgF_YER057c_UK114_family superfamily - - "YjgF, YER057c, and UK114 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site." Q#8787 - CGI_10011127 superfamily 245229 308 387 7.48E-17 76.9121 cl10015 YjgF_YER057c_UK114_family superfamily - - "YjgF, YER057c, and UK114 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site." Q#8790 - CGI_10011130 superfamily 245201 571 830 1.05E-108 342.979 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8791 - CGI_10011131 superfamily 241563 13 46 0.00307776 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8793 - CGI_10011133 superfamily 216062 79 308 1.60E-55 186.104 cl02928 TGFb_propeptide superfamily - - TGF-beta propeptide; This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. Q#8793 - CGI_10011133 superfamily 243062 360 461 2.87E-54 178.238 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#8794 - CGI_10011134 superfamily 243092 843 1117 2.78E-64 220.284 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8794 - CGI_10011134 superfamily 243074 467 507 8.16E-10 56.3609 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#8795 - CGI_10011135 superfamily 203311 30 180 1.73E-34 122.438 cl18239 CDKN3 superfamily - - Cyclin-dependent kinase inhibitor 3 (CDKN3); This family consists of cyclin-dependent kinase inhibitor 3 or kinase associated phosphatase proteins from several mammalian species. The cyclin-dependent kinase (Cdk)-associated protein phosphatase (KAP) is a human dual specificity protein phosphatase that dephosphorylates Cdk2 on threonine 160 in a cyclin-dependent manner. Q#8796 - CGI_10028418 superfamily 241578 5 180 1.42E-12 63.739 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#8797 - CGI_10028419 superfamily 248119 17 123 3.37E-12 59.232 cl17565 TerD_like superfamily N - "Uncharacterized proteins involved in stress response, similar to tellurium resistance terD; Tellurium resistance terD like proteins. This family is composed of uncharacterized proteins involved in stress response, such as the tellurium resistance proteins, chemical-damaging agent resistance proteins, and general stress proteins from a variety of organisms. The tellurium resistance proteins are homologous terA,-D,-E,-F,-Z,-X gene products, which confer tellurium resistance mediated by plasmids. Currently, the biochemical mechanism of tellurium resistance remains unknown. The family also contains several ter gene homologues, YceC, YceD, YceE, for which there is no clear evidence for any involvement in the tellurium resistance. A putative cAMP-binding protin CABP1 shows a significant similarity to the terD protein and is also included in this family." Q#8798 - CGI_10028420 superfamily 243092 29 148 0.00042299 42.7072 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8799 - CGI_10028421 superfamily 243092 687 952 1.12E-65 222.981 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8799 - CGI_10028421 superfamily 248020 28 387 4.16E-48 175.345 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#8799 - CGI_10028421 superfamily 208922 538 607 5.67E-13 67.2183 cl08418 TAF5_NTD2 superfamily C - "TAF5_NTD2 is the second conserved N-terminal region of TATA Binding Protein (TBP) Associated Factor 5 (TAF5), involved in forming Transcription Factor IID (TFIID); The TATA Binding Protein (TBP) Associated Factor 5 (TAF5) is one of several TAFs that bind TBP and are involved in forming Transcription Factor IID (TFIID) complex. TAF5 contains three domains, two conserved sequence motifs at the N-terminal and one at the C-terminal region. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the preinitiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF5 may play a major role in forming TFIID and its related complexes. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. TAF5 has a paralog gene (TAF5L) which has a redundant function. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. C-terminus of TAF5 contains six WD40 repeats that likely form a closed beta propeller structure and may be involved in protein-protein interaction. The first part of the TAF5 N-terminal (TAF5_NTD1) homodimerizes in the absence of other TAFs. The second conserved N-terminal part of TAF5 (TAF5_NTD2) has an alpha-helical domain. One study has shown that TAF5_NTD2 homodimerizes only at high concentration of calcium but not any other metals. No dimerization was observed in other structural studies of TAF_NTD2. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. However, TAF5 does not have a HFD motif." Q#8802 - CGI_10028424 superfamily 216981 35 73 0.000453279 38.2826 cl17087 OTU superfamily C - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#8802 - CGI_10028424 superfamily 243176 131 154 0.00449485 37.2783 cl02777 chaperonin_like superfamily NC - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#8806 - CGI_10028428 superfamily 245226 80 230 2.58E-18 80.0372 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#8806 - CGI_10028428 superfamily 244936 262 312 0.00813789 35.3212 cl08398 MltA superfamily C - MltA specific insert domain; This beta barrel domain is found inserted in the MltA a murein degrading transglycosylase enzyme. This domain may be involved in peptidoglycan binding. Q#8808 - CGI_10028430 superfamily 243051 26 135 2.65E-10 55.0493 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#8809 - CGI_10028431 superfamily 246664 303 679 2.81E-148 449.718 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#8809 - CGI_10028431 superfamily 246664 893 995 1.47E-30 126.27 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#8809 - CGI_10028431 superfamily 246664 154 240 1.83E-05 46.894 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#8810 - CGI_10028432 superfamily 243027 8 84 6.47E-22 83.4811 cl02418 Hormone_5 superfamily - - "Neurohypophysial hormones, C-terminal Domain; N-terminal Domain is in hormone5" Q#8813 - CGI_10028435 superfamily 241780 17 299 2.78E-127 373.719 cl00319 Gn_AT_II superfamily - - "Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer." Q#8813 - CGI_10028435 superfamily 241770 310 445 2.08E-15 72.8136 cl00309 PRTases_typeI superfamily - - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#8814 - CGI_10028436 superfamily 241708 7 257 4.42E-164 464.458 cl00231 SAICAR_synt superfamily - - "5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR) synthase; SAICAR synthetase (the PurC gene product) catalyzes the seventh step of the de novo biosynthesis of purine nucleotides (also reported as eighth step). It converts 5-aminoimidazole-4-carboxyribonucleotide (CAIR), ATP, and L-aspartate into 5-aminoimidazole-4-(N-succinylcarboxamide) ribonucleotide (SAICAR), ADP, and phosphate." Q#8814 - CGI_10028436 superfamily 241771 267 417 3.22E-48 163.555 cl00310 AIRC superfamily - - AIR carboxylase; Members of this family catalyze the decarboxylation of 1-(5-phosphoribosyl)-5-amino-4-imidazole-carboxylate (AIR). This family catalyze the sixth step of de novo purine biosynthesis. Some members of this family contain two copies of this domain. Q#8817 - CGI_10028439 superfamily 245201 965 1217 3.65E-83 273.643 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8817 - CGI_10028439 superfamily 247856 1256 1307 8.20E-05 42.1497 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8817 - CGI_10028439 superfamily 245201 582 847 1.63E-28 116.795 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8817 - CGI_10028439 superfamily 247725 188 319 2.40E-16 78.1464 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8817 - CGI_10028439 superfamily 246908 430 502 8.40E-11 60.7613 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#8817 - CGI_10028439 superfamily 246908 305 377 9.58E-10 57.6797 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#8818 - CGI_10028440 superfamily 247057 499 569 2.39E-32 119.063 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#8818 - CGI_10028440 superfamily 248259 129 227 7.16E-39 137.766 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#8818 - CGI_10028440 superfamily 248259 23 118 1.25E-36 131.603 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#8818 - CGI_10028440 superfamily 221437 305 415 2.58E-35 128.627 cl13561 DUF3588 superfamily - - "Protein of unknown function (DUF3588); This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 866 amino acids in length, and the family is found in association with pfam02820. The exact function of this family is not known." Q#8819 - CGI_10028441 superfamily 248458 1 274 3.74E-11 61.9461 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8820 - CGI_10028442 superfamily 248458 73 435 1.36E-17 82.7469 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#8821 - CGI_10028443 superfamily 215847 72 111 5.65E-06 43.2039 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#8822 - CGI_10028445 superfamily 215847 109 232 1.54E-19 86.7314 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#8823 - CGI_10028446 superfamily 243199 37 78 4.43E-13 62.2277 cl02808 RT_like superfamily C - "RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs." Q#8824 - CGI_10028447 superfamily 206130 85 114 0.000342324 38.3387 cl16501 DUF4218 superfamily NC - Domain of unknown function (DUF4218); Domain of unknown function (DUF4218). Q#8829 - CGI_10028452 superfamily 246675 19 292 4.39E-101 300.312 cl14615 PI-PLCc_GDPD_SF superfamily - - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#8831 - CGI_10028454 superfamily 245201 7 119 3.40E-19 81.5141 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8832 - CGI_10028455 superfamily 243072 129 250 2.17E-32 123.648 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8832 - CGI_10028455 superfamily 246680 799 873 5.20E-07 48.5955 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#8834 - CGI_10028457 superfamily 243082 561 800 1.77E-34 136.561 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#8834 - CGI_10028457 superfamily 243082 79 227 1.56E-22 99.9674 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#8834 - CGI_10028457 superfamily 177476 942 1046 0.0058396 38.9543 cl14386 PHA02694 superfamily C - hypothetical protein; Provisional Q#8835 - CGI_10028458 superfamily 245201 489 739 2.27E-67 224.706 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8835 - CGI_10028458 superfamily 241645 97 174 1.64E-11 61.4365 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#8836 - CGI_10028459 superfamily 241645 24 105 3.71E-22 84.1632 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#8838 - CGI_10028461 superfamily 246908 61 147 3.26E-42 146.355 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#8838 - CGI_10028461 superfamily 241566 228 277 3.94E-15 70.2135 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#8838 - CGI_10028461 superfamily 243095 288 483 1.75E-72 230.867 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#8845 - CGI_10028468 superfamily 241766 298 406 0.000137146 42.4155 cl00303 PNP_UDP_1 superfamily C - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#8846 - CGI_10028469 superfamily 247723 43 120 1.09E-42 144.821 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#8846 - CGI_10028469 superfamily 247723 134 206 3.93E-27 102.336 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#8847 - CGI_10028470 superfamily 241569 128 181 2.40E-20 80.834 cl00044 ChSh superfamily - - "Chromo Shadow Domain, found in association with N-terminal chromo (CHRromatin Organization MOdifier) domain; Chromo domains mediate the interaction of the heterochromatin with other heterochromatin proteins, thereby affecting chromatin structure (e.g. Drosophila and human heterochromatin protein (HP1) and mammalian modifier 1 and modifier 2)" Q#8847 - CGI_10028470 superfamily 248013 32 75 3.81E-13 61.1259 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#8848 - CGI_10028471 superfamily 245208 90 461 0 633.769 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#8850 - CGI_10028473 superfamily 247792 99 146 0.000339004 37.04 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8855 - CGI_10028478 superfamily 247792 247 293 3.61E-07 47.8256 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8855 - CGI_10028478 superfamily 243092 392 587 1.88E-09 58.1152 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8856 - CGI_10028479 superfamily 242307 44 135 1.49E-20 81.1556 cl01110 Sdh5 superfamily - - "Flavinator of succinate dehydrogenase; This family of uncharacterized proteins. Based on personal observation it was previously annotated in Pfam as being a divergent TPR repeat but structural evidence has indicated this is not true.This family is now found to be a highly conserved mitochondrial protein, Sdh5. Both yeast and human Sdh5 interact with the catalytic subunit of the succinate dehydrogenase (SDH) complex, a component of both the electron transport chain and the tricarboxylic acid cycle. Sdh5 is required for SDH-dependent respiration and for Sdh1 flavination (incorporation of the flavin adenine dinucleotide cofactor). Mutational inactivation of Sdh5 confers tumor susceptibility in humans." Q#8857 - CGI_10028480 superfamily 243056 53 249 2.76E-08 51.9761 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#8858 - CGI_10028481 superfamily 247057 388 443 4.29E-06 44.1525 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#8858 - CGI_10028481 superfamily 221744 30 140 0.00212626 38.1859 cl18614 CABIT superfamily N - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#8862 - CGI_10028485 superfamily 207668 68 107 9.82E-16 66.061 cl02609 TFIIS_C superfamily - - Transcription factor S-II (TFIIS); Transcription factor S-II (TFIIS). Q#8862 - CGI_10028485 superfamily 243156 4 53 3.77E-11 53.9281 cl02717 RNA_POL_M_15KD superfamily - - RNA polymerases M/15 Kd subunit; RNA polymerases M/15 Kd subunit. Q#8867 - CGI_10028490 superfamily 243072 51 158 4.04E-19 79.3498 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8868 - CGI_10028491 superfamily 243035 41 81 5.70E-05 41.0662 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#8868 - CGI_10028491 superfamily 243091 119 231 1.85E-08 51.9515 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#8870 - CGI_10028493 superfamily 239753 143 269 4.34E-80 239.981 cl15890 Sina superfamily - - "Seven in absentia (Sina) protein family, C-terminal substrate binding domain; composed of the Drosophila Sina protein, the mammalian Sina homolog (Siah), the plant protein SINAT5, and similar proteins. Sina, Siah and SINAT5 are RING-containing proteins that function as E3 ubiquitin ligases, acting either as single proteins or as a part of multiprotein complexes. Sina is expressed in many cells in the developing eye but is essential specifically for R7 photoreceptor cell development. Sina cooperates with Phyllopod (Phyl), Ebi and the E2 ubiquitin-conjugating enzyme Ubcd1 to catalyze the ubiquitination and subsequent degradation of Tramtrack (Ttk88); Ttk88 is a transcriptional repressor that blocks photoreceptor differentiation. Similarly, the mammalian homologue Siah1 cooperates with SIP (Siah-interacting protein), Ebi and the adaptor protein Skp1, to target beta-catenin for ubiquitination and degradation via a p53-dependent mechanism. SINAT5 targets NAC1 for ubiquitin-mediated degradation resulting in the downregulation of auxin, a hormone that controls many aspects of plant development. Other targets of Sina family proteins include c-Myb, synaptophysin, group 1 glutamate receptors, promyelocytic leukemia protein, alpha-synuclein, synphilin-1 and alpha-ketoglutarate dehydrogenase, among others. Sina proteins also bind proteins that are not targets for ubiquitination such as Phyl, adenomatous polyposis coli, VAV, BAG-1 and Dab-1. Siah binds to a consensus motif, PXAXVXP, which is present in Siah-binding proteins. Siah is a dimeric protein consisting of an N-terminal RING domain, two zinc finger motifs and a C-terminal substrate-binding domain (SBD); this SBD contains an eight-stranded antiparallel beta-sandwich fold similar to the MATH (meprin and TRAF-C homology) domain." Q#8870 - CGI_10028493 superfamily 247792 28 67 0.00721554 33.5732 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#8872 - CGI_10028495 superfamily 241547 71 239 1.77E-32 120.08 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#8872 - CGI_10028495 superfamily 241547 44 96 1.85E-05 43.8417 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#8873 - CGI_10028496 superfamily 241867 75 286 1.07E-17 81.0606 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#8874 - CGI_10028497 superfamily 241609 31 117 2.02E-19 79.3443 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#8877 - CGI_10028500 superfamily 216574 74 214 4.60E-30 114.612 cl14794 FAD_binding_4 superfamily - - "FAD binding domain; This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidises the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan." Q#8877 - CGI_10028500 superfamily 219706 478 514 3.82E-06 44.4492 cl06869 BBE superfamily - - Berberine and berberine like; This domain is found in the berberine bridge and berberine bridge- like enzymes which are involved in the biosynthesis of numerous isoquinoline alkaloids. They catalyze the transformation of the N-methyl group of (S)-reticuline into the C-8 berberine bridge carbon of (S)-scoulerine. Q#8879 - CGI_10028502 superfamily 245201 99 338 1.96E-12 64.5653 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#8882 - CGI_10028505 superfamily 243034 292 373 1.13E-07 51.2268 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#8883 - CGI_10028506 superfamily 241566 264 313 3.84E-14 67.5171 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#8883 - CGI_10028506 superfamily 241566 337 387 3.58E-10 55.9612 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#8883 - CGI_10028506 superfamily 247725 75 169 2.44E-33 121.683 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8883 - CGI_10028506 superfamily 248019 423 498 6.22E-28 107.768 cl17465 DAGK_cat superfamily C - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#8885 - CGI_10028508 superfamily 244875 18 309 5.11E-60 195.698 cl08255 Na_K-ATPase superfamily - - Sodium / potassium ATPase beta chain; Sodium / potassium ATPase beta chain. Q#8886 - CGI_10028509 superfamily 243092 142 448 1.21E-83 263.812 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8886 - CGI_10028509 superfamily 199226 6 32 1.13E-07 48.4679 cl11662 LisH superfamily - - "LisH; The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex." Q#8887 - CGI_10028510 superfamily 247725 16 135 5.57E-18 77.1987 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8888 - CGI_10028511 superfamily 245602 353 746 4.81E-154 457.401 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#8888 - CGI_10028511 superfamily 241612 81 122 0.009986 35.0122 cl00103 Trefoil superfamily N - "P or trefoil or TFF domain; Trefoil factor family domain peptides are mucin-associated molecules, largely found in epithelia of gastrointestinal tissues. Function is not known but it was originally identified from mucosal tissues, where it may have a regulatory or structural role and has also been implicated as a growth fractor in other tissues.The domain is found in 1 to 6 copies where it occurs." Q#8889 - CGI_10028512 superfamily 245602 369 726 1.54E-126 388.45 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#8890 - CGI_10028513 superfamily 245602 332 727 5.79E-179 526.737 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#8891 - CGI_10028514 superfamily 241583 130 282 1.84E-76 241.339 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#8891 - CGI_10028514 superfamily 243048 312 501 5.69E-48 166.718 cl02471 HX superfamily - - Hemopexin-like repeats.; Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). This CD contains 4 instances of the repeat. Q#8891 - CGI_10028514 superfamily 216518 52 102 1.60E-11 60.2365 cl18368 PG_binding_1 superfamily - - Putative peptidoglycan binding domain; This domain is composed of three alpha helices. This domain is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. This family is found N-terminal to the catalytic domain of matrixins. The domain is found to bind peptidoglycan experimentally. Q#8892 - CGI_10028515 superfamily 247098 32 158 7.80E-72 215.082 cl15841 COG0229 superfamily - - "Conserved domain frequently associated with peptide methionine sulfoxide reductase [Posttranslational modification, protein turnover, chaperones]" Q#8896 - CGI_10009337 superfamily 218493 48 195 1.77E-43 146.348 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#8897 - CGI_10009338 superfamily 218493 441 588 5.63E-47 162.141 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#8898 - CGI_10009339 superfamily 246722 4 115 4.48E-30 115.932 cl14812 PIN_SF superfamily N - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#8898 - CGI_10009339 superfamily 246724 117 256 1.95E-20 85.8673 cl14815 H3TH_StructSpec-5'-nucleases superfamily - - "H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination; The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases." Q#8899 - CGI_10009340 superfamily 243609 1 136 7.10E-34 118.091 cl04000 Cornichon superfamily - - Cornichon protein; Cornichon protein. Q#8900 - CGI_10009341 superfamily 192997 310 399 5.65E-09 55.6655 cl18184 Sterol-sensing superfamily N - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#8900 - CGI_10009341 superfamily 241888 719 860 4.60E-05 44.4817 cl00473 BI-1-like superfamily C - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#8907 - CGI_10009348 superfamily 243074 7 53 9.37E-10 54.4349 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#8908 - CGI_10009349 superfamily 115363 583 640 1.42E-08 52.3742 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8909 - CGI_10009350 superfamily 115363 5 64 1.53E-07 44.6702 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8910 - CGI_10009351 superfamily 115363 58 107 0.000199191 38.1218 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8911 - CGI_10009352 superfamily 115363 267 321 1.34E-05 42.7442 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#8913 - CGI_10001195 superfamily 247057 256 311 1.50E-05 42.2265 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#8916 - CGI_10001198 superfamily 245040 51 79 0.00927424 31.2445 cl09238 CY superfamily NC - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#8917 - CGI_10001316 superfamily 206088 28 49 0.00413798 33.4635 cl16476 zf-CCHC_3 superfamily C - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#8923 - CGI_10008365 superfamily 244859 4 118 0.00545723 33.6741 cl08171 HtrL_YibB superfamily C - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#8924 - CGI_10008366 superfamily 243072 430 555 2.14E-31 118.64 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8924 - CGI_10008366 superfamily 243072 4 125 7.97E-31 117.099 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8924 - CGI_10008366 superfamily 243072 165 290 8.56E-31 116.714 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8924 - CGI_10008366 superfamily 243072 368 489 2.22E-28 110.166 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8924 - CGI_10008366 superfamily 243072 264 390 3.05E-28 109.781 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8926 - CGI_10008368 superfamily 245206 22 94 5.94E-16 69.3981 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#8927 - CGI_10008369 superfamily 193256 596 643 2.18E-05 45.3236 cl18189 AAA_8 superfamily N - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#8927 - CGI_10008369 superfamily 110440 471 498 0.000256337 39.3133 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#8927 - CGI_10008369 superfamily 241563 60 96 0.00151979 37.0736 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#8929 - CGI_10004220 superfamily 241874 28 605 0 681.983 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#8933 - CGI_10001889 superfamily 217293 81 283 8.05E-30 115.037 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#8933 - CGI_10001889 superfamily 202474 290 375 1.13E-07 51.1153 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#8934 - CGI_10002066 superfamily 247858 1 114 1.25E-18 78.5838 cl17304 2OG-FeII_Oxy_3 superfamily N - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#8935 - CGI_10002067 superfamily 219677 41 69 0.00947167 35.106 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#8938 - CGI_10010354 superfamily 247905 186 325 3.67E-21 88.4488 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#8938 - CGI_10010354 superfamily 247805 5 156 9.29E-14 67.7476 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#8939 - CGI_10010355 superfamily 243092 117 299 3.16E-16 79.3012 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8939 - CGI_10010355 superfamily 243092 224 492 4.06E-11 63.8932 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8941 - CGI_10010357 superfamily 243072 19 112 6.49E-15 68.5642 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8941 - CGI_10010357 superfamily 243073 206 246 1.87E-06 43.6129 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#8944 - CGI_10010360 superfamily 247725 46 180 7.41E-16 76.0273 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8944 - CGI_10010360 superfamily 206020 369 419 1.22E-12 64.0662 cl18286 Y_phosphatase_m superfamily - - "Myotubularin Y_phosphatase-like; This short region is highly conserved and seems to be common to many myotubularin proteins with protein tyrosine pyrophosphate activity. As the family has a number of highly conserved residues such as histidine, cysteine, glutamine and aspartate, it is possible that this represents a catalytic core of the active enzymatic part of the proteins." Q#8944 - CGI_10010360 superfamily 221647 582 626 1.05E-08 53.2103 cl13953 3-PAP superfamily NC - "Myotubularin-associated protein; This domain family is found in eukaryotes, and is typically between 115 and 138 amino acids in length. Myotubularin is a dual-specific phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bisphosphate. 3-PAP is a catalytically inactive member of the myotubularin gene family, which coprecipitates lipid phosphatidylinositol 3-phosphate-3-phosphatase activity from lysates of human platelets." Q#8944 - CGI_10010360 superfamily 219103 218 269 3.40E-06 45.8252 cl05893 Myotub-related superfamily C - "Myotubularin-related; This family represents a region within eukaryotic myotubularin-related proteins that is sometimes found with pfam02893. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease." Q#8950 - CGI_10010514 superfamily 241624 925 1167 8.37E-29 118.198 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#8950 - CGI_10010514 superfamily 247725 182 363 6.84E-39 145.765 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#8950 - CGI_10010514 superfamily 246925 680 898 1.62E-07 53.8986 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#8950 - CGI_10010514 superfamily 246925 573 738 0.000181821 44.2686 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#8951 - CGI_10010516 superfamily 215647 32 254 1.19E-18 81.886 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#8952 - CGI_10010517 superfamily 241600 106 184 2.36E-29 110.023 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8953 - CGI_10010518 superfamily 247908 118 305 7.81E-65 204.064 cl17354 NIF superfamily - - NLI interacting factor-like phosphatase; This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain. Q#8954 - CGI_10010519 superfamily 248259 471 563 1.25E-39 142.389 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#8954 - CGI_10010519 superfamily 248259 371 460 4.85E-27 106.565 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#8954 - CGI_10010519 superfamily 248259 141 241 2.15E-26 105.024 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#8954 - CGI_10010519 superfamily 247057 770 836 2.34E-26 103.989 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#8954 - CGI_10010519 superfamily 248259 252 347 5.79E-24 98.0906 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#8955 - CGI_10010520 superfamily 114912 7 41 3.46E-11 57.0609 cl17946 zf-U1 superfamily - - "U1 zinc finger; This family consists of several U1 small nuclear ribonucleoprotein C (U1-C) proteins. The U1 small nuclear ribonucleoprotein (U1 snRNP) binds to the pre-mRNA 5' splice site (ss) at early stages of spliceosome assembly. Recruitment of U1 to a class of weak 5' ss is promoted by binding of the protein TIA-1 to uridine-rich sequences immediately downstream from the 5' ss. Binding of TIA-1 in the vicinity of a 5' ss helps to stabilise U1 snRNP recruitment, at least in part, via a direct interaction with U1-C, thus providing one molecular mechanism for the function of this splicing regulator. This domain is probably a zinc-binding. It is found in multiple copies in some members of the family." Q#8955 - CGI_10010520 superfamily 241647 130 155 0.00108225 35.9666 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#8956 - CGI_10010521 superfamily 247775 302 551 2.28E-69 231.372 cl17221 ArsB_NhaD_permease superfamily N - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#8956 - CGI_10010521 superfamily 247775 91 266 2.62E-54 190.541 cl17221 ArsB_NhaD_permease superfamily C - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#8958 - CGI_10010523 superfamily 243072 63 173 7.66E-09 54.697 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#8958 - CGI_10010523 superfamily 149414 180 242 3.50E-29 111.98 cl07091 TRP_2 superfamily - - Transient receptor ion channel II; This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023). Q#8962 - CGI_10010528 superfamily 241571 47 83 0.00392895 34.3103 cl00049 CUB superfamily NC - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#8963 - CGI_10010529 superfamily 241785 85 483 0 630.198 cl00324 Ribosomal_L3 superfamily - - Ribosomal protein L3; Ribosomal protein L3. Q#8964 - CGI_10010530 superfamily 118773 39 164 3.40E-34 118.724 cl10933 NDUFB10 superfamily - - "NADH-ubiquinone oxidoreductase subunit 10; NDUFB10 is a family of conserved proteins of up to 180 residues. It is one of the 41 protein subunits within the hydrophobic fraction of the NADH:ubiquinone oxidoreductase (complex I), a multiprotein complex located in the inner mitochondrial membrane whose main function is the transport of electrons from NADH to ubiquinone, which is accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space. NDUFB10 is encoded in the nucleus." Q#8965 - CGI_10010531 superfamily 243092 424 647 4.02E-47 171.749 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#8966 - CGI_10010532 superfamily 241782 26 438 2.50E-142 417.355 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#8969 - CGI_10010887 superfamily 241607 94 117 3.21E-05 38.0198 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#8970 - CGI_10010888 superfamily 241607 66 89 9.40E-07 41.8718 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#8970 - CGI_10010888 superfamily 241607 28 62 3.77E-06 40.331 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#8971 - CGI_10010889 superfamily 241547 61 207 3.11E-49 163.594 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#8972 - CGI_10010890 superfamily 205516 307 467 4.67E-79 245.392 cl18271 AcetylCoA_hyd_C superfamily - - "Acetyl-CoA hydrolase/transferase C-terminal domain; This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilises acyl-CoA and acetate to form acetyl-CoA." Q#8972 - CGI_10010890 superfamily 217098 44 222 3.13E-22 93.7549 cl15896 AcetylCoA_hydro superfamily - - "Acetyl-CoA hydrolase/transferase N-terminal domain; This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilises acyl-CoA and acetate to form acetyl-CoA." Q#8973 - CGI_10010891 superfamily 203238 1 131 4.06E-58 179.417 cl12307 GMP_PDE_delta superfamily - - "GMP-PDE, delta subunit; GMP-PDE delta subunit was originally identified as a fourth subunit of rod-specific cGMP phosphodiesterase (PDE)(EC:3.1.4.35). The precise function of PDE delta subunit in the rod specific GMP-PDE complex is unclear. In addition, PDE delta subunit is not confined to photoreceptor cells but is widely distributed in different tissues. PDE delta subunit is thought to be a specific soluble transport factor for certain prenylated proteins and Arl2-GTP a regulator of PDE-mediated transport." Q#8977 - CGI_10010895 superfamily 241596 121 171 3.84E-15 66.8539 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#8978 - CGI_10010896 superfamily 241596 140 180 6.87E-11 55.2979 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#8980 - CGI_10010898 superfamily 247755 58 285 1.70E-75 232.056 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8981 - CGI_10010899 superfamily 247789 90 270 4.00E-20 85.3882 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#8982 - CGI_10010900 superfamily 247755 47 210 4.76E-56 180.825 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8983 - CGI_10010901 superfamily 247789 16 229 1.46E-23 96.9441 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#8983 - CGI_10010901 superfamily 247755 377 424 1.11E-06 47.5458 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#8984 - CGI_10010902 superfamily 218284 43 96 1.12E-15 70.3611 cl04786 SOUL superfamily C - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#8984 - CGI_10010902 superfamily 247750 124 156 2.53E-06 44.9509 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#8992 - CGI_10008844 superfamily 245819 426 601 2.51E-60 201.269 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#8992 - CGI_10008844 superfamily 219526 208 411 3.30E-61 204.775 cl06648 HNOBA superfamily - - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8992 - CGI_10008844 superfamily 203730 1 158 5.54E-45 158.202 cl18246 HNOB superfamily - - "Heme NO binding; The HNOB (Heme NO Binding) domain, is a predominantly alpha-helical domain and binds heme via a covalent linkage to histidine. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#8993 - CGI_10008845 superfamily 241600 136 354 4.73E-102 303.778 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#8997 - CGI_10008849 superfamily 243179 112 220 9.43E-11 56.3587 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#8998 - CGI_10008850 superfamily 247856 325 384 2.17E-05 41.7645 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#8998 - CGI_10008850 superfamily 247856 174 227 6.84E-05 40.6089 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9000 - CGI_10008852 superfamily 243034 125 219 1.07E-06 45.4488 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#9001 - CGI_10008853 superfamily 247727 171 306 3.66E-13 66.9365 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#9002 - CGI_10008854 superfamily 221135 7 149 9.45E-32 115.932 cl13079 PI31_Prot_N superfamily - - PI31 proteasome regulator N-terminal; PI31 is a regulatory subunit of the immuno-proteasome which is an inhibitor of the 20 S proteasome in vitro.PI31 is also an F-box protein Fbxo7.Skp1 binding partner which requires an N terminal FP domain in both proteins for the interaction to occur via the FP beta sheets. The structure of PI31 FP domain contains a novel alpha/beta-fold and two intermolecular contact surfaces. This is the N-terminal domain of the members. Q#9002 - CGI_10008854 superfamily 219914 198 250 1.42E-05 41.5874 cl07260 PI31_Prot_C superfamily - - PI31 proteasome regulator; PI31 is a cellular regulator of proteasome formation and of proteasome-mediated antigen processing. Q#9004 - CGI_10008856 superfamily 243072 452 531 5.16E-14 68.9494 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9004 - CGI_10008856 superfamily 242183 134 421 2.65E-144 420.114 cl00907 Glutaminase superfamily - - Glutaminase; This family of enzymes deaminates glutamine to glutamate EC:3.5.1.2. Q#9006 - CGI_10008858 superfamily 248458 167 328 1.26E-06 48.4641 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9006 - CGI_10008858 superfamily 248458 24 108 0.000247003 41.1453 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9007 - CGI_10008860 superfamily 218267 25 113 8.01E-24 100.972 cl04754 LMBR1 superfamily C - "LMBR1-like membrane protein; Members of this family are integral membrane proteins that are around 500 residues in length. LMBR1 is not involved in preaxial polydactyly, as originally thought. Vertebrate members of this family may play a role in limb development. A member of this family has been shown to be a lipocalin membrane receptor" Q#9009 - CGI_10008862 superfamily 241563 68 109 8.16E-07 46.7036 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9009 - CGI_10008862 superfamily 241563 28 59 0.00158551 37.0736 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9010 - CGI_10002688 superfamily 247916 501 568 0.000209174 40.4618 cl17362 Transglut_core superfamily C - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#9013 - CGI_10021342 superfamily 222429 7 85 3.61E-16 68.8064 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#9015 - CGI_10021344 superfamily 151671 10 475 1.20E-43 162.131 cl12777 DUF3028 superfamily - - Protein of unknown function (DUF3028); This eukaryotic family of proteins has no known function. Q#9016 - CGI_10021345 superfamily 247724 14 94 6.31E-53 165.756 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9017 - CGI_10021346 superfamily 243050 1 38 0.000114632 37.0139 cl02475 LIM superfamily N - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#9018 - CGI_10021347 superfamily 245201 501 714 2.71E-47 167.414 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9019 - CGI_10021348 superfamily 238191 61 527 8.82E-102 319.278 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#9019 - CGI_10021348 superfamily 238191 2 30 0.000817394 40.7784 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#9020 - CGI_10021350 superfamily 241838 73 115 1.93E-18 84.4222 cl00395 FMT_core superfamily C - "Formyltransferase, catalytic core domain; Formyltransferase, catalytic core domain. The proteins of this superfamily contain a formyltransferase domain that hydrolyzes the removal of a formyl group from its substrate as part of a multistep transfer mechanism, and this alignment model represents the catalytic core of the formyltransferase domain. This family includes the following known members; Glycinamide Ribonucleotide Transformylase (GART), Formyl-FH4 Hydrolase, Methionyl-tRNA Formyltransferase, ArnA, and 10-Formyltetrahydrofolate Dehydrogenase (FDH). Glycinamide Ribonucleotide Transformylase (GART) catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Methionyl-tRNA Formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA, which plays important role in translation initiation. ArnA is required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. 10-formyltetrahydrofolate dehydrogenase (FDH) catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. Members of this family are multidomain proteins. The formyltransferase domain is located at the N-terminus of FDH, Methionyl-tRNA Formyltransferase and ArnA, and at the C-terminus of Formyl-FH4 Hydrolase. Prokaryotic Glycinamide Ribonucleotide Transformylase (GART) is a single domain protein while eukaryotic GART is a trifunctional protein that catalyzes the second, third and fifth steps in de novo purine biosynthesis." Q#9020 - CGI_10021350 superfamily 245201 620 691 6.68E-15 73.8101 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9020 - CGI_10021350 superfamily 247724 293 413 1.14E-07 51.0082 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9020 - CGI_10021350 superfamily 247724 187 225 2.36E-07 50.2378 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9021 - CGI_10021351 superfamily 238191 29 204 1.39E-57 191.777 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#9022 - CGI_10021352 superfamily 241971 22 269 1.06E-101 300.263 cl00599 Extradiol_Dioxygenase_3B_like superfamily - - "Subunit B of Class III Extradiol ring-cleavage dioxygenases; Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be further divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two-domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B. This model represents the catalytic subunit B of extradiol dioxygenase class III enzymes. Enzymes belonging to this family include Protocatechuate 4,5-dioxygenase (LigAB), 2'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarB), 4,5-DOPA Dioxygenase, 2,3-dihydroxyphenylpropionate 1,2-dioxygenase, and 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD). There are also some family members that do not show the typical dioxygenase activity." Q#9024 - CGI_10021354 superfamily 245847 22 143 2.44E-07 48.2676 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#9025 - CGI_10021355 superfamily 241574 189 389 3.74E-67 221.306 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9025 - CGI_10021355 superfamily 241574 542 629 1.20E-11 63.7589 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9027 - CGI_10021357 superfamily 243064 17 95 1.18E-07 47.3534 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#9028 - CGI_10021358 superfamily 241563 61 102 2.49E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9029 - CGI_10021359 superfamily 248458 9 104 1.29E-07 50.3901 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9030 - CGI_10021360 superfamily 242173 188 338 3.58E-32 118.902 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#9031 - CGI_10021361 superfamily 241782 136 497 3.55E-149 457.849 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#9031 - CGI_10021361 superfamily 241782 535 896 6.19E-179 539.574 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#9031 - CGI_10021361 superfamily 241782 898 1190 1.17E-136 428.252 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#9032 - CGI_10021362 superfamily 242173 33 184 5.09E-37 128.918 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#9033 - CGI_10021363 superfamily 247675 8 388 0 793.138 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#9036 - CGI_10021366 superfamily 241644 79 213 5.67E-64 197.425 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#9038 - CGI_10023035 superfamily 243035 2 54 2.59E-09 48.385 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9039 - CGI_10023037 superfamily 247046 93 245 4.09E-09 53.9781 cl15705 DUF563 superfamily N - Protein of unknown function (DUF563); Family of uncharacterized proteins. Q#9043 - CGI_10023041 superfamily 243051 599 757 1.12E-22 95.9077 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#9043 - CGI_10023041 superfamily 241571 331 446 2.65E-12 64.3558 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#9043 - CGI_10023041 superfamily 241583 103 284 3.59E-33 126.532 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#9053 - CGI_10023051 superfamily 247915 16 55 0.00493939 31.9745 cl17361 Glucosaminidase superfamily NC - "Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase; This family includes Mannosyl-glycoprotein endo-beta-N-acetylglucosaminidase EC:3.2.1.96, as well as the flageller protein J, which has been shown to hydrolyse peptidoglycan." Q#9056 - CGI_10023054 superfamily 241554 223 355 1.93E-22 93.8643 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#9056 - CGI_10023054 superfamily 241554 321 395 1.20E-06 47.9213 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#9058 - CGI_10023056 superfamily 203013 436 461 0.00444759 34.909 cl04519 zf-HIT superfamily - - HIT zinc finger; This presumed zinc finger contains up to 6 cysteine residues that could coordinate zinc. The domain is named after the HIT protein. This domain is also found in the Thyroid receptor interacting protein 3 (TRIP-3) that specifically interacts with the ligand binding domain of the thyroid receptor. Q#9063 - CGI_10023062 superfamily 222090 148 293 2.05E-15 73.0758 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#9064 - CGI_10023063 superfamily 247866 131 374 4.51E-48 164.548 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#9064 - CGI_10023063 superfamily 247866 11 114 4.29E-07 48.988 cl17312 PhyH superfamily C - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#9065 - CGI_10023064 superfamily 216101 18 599 0 675.548 cl08288 Carn_acyltransf superfamily - - Choline/Carnitine o-acyltransferase; Choline/Carnitine o-acyltransferase. Q#9067 - CGI_10023066 superfamily 243094 509 828 1.15E-171 517.522 cl02569 RasGAP superfamily - - "Ras GTPase Activating Domain; RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator." Q#9067 - CGI_10023066 superfamily 246669 374 517 1.04E-63 214.478 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#9067 - CGI_10023066 superfamily 247725 315 422 7.77E-35 133.666 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9070 - CGI_10023069 superfamily 218940 104 568 0 609.496 cl05625 COBRA1 superfamily - - "Cofactor of BRCA1 (COBRA1); This family consists of several cofactor of BRCA1 (COBRA1) like proteins. It is thought that COBRA1 along with BRCA1 is involved in chromatin unfolding. COBRA1 is recruited to the chromosome site by the first BRCT repeat of BRCA1, and is itself sufficient to induce chromatin unfolding. BRCA1 mutations that enhance chromatin unfolding also increase its affinity for, and recruitment of, COBRA1. It is thought that that reorganisation of higher levels of chromatin structure is an important regulated step in BRCA1-mediated nuclear functions." Q#9071 - CGI_10023071 superfamily 247057 141 191 1.58E-06 46.8489 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#9071 - CGI_10023071 superfamily 246908 253 373 2.45E-37 137.851 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#9071 - CGI_10023071 superfamily 243053 748 820 1.24E-13 70.7444 cl02485 RasGEF superfamily C - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#9072 - CGI_10023072 superfamily 243069 41 169 1.09E-48 165.807 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#9073 - CGI_10023074 superfamily 247639 74 251 1.64E-07 50.0607 cl16914 O-FucT_like superfamily N - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#9075 - CGI_10009660 superfamily 247986 456 543 1.06E-08 55.0718 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#9075 - CGI_10009660 superfamily 245225 8 385 1.30E-126 391.05 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#9075 - CGI_10009660 superfamily 197504 663 792 1.58E-38 141.273 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#9075 - CGI_10009660 superfamily 119082 839 865 2.83E-09 54.4926 cl11191 CaM_bdg_C0 superfamily - - "Calmodulin-binding domain C0 of NMDA receptor NR1 subunit; This is a very short highly conserved domain that is C-terminal to the cytosolic transmembrane region IV of the NMDA-receptor 1. It has been shown to bind Calmodulin-Calcium with high affinity. The ionotropic N-methyl-D-aspartate receptor (NMDAR) is a major source of calcium flux into neurons in the brain and plays a critical role in learning, memory, neural development, and synaptic plasticity. Calmodulin (CaM) regulates NMDARs by binding tightly to the C0 and C1 regions of their NR1 subunit. The conserved tryptophan is considered to be the anchor residue." Q#9076 - CGI_10009661 superfamily 247723 332 422 4.24E-46 156.999 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9076 - CGI_10009661 superfamily 247723 200 290 5.29E-46 156.54 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9077 - CGI_10009662 superfamily 116164 71 97 9.76E-07 44.102 cl06534 Vg_Tdu superfamily C - "Vestigial/Tondu family; The mammalian TEF and the Drosophila scalloped genes belong to a conserved family of transcriptional factors that possesses a TEA/ATTS DNA-binding domain. Transcriptional activation by these proteins likely requires interactions with specific coactivators. In Drosophila, Scalloped (Sd) interacts with Vestigial (Vg) to form a complex, which binds DNA through the Sd TEA/ATTS domain. The Sd-Vg heterodimer is a key regulator of wing development, which directly controls several target genes and is able to induce wing outgrowth when ectopically expressed. This short conserved region is needed for interaction with Sd." Q#9086 - CGI_10003144 superfamily 243035 20 96 1.28E-11 57.6297 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9087 - CGI_10003145 superfamily 243092 359 554 1.86E-14 74.6788 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9087 - CGI_10003145 superfamily 243092 688 774 0.000817055 41.9368 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9087 - CGI_10003145 superfamily 243092 101 170 0.00954782 38.47 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9091 - CGI_10012656 superfamily 245596 530 794 6.93E-76 252.614 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#9091 - CGI_10012656 superfamily 245596 440 496 4.15E-09 57.3176 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#9091 - CGI_10012656 superfamily 245166 881 945 0.00611376 37.955 cl09823 Trep_Strep superfamily C - "Hypothetical bacterial integral membrane protein (Trep_Strep); This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. It is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6." Q#9091 - CGI_10012656 superfamily 216554 1138 1216 0.00828521 37.459 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#9095 - CGI_10028231 superfamily 241832 438 532 2.51E-14 70.3321 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9095 - CGI_10028231 superfamily 241832 570 653 0.00110302 38.4726 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9097 - CGI_10028233 superfamily 247978 414 447 0.00853678 36.394 cl17424 CsbD superfamily NC - "CsbD-like; CsbD is a bacterial general stress response protein. It's expression is mediated by sigma-B, an alternative sigma factor. The role of CsbD in stress response is unclear." Q#9098 - CGI_10028234 superfamily 220695 49 206 1.20E-05 44.8771 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#9099 - CGI_10028235 superfamily 218140 210 487 9.58E-89 289.113 cl04579 Anoctamin superfamily C - "Calcium-activated chloride channel; The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes." Q#9099 - CGI_10028235 superfamily 218140 544 711 3.61E-48 176.25 cl04579 Anoctamin superfamily N - "Calcium-activated chloride channel; The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes." Q#9100 - CGI_10028236 superfamily 248264 13 50 4.13E-09 49.543 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#9101 - CGI_10028237 superfamily 245596 75 266 1.01E-99 303.734 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#9104 - CGI_10028240 superfamily 241563 38 77 8.40E-07 46.1763 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9104 - CGI_10028240 superfamily 245027 76 189 0.00779474 35.832 cl09176 FlgN superfamily C - FlgN protein; This family includes the FlgN protein and export chaperone involved in flagellar synthesis. Q#9107 - CGI_10028243 superfamily 110440 354 381 0.00582978 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9107 - CGI_10028243 superfamily 243092 134 269 0.0063365 36.544 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9108 - CGI_10028244 superfamily 241563 60 99 1.17E-05 43.8651 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9108 - CGI_10028244 superfamily 243362 355 396 0.00916357 36.6343 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#9110 - CGI_10028246 superfamily 243072 852 978 1.15E-25 104.388 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9110 - CGI_10028246 superfamily 243072 569 694 1.16E-23 98.6098 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9110 - CGI_10028246 superfamily 243072 474 628 1.28E-23 98.6098 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9110 - CGI_10028246 superfamily 243072 754 912 5.03E-22 93.9874 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9110 - CGI_10028246 superfamily 243072 361 500 8.43E-21 90.5206 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9110 - CGI_10028246 superfamily 243072 250 387 6.23E-20 87.8242 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9110 - CGI_10028246 superfamily 248318 1092 1147 1.20E-19 85.1801 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#9110 - CGI_10028246 superfamily 243072 919 1050 9.34E-18 81.661 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9110 - CGI_10028246 superfamily 243066 59 156 2.25E-13 68.0277 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#9111 - CGI_10028247 superfamily 245210 31 406 8.15E-172 490.513 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#9113 - CGI_10028249 superfamily 245226 122 284 3.81E-22 90.8228 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#9114 - CGI_10028250 superfamily 217865 60 338 1.18E-159 455.548 cl12285 Not1 superfamily N - "CCR4-Not complex component, Not1; The Ccr4-Not complex is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID." Q#9115 - CGI_10028251 superfamily 207662 73 109 8.94E-17 73.6336 cl02596 NR_DBD_like superfamily C - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#9115 - CGI_10028251 superfamily 207662 251 287 2.23E-11 58.996 cl02596 NR_DBD_like superfamily NC - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#9117 - CGI_10028254 superfamily 221482 479 680 3.59E-29 115.752 cl13649 Angiomotin_C superfamily - - "Angiomotin C terminal; This domain family is found in eukaryotes, and is typically between 197 and 211 amino acids in length. This family is the C terminal region of angiomotin. Angiomotin regulates the action of angiogenesis inhibitor angiostatin. The C terminal region of angiomotin appears to be involved in directing the protein chemotactically." Q#9118 - CGI_10028255 superfamily 187751 116 260 2.09E-35 130.117 cl18153 RNase_H2-B superfamily C - "Ribonuclease H2-B is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids; Ribonuclease H2B is one of the three proteins of eukaryotic RNase H2 complex that is required for nucleic acid binding and hydrolysis. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. Eukaryotic RNase HII is active during replication and is believed to play a role in removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII is functional when it forms a complex with RNase H2B and RNase H2C proteins. It is speculated that the two accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause neurological disorder." Q#9118 - CGI_10028255 superfamily 187751 344 384 9.90E-05 41.906 cl18153 RNase_H2-B superfamily N - "Ribonuclease H2-B is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids; Ribonuclease H2B is one of the three proteins of eukaryotic RNase H2 complex that is required for nucleic acid binding and hydrolysis. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. Eukaryotic RNase HII is active during replication and is believed to play a role in removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII is functional when it forms a complex with RNase H2B and RNase H2C proteins. It is speculated that the two accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause neurological disorder." Q#9120 - CGI_10028257 superfamily 220695 327 455 2.31E-08 54.5071 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#9123 - CGI_10028260 superfamily 241750 312 514 4.38E-32 123.067 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#9124 - CGI_10028261 superfamily 222413 90 149 5.94E-11 59.9762 cl16433 DDE_Tnp_1_7 superfamily C - Transposase IS4; Transposase IS4. Q#9124 - CGI_10028261 superfamily 222412 204 233 1.79E-06 43.5133 cl16432 Tnp_zf-ribbon_2 superfamily - - DDE_Tnp_1-like zinc-ribbon; This zinc-ribbon domain is frequently found at the C-terminal of proteins derived from transposable elements. Q#9127 - CGI_10028264 superfamily 242274 134 250 2.82E-05 42.7846 cl01053 SGNH_hydrolase superfamily N - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#9130 - CGI_10028267 superfamily 243077 106 157 3.81E-15 71.0373 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#9130 - CGI_10028267 superfamily 214946 220 459 2.56E-18 85.4879 cl15345 Sec63 superfamily C - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#9130 - CGI_10028267 superfamily 214946 612 692 0.000193512 42.7307 cl15345 Sec63 superfamily N - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#9131 - CGI_10028268 superfamily 247916 105 150 6.54E-05 41.2143 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#9135 - CGI_10028272 superfamily 246664 250 745 1.09E-149 450.223 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#9136 - CGI_10028273 superfamily 247799 75 197 4.93E-59 191.688 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#9137 - CGI_10028274 superfamily 241568 30 85 3.28E-07 45.9168 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9138 - CGI_10028275 superfamily 241563 68 109 8.96E-07 46.3184 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9138 - CGI_10028275 superfamily 241563 28 59 0.00337337 35.918 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9139 - CGI_10028276 superfamily 247044 141 206 2.33E-28 104.612 cl15697 ADF_gelsolin superfamily C - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#9141 - CGI_10028278 superfamily 248097 69 186 4.12E-16 74.2238 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9141 - CGI_10028278 superfamily 248097 354 413 0.00854139 34.9334 cl17543 C1q superfamily NC - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9144 - CGI_10028281 superfamily 245603 94 117 0.000190592 36.1604 cl11403 pepsin_retropepsin_like superfamily C - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#9145 - CGI_10028283 superfamily 241574 627 815 3.51E-71 235.558 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9145 - CGI_10028283 superfamily 243072 98 189 5.70E-15 72.8014 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9148 - CGI_10028286 superfamily 218570 166 194 0.00107341 35.4561 cl05109 Pacifastin_I superfamily - - "Pacifastin inhibitor (LCMII); Structures of members of this family show that they are comprised of a triple-stranded antiparallel beta-sheet connected by three disulfide bridges, which defines this as a novel family of serine protease inhibitors." Q#9149 - CGI_10028287 superfamily 215754 61 133 4.21E-16 70.7452 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#9149 - CGI_10028287 superfamily 215754 136 199 2.68E-07 46.0924 cl02813 Mito_carr superfamily C - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#9150 - CGI_10028288 superfamily 245814 200 276 0.000498845 40.1651 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9150 - CGI_10028288 superfamily 215647 751 928 1.47E-15 77.6488 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#9150 - CGI_10028288 superfamily 243086 684 729 4.80E-07 48.9106 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#9150 - CGI_10028288 superfamily 243029 301 341 0.00243192 38.1005 cl02422 HRM superfamily N - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#9150 - CGI_10028288 superfamily 199168 89 112 0.00309512 37.3312 cl15310 LRR_TYP superfamily - - "Leucine-rich repeats, typical (most populated) subfamily; Leucine-rich repeats, typical (most populated) subfamily. " Q#9150 - CGI_10028288 superfamily 246925 24 102 0.007391 38.8758 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#9151 - CGI_10028289 superfamily 243072 611 763 5.83E-14 70.4902 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9151 - CGI_10028289 superfamily 243072 335 402 8.23E-11 60.8602 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9151 - CGI_10028289 superfamily 243072 538 652 1.67E-07 50.845 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9151 - CGI_10028289 superfamily 209898 100 122 0.000167821 40.4646 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#9151 - CGI_10028289 superfamily 209898 77 97 0.00122943 38.1534 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#9152 - CGI_10028290 superfamily 218520 22 187 4.68E-59 186.715 cl05007 EBP superfamily - - "Emopamil binding protein; Emopamil binding protein (EBP) is as a gene that encodes a non-glycosylated type I integral membrane protein of endoplasmic reticulum and shows high level expression in epithelial tissues. The EBP protein has emopamil binding domains, including the sterol acceptor site and the catalytic centre, which show Delta8-Delta7 sterol isomerase activity. Human sterol isomerase, a homologue of mouse EBP, is suggested not only to play a role in cholesterol biosynthesis, but also to affect lipoprotein internalisation. In humans, mutations of EBP are known to cause the genetic disorder of X-linked dominant chondrodysplasia punctata (CDPX2). This syndrome of humans is lethal in most males, and affected females display asymmetric hyperkeratotic skin and skeletal abnormalities." Q#9153 - CGI_10028291 superfamily 247723 127 202 5.29E-43 146.762 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9153 - CGI_10028291 superfamily 247723 212 282 4.14E-40 138.94 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9155 - CGI_10028293 superfamily 241832 47 173 1.50E-59 197.528 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9155 - CGI_10028293 superfamily 110440 462 488 0.00320249 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9160 - CGI_10028298 superfamily 243179 97 165 1.26E-10 56.9725 cl02781 tetraspanin_LEL superfamily C - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#9161 - CGI_10028299 superfamily 245206 1 111 8.65E-20 81.1332 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#9162 - CGI_10028300 superfamily 243077 9 54 1.65E-07 42.9177 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#9163 - CGI_10028301 superfamily 243092 286 388 0.0025535 38.8552 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9164 - CGI_10028302 superfamily 219110 1 63 2.59E-37 120.485 cl05913 RAMP4 superfamily - - "Ribosome associated membrane protein RAMP4; This family consists of several ribosome associated membrane protein RAMP4 (or SERP1) sequences. Stabilisation of membrane proteins in response to stress involves the concerted action of a rescue unit in the ER membrane comprised of SERP1/RAMP4, other components of the translocon, and molecular chaperones in the ER." Q#9165 - CGI_10028303 superfamily 245864 55 147 1.25E-19 83.4818 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9166 - CGI_10028304 superfamily 245864 4 411 2.18E-106 324.617 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9167 - CGI_10028305 superfamily 245864 4 389 2.63E-92 288.408 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9168 - CGI_10028306 superfamily 245864 26 477 6.47E-117 354.277 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9169 - CGI_10028307 superfamily 245864 41 241 8.44E-61 199.812 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9170 - CGI_10028308 superfamily 245864 26 96 1.18E-12 63.0662 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9171 - CGI_10028309 superfamily 245864 26 478 5.53E-101 313.061 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9172 - CGI_10028310 superfamily 245864 1 349 1.54E-72 234.48 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9173 - CGI_10028311 superfamily 245864 26 382 1.80E-78 252.199 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9174 - CGI_10028312 superfamily 241691 772 884 0.00132635 39.4176 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#9174 - CGI_10028312 superfamily 241565 59 128 0.00920671 35.429 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#9175 - CGI_10028313 superfamily 245864 26 474 1.98E-112 342.721 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9176 - CGI_10028314 superfamily 219570 5 454 1.51E-73 247.09 cl06694 CENP-I superfamily - - "Mis6; Mis6 is an essential centromere connector protein acting during G1-S phase of the cell cycle. Mis6 is thought to be required for recruiting CENP-A, the centromere- specific histone H3 variant, an important event for centromere function and chromosome segregation during mitosis." Q#9177 - CGI_10028315 superfamily 245201 3 234 4.80E-05 41.8386 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9178 - CGI_10028316 superfamily 245201 26 266 2.95E-07 48.7721 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9179 - CGI_10004654 superfamily 245201 23 211 2.64E-37 135.827 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9179 - CGI_10004654 superfamily 240618 295 383 1.07E-17 77.6744 cl18927 UBL_TBK1_like superfamily - - "Ubiquitin-Like Domain Of Human Tbk1 and similar proteins; This family contains ubiquitin-like domain (UBL) found in TANK-binding kinase 1 (TBK1) and similar proteins. TBK1 regulates factors such as IRF3 and IRF7, promoting antiviral activity in the interferon signaling pathways. In addition to the central UBL, these proteins have an N-terminal kinase domain and a C-terminal elongated helical domain. The ubiquitin-like domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the IFN pathway." Q#9181 - CGI_10004656 superfamily 245201 1171 1309 9.17E-23 99.2333 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9181 - CGI_10004656 superfamily 248099 52 128 2.90E-23 96.6246 cl17545 Bromo_TP superfamily - - Bromodomain associated; This domain is predicted to bind DNA and is often found associated with pfam00439 and in transcription factors. It has a histone-like fold. Q#9181 - CGI_10004656 superfamily 247999 1032 1077 1.90E-13 67.6221 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#9182 - CGI_10004657 superfamily 247724 19 180 3.33E-100 289.329 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9186 - CGI_10004661 superfamily 241574 358 556 1.01E-68 224.002 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9187 - CGI_10002369 superfamily 241600 1 169 2.36E-54 173.581 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#9188 - CGI_10002370 superfamily 241600 47 230 2.82E-61 195.152 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#9193 - CGI_10014193 superfamily 241733 1 65 5.42E-42 133.48 cl00259 Sm_like superfamily N - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#9195 - CGI_10014195 superfamily 243034 1085 1178 8.42E-09 54.6936 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#9195 - CGI_10014195 superfamily 243034 1202 1284 5.07E-06 46.2192 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#9195 - CGI_10014195 superfamily 243034 836 945 0.00150912 38.5152 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#9196 - CGI_10014196 superfamily 248458 84 448 4.96E-08 53.4717 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9197 - CGI_10014198 superfamily 216809 161 252 2.13E-16 71.835 cl03397 DUF108 superfamily - - Domain of unknown function DUF108; This family has no known function. It is found to compose the complete protein in archaebacteria and a single domain in a large C. elegans protein. Q#9197 - CGI_10014198 superfamily 217564 8 113 6.85E-12 60.0414 cl18420 NAD_binding_3 superfamily - - "Homoserine dehydrogenase, NAD binding domain; This domain adopts a Rossmann NAD binding fold. The C-terminal domain of homoserine dehydrogenase contributes a single helix to this structural domain, which is not included in the Pfam model." Q#9198 - CGI_10014199 superfamily 221749 1002 1056 5.73E-08 51.5849 cl15061 Gryzun-like superfamily - - "Gryzun, putative Golgi trafficking; Members of this family are involved in Golgi trafficking." Q#9199 - CGI_10014200 superfamily 243141 101 206 7.59E-22 88.1422 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#9200 - CGI_10014201 superfamily 241547 84 313 2.66E-88 277.242 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#9200 - CGI_10014201 superfamily 241547 375 626 9.58E-99 305.392 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#9201 - CGI_10014202 superfamily 245864 43 375 4.78E-62 209.442 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9201 - CGI_10014202 superfamily 245864 375 427 0.000409123 41.1098 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9202 - CGI_10014203 superfamily 245864 8 74 1.22E-22 90.4154 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9203 - CGI_10014204 superfamily 217311 1 328 2.32E-94 295.014 cl18402 DUF229 superfamily N - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#9205 - CGI_10014206 superfamily 241564 1 47 3.22E-14 62.2831 cl00035 BIR superfamily N - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#9206 - CGI_10014207 superfamily 245231 1 241 9.85E-71 221.626 cl10019 PurM-like superfamily N - "AIR (aminoimidazole ribonucleotide) synthase related protein. This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM (formylglycinamidine ribonucleotide) synthase and Selenophosphate synthetase (SelD). The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain." Q#9207 - CGI_10014208 superfamily 241782 69 453 1.16E-51 179.459 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#9208 - CGI_10014209 superfamily 243066 29 123 1.08E-20 86.9025 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#9208 - CGI_10014209 superfamily 198867 133 240 3.17E-17 76.9964 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#9209 - CGI_10014210 superfamily 221288 374 771 3.38E-138 417.502 cl14982 DUF3402 superfamily - - Domain of unknown function (DUF3402); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 350 to 473 amino acids in length. This domain is found associated with pfam07923. Q#9209 - CGI_10014210 superfamily 219645 22 311 3.55E-66 222.958 cl06798 N1221 superfamily - - N1221-like protein; The sequences featured in this family are similar to a hypothetical protein product of ORF N1221 in the CPT1-SPC98 intergenic region of the yeast genome. This encodes an acidic polypeptide with several possible transmembrane regions. Q#9210 - CGI_10014211 superfamily 243175 80 197 2.02E-58 182.063 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#9210 - CGI_10014211 superfamily 241832 1 72 4.09E-34 118.258 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9211 - CGI_10014212 superfamily 243175 58 178 3.93E-59 188.997 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#9211 - CGI_10014212 superfamily 241832 2 52 1.54E-20 84.36 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9213 - CGI_10014214 superfamily 245201 533 673 7.41E-06 47.325 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9213 - CGI_10014214 superfamily 245201 397 467 9.38E-06 47.1546 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9213 - CGI_10014214 superfamily 245716 10 33 1.87E-05 42.9307 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#9213 - CGI_10014214 superfamily 245201 623 749 5.68E-05 44.3598 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9214 - CGI_10004811 superfamily 241566 170 219 1.96E-15 71.3691 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#9214 - CGI_10004811 superfamily 241566 98 147 7.70E-14 67.1319 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#9214 - CGI_10004811 superfamily 245201 298 616 0 523.972 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9217 - CGI_10004814 superfamily 218676 12 371 1.36E-27 112.041 cl14911 Peptidase_M13_N superfamily - - "Peptidase family M13; M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk." Q#9219 - CGI_10004816 superfamily 218676 290 627 2.55E-22 98.9447 cl14911 Peptidase_M13_N superfamily - - "Peptidase family M13; M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk." Q#9219 - CGI_10004816 superfamily 246723 788 914 6.62E-09 58.0835 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#9219 - CGI_10004816 superfamily 246723 13 46 0.00775565 38.4383 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#9221 - CGI_10004207 superfamily 245225 487 780 3.40E-31 127.102 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#9221 - CGI_10004207 superfamily 247905 321 437 6.58E-10 58.7885 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#9221 - CGI_10004207 superfamily 245225 944 1395 1.56E-67 237.141 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#9221 - CGI_10004207 superfamily 242406 52 163 3.16E-11 62.9941 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#9222 - CGI_10004208 superfamily 245819 760 936 5.61E-64 215.136 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#9222 - CGI_10004208 superfamily 245225 11 306 1.62E-79 266.889 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#9222 - CGI_10004208 superfamily 245201 450 684 6.90E-34 132.275 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9222 - CGI_10004208 superfamily 219526 692 746 7.68E-05 43.7619 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#9223 - CGI_10004209 superfamily 241574 197 323 3.15E-18 82.6337 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9223 - CGI_10004209 superfamily 241574 39 165 6.09E-18 81.8633 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9224 - CGI_10004210 superfamily 241574 114 286 6.06E-72 226.313 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9226 - CGI_10009863 superfamily 247684 17 322 7.51E-72 232.938 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#9226 - CGI_10009863 superfamily 241547 344 396 7.54E-16 75.0119 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#9227 - CGI_10009864 superfamily 241547 153 201 3.67E-10 58.5788 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#9227 - CGI_10009864 superfamily 241832 269 358 1.40E-07 48.8438 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9227 - CGI_10009864 superfamily 241547 189 236 4.82E-07 49.334 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#9232 - CGI_10006159 superfamily 241580 76 151 6.21E-37 127.285 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#9238 - CGI_10006165 superfamily 246723 21 117 8.47E-31 120.361 cl14813 GluZincin superfamily NC - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#9238 - CGI_10006165 superfamily 246723 118 183 5.07E-22 94.9379 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#9239 - CGI_10006166 superfamily 217740 197 427 2.95E-72 229.17 cl18427 Scramblase superfamily - - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#9239 - CGI_10006166 superfamily 222429 6 85 1.81E-22 90.3776 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#9240 - CGI_10006167 superfamily 217740 46 276 3.90E-82 249.201 cl18427 Scramblase superfamily - - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#9241 - CGI_10006168 superfamily 246908 167 245 4.52E-15 72.8734 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#9241 - CGI_10006168 superfamily 128469 1145 1238 5.40E-14 70.562 cl17971 VPS9 superfamily - - Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. Q#9241 - CGI_10006168 superfamily 241645 1264 1332 6.28E-11 60.74 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#9242 - CGI_10017479 superfamily 245213 14 49 9.64E-11 55.3354 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9242 - CGI_10017479 superfamily 243061 136 230 1.07E-41 140.17 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#9242 - CGI_10017479 superfamily 245847 55 119 1.49E-06 45.6254 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#9243 - CGI_10017480 superfamily 243112 14 37 1.45E-08 49.9419 cl02620 YDG_SRA superfamily N - "YDG/SRA domain; The function of this domain is unknown, it contains a conserved motif YDG after which it has been named." Q#9245 - CGI_10017482 superfamily 242443 45 361 1.89E-67 216.825 cl01342 Peptidase_A22B superfamily - - "Signal peptide peptidase; The members of this family are membrane proteins. In some proteins this region is found associated with pfam02225. This family corresponds with Merops subfamily A22B, the type example of which is signal peptide peptidase. There is a sequence-similarity relationship with pfam01080." Q#9248 - CGI_10017485 superfamily 248360 12 204 1.19E-68 212.135 cl17806 DER1 superfamily - - "Der1-like family; The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae contains of proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process. The mutant classes were called 'der' for 'degradation in the ER'. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein, that is localised to the ER. Deletion of DER1 abolished degradation of the substrate proteins. The function of the Der1 protein seems to be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. Suggesting that this family may also mediate degradation of misfolded proteins (Bateman A pers. obs.)." Q#9249 - CGI_10017486 superfamily 241644 4 170 4.47E-49 157.749 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#9250 - CGI_10017487 superfamily 243066 23 122 1.52E-15 71.8797 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#9250 - CGI_10017487 superfamily 198867 131 235 9.97E-13 63.8996 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#9251 - CGI_10017488 superfamily 247792 16 67 8.79E-08 49.7516 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9251 - CGI_10017488 superfamily 241563 165 196 0.000582272 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9253 - CGI_10017490 superfamily 247723 44 136 1.86E-38 140.511 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9253 - CGI_10017490 superfamily 243091 1264 1385 2.41E-37 138.621 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#9253 - CGI_10017490 superfamily 214703 1385 1401 9.16E-06 44.7036 cl02636 PostSET superfamily - - Cysteine-rich motif following a subset of SET domains; Cysteine-rich motif following a subset of SET domains. Q#9254 - CGI_10017491 superfamily 247856 233 294 2.61E-06 46.7721 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9254 - CGI_10017491 superfamily 247856 270 346 0.000134006 41.7645 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9254 - CGI_10017491 superfamily 243082 763 941 9.31E-21 94.2647 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9254 - CGI_10017491 superfamily 243082 1425 1488 8.59E-16 78.0982 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9254 - CGI_10017491 superfamily 245879 536 601 8.44E-11 60.5639 cl12116 DUSP superfamily N - DUSP domain; The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. Q#9254 - CGI_10017491 superfamily 243082 1276 1361 5.56E-08 53.8306 cl02553 Peptidase_C19 superfamily NC - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9254 - CGI_10017491 superfamily 245879 395 442 2.25E-05 44.2714 cl12116 DUSP superfamily C - DUSP domain; The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. Q#9256 - CGI_10017493 superfamily 245213 39 74 4.18E-05 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9256 - CGI_10017493 superfamily 245213 76 111 6.28E-05 36.4606 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9257 - CGI_10017494 superfamily 247725 24 134 3.81E-68 207.911 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9258 - CGI_10017495 superfamily 245226 526 693 2.00E-21 92.3636 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#9259 - CGI_10017496 superfamily 243058 534 601 0.00024382 40.3756 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#9259 - CGI_10017496 superfamily 248054 104 156 2.96E-09 54.4004 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#9261 - CGI_10017498 superfamily 242418 351 532 1.82E-31 118.858 cl01298 Glyco_transf_25 superfamily - - "Glycosyltransferase family 25 [lipooligosaccharide (LOS) biosynthesis protein] is a family of glycosyltransferases involved in LOS biosynthesis. The members include the beta(1,4) galactosyltransferases: Lgt2 of Moraxella catarrhalis, LgtB and LgtE of Neisseria gonorrhoeae and Lic2A of Haemophilus influenzae. M. catarrhalis Lgt2 catalyzes the addition of galactose (Gal) to the growing chain of LOS on the cell surface. N. gonorrhoeae LgtB and LgtE link Gal-beta(1,4) to GlcNAc (N-acetylglucosamine) and Glc (glucose), respectively. The genes encoding LgtB and LgtE are two genes of a five gene locus involved in the synthesis of gonococcal LOS. LgtE is believed to perform the first step in LOS biosynthesis." Q#9261 - CGI_10017498 superfamily 245596 47 166 0.000147855 42.2757 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#9263 - CGI_10017501 superfamily 247725 84 175 7.57E-06 42.3464 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9265 - CGI_10017503 superfamily 243050 54 106 4.10E-21 80.7757 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#9266 - CGI_10017504 superfamily 247792 43 88 3.75E-08 50.522 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9268 - CGI_10017506 superfamily 243092 101 437 3.41E-83 261.501 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9268 - CGI_10017506 superfamily 219730 10 71 1.19E-17 77.5607 cl06962 NLE superfamily - - NLE (NUC135) domain; This domain is located N terminal to WD40 repeats. It is found in the microtubule-associated yeast protein YTM1. Q#9270 - CGI_10017508 superfamily 241825 52 150 5.75E-20 81.0471 cl00379 Ribosomal_L18_L5e superfamily - - "Ribosomal L18/L5e: L18 (L5e) is a ribosomal protein found in the central protuberance (CP) of the large subunit. L18 binds 5S rRNA and induces a conformational change that stimulates the binding of L5 to 5S rRNA. Association of 5S rRNA with 23S rRNA depends on the binding of L18 and L5 to 5S rRNA. L18/L5e is generally described as L18 in prokaryotes and archaea, and as L5e (or L5) in eukaryotes. In bacteria, the CP proteins L5, L18, and L25 are required for the ribosome to incorporate 5S rRNA into the large subunit, one of the last steps in ribosome assembly. In archaea, both L18 and L5 bind 5S rRNA; in eukaryotes, only the L18 homolog (L5e) binds 5S rRNA but a homolog to L5 is also identified." Q#9271 - CGI_10017509 superfamily 241547 231 508 2.65E-71 234.87 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#9272 - CGI_10017510 superfamily 202894 126 193 1.65E-20 81.8834 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#9273 - CGI_10017511 superfamily 248289 22 82 7.00E-07 47.5108 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#9276 - CGI_10017514 superfamily 151060 262 310 1.47E-24 94.926 cl11140 SR-25 superfamily C - "Nuclear RNA-splicing-associated protein; SR-25, otherwise known as ADP-ribosylation factor-like factor 6-interacting protein 4, is expressed in virtually all tissues. At the N-terminus there is a repeat of serine-arginine (SR repeat), and towards the middle of the protein there are clusters of both serines and of basic amino acids. The presence of many nuclear localisation signals strongly implies that this is a nuclear protein that may contribute to RNA splicing. SR-25 is also implicated, along with heat-shock-protein-27, as a mediator in the Rac1 (GTPase ras-related C3 botulinum toxin substrate 1) signalling pathway." Q#9276 - CGI_10017514 superfamily 243703 7 84 1.55E-09 53.7726 cl04309 RNAP_Rpb7_N_like superfamily - - "RNAP_Rpb7_N_like: This conserved domain represents the N-terminal ribonucleoprotein (RNP) domain of the Rpb7 subunit of eukaryotic RNA polymerase (RNAP) II and its homologs, Rpa43 of eukaryotic RNAP I, Rpc25 of eukaryotic RNAP III, and RpoE (subunit E) of archaeal RNAP. These proteins have, in addition to their N-terminal RNP domain, a C-terminal oligonucleotide-binding (OB) domain. Each of these subunits heterodimerizes with another RNAP subunit (Rpb7 to Rpb4, Rpc25 to Rpc17, RpoE to RpoF, and Rpa43 to Rpa14). The heterodimer is thought to tether the RNAP to a given promoter via its interactions with a promoter-bound transcription factor.The heterodimer is also thought to bind and position nascent RNA as it exits the polymerase complex." Q#9282 - CGI_10005549 superfamily 193607 90 221 1.66E-75 227.07 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#9282 - CGI_10005549 superfamily 247792 43 81 3.60E-08 48.2108 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9283 - CGI_10005550 superfamily 241589 427 549 1.13E-34 126.98 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#9286 - CGI_10003017 superfamily 192389 14 199 1.23E-46 160.954 cl10781 SNAPc_SNAP43 superfamily - - "Small nuclear RNA activating complex (SNAPc), subunit SNAP43; Members of this family are part of the SNAPc complex required for the transcription of both RNA polymerase II and III small-nuclear RNA genes. They bind to the proximal sequence element (PSE), a non-TATA-box basal promoter element common to these 2 types of genes. Furthermore, they also recruit TBP and BRF2 to the U6 snRNA TATA box." Q#9287 - CGI_10003019 superfamily 244859 46 270 1.53E-15 74.12 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#9289 - CGI_10003083 superfamily 247905 203 336 4.85E-15 71.8852 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#9289 - CGI_10003083 superfamily 221155 406 510 1.61E-06 46.592 cl13152 RIG-I_C-RD superfamily - - "C-terminal domain of RIG-I; This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerisation. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity." Q#9290 - CGI_10003085 superfamily 243040 22 144 4.94E-58 184.869 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#9291 - CGI_10003086 superfamily 244897 91 385 1.29E-25 105.261 cl08298 PTZ00007 superfamily - - (NAP-L) nucleosome assembly protein -L; Provisional Q#9294 - CGI_10006717 superfamily 219542 18 124 2.09E-40 143.151 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#9294 - CGI_10006717 superfamily 215896 141 322 5.83E-18 81.1872 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#9295 - CGI_10006718 superfamily 241832 7 79 4.22E-16 70.2704 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9295 - CGI_10006718 superfamily 243175 90 214 2.76E-13 63.0254 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#9296 - CGI_10006719 superfamily 247724 131 333 8.08E-07 48.2216 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9297 - CGI_10006720 superfamily 245201 18 189 4.48E-17 75.6508 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9298 - CGI_10006721 superfamily 245226 1134 1307 5.46E-96 305.695 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#9298 - CGI_10006721 superfamily 243082 543 627 1.28E-27 114.532 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9298 - CGI_10006721 superfamily 243082 1005 1063 1.13E-05 47.5072 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9298 - CGI_10006721 superfamily 243082 810 888 3.64E-05 45.5812 cl02553 Peptidase_C19 superfamily NC - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9298 - CGI_10006721 superfamily 243092 242 347 0.000217639 43.4776 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9300 - CGI_10006723 superfamily 219502 364 610 6.63E-50 173.399 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#9300 - CGI_10006723 superfamily 201962 182 253 3.06E-16 74.7196 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#9300 - CGI_10006723 superfamily 219507 264 359 3.41E-13 66.4939 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#9302 - CGI_10006725 superfamily 248097 130 240 7.39E-14 67.6754 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9302 - CGI_10006725 superfamily 248097 268 378 1.15E-10 58.4306 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9303 - CGI_10006726 superfamily 248097 99 234 1.31E-19 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9304 - CGI_10006727 superfamily 241599 204 255 1.04E-10 58.4089 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#9304 - CGI_10006727 superfamily 202226 100 188 2.25E-11 60.7698 cl08348 CUT superfamily - - "CUT domain; The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain, often found downstream of the CUT domain. Multiple copies of the CUT domain can exist in one protein ." Q#9305 - CGI_10006728 superfamily 205121 463 487 2.38E-06 45.958 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#9307 - CGI_10019525 superfamily 243092 37 78 2.15E-07 46.9444 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9308 - CGI_10019526 superfamily 243092 268 584 1.91E-18 85.4644 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9308 - CGI_10019526 superfamily 243092 1 341 2.33E-08 55.0336 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9310 - CGI_10019528 superfamily 243073 225 266 1.16E-08 49.7761 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#9310 - CGI_10019528 superfamily 246908 105 188 1.65E-16 71.8475 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#9311 - CGI_10019529 superfamily 243072 34 193 1.00E-31 116.714 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9311 - CGI_10019529 superfamily 243073 320 355 6.06E-08 48.6205 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#9312 - CGI_10019530 superfamily 243072 294 419 1.20E-42 155.234 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9312 - CGI_10019530 superfamily 243072 492 617 3.81E-41 150.612 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9312 - CGI_10019530 superfamily 243072 562 683 5.30E-41 150.227 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9312 - CGI_10019530 superfamily 243072 657 782 9.14E-41 149.841 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9312 - CGI_10019530 superfamily 243072 66 191 2.47E-40 148.301 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9312 - CGI_10019530 superfamily 243072 393 518 3.36E-40 147.915 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9312 - CGI_10019530 superfamily 243072 228 353 4.23E-40 147.915 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9312 - CGI_10019530 superfamily 194336 1221 1325 1.67E-26 107.822 cl02517 ZU5 superfamily - - ZU5 domain; Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function. Q#9312 - CGI_10019530 superfamily 246680 1714 1795 2.40E-19 86.1856 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#9314 - CGI_10019532 superfamily 247919 58 204 0.00385706 40.7366 cl17365 TrkH superfamily C - Cation transport protein; This family consists of various cation transport proteins (Trk) and V-type sodium ATP synthase subunit J or translocating ATPase J EC:3.6.1.34. These proteins are involved in active sodium up-take utilising ATP in the process. TrkH a member of the family from E. coli is a hydrophobic membrane protein and determines the specificity and kinetics of cation transport by the TrK system in E. coli. Q#9316 - CGI_10019534 superfamily 241568 107 144 2.64E-06 44.7612 cl00043 CCP superfamily C - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9316 - CGI_10019534 superfamily 241568 170 218 8.76E-06 43.2204 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9316 - CGI_10019534 superfamily 241568 244 283 0.000658836 37.4424 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9316 - CGI_10019534 superfamily 241568 42 82 0.00660425 34.3608 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9323 - CGI_10019541 superfamily 241628 18 170 7.45E-23 97.7932 cl00130 PseudoU_synth superfamily C - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#9323 - CGI_10019541 superfamily 243090 292 405 3.11E-26 105.62 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#9323 - CGI_10019541 superfamily 198670 978 1057 6.02E-20 86.5121 cl02426 DIX superfamily - - DIX domain; The DIX domain is present in Dishevelled and axin. This domain is involved in homo- and hetero-oligomerisation. It is involved in the homo- oligomerisation of mouse axin. The axin DIX domain also interacts with the dishevelled DIX domain. The DIX domain has also been called the DAX domain. Q#9323 - CGI_10019541 superfamily 211424 171 271 2.00E-05 43.9616 cl16934 Axin_TNKS_binding superfamily - - "Tankyrase binding N-terminal segment of axin; This N-terminal region of axin mediates interactions with the ankyrin-repeat clusters 2 and 3 of tankyrase, which controls the turnover of axin via poly-ADP-ribosylation. Axin functions as a negative regulator of the WNT signaling pathway." Q#9323 - CGI_10019541 superfamily 220035 632 675 0.00249652 37.0907 cl07441 Axin_b-cat_bind superfamily - - Axin beta-catenin binding domain; This domain is found on the scaffolding protein Axin which is a component of the beta-catenin destruction complex. It competes with the tumour suppressor adenomatous polyposis coli protein (APC) for binding to beta-catenin. Q#9324 - CGI_10019542 superfamily 247856 35 90 5.37E-08 46.3869 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9324 - CGI_10019542 superfamily 247856 66 124 6.43E-08 46.0017 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9324 - CGI_10019542 superfamily 247856 1 60 6.93E-08 46.0017 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9325 - CGI_10019543 superfamily 242611 204 467 1.56E-161 477.406 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#9325 - CGI_10019543 superfamily 245606 585 806 2.66E-37 138.808 cl11410 TPP_enzyme_PYR superfamily - - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#9326 - CGI_10019544 superfamily 247683 330 385 8.05E-20 85.4516 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#9326 - CGI_10019544 superfamily 248022 642 1020 7.49E-68 233.324 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#9326 - CGI_10019544 superfamily 245835 5 242 4.38E-51 180.254 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#9327 - CGI_10019545 superfamily 216686 143 334 3.94E-48 163.263 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#9328 - CGI_10019546 superfamily 213147 405 523 1.77E-31 120.433 cl17040 ADDz superfamily - - "ADDz for ATRX, Dnmt3 and Dnmt3l PHD-like zinc finger domain; The ADDz zinc finger domain is present in the chromatin-associated proteins cytosine-5-methyltransferase 3 (Dnmt3) and ATRX, a SNF2 type transcription factor protein. The Dnmt3 family includes two active DNA methyltransferases, Dnmt3a and -3b, and one regulatory factor Dnmt3l. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The ADDz domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif." Q#9328 - CGI_10019546 superfamily 238192 549 814 3.48E-15 75.3494 cl18939 Cyt_C5_DNA_methylase superfamily - - "Cytosine-C5 specific DNA methylases; Methyl transfer reactions play an important role in many aspects of biology. Cytosine-specific DNA methylases are found both in prokaryotes and eukaryotes. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the mammalian genome. These effects include transcriptional repression via inhibition of transcription factor binding or the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability." Q#9328 - CGI_10019546 superfamily 243083 236 323 1.18E-07 50.4452 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#9331 - CGI_10019549 superfamily 241564 17 74 2.01E-18 72.2983 cl00035 BIR superfamily C - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#9332 - CGI_10019550 superfamily 222150 138 163 0.000197814 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#9332 - CGI_10019550 superfamily 222150 342 366 0.000293849 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#9332 - CGI_10019550 superfamily 222150 111 135 0.00166708 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#9332 - CGI_10019550 superfamily 222150 313 337 0.00236958 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#9334 - CGI_10019552 superfamily 245210 4 403 7.52E-102 337.223 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#9334 - CGI_10019552 superfamily 243109 2505 2692 2.70E-81 268.701 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#9334 - CGI_10019552 superfamily 245206 1868 2086 4.51E-75 260.844 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#9334 - CGI_10019552 superfamily 247637 1569 1841 2.99E-34 136.16 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#9334 - CGI_10019552 superfamily 244888 490 798 1.53E-77 263.118 cl08282 Acyl_transf_1 superfamily - - Acyl transferase domain; Acyl transferase domain. Q#9334 - CGI_10019552 superfamily 245206 1327 1535 2.40E-15 80.1856 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#9334 - CGI_10019552 superfamily 214837 857 1009 2.53E-08 54.925 cl11739 PKS_DH superfamily - - Dehydratase domain in polyketide synthase (PKS) enzymes; Dehydratase domain in polyketide synthase (PKS) enzymes. Q#9334 - CGI_10019552 superfamily 245209 2102 2171 1.41E-07 51.8673 cl09936 PP-binding superfamily - - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#9335 - CGI_10019553 superfamily 245835 169 383 2.44E-144 413.259 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#9335 - CGI_10019553 superfamily 241622 46 122 4.44E-15 70.2882 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#9338 - CGI_10019557 superfamily 243033 417 535 6.53E-05 41.9202 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#9339 - CGI_10019558 superfamily 238191 25 501 4.40E-117 356.257 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#9340 - CGI_10019559 superfamily 238191 543 1019 2.72E-104 337.382 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#9340 - CGI_10019559 superfamily 238191 45 512 6.41E-101 328.137 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#9341 - CGI_10019560 superfamily 238191 1 110 5.30E-10 55.8012 cl18907 Esterase_lipase superfamily N - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#9342 - CGI_10019561 superfamily 247057 776 837 3.39E-16 74.8232 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#9344 - CGI_10009108 superfamily 110440 113 140 1.10E-05 42.3949 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9347 - CGI_10009111 superfamily 246669 157 281 9.33E-86 261.816 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#9347 - CGI_10009111 superfamily 246669 385 483 7.77E-56 184.527 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#9347 - CGI_10009111 superfamily 246669 292 372 6.52E-41 144.081 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#9348 - CGI_10009112 superfamily 241594 508 837 5.57E-119 368.431 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#9348 - CGI_10009112 superfamily 243072 56 181 3.75E-38 139.441 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9349 - CGI_10009113 superfamily 243040 29 150 2.74E-84 260.472 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#9356 - CGI_10009137 superfamily 245814 58 119 4.13E-06 44.0171 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9356 - CGI_10009137 superfamily 245814 257 309 1.33E-05 42.4763 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9362 - CGI_10009143 superfamily 217473 96 319 1.46E-28 115.925 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#9364 - CGI_10004288 superfamily 191867 3 179 1.99E-102 309.114 cl06740 FTCD_N superfamily - - "Formiminotransferase domain, N-terminal subdomain; The formiminotransferase (FT) domain of formiminotransferase- cyclodeaminase (FTCD) forms a homodimer, and each protomer comprises two subdomains. The N-terminal subdomain is made up of a six-stranded mixed beta-pleated sheet and five alpha helices, which are arranged on the external surface of the beta sheet. This, in turn, faces the beta-sheet of the C-terminal subdomain to form a double beta-sheet layer. The two subdomains are separated by a short linker sequence, which is not thought to be any more flexible than the remainder of the molecule. The substrate is predicted to form a number of contacts with residues found in both the N-terminal and C-terminal subdomains." Q#9364 - CGI_10004288 superfamily 202490 181 335 6.40E-70 223.657 cl03808 FTCD superfamily - - Formiminotransferase domain; Formiminotransferase domain. Q#9364 - CGI_10004288 superfamily 242449 387 523 4.60E-45 157.724 cl01350 FTCD_C superfamily N - Formiminotransferase-cyclodeaminase; Members of this family are thought to be Formiminotransferase- cyclodeaminase enzymes EC:4.3.1.4. This domain is found in the C-terminus of the bifunctional animal members of the family. Q#9365 - CGI_10004289 superfamily 243083 270 351 2.40E-33 122.897 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#9365 - CGI_10004289 superfamily 243084 154 256 5.48E-14 68.9463 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#9366 - CGI_10004290 superfamily 248458 369 540 6.35E-09 56.9385 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9366 - CGI_10004290 superfamily 248458 94 240 9.84E-06 46.5381 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9368 - CGI_10007524 superfamily 243540 47 271 1.65E-22 92.696 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#9370 - CGI_10005864 superfamily 247755 2053 2277 3.97E-110 352.192 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#9370 - CGI_10005864 superfamily 247755 1026 1245 6.40E-105 337.17 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#9370 - CGI_10005864 superfamily 243092 56 292 1.30E-28 118.977 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9373 - CGI_10005867 superfamily 245820 27 451 0 555.72 cl11970 PriL superfamily - - "Archaeal/eukaryotic core primase: Large subunit, PriL; Primases synthesize the RNA primers required for DNA replication. Primases are grouped into two classes, bacteria/bacteriophage and archaeal/eukaryotic. The proteins in the two classes differ in structure and the replication apparatus components. The DNA replication machinery of archaeal organisms contains only the core primase, a simpler arrangement compared to eukaryotes. Archaeal/eukaryotic core primase is a heterodimeric enzyme consisting of a small catalytic subunit (PriS) and a large subunit (PriL). Although the catalytic activity resides within PriS, the PriL subunit is essential for primase function as disruption of the PriL gene in yeast is lethal. PriL is composed of two structural domains. Several functions have been proposed for PriL, such as the stabilization of PriS, involvement in the initiation of synthesis, the improvement of primase processivity, and the determination of product size." Q#9375 - CGI_10005869 superfamily 245814 24 101 2.15E-05 40.9355 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9375 - CGI_10005869 superfamily 245814 120 204 0.000366049 37.5772 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9376 - CGI_10005870 superfamily 243034 448 541 5.48E-09 54.3084 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#9376 - CGI_10005870 superfamily 243034 165 264 2.32E-08 52.7676 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#9377 - CGI_10005871 superfamily 241645 3 84 3.14E-06 40.1941 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#9378 - CGI_10005872 superfamily 243082 132 217 9.29E-18 79.3528 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9378 - CGI_10005872 superfamily 243082 84 100 0.00108047 38.2331 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9379 - CGI_10005873 superfamily 243082 23 121 2.37E-15 73.861 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9382 - CGI_10008035 superfamily 216239 37 371 9.25E-103 310.777 cl18361 IRK superfamily - - Inward rectifier potassium channel; Inward rectifier potassium channel. Q#9383 - CGI_10008036 superfamily 216239 23 322 3.61E-90 279.576 cl18361 IRK superfamily - - Inward rectifier potassium channel; Inward rectifier potassium channel. Q#9384 - CGI_10008037 superfamily 216239 1 295 2.20E-103 310.007 cl18361 IRK superfamily - - Inward rectifier potassium channel; Inward rectifier potassium channel. Q#9385 - CGI_10008038 superfamily 241571 670 783 5.51E-23 96.7126 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#9385 - CGI_10008038 superfamily 241609 791 864 1.60E-16 77.0331 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#9385 - CGI_10008038 superfamily 241568 1124 1181 5.69E-06 45.5316 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9385 - CGI_10008038 superfamily 243035 985 1093 1.09E-05 45.3034 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9385 - CGI_10008038 superfamily 241568 919 972 0.000274005 40.524 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9385 - CGI_10008038 superfamily 241583 240 398 3.31E-28 113.82 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#9388 - CGI_10008041 superfamily 241629 45 158 6.86E-09 54.0632 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#9388 - CGI_10008041 superfamily 241629 426 539 5.56E-05 41.8202 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#9389 - CGI_10008042 superfamily 247755 88 322 1.39E-79 254.013 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#9389 - CGI_10008042 superfamily 247789 426 638 3.31E-36 135.464 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#9390 - CGI_10008043 superfamily 248312 8 119 1.48E-06 44.2665 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#9392 - CGI_10008045 superfamily 247905 22 69 8.76E-11 59.9441 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#9393 - CGI_10000389 superfamily 243064 21 135 3.57E-23 89.3402 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#9395 - CGI_10002469 superfamily 241574 213 420 6.50E-84 266.374 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9395 - CGI_10002469 superfamily 241574 458 629 4.63E-16 77.2409 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9396 - CGI_10002470 superfamily 220611 16 64 0.00746028 31.3611 cl10864 Laps superfamily NC - Learning-associated protein; This is a family of 121-amino acid secretory proteins. Laps functions in the regulation of neuronal cell adhesion and/or movement and synapse attachment. Laps binds to the ApC/EBP (Aplysia CCAAT/enhancer binding protein) promoter and activates the transcription of ApC/EBP mRNA. Q#9398 - CGI_10006809 superfamily 245213 46 81 3.78E-09 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9398 - CGI_10006809 superfamily 245213 8 43 4.79E-09 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9398 - CGI_10006809 superfamily 245213 84 119 8.89E-09 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9398 - CGI_10006809 superfamily 245847 125 269 1.86E-18 79.1377 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#9399 - CGI_10006810 superfamily 225950 4 55 0.00155182 34.4781 cl18731 COG3416 superfamily N - Uncharacterized protein conserved in bacteria [Function unknown] Q#9401 - CGI_10006812 superfamily 246918 216 262 4.13E-11 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9401 - CGI_10006812 superfamily 246918 104 150 1.42E-10 57.5967 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9401 - CGI_10006812 superfamily 246918 160 211 3.44E-09 53.3595 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9401 - CGI_10006812 superfamily 246918 272 323 1.23E-08 51.8187 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9401 - CGI_10006812 superfamily 246918 380 429 1.07E-07 49.1223 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9401 - CGI_10006812 superfamily 243119 437 481 0.000174335 39.7245 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#9401 - CGI_10006812 superfamily 246918 328 369 0.00060684 38.3367 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9405 - CGI_10006816 superfamily 227412 1 136 1.47E-21 86.3685 cl18811 YIP1 superfamily N - "Rab GTPase interacting factor, Golgi membrane protein [Intracellular trafficking and secretion]" Q#9408 - CGI_10006819 superfamily 241750 2 156 1.53E-31 114.593 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#9412 - CGI_10004267 superfamily 247684 5 432 8.12E-88 276.081 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#9416 - CGI_10004271 superfamily 241563 61 96 0.00107099 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9418 - CGI_10004914 superfamily 241550 180 436 3.14E-68 218.202 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#9418 - CGI_10004914 superfamily 243647 67 153 1.45E-23 93.85 cl04104 Arg_tRNA_synt_N superfamily - - "Arginyl tRNA synthetase N terminal domain; This domain is found at the amino terminus of Arginyl tRNA synthetase, also called additional domain 1 (Add-1). It is about 140 residues long and it has been suggested that this domain will be involved in tRNA recognition." Q#9419 - CGI_10004915 superfamily 245839 59 223 1.52E-31 114.234 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#9420 - CGI_10004916 superfamily 220622 31 174 8.54E-68 207.76 cl10879 Mesd superfamily - - "Chaperone for wingless signalling and trafficking of LDL receptor; Mesd is a family of highly conserved proteins found from nematodes to humans. The final C-terminal residues, KEDL, are the endoplasmic reticulum retention sequence as it is an ER protein specifically required for the intracellular trafficking of members of the low-density lipoprotein family of receptors (LDLRs). The N- and C-terminal sequences are predicted to adopt a random coil conformation, with the exception of an isolated predicted helix within the N-terminal region, The central folded domain flanked by natively unstructured regions is the necessary structure for facilitating maturation of LRP6 (Low-Density Lipoprotein Receptor-Related Protein 6 Maturation)." Q#9421 - CGI_10004917 superfamily 247723 232 347 1.17E-49 163.987 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9426 - CGI_10010703 superfamily 222150 400 425 0.00225775 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#9427 - CGI_10010704 superfamily 222150 380 405 0.000906793 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#9427 - CGI_10010704 superfamily 222150 353 376 0.00122031 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#9428 - CGI_10010705 superfamily 246680 11 89 5.67E-16 70.3084 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#9429 - CGI_10010706 superfamily 247683 427 488 1.35E-26 105.835 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#9429 - CGI_10010706 superfamily 241622 328 407 2.15E-17 79.9182 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#9429 - CGI_10010706 superfamily 241622 1 75 3.14E-16 76.4514 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#9429 - CGI_10010706 superfamily 241622 90 163 3.35E-06 46.7911 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#9429 - CGI_10010706 superfamily 194336 1391 1494 1.15E-47 167.423 cl02517 ZU5 superfamily - - ZU5 domain; Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function. Q#9429 - CGI_10010706 superfamily 247744 537 705 1.94E-26 108.919 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#9430 - CGI_10010707 superfamily 243051 649 803 1.98E-25 103.997 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#9430 - CGI_10010707 superfamily 245213 608 645 4.37E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9430 - CGI_10010707 superfamily 245213 569 603 0.000147353 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9430 - CGI_10010707 superfamily 245213 489 524 0.000152315 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9430 - CGI_10010707 superfamily 245213 244 281 0.00447582 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9431 - CGI_10010708 superfamily 243051 675 782 3.53E-20 88.5889 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#9431 - CGI_10010708 superfamily 245213 593 630 6.88E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9431 - CGI_10010708 superfamily 245213 467 505 4.13E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9431 - CGI_10010708 superfamily 245213 508 546 0.000265466 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9431 - CGI_10010708 superfamily 245213 549 587 0.00027688 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9431 - CGI_10010708 superfamily 241583 1 161 1.39E-42 153.111 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#9431 - CGI_10010708 superfamily 241571 209 324 0.000432195 39.7031 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#9432 - CGI_10010709 superfamily 243107 6 32 0.000366359 37.5246 cl02611 G-patch superfamily N - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#9433 - CGI_10010710 superfamily 243097 76 306 1.52E-93 284.955 cl02572 PIPKc superfamily C - "Phosphatidylinositol phosphate kinases (PIPK) catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. CD alignment includes type II phosphatidylinositol phosphate kinases (PIPKII-beta), type I andII PIPK (-alpha, -beta, and -gamma) kinases and related yeast Fab1p and Mss4p kinases. Signaling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. The catalytic core domains of PIPKs are structurally similar to PI3K, PI4K, and cAMP-dependent protein kinases (PKA), the dimerization region is a unique feature of the PIPKs." Q#9438 - CGI_10009776 superfamily 215647 49 289 1.43E-12 64.9372 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#9442 - CGI_10009780 superfamily 215647 187 427 9.10E-09 54.5369 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#9443 - CGI_10009781 superfamily 247724 5 117 3.35E-25 96.6831 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9445 - CGI_10006602 superfamily 219619 5 83 1.88E-10 54.1359 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#9446 - CGI_10006603 superfamily 219619 73 129 1.40E-12 62.2251 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#9446 - CGI_10006603 superfamily 219619 180 206 4.10E-07 46.8172 cl18518 Ion_trans_2 superfamily NC - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#9447 - CGI_10006604 superfamily 219619 76 129 1.15E-13 65.3067 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#9447 - CGI_10006604 superfamily 219619 179 240 2.20E-09 53.3655 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#9448 - CGI_10006605 superfamily 241567 37 196 4.68E-11 60.7918 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#9448 - CGI_10006605 superfamily 241567 274 365 1.77E-05 44.1284 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#9449 - CGI_10006606 superfamily 243097 1 349 2.50E-94 287.356 cl02572 PIPKc superfamily - - "Phosphatidylinositol phosphate kinases (PIPK) catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. CD alignment includes type II phosphatidylinositol phosphate kinases (PIPKII-beta), type I andII PIPK (-alpha, -beta, and -gamma) kinases and related yeast Fab1p and Mss4p kinases. Signaling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. The catalytic core domains of PIPKs are structurally similar to PI3K, PI4K, and cAMP-dependent protein kinases (PKA), the dimerization region is a unique feature of the PIPKs." Q#9450 - CGI_10007272 superfamily 241607 436 486 1.63E-18 80.4261 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#9452 - CGI_10007274 superfamily 207662 261 333 1.69E-42 144.125 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#9454 - CGI_10007276 superfamily 243134 207 333 7.59E-19 83.8528 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#9454 - CGI_10007276 superfamily 243134 371 450 1.39E-09 56.5036 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#9454 - CGI_10007276 superfamily 243134 13 147 5.32E-09 54.9628 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#9454 - CGI_10007276 superfamily 243134 655 734 1.86E-05 43.792 cl02663 Fasciclin superfamily N - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#9462 - CGI_10017262 superfamily 241984 1 197 1.65E-41 142.01 cl00615 Membrane-FADS-like superfamily N - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#9463 - CGI_10017263 superfamily 243176 1 494 0 884.32 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#9464 - CGI_10017264 superfamily 241889 122 267 3.48E-56 184.741 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#9465 - CGI_10017265 superfamily 243555 20 207 1.68E-13 68.1866 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#9468 - CGI_10017268 superfamily 248012 102 201 3.42E-24 94.952 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#9469 - CGI_10017269 superfamily 248012 2 84 2.90E-07 43.8013 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#9471 - CGI_10017271 superfamily 245814 3 65 3.92E-08 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9472 - CGI_10017272 superfamily 245814 80 123 0.00554724 33.6167 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9473 - CGI_10017273 superfamily 241607 35 75 5.89E-11 54.1982 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#9474 - CGI_10017274 superfamily 241624 335 470 7.56E-36 133.606 cl00120 PP2Cc superfamily N - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#9479 - CGI_10017279 superfamily 207724 175 234 6.65E-12 59.9367 cl02772 BSD superfamily - - BSD domain; This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function. Q#9483 - CGI_10017283 superfamily 245815 25 503 0 834.738 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#9483 - CGI_10017283 superfamily 245815 565 731 4.78E-24 105.086 cl11961 ALDH-SF superfamily C - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#9484 - CGI_10017284 superfamily 241616 10 43 0.000180166 38.7804 cl00109 MADS superfamily C - "MADS: MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptonal regulators. Binds DNA and exists as hetero and homo-dimers. Composed of 2 main subgroups: SRF-like/Type I and MEF2-like (myocyte enhancer factor 2)/ Type II. These subgroups differ mainly in position of the alpha 2 helix responsible for the dimerization interface; Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi." Q#9484 - CGI_10017284 superfamily 247999 265 319 0.00229944 35.544 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#9489 - CGI_10005130 superfamily 247742 5 413 0 741.081 cl17188 enolase_like superfamily - - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#9490 - CGI_10000748 superfamily 248019 332 444 6.30E-15 72.3169 cl17465 DAGK_cat superfamily C - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#9491 - CGI_10003415 superfamily 247042 32 428 1.28E-74 242.622 cl15693 Sema superfamily - - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#9491 - CGI_10003415 superfamily 247042 426 492 0.00134794 39.6215 cl15693 Sema superfamily N - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#9492 - CGI_10003416 superfamily 247746 130 213 0.0062628 35.697 cl17192 ATP-synt_B superfamily N - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#9493 - CGI_10003417 superfamily 247684 7 383 0 766.119 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#9494 - CGI_10015220 superfamily 241599 101 170 6.43E-05 40.3045 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#9494 - CGI_10015220 superfamily 243092 248 384 0.00305355 37.6996 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9495 - CGI_10015221 superfamily 211426 65 166 3.53E-37 135.536 cl16936 SATB1_N superfamily - - "N-terminal domain of SATB1 and similar proteins; SATB1, the special AT-rich sequence-binding protein 1, is involved in organizing chromosomal loci into distinct loops, creating a "loopscape" that has a direct bearing on gene expression. This N-terminal domain, which may be involved in various interactions with chromatin proteins, resembles a ubiquitin domain and has been shown to form tetramers, a function critical to SATB1-DNA interactions. The related Drosophila homeobox gene defective proventriculus (dve) plays a key role in the functional specification during endoderm development." Q#9495 - CGI_10015221 superfamily 241599 708 779 1.35E-07 49.5493 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#9495 - CGI_10015221 superfamily 241599 276 342 2.42E-07 48.7789 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#9496 - CGI_10015222 superfamily 247743 213 380 4.94E-29 111.084 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#9497 - CGI_10015223 superfamily 248030 244 736 5.08E-93 302.37 cl17476 Glyco_transf_7C superfamily - - "N-terminal domain of galactosyltransferase; This is the N-terminal domain of a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activities, all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalyzed reaction." Q#9498 - CGI_10015224 superfamily 247792 30 95 4.76E-06 43.9736 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9498 - CGI_10015224 superfamily 241563 128 168 3.48E-05 41.504 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9498 - CGI_10015224 superfamily 247797 220 287 0.000445032 40.4861 cl17243 PRK13975 superfamily NC - thymidylate kinase; Provisional Q#9499 - CGI_10015225 superfamily 247723 1 20 0.00188498 36.2264 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9500 - CGI_10015226 superfamily 241584 709 792 9.67E-16 74.8403 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9500 - CGI_10015226 superfamily 241584 816 888 1.00E-12 65.9807 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9500 - CGI_10015226 superfamily 241584 426 513 9.04E-10 57.1211 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9500 - CGI_10015226 superfamily 241584 892 979 2.82E-09 55.9655 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9500 - CGI_10015226 superfamily 241584 346 421 3.43E-09 55.5803 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9500 - CGI_10015226 superfamily 241584 534 608 5.65E-09 54.8099 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9500 - CGI_10015226 superfamily 241584 627 703 3.05E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9500 - CGI_10015226 superfamily 241584 993 1077 2.23E-05 44.0243 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9500 - CGI_10015226 superfamily 241584 235 327 0.000127163 41.7131 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9503 - CGI_10015229 superfamily 246722 1 121 2.43E-54 190.131 cl14812 PIN_SF superfamily C - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#9503 - CGI_10015229 superfamily 246722 595 713 1.53E-42 156.619 cl14812 PIN_SF superfamily N - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#9503 - CGI_10015229 superfamily 246724 720 809 5.54E-31 118.507 cl14815 H3TH_StructSpec-5'-nucleases superfamily - - "H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination; The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases." Q#9503 - CGI_10015229 superfamily 246722 793 848 2.12E-08 54.9262 cl14812 PIN_SF superfamily N - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#9504 - CGI_10015230 superfamily 241628 79 261 2.10E-121 356.186 cl00130 PseudoU_synth superfamily - - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#9504 - CGI_10015230 superfamily 203846 39 97 8.82E-33 118.998 cl06897 DKCLD superfamily - - DKCLD (NUC011) domain; This is a TruB_N/PUA domain associated N-terminal domain of Dyskerin-like proteins. Q#9504 - CGI_10015230 superfamily 241977 288 361 1.82E-22 91.3862 cl00607 PUA superfamily - - "PUA domain; The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain." Q#9505 - CGI_10015231 superfamily 241572 47 100 2.47E-08 52.2409 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#9505 - CGI_10015231 superfamily 241572 155 207 0.00278342 36.8329 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#9507 - CGI_10015233 superfamily 241563 60 97 1.82E-06 45.548 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9507 - CGI_10015233 superfamily 128778 102 214 2.42E-05 43.0223 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#9507 - CGI_10015233 superfamily 110440 485 511 0.00461564 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9508 - CGI_10015234 superfamily 220617 23 134 4.95E-06 45.1325 cl10871 DUF2371 superfamily - - Uncharacterized conserved protein (DUF2371); This is a family of proteins conserved from nematodes to humans. The function is not known. Q#9511 - CGI_10015237 superfamily 243179 107 202 5.02E-08 50.5807 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#9513 - CGI_10015239 superfamily 247725 354 465 6.26E-65 217.139 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9513 - CGI_10015239 superfamily 215882 272 379 1.36E-28 113.145 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#9513 - CGI_10015239 superfamily 220215 190 266 1.13E-21 91.903 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#9513 - CGI_10015239 superfamily 220215 42 118 1.13E-21 91.903 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#9513 - CGI_10015239 superfamily 147837 1426 1537 1.47E-09 57.6121 cl05461 4_1_CTD superfamily - - "4.1 protein C-terminal domain (CTD); At the C-terminus of all known 4.1 proteins is a sequence domain unique to these proteins, known as the C-terminal domain (CTD). Mammalian CTDs are associated with a growing number of protein-protein interactions, although such activities have yet to be associated with invertebrate CTDs. Mammalian CTDs are generally defined by sequence alignment as encoded by exons 18-21. Comparison of known vertebrate 4.1 proteins with invertebrate 4.1 proteins indicates that mammalian 4.1 exon 19 represents a vertebrate adaptation that extends the sequence of the CTD with a Ser/Thr-rich sequence. The CTD was first described as a 22/24-kDa domain by chymotryptic digestion of erythrocyte 4.1 (4.1R). CTD is thought to represent an independent folding structure which has gained function since the divergence of vertebrates from invertebrates." Q#9513 - CGI_10015239 superfamily 192138 478 512 1.02E-06 47.9988 cl07378 FA superfamily C - "FERM adjacent (FA); This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase substrates." Q#9515 - CGI_10015241 superfamily 247801 47 170 1.37E-58 193.231 cl17247 CoA_trans superfamily C - Coenzyme A transferase; Coenzyme A transferase. Q#9515 - CGI_10015241 superfamily 241634 236 329 0.00277444 36.8786 cl00143 SynN superfamily C - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#9516 - CGI_10015242 superfamily 241642 214 272 2.35E-09 52.4978 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#9516 - CGI_10015242 superfamily 241634 40 171 1.79E-10 57.2942 cl00143 SynN superfamily - - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#9517 - CGI_10015243 superfamily 247801 136 342 3.76E-97 289.573 cl17247 CoA_trans superfamily - - Coenzyme A transferase; Coenzyme A transferase. Q#9517 - CGI_10015243 superfamily 247801 23 107 3.48E-25 101.168 cl17247 CoA_trans superfamily N - Coenzyme A transferase; Coenzyme A transferase. Q#9518 - CGI_10015244 superfamily 215859 124 266 0.000391382 39.8923 cl18347 Peptidase_S9 superfamily C - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#9521 - CGI_10015247 superfamily 243072 71 204 4.54E-31 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9522 - CGI_10015248 superfamily 244964 172 250 3.22E-13 63.0218 cl08458 CobW_C superfamily - - "Cobalamin synthesis protein cobW C-terminal domain; This is a large and diverse family of putative metal chaperones that can be separated into up to 15 subgroups. In addition to known roles in cobalamin biosynthesis and the activation of the Fe-type nitrile hydratase, this family is also known to be involved in the response to zinc limitation. The CobW subgroup involved in cobalamin synthesis represents only a small sub-fraction of the family." Q#9522 - CGI_10015248 superfamily 247757 85 116 0.00186388 36.8297 cl17203 Fer4_NifH superfamily N - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#9523 - CGI_10002335 superfamily 242934 1 142 1.58E-42 143.086 cl02219 Bap31 superfamily - - "B-cell receptor-associated protein 31-like; Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31." Q#9524 - CGI_10002336 superfamily 247916 18 68 1.19E-07 48.9183 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#9525 - CGI_10012175 superfamily 216686 3 152 1.46E-33 120.12 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#9526 - CGI_10012176 superfamily 245235 3613 3688 2.45E-44 170.474 cl10023 POLBc superfamily C - "DNA polymerase type-B family catalytic domain. DNA-directed DNA polymerases elongate DNA by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA. DNA-directed DNA polymerases are multifunctional with both synthetic (polymerase) and degradative modes (exonucleases) and play roles in the processes of DNA replication, repair, and recombination. DNA-dependent DNA polymerases can be classified in six main groups based upon their phylogenetic relationships with E. coli polymerase I (class A), E. coli polymerase II (class B), E. coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB, and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family B DNA polymerases include E. coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon, and zeta), and eukaryotic viral and plasmid-borne enzymes. DNA polymerase is made up of distinct domains and sub-domains. The polymerase domain of DNA polymerase type B (Pol domain) is responsible for the template-directed polymerization of dNTPs onto the growing primer strand of duplex DNA that is usually magnesium dependent. In general, the architecture of the Pol domain has been likened to a right hand with fingers, thumb, and palm sub-domains with a deep groove to accommodate the nucleic acid substrate. There are a few conserved motifs in the Pol domain of family B DNA polymerases. The conserved aspartic acid residues in the DTDS motifs of the palm sub-domain is crucial for binding to divalent metal ion and is suggested to be important for polymerase catalysis." Q#9526 - CGI_10012176 superfamily 245226 3476 3583 8.16E-43 159.708 cl10012 DnaQ_like_exo superfamily NC - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#9526 - CGI_10012176 superfamily 245226 3332 3386 4.22E-17 83.8233 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#9528 - CGI_10012178 superfamily 149701 287 329 4.01E-12 60.3077 cl07373 Integrin_b_cyt superfamily - - "Integrin beta cytoplasmic domain; Integrins are a group of transmembrane proteins which function as extracellular matrix receptors and in cell adhesion. Integrins are ubiquitously expressed and are heterodimeric, each composed of an alpha and beta subunit. Several variations of the the alpha and beta subunits exist, and association of different alpha and beta subunits can have different a different binding specificity. This domain corresponds to the cytoplasmic domain of the beta subunit." Q#9528 - CGI_10012178 superfamily 219669 202 262 0.00188703 35.8316 cl06832 Integrin_B_tail superfamily - - Integrin beta tail domain; This is the beta tail domain of the Integrin protein. Integrins are receptors which are involved in cell-cell and cell-extracellular matrix interactions. Q#9529 - CGI_10012179 superfamily 248247 1 277 5.04E-72 244.436 cl17693 Integrin_beta superfamily N - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#9529 - CGI_10012179 superfamily 248247 636 871 3.55E-58 205.146 cl17693 Integrin_beta superfamily C - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#9529 - CGI_10012179 superfamily 149701 569 605 1.13E-08 52.6037 cl07373 Integrin_b_cyt superfamily - - "Integrin beta cytoplasmic domain; Integrins are a group of transmembrane proteins which function as extracellular matrix receptors and in cell adhesion. Integrins are ubiquitously expressed and are heterodimeric, each composed of an alpha and beta subunit. Several variations of the the alpha and beta subunits exist, and association of different alpha and beta subunits can have different a different binding specificity. This domain corresponds to the cytoplasmic domain of the beta subunit." Q#9530 - CGI_10012180 superfamily 248247 9 298 8.37E-74 242.895 cl17693 Integrin_beta superfamily N - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#9531 - CGI_10012181 superfamily 248247 34 154 5.18E-28 106.92 cl17693 Integrin_beta superfamily C - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#9533 - CGI_10012183 superfamily 247856 130 157 0.00342012 34.878 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9535 - CGI_10003639 superfamily 241554 44 246 1.90E-15 71.7154 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#9537 - CGI_10004868 superfamily 241874 31 593 0 576.053 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#9538 - CGI_10004869 superfamily 241594 162 321 3.76E-06 46.404 cl00077 HECTc superfamily C - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#9540 - CGI_10004871 superfamily 217414 513 636 1.13E-05 46.554 cl03927 Otopetrin superfamily N - "Protein of unknown function, DUF270; Protein of unknown function, DUF270. " Q#9542 - CGI_10004873 superfamily 246918 232 263 6.56E-05 40.2627 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9544 - CGI_10003481 superfamily 247724 5 166 2.51E-130 366.5 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9546 - CGI_10003483 superfamily 241563 97 134 0.000201953 39.1928 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9547 - CGI_10003484 superfamily 241563 70 110 0.00116731 37.0736 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9547 - CGI_10003484 superfamily 110440 487 514 0.00137741 37.0021 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9547 - CGI_10003484 superfamily 217316 118 198 0.00163264 37.2208 cl03832 DUF234 superfamily - - Archaea bacterial proteins of unknown function; Archaea bacterial proteins of unknown function. Q#9547 - CGI_10003484 superfamily 110440 529 556 0.00935534 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9549 - CGI_10007903 superfamily 247757 20 153 6.43E-58 181.685 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#9551 - CGI_10007905 superfamily 241584 71 131 2.39E-09 50.1875 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9553 - CGI_10007907 superfamily 245599 254 432 2.66E-39 140.436 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#9553 - CGI_10007907 superfamily 207662 70 163 1.25E-43 149.518 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#9554 - CGI_10007908 superfamily 243290 133 273 1.70E-45 151.512 cl03075 GrpE superfamily - - "GrpE is the adenine nucleotide exchange factor of DnaK (Hsp70)-type ATPases. The GrpE dimer binds to the ATPase domain of Hsp70 catalyzing the dissociation of ADP, which enables rebinding of ATP, one step in the Hsp70 reaction cycle in protein folding. In eukaryotes, only the mitochondrial Hsp70, not the cytosolic form, is GrpE dependent." Q#9559 - CGI_10007913 superfamily 243175 79 150 7.75E-21 83.4936 cl02776 GST_C_family superfamily C - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#9560 - CGI_10007914 superfamily 247792 209 251 5.36E-11 56.3 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9560 - CGI_10007914 superfamily 218247 37 173 5.27E-21 87.8171 cl04727 Pex2_Pex12 superfamily N - "Pex2 / Pex12 amino terminal region; This region is found at the N terminal of a number of known and predicted peroxins including Pex2, Pex10 and Pex12. This conserved region is usually associated with a C terminal ring finger (pfam00097) domain." Q#9566 - CGI_10007940 superfamily 189857 5 124 2.50E-38 128.522 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#9567 - CGI_10007941 superfamily 189857 5 124 2.84E-38 128.137 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#9568 - CGI_10007942 superfamily 189857 5 124 1.16E-38 129.292 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#9570 - CGI_10007944 superfamily 219502 399 642 1.74E-73 237.728 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#9570 - CGI_10007944 superfamily 201962 215 287 6.41E-22 90.898 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#9570 - CGI_10007944 superfamily 219507 296 394 2.80E-13 66.8791 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#9571 - CGI_10007945 superfamily 110440 462 489 0.00524988 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9573 - CGI_10007947 superfamily 241563 63 103 0.0024083 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9575 - CGI_10007949 superfamily 216686 227 392 2.76E-26 104.712 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#9576 - CGI_10007950 superfamily 247724 311 517 4.60E-92 281.891 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9576 - CGI_10007950 superfamily 243092 15 306 2.68E-34 130.533 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9578 - CGI_10023516 superfamily 241563 81 120 2.85E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9578 - CGI_10023516 superfamily 110440 505 531 8.10E-05 40.4689 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9579 - CGI_10023517 superfamily 110440 381 405 0.000161946 39.3133 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9580 - CGI_10023518 superfamily 241584 118 194 1.12E-12 61.7435 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9580 - CGI_10023518 superfamily 241584 7 113 2.00E-05 41.3279 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#9581 - CGI_10023519 superfamily 217915 272 777 1.89E-133 413.44 cl14957 Spc97_Spc98 superfamily - - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#9583 - CGI_10023521 superfamily 245596 8 257 2.00E-105 318.436 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#9584 - CGI_10023522 superfamily 241832 164 253 3.10E-53 178.456 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9584 - CGI_10023522 superfamily 241832 266 355 2.47E-51 173.448 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9584 - CGI_10023522 superfamily 241832 42 138 5.01E-41 145.107 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9584 - CGI_10023522 superfamily 243082 558 694 4.47E-21 92.9314 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9584 - CGI_10023522 superfamily 243046 361 540 3.01E-30 121.641 cl02460 Transferrin superfamily C - Transferrin; Transferrin. Q#9585 - CGI_10023523 superfamily 243352 58 349 1.94E-114 336.489 cl03224 Porin3 superfamily - - "Eukaryotic porin family that forms channels in the mitochondrial outer membrane; The porin family 3 contains two sub-families that play vital roles in the mitochondrial outer membrane, a translocase for unfolded pre-proteins (Tom40) and the voltage-dependent anion channel (VDAC) that regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane." Q#9586 - CGI_10023524 superfamily 245456 581 812 7.90E-110 337.666 cl10970 AP_MHD_Cterm superfamily - - "C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD); This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15." Q#9586 - CGI_10023524 superfamily 242876 72 193 0.00154635 38.4893 cl02092 Clat_adaptor_s superfamily - - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#9588 - CGI_10023526 superfamily 243095 301 489 7.93E-100 321.76 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#9588 - CGI_10023526 superfamily 247683 194 247 1.18E-29 115.239 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#9588 - CGI_10023526 superfamily 243088 59 172 5.75E-31 120.675 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#9589 - CGI_10023527 superfamily 247866 51 268 6.01E-32 119.094 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#9592 - CGI_10023530 superfamily 247903 12 199 6.91E-54 173.252 cl17349 Peptidase_M54 superfamily - - "Peptidase family M54, also called archaemetzincins or archaelysins; Peptidase M54 (archaemetzincin or archaelysin) is a zinc-dependent aminopeptidase that contains the consensus zinc-binding sequence HEXXHXXGXXH/D and a conserved Met residue at the active site, and is thus classified as a metzincin. Archaemetzincins, first identified in archaea, are also found in bacteria and eukaryotes, including two human members, archaemetzincin-1 and -2 (AMZ1 and AMZ2). AMZ1 is mainly found in the liver and heart while AMZ2 is primarily expressed in testis and heart; both have been reported to degrade synthetic substrates and peptides. The Peptidase M54 family contains an extended metzincin concensus sequence of HEXXHXXGX3CX4CXMX17CXXC such that a second zinc ion is bound to four cysteines, thus resembling a zinc finger. Phylogenetic analysis of this family reveals a complex evolutionary process involving a series of lateral gene transfer, gene loss and genetic duplication events." Q#9593 - CGI_10023531 superfamily 241644 1 86 5.22E-32 110.37 cl00154 UBCc superfamily C - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#9594 - CGI_10023532 superfamily 248458 36 214 5.71E-20 90.8361 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9594 - CGI_10023532 superfamily 248458 467 657 4.06E-11 63.8721 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9595 - CGI_10023533 superfamily 247755 8 175 2.85E-32 122.506 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#9595 - CGI_10023533 superfamily 247789 428 565 5.30E-11 61.1206 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#9596 - CGI_10023534 superfamily 207662 76 158 3.25E-47 157.963 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#9596 - CGI_10023534 superfamily 245599 232 413 4.52E-38 138.124 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#9597 - CGI_10023535 superfamily 241799 8 225 1.17E-100 294.38 cl00339 SugarP_isomerase superfamily - - "SugarP_isomerase: Sugar Phosphate Isomerase family; includes type A ribose 5-phosphate isomerase (RPI_A), glucosamine-6-phosphate (GlcN6P) deaminase, and 6-phosphogluconolactonase (6PGL). RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium, the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate. 6PGL converts 6-phosphoglucono-1,5-lactone to 6-phosphogluconate, the second step of the oxidative phase of the pentose phosphate pathway." Q#9598 - CGI_10023536 superfamily 222150 595 619 6.45E-05 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#9599 - CGI_10023537 superfamily 243072 395 544 5.56E-16 77.4238 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9599 - CGI_10023537 superfamily 243072 521 612 6.65E-06 46.6078 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9602 - CGI_10023540 superfamily 217293 30 109 7.37E-13 60.7243 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#9603 - CGI_10023541 superfamily 219425 38 156 5.14E-17 72.5742 cl06494 Hydrolase_2 superfamily - - "Cell Wall Hydrolase; These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance B. subtilis sleB is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyses the cortex. A similar role is carried out by the partially redundant B. subtilis cwlJ. It is not clear whether these enzymes are amidases or peptidases." Q#9604 - CGI_10023542 superfamily 219425 24 129 1.47E-09 51.3883 cl06494 Hydrolase_2 superfamily - - "Cell Wall Hydrolase; These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance B. subtilis sleB is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyses the cortex. A similar role is carried out by the partially redundant B. subtilis cwlJ. It is not clear whether these enzymes are amidases or peptidases." Q#9607 - CGI_10023546 superfamily 243035 10 75 1.58E-10 53.0773 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9609 - CGI_10023548 superfamily 202203 223 292 8.44E-24 96.8685 cl03534 E2F_TDP superfamily - - "E2F/DP family winged-helix DNA-binding domain; This family contains the transcription factor E2F and its dimerisation partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. The crystal structure of an E2F4-DP2-DNA complex shows that the DNA-binding domains of the E2F and DP proteins both have a fold related to the winged-helix DNA-binding motif. Recognition of the central c/gGCGCg/c sequence of the consensus DNA-binding site is symmetric, and amino acids that contact these bases are conserved among all known E2F and DP proteins." Q#9609 - CGI_10023548 superfamily 202203 355 421 4.00E-12 63.3561 cl03534 E2F_TDP superfamily C - "E2F/DP family winged-helix DNA-binding domain; This family contains the transcription factor E2F and its dimerisation partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. The crystal structure of an E2F4-DP2-DNA complex shows that the DNA-binding domains of the E2F and DP proteins both have a fold related to the winged-helix DNA-binding motif. Recognition of the central c/gGCGCg/c sequence of the consensus DNA-binding site is symmetric, and amino acids that contact these bases are conserved among all known E2F and DP proteins." Q#9611 - CGI_10023550 superfamily 241559 3 101 3.56E-16 71.1879 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#9613 - CGI_10023552 superfamily 247743 502 581 1.60E-05 45.5999 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#9613 - CGI_10023552 superfamily 192053 1965 2041 5.82E-34 128.045 cl18175 COQ9 superfamily - - COQ9; COQ9 is an enzyme that is required for the biosynthesis of coenzyme Q. It may either catalyze a reaction in the coenzyme Q biosynthetic pathway or have a regulatory role. Q#9613 - CGI_10023552 superfamily 219908 1539 1706 1.61E-09 61.7241 cl07251 Kinetochor_Ybp2 superfamily N - "Uncharacterized protein family, YAP/Alf4/glomulin; This entry contains a number of protein families with apparently unrelated functions. These include the YAP binding proteins of yeasts. These are stress response and redox homeostasis proteins, induced by hydrogen peroxide or induced in response to alkylating agent methyl methanesulphonate (MMS). The family includes Aberrant root formation protein 4 (Alf4) of Arabidopsis thaliana (Mouse-ear cress), which is required for the initiation of lateral roots independent from auxin signalling. It may also function in maintaining the pericycle in the mitotically competent state needed for lateral root formation. The family includes glomulin (FAP68), which is essential for normal development of the vasculature and may represent a naturally occurring ligand of the immunophilins FKBP59 and FKBP12." Q#9616 - CGI_10001435 superfamily 248097 7 73 3.21E-06 40.3262 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9619 - CGI_10005170 superfamily 241600 46 173 1.48E-58 184.366 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#9623 - CGI_10011782 superfamily 245201 21 300 0 569.621 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9626 - CGI_10011785 superfamily 245212 77 146 0.00186683 35.7723 cl09940 S4 superfamily NC - "S4/Hsp/ tRNA synthetase RNA-binding domain; The domain surface is populated by conserved, charged residues that define a likely RNA-binding site; Found in stress proteins, ribosomal proteins and tRNA synthetases; This may imply a hitherto unrecognized functional similarity between these three protein classes." Q#9628 - CGI_10011787 superfamily 191378 22 115 0.000425029 36.7172 cl05416 PSP94 superfamily - - "Beta-microseminoprotein (PSP-94); This family consists of the mammalian specific protein beta-microseminoprotein. Prostatic secretory protein of 94 amino acids (PSP94), also called beta-microseminoprotein, is a small, nonglycosylated protein, rich in cysteine residues. It was first isolated as a major protein from human seminal plasma. The exact function of this protein is unknown." Q#9629 - CGI_10011788 superfamily 215648 304 554 1.10E-48 171.625 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#9629 - CGI_10011788 superfamily 245225 1 206 2.86E-28 117.729 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#9629 - CGI_10011788 superfamily 219467 224 273 1.18E-12 64.2767 cl08456 NCD3G superfamily - - "Nine Cysteines Domain of family 3 GPCR; This conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the pfam00003 in several receptor proteins." Q#9634 - CGI_10011794 superfamily 241568 914 967 8.40E-07 47.8428 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9634 - CGI_10011794 superfamily 243032 257 428 2.63E-06 49.1271 cl02427 Pumilio superfamily C - "Pumilio-family RNA binding domain; Puf repeats (also labelled PUM-HD or Pumilio homology domain) mediate sequence specific RNA binding in fly Pumilio, worm FBF-1 and FBF-2, and many other proteins such as vertebrate Pumilio. These proteins function as translational repressors in early embryonic development by binding to sequences in the 3' UTR of target mRNAs, such as the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA. Other proteins that contain Puf domains are also plausible RNA binding proteins. Yeast PUF1 (JSN1), for instance, appears to contain a single RNA-recognition motif (RRM) domain. Puf repeat proteins have been observed to function asymmetrically and may be responsible for creating protein gradients involved in the specification of cell fate and differentiation. Puf domains usually occur as a tandem repeat of 8 domains. This model encompasses all 8 tandem repeats. Some proteins may have fewer (canonical) repeats." Q#9634 - CGI_10011794 superfamily 243032 361 555 8.72E-06 47.5863 cl02427 Pumilio superfamily C - "Pumilio-family RNA binding domain; Puf repeats (also labelled PUM-HD or Pumilio homology domain) mediate sequence specific RNA binding in fly Pumilio, worm FBF-1 and FBF-2, and many other proteins such as vertebrate Pumilio. These proteins function as translational repressors in early embryonic development by binding to sequences in the 3' UTR of target mRNAs, such as the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA. Other proteins that contain Puf domains are also plausible RNA binding proteins. Yeast PUF1 (JSN1), for instance, appears to contain a single RNA-recognition motif (RRM) domain. Puf repeat proteins have been observed to function asymmetrically and may be responsible for creating protein gradients involved in the specification of cell fate and differentiation. Puf domains usually occur as a tandem repeat of 8 domains. This model encompasses all 8 tandem repeats. Some proteins may have fewer (canonical) repeats." Q#9637 - CGI_10022134 superfamily 241583 137 274 3.54E-10 57.581 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#9637 - CGI_10022134 superfamily 241571 323 390 0.000174647 39.7031 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#9638 - CGI_10022135 superfamily 245206 5 254 6.16E-67 211.537 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#9639 - CGI_10022136 superfamily 241571 85 182 1.92E-10 56.6518 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#9640 - CGI_10022137 superfamily 241571 78 194 6.88E-09 52.7998 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#9642 - CGI_10022139 superfamily 247725 670 791 1.42E-22 94.3615 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9643 - CGI_10022140 superfamily 243072 168 294 2.09E-37 136.745 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9643 - CGI_10022140 superfamily 243072 30 160 1.61E-21 91.6762 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9645 - CGI_10022142 superfamily 242395 105 378 1.19E-77 244.147 cl01255 nadF superfamily - - NAD kinase [Coenzyme metabolism] Q#9646 - CGI_10022143 superfamily 201673 159 677 1.13E-101 324.915 cl18217 Glyco_hydro_39 superfamily - - Glycosyl hydrolases family 39; Glycosyl hydrolases family 39. Q#9649 - CGI_10022146 superfamily 220396 24 141 3.56E-27 103.012 cl14975 DUF2349 superfamily - - Uncharacterized conserved protein (DUF2349); Members of this family of uncharacterized novel proteins have no known function. Q#9650 - CGI_10022147 superfamily 241740 24 124 1.78E-41 135.932 cl00269 cytidine_deaminase-like superfamily - - "Cytidine and deoxycytidylate deaminase zinc-binding region. The family contains cytidine deaminases, nucleoside deaminases, deoxycytidylate deaminases and riboflavin deaminases. Also included are the apoBec family of mRNA editing enzymes. All members are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate." Q#9651 - CGI_10022148 superfamily 227891 197 531 6.58E-35 136.173 cl15889 COG5604 superfamily C - Uncharacterized conserved protein [Function unknown] Q#9653 - CGI_10022150 superfamily 227891 4 143 1.45E-17 81.8596 cl15889 COG5604 superfamily N - Uncharacterized conserved protein [Function unknown] Q#9654 - CGI_10022151 superfamily 241563 99 135 0.000262316 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9655 - CGI_10022152 superfamily 248313 33 129 0.00618626 33.7438 cl17759 EamA superfamily N - EamA-like transporter family; This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. The family used to be known as DUF6. Q#9656 - CGI_10022153 superfamily 220735 402 1023 0 631.607 cl15660 Ufd2P_core superfamily - - "Ubiquitin elongating factor core; This is the most conserved part of the core region of Ufd2P ubiquitin elongating factor or E4, running from helix alpha-11 to alpha-38. It consists of 31 helices of variable length connected by loops of variable size forming a compact unit; the helical packing pattern of the compact unit consists of five structural repeats that resemble tandem Armadillo (ARM) repeats. This domain is involved in ubiquitination as it binds Cdc48p and escorts ubiquitinated proteins from Cdc48p to the proteasome for degradation. The core is structurally similar to the nuclear transporter protein importin-alpha. The core is associated with the U-box at the C-terminus, pfam04564, which has ligase activity." Q#9656 - CGI_10022153 superfamily 248098 1038 1107 1.59E-27 108.15 cl17544 U-box superfamily - - U-box domain; This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. Q#9657 - CGI_10022154 superfamily 247742 62 414 0 606.787 cl17188 enolase_like superfamily - - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#9657 - CGI_10022154 superfamily 149077 881 989 3.87E-31 119.652 cl06719 TMC superfamily - - "TMC domain; These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 and EVIN2 - this region is termed the TMC domain. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters." Q#9658 - CGI_10022155 superfamily 245835 18 217 6.65E-91 270.713 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#9659 - CGI_10022156 superfamily 214507 221 260 0.000283309 38.5652 cl15307 LRRCT superfamily C - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#9659 - CGI_10022156 superfamily 246925 36 142 0.00278006 38.1054 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#9660 - CGI_10022157 superfamily 247725 1 96 4.71E-49 165.861 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9660 - CGI_10022157 superfamily 243072 390 507 1.25E-24 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9660 - CGI_10022157 superfamily 243047 144 257 7.66E-53 177.042 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#9662 - CGI_10022159 superfamily 206077 230 269 9.47E-14 64.5664 cl18287 AA_permease_C superfamily C - C-terminus of AA_permease; This is the C-terminus of AA-permease enzymes that is not captured by the models pfam00324 and pfam13520. Q#9664 - CGI_10022161 superfamily 238076 187 292 1.17E-53 184.545 cl18938 PAX superfamily - - Paired Box domain Q#9665 - CGI_10022162 superfamily 245201 273 601 0 674.998 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9665 - CGI_10022162 superfamily 243157 59 141 6.16E-42 145.952 cl02720 PB1 superfamily - - "The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants." Q#9665 - CGI_10022162 superfamily 241566 174 223 8.85E-15 69.4431 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#9666 - CGI_10022163 superfamily 144634 32 267 1.20E-153 432.422 cl03106 F_actin_cap_B superfamily - - "F-actin capping protein, beta subunit; F-actin capping protein, beta subunit. " Q#9667 - CGI_10022164 superfamily 246680 17 83 5.72E-08 49.8928 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#9669 - CGI_10022166 superfamily 248458 47 260 7.70E-18 84.6729 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9669 - CGI_10022166 superfamily 248458 312 470 1.55E-10 62.3313 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9669 - CGI_10022166 superfamily 241583 628 701 4.62E-19 86.471 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#9669 - CGI_10022166 superfamily 241583 703 741 8.38E-09 54.8847 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#9669 - CGI_10022166 superfamily 247097 849 884 1.55E-06 46.2901 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#9669 - CGI_10022166 superfamily 247097 808 842 1.95E-05 43.1369 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#9672 - CGI_10022169 superfamily 241832 4 43 3.79E-15 66.8036 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9672 - CGI_10022169 superfamily 243175 54 140 5.09E-07 44.5358 cl02776 GST_C_family superfamily C - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#9672 - CGI_10022169 superfamily 243175 115 175 0.00120194 36.1415 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#9673 - CGI_10022170 superfamily 238191 106 179 0.000983298 39.6228 cl18907 Esterase_lipase superfamily NC - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#9677 - CGI_10022174 superfamily 245612 197 674 0 622.795 cl11426 Amidase superfamily - - Amidase; Amidase. Q#9679 - CGI_10002112 superfamily 203401 19 129 7.88E-28 100.453 cl05607 Med22 superfamily - - "Surfeit locus protein 5 subunit 22 of Mediator complex; This family consists of several eukaryotic Surfeit locus protein 5 (SURF5) sequences. The human Surfeit locus has been mapped on chromosome 9q34.1. The locus includes six tightly clustered housekeeping genes (Surf1-6), and the gene organisation is similar in human, mouse and chicken Surfeit locus. The Med22 subunit of Mediator complex is part of the essential core head region." Q#9680 - CGI_10002257 superfamily 241573 1582 1835 2.05E-74 253.408 cl00051 CysPc superfamily N - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#9680 - CGI_10002257 superfamily 241653 1986 2108 4.98E-28 113.229 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#9680 - CGI_10002257 superfamily 241573 1453 1581 1.53E-23 103.945 cl00051 CysPc superfamily C - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#9680 - CGI_10002257 superfamily 241764 1301 1375 3.85E-22 93.9103 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#9680 - CGI_10002257 superfamily 241764 1218 1286 2.08E-08 53.7516 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#9680 - CGI_10002257 superfamily 241653 1866 1974 7.85E-06 46.5896 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#9680 - CGI_10002257 superfamily 241563 60 102 0.000731963 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9683 - CGI_10002592 superfamily 247856 643 685 0.00628957 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9685 - CGI_10002594 superfamily 247684 7 382 0 769.586 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#9689 - CGI_10002652 superfamily 245864 2 380 2.98E-89 279.934 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9690 - CGI_10002325 superfamily 243035 150 263 1.71E-06 44.9182 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9690 - CGI_10002325 superfamily 241568 20 76 4.28E-06 42.8352 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9690 - CGI_10002325 superfamily 241568 78 132 0.000303858 37.4424 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#9691 - CGI_10002326 superfamily 217293 33 239 3.05E-45 158.18 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#9691 - CGI_10002326 superfamily 202474 246 329 1.61E-16 77.6941 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#9693 - CGI_10002921 superfamily 247792 192 240 5.75E-10 55.9148 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9693 - CGI_10002921 superfamily 245201 430 686 3.82E-112 341.608 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9693 - CGI_10002921 superfamily 246908 305 404 1.93E-41 146.188 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#9694 - CGI_10002922 superfamily 241594 1528 1940 1.25E-103 338 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#9694 - CGI_10002922 superfamily 207713 675 737 3.20E-19 85.0637 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#9697 - CGI_10026644 superfamily 245213 32 66 8.52E-06 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9697 - CGI_10026644 superfamily 245814 101 162 0.00324271 34.3871 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9698 - CGI_10026645 superfamily 247804 56 108 1.50E-19 83.8981 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#9698 - CGI_10026645 superfamily 247804 11 54 1.89E-10 57.5854 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#9699 - CGI_10026646 superfamily 243072 238 346 4.52E-28 110.551 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9699 - CGI_10026646 superfamily 243072 27 145 2.04E-22 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9699 - CGI_10026646 superfamily 243072 91 231 6.70E-18 81.2758 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9699 - CGI_10026646 superfamily 217473 605 744 0.00201954 39.6558 cl03978 Mab-21 superfamily NC - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#9700 - CGI_10026647 superfamily 243072 1221 1348 3.26E-27 109.395 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9700 - CGI_10026647 superfamily 243072 644 759 1.37E-26 107.469 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9700 - CGI_10026647 superfamily 243072 767 882 2.74E-25 104.003 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9700 - CGI_10026647 superfamily 243072 1154 1283 2.40E-24 100.921 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9700 - CGI_10026647 superfamily 243072 1091 1214 4.96E-22 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9700 - CGI_10026647 superfamily 243072 1018 1147 2.98E-21 92.0614 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9700 - CGI_10026647 superfamily 243072 832 978 8.78E-17 78.9646 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9700 - CGI_10026647 superfamily 243072 993 1021 2.48E-05 43.3116 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9702 - CGI_10026649 superfamily 243054 319 511 5.21E-20 91.3531 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9702 - CGI_10026649 superfamily 241559 15 119 6.33E-19 85.4403 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#9702 - CGI_10026649 superfamily 241559 137 239 1.18E-18 84.6699 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#9702 - CGI_10026649 superfamily 243054 888 1101 9.18E-17 81.7231 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9702 - CGI_10026649 superfamily 243054 778 992 4.39E-12 66.7004 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9702 - CGI_10026649 superfamily 243054 1008 1199 1.48E-11 65.1596 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9702 - CGI_10026649 superfamily 243054 1887 2091 1.00E-08 56.3 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9702 - CGI_10026649 superfamily 243054 513 621 2.51E-06 49.3664 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9702 - CGI_10026649 superfamily 243054 1640 1733 0.00173685 39.2353 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9702 - CGI_10026649 superfamily 243054 1317 1511 0.00625813 38.966 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9703 - CGI_10026650 superfamily 241760 1475 1523 1.70E-31 119.768 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#9703 - CGI_10026650 superfamily 243054 539 716 2.41E-16 80.1823 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9703 - CGI_10026650 superfamily 243054 125 310 3.04E-12 67.0855 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9703 - CGI_10026650 superfamily 243054 27 199 6.78E-12 65.93 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9703 - CGI_10026650 superfamily 241647 1223 1249 3.81E-07 49.0634 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#9703 - CGI_10026650 superfamily 149946 1375 1466 9.27E-45 158.972 cl07621 efhand_2 superfamily - - "EF-hand; Members of this family adopt a helix-loop-helix motif, as per other EF hand domains. However, since they do not contain the canonical pattern of calcium binding residues found in many EF hand domains, they do not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (pfam00397), enhancing dystroglycan binding." Q#9703 - CGI_10026650 superfamily 243054 917 1160 2.90E-05 45.8996 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9703 - CGI_10026650 superfamily 243054 334 610 0.000125158 43.9736 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#9704 - CGI_10026651 superfamily 245814 77 143 9.14E-06 41.1491 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9705 - CGI_10026652 superfamily 247723 102 177 1.33E-53 169.596 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9707 - CGI_10026654 superfamily 247905 5 128 2.64E-33 118.88 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#9710 - CGI_10026657 superfamily 247941 21 165 7.00E-09 51.1825 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#9711 - CGI_10026658 superfamily 220608 42 157 2.12E-21 90.8282 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#9713 - CGI_10026660 superfamily 247805 107 199 8.26E-05 40.3203 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#9714 - CGI_10026662 superfamily 247684 10 124 2.95E-41 144.411 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#9716 - CGI_10026664 superfamily 247684 59 554 0 878.986 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#9717 - CGI_10026665 superfamily 198738 426 509 6.42E-51 170.194 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#9717 - CGI_10026665 superfamily 247057 145 213 3.17E-29 110.151 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#9720 - CGI_10026668 superfamily 216554 246 374 3.00E-22 92.1573 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#9721 - CGI_10026669 superfamily 217062 213 457 1.05E-40 147.416 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#9723 - CGI_10026671 superfamily 241679 163 511 8.09E-152 472.646 cl00199 SO_family_Moco superfamily - - "Sulfite oxidase (SO) family, molybdopterin binding domain. This molybdopterin cofactor (Moco) binding domain is found in a variety of oxidoreductases, main members of this family are nitrate reductase (NR) and sulfite oxidase (SO). SO catalyzes the terminal reaction in the oxidative degradation of the sulfur-containing amino acids cysteine and methionine. Assimilatory NRs catalyze the reduction of nitrate to nitrite which is subsequently converted to NH4+ by nitrite reductase. Common features of all known members of this family are that they contain one single pterin cofactor and part of the coordination of the metal (Mo) is a cysteine ligand of the protein and that they catalyze the transfer of an oxygen to or from a lone pair of electrons on the substrate." Q#9723 - CGI_10026671 superfamily 241578 834 1011 1.92E-48 172.785 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#9723 - CGI_10026671 superfamily 217293 1382 1575 1.33E-33 131.216 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#9723 - CGI_10026671 superfamily 242849 65 139 1.52E-17 80.3256 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#9723 - CGI_10026671 superfamily 202474 1582 1635 7.84E-07 50.7301 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#9726 - CGI_10026674 superfamily 217685 290 431 3.22E-35 129.762 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#9726 - CGI_10026674 superfamily 216290 152 273 4.09E-20 86.5733 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#9727 - CGI_10026675 superfamily 221329 23 167 2.43E-30 110.951 cl13389 DUF3456 superfamily - - "TLR4 regulator and MIR-interacting MSAP; This family of proteins, found from plants to humans, is PRAT4 (A and B), a Protein Associated with Toll-like receptor 4. The Toll family of receptors - TLRs - plays an essential role in innate recognition of microbial products, the first line of defence against bacterial infection. PRAT4A influences the subcellular distribution and the strength of TLR responses and alters the relative activity of each TLR. PRAT4B regulates TLR4 trafficking to the cell surface and the extent of its expression there. TLR4 recognizes lipopolysaccharide (LPS), one of the most immuno-stimulatory glycolipids constituting the outer membrane of the Gram-negative bacteria. This family has also been described as a SAP-like MIR-interacting protein family." Q#9728 - CGI_10026676 superfamily 241748 2 203 3.45E-101 295.556 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#9729 - CGI_10026677 superfamily 243051 20 172 1.13E-38 131.346 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#9730 - CGI_10026678 superfamily 243051 311 463 9.55E-37 132.887 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#9730 - CGI_10026678 superfamily 243051 151 301 2.41E-27 106.693 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#9732 - CGI_10026680 superfamily 245230 139 565 1.79E-117 359.75 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#9732 - CGI_10026680 superfamily 220832 1 103 1.13E-33 123.634 cl10534 Misat_Myo_SegII superfamily - - "Misato Segment II myosin-like domain; The misato protein contains three distinct, conserved domains, segments I, II and III. Segments I and III are common to Tubulins pfam00091, but segment II aligns with myosin heavy chain sequences from D. melanogaster (PIR C35815), rabbit (SP P04460), and human (PIR S12458). Segment II of misato is a major contributor to its greater length compared with the various tubulins. The most significant sequence similarities to this 54-amino acid region are from a motif found in the heavy chains of myosins from different organisms. A comparison of segment II with the vertebrate myosin heavy chains reveals that it is homologous to a myosin peptide in the hinge region linking the S2 and LMM domains. Segment II also contains heptad repeats which are characteristic of the myosin tail alpha-helical coiled-coils." Q#9733 - CGI_10026681 superfamily 241645 70 145 5.18E-46 149.25 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#9733 - CGI_10026681 superfamily 241645 146 221 5.18E-46 149.25 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#9733 - CGI_10026681 superfamily 241645 1 69 5.36E-40 133.842 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#9734 - CGI_10026682 superfamily 217305 8 76 9.58E-15 64.566 cl03807 TBCA superfamily C - Tubulin binding cofactor A; Tubulin binding cofactor A. Q#9735 - CGI_10026683 superfamily 244539 1289 1470 9.49E-47 169.022 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#9735 - CGI_10026683 superfamily 247856 871 932 8.18E-09 54.4761 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9735 - CGI_10026683 superfamily 246664 40 641 0 741.424 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#9735 - CGI_10026683 superfamily 242267 1101 1248 2.30E-06 47.67 cl01043 Ferric_reduct superfamily - - "Ferric reductase like transmembrane component; This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease." Q#9741 - CGI_10026689 superfamily 245814 75 141 4.91E-06 45.1727 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9741 - CGI_10026689 superfamily 245201 384 662 8.99E-167 482.329 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9741 - CGI_10026689 superfamily 245814 163 243 9.84E-21 87.8331 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9743 - CGI_10026691 superfamily 216987 22 101 0.00328193 34.4469 cl18386 GDNF superfamily - - "GDNF/GAS1 domain; This cysteine rich domain is found in multiple copies in GNDF and GAS1 proteins. GDNF and neurturin (NTN) receptors are potent survival factors for sympathetic, sensory and central nervous system neurons. GDNF and neurturin promote neuronal survival by signaling through similar multicomponent receptors that consist of a common receptor tyrosine kinase and a member of a GPI-linked family of receptors that determines ligand specificity." Q#9744 - CGI_10026692 superfamily 193358 4 79 8.97E-15 68.3222 cl15143 TORC_M superfamily N - "Transducer of regulated CREB activity middle domain; This family includes the region between the N and C terminus of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerisation domain of CREB (cAMP Response Element-Binding). Although the C- and N- terminal domains of these proteins have been well characterized, no functional role has been assigned to the central region, yet." Q#9745 - CGI_10026693 superfamily 221830 3 66 6.13E-20 80.5355 cl15142 TORC_N superfamily - - "Transducer of regulated CREB activity, N terminus; This family includes the N terminal region of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerisation domain of CREB (cAMP Response Element-Binding). The proteins display a highly conserved predicted N-terminal coiled-coil domain and an invariant sequence matching a protein kinase A (PKA) phosphorylation consensus sequence (RKXS). The coiled-coil structure interacts with the bZIP domain of CREB. This interaction may occur via ionic bonds because it is disrupted under high-salt conditions. In addition to CREB-binding, the N-terminal region plays a role in the tetramer formation of TORCs, but the physiological function of the multimeric complex has not been clarified yet." Q#9745 - CGI_10026693 superfamily 193358 158 215 0.00637774 34.8099 cl15143 TORC_M superfamily C - "Transducer of regulated CREB activity middle domain; This family includes the region between the N and C terminus of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerisation domain of CREB (cAMP Response Element-Binding). Although the C- and N- terminal domains of these proteins have been well characterized, no functional role has been assigned to the central region, yet." Q#9746 - CGI_10026694 superfamily 247723 121 177 1.84E-23 93.5937 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9746 - CGI_10026694 superfamily 247723 388 465 5.67E-21 86.9366 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9746 - CGI_10026694 superfamily 247723 16 47 3.06E-09 53.5791 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9748 - CGI_10026696 superfamily 243034 177 259 1.00E-05 42.7524 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#9749 - CGI_10026697 superfamily 110440 353 379 0.00177121 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9749 - CGI_10026697 superfamily 110440 312 338 0.00219954 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#9750 - CGI_10026698 superfamily 241611 112 291 2.05E-14 72.6863 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#9750 - CGI_10026698 superfamily 215647 659 815 1.72E-12 66.8632 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#9751 - CGI_10026699 superfamily 242847 427 489 3.14E-30 112.384 cl02038 Elf1 superfamily - - Transcription elongation factor Elf1 like; This family of short proteins contains a putative zinc binding domain with four conserved cysteines. ELF1 has been identified as a transcription elongation factor in Saccharomyces cerevisiae. Q#9751 - CGI_10026699 superfamily 243091 62 104 0.00237508 36.7013 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#9752 - CGI_10026700 superfamily 243072 21 135 4.75E-23 94.3726 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9753 - CGI_10026701 superfamily 241563 104 141 5.76E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9753 - CGI_10026701 superfamily 243092 353 485 0.000971205 40.0108 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9755 - CGI_10026703 superfamily 241647 20 50 1.39E-08 50.9894 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#9755 - CGI_10026703 superfamily 243076 331 405 1.93E-19 82.2648 cl02539 BAG superfamily - - BAG domain; Domain present in Hsp70 regulators. Q#9768 - CGI_10021742 superfamily 248097 15 138 6.57E-25 93.4838 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9771 - CGI_10021745 superfamily 247755 31 164 4.18E-40 148.127 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#9774 - CGI_10021748 superfamily 241563 117 150 2.94E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9774 - CGI_10021748 superfamily 189332 346 414 0.000391405 41.5337 cl14874 Luminal_IRE1_like superfamily NC - "The Luminal domain, a dimerization domain, of Inositol-requiring protein 1-like proteins; The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), eukaryotic translation Initiation Factor 2-Alpha Kinase 3 (EIF2AK3), and similar proteins. IRE1 and EIF2AK3 are serine/threonine protein kinases (STKs) and are type I transmembrane proteins that are localized in the endoplasmic reticulum (ER). They are kinase receptors that are activated through the release of BiP, a chaperone bound to their luminal domains under unstressed conditions. This results in dimerization through their luminal domains, allowing trans-autophosphorylation of their kinase domains and activation. They play roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), contains an endoribonuclease domain in its cytoplasmic side and acts as an ER stress sensor. It is the oldest and most conserved component of the UPR in eukaryotes. Its activation results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. EIF2AK3, also called PKR-like Endoplasmic Reticulum Kinase (PERK), phosphorylates the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. It functions as the central regulator of translational control during the UPR pathway. In addition to the eIF-2 alpha subunit, EIF2AK3 also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR." Q#9777 - CGI_10021751 superfamily 241599 65 114 3.66E-22 87.2988 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#9777 - CGI_10021751 superfamily 146451 260 279 0.00197939 35.0275 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#9778 - CGI_10021752 superfamily 215859 71 195 8.32E-06 45.2851 cl18347 Peptidase_S9 superfamily - - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#9779 - CGI_10021753 superfamily 192485 28 69 2.11E-11 56.5994 cl10906 DUF2039 superfamily N - Uncharacterized conserved protein (DUF2039); This entry is a region of approximately 100 residues containing three pairs of cysteine residues. The region is conserved from plants to humans but its function is unknown. Q#9780 - CGI_10021754 superfamily 245202 67 131 4.13E-15 66.063 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#9781 - CGI_10021755 superfamily 241611 51 202 0.000139956 40.8348 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#9782 - CGI_10021756 superfamily 241546 742 863 5.33E-43 154.741 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#9782 - CGI_10021756 superfamily 243086 632 674 5.84E-12 63.1629 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#9782 - CGI_10021756 superfamily 242057 1450 1562 0.00138094 41.0713 cl00734 Bac_export_1 superfamily NC - "Bacterial export proteins, family 1; This family includes the following members; FliR, MopE, SsaT, YopT, Hrp, HrcT and SpaR All of these members export proteins, that do not possess signal peptides, through the membrane. Although the proteins that these exporters move may be different, the exporters are thought to function in similar ways." Q#9783 - CGI_10021757 superfamily 241546 1096 1218 2.00E-43 156.667 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#9783 - CGI_10021757 superfamily 243086 984 1027 2.02E-11 62.0073 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#9784 - CGI_10021758 superfamily 245213 526 566 5.90E-08 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9784 - CGI_10021758 superfamily 245213 408 446 4.65E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9784 - CGI_10021758 superfamily 245213 447 478 1.34E-06 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9784 - CGI_10021758 superfamily 245213 633 665 1.81E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9784 - CGI_10021758 superfamily 245213 487 525 9.16E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9784 - CGI_10021758 superfamily 245213 567 601 2.49E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9784 - CGI_10021758 superfamily 245213 788 821 0.000860574 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9784 - CGI_10021758 superfamily 243060 684 745 2.48E-05 43.5216 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#9784 - CGI_10021758 superfamily 193419 372 407 0.00637436 35.8309 cl15186 EGF_MSP1_1 superfamily - - MSP1 EGF domain 1; This EGF-like domain is found at the C-terminus of the malaria parasite MSP1 protein. MSP1 is the merozoite surface protein 1. This domain is part of the C-terminal fragment that is proteolytically processed from the the rest of the protein and is left attached to the surface of the invading parasite. Q#9785 - CGI_10021759 superfamily 243060 13 116 3.12E-15 73.182 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#9786 - CGI_10021760 superfamily 243082 98 399 1.78E-156 467.138 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9787 - CGI_10021761 superfamily 241631 65 245 4.32E-88 267.166 cl00136 Sec7 superfamily - - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#9787 - CGI_10021761 superfamily 247725 260 379 3.61E-60 192.907 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9788 - CGI_10021762 superfamily 245201 104 339 5.29E-67 225.536 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9789 - CGI_10021763 superfamily 243092 160 483 5.17E-20 88.9312 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9790 - CGI_10021764 superfamily 241563 73 113 1.46E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9797 - CGI_10021771 superfamily 241750 527 577 0.000383131 41.8743 cl00281 metallo-dependent_hydrolases superfamily C - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#9799 - CGI_10021773 superfamily 243190 23 96 4.11E-28 98.5342 cl02794 Cyt_c_Oxidase_VIb superfamily - - "Cytochrome c oxidase subunit VIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIb is one of three mammalian subunits that lacks a transmembrane region. It is located on the cytosolic side of the membrane and helps form the dimer interface with the corresponding subunit on the other monomer complex." Q#9802 - CGI_10003643 superfamily 243035 185 303 2.52E-28 106.55 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9802 - CGI_10003643 superfamily 243035 44 140 3.19E-21 86.5197 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9806 - CGI_10010578 superfamily 217598 457 590 2.40E-43 154.164 cl04130 KCNQ_channel superfamily N - KCNQ voltage-gated potassium channel; This family matches to the C-terminal tail of KCNQ type potassium channels. Q#9806 - CGI_10010578 superfamily 219619 250 324 6.35E-16 73.3959 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#9807 - CGI_10010579 superfamily 248054 238 304 0.00100305 38.4375 cl17500 NAD_binding_8 superfamily NC - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#9808 - CGI_10010580 superfamily 218493 344 477 2.12E-39 140.185 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#9808 - CGI_10010580 superfamily 248054 161 227 0.00336471 37.6671 cl17500 NAD_binding_8 superfamily NC - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#9810 - CGI_10010582 superfamily 248458 38 225 1.41E-17 82.7469 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9810 - CGI_10010582 superfamily 248458 303 502 2.40E-12 66.9537 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9812 - CGI_10010584 superfamily 247683 187 237 8.46E-12 60.9395 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#9812 - CGI_10010584 superfamily 247683 54 103 4.08E-08 50.5391 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#9812 - CGI_10010584 superfamily 247683 532 583 0.000208645 39.3683 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#9815 - CGI_10010587 superfamily 241763 144 358 9.00E-114 332.666 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#9815 - CGI_10010587 superfamily 244586 56 116 2.93E-12 61.1055 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#9816 - CGI_10010588 superfamily 241763 167 381 5.04E-103 306.088 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#9816 - CGI_10010588 superfamily 244586 79 139 2.28E-09 53.0163 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#9817 - CGI_10010589 superfamily 245814 368 435 0.00351 37.0997 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#9818 - CGI_10002783 superfamily 243072 154 283 1.62E-19 84.7426 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9818 - CGI_10002783 superfamily 243072 293 420 3.09E-16 75.4978 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9818 - CGI_10002783 superfamily 243072 362 496 3.14E-15 72.4162 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9822 - CGI_10011654 superfamily 243092 489 791 9.78E-25 105.495 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#9823 - CGI_10011655 superfamily 245864 13 202 2.01E-50 171.692 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#9824 - CGI_10011656 superfamily 247684 7 399 6.89E-95 297.652 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#9826 - CGI_10011658 superfamily 217293 39 203 1.77E-36 133.912 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#9826 - CGI_10011658 superfamily 202474 240 437 3.49E-33 125.074 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#9827 - CGI_10011659 superfamily 202474 111 194 2.85E-30 113.903 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#9827 - CGI_10011659 superfamily 202474 192 252 0.00907646 35.3221 cl08379 Neur_chan_memb superfamily N - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#9828 - CGI_10008217 superfamily 241733 24 109 2.39E-58 177.008 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#9831 - CGI_10008220 superfamily 248097 93 221 9.13E-18 76.1498 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9832 - CGI_10008221 superfamily 247792 24 75 5.12E-06 44.3588 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9832 - CGI_10008221 superfamily 241563 177 215 0.00214975 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9833 - CGI_10008222 superfamily 216033 65 156 3.70E-12 60.4252 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#9834 - CGI_10008223 superfamily 246925 311 463 4.06E-06 48.891 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#9834 - CGI_10008223 superfamily 246925 808 928 0.000387858 42.7278 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#9834 - CGI_10008223 superfamily 246925 595 689 0.00199227 40.4166 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#9834 - CGI_10008223 superfamily 246925 137 321 0.0049669 39.261 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#9835 - CGI_10008224 superfamily 248458 313 487 5.63E-11 62.7165 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9835 - CGI_10008224 superfamily 248458 71 220 3.31E-07 50.7753 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#9836 - CGI_10008225 superfamily 244363 74 164 8.64E-36 124.079 cl06336 Commd superfamily N - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#9837 - CGI_10008226 superfamily 241610 1 53 5.49E-12 54.9486 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#9838 - CGI_10008227 superfamily 241610 2 53 8.21E-15 61.8822 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#9841 - CGI_10008230 superfamily 241563 75 115 5.70E-07 47.0888 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9843 - CGI_10008233 superfamily 177822 28 114 4.61E-05 39.9033 cl18088 PLN02164 superfamily N - sulfotransferase Q#9845 - CGI_10008235 superfamily 177822 43 277 1.54E-18 82.6605 cl18088 PLN02164 superfamily N - sulfotransferase Q#9847 - CGI_10003846 superfamily 222429 6 58 2.95E-07 42.6128 cl18676 Myb_DNA-bind_5 superfamily C - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#9851 - CGI_10003850 superfamily 241867 120 336 1.37E-24 99.1236 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#9852 - CGI_10003302 superfamily 241600 3 207 5.84E-69 212.101 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#9853 - CGI_10003303 superfamily 241600 15 232 1.98E-58 186.292 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#9854 - CGI_10003304 superfamily 241600 9 177 2.85E-49 160.484 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#9855 - CGI_10015615 superfamily 245201 19 282 1.97E-130 387.624 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#9856 - CGI_10015616 superfamily 241574 48 186 5.51E-45 147.755 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9857 - CGI_10015617 superfamily 219152 12 364 1.61E-104 316.491 cl18495 PIG-U superfamily - - "GPI transamidase subunit PIG-U; Many eukaryotic proteins are anchored to the cell surface via glycosylphosphatidylinositol (GPI), which is posttranslationally attached to the carboxyl-terminus by GPI transamidase. The mammalian GPI transamidase is a complex of at least four subunits, GPI8, GAA1, PIG-S, and PIG-T. PIG-U is thought to represent a fifth subunit in this complex and may be involved in the recognition of either the GPI attachment signal or the lipid portion of GPI." Q#9858 - CGI_10015618 superfamily 247740 11 257 5.30E-87 269.749 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#9858 - CGI_10015618 superfamily 241575 374 434 0.00130389 37.2519 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#9859 - CGI_10015619 superfamily 241637 705 742 0.00257525 36.9027 cl00146 TFIIS_I superfamily N - N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme Q#9860 - CGI_10015620 superfamily 243072 165 284 1.20E-38 141.752 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#9862 - CGI_10015622 superfamily 241596 78 117 5.05E-07 42.5863 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#9865 - CGI_10015625 superfamily 248097 1 98 3.25E-12 58.4306 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9866 - CGI_10015626 superfamily 241574 78 270 1.43E-74 238.254 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9866 - CGI_10015626 superfamily 241574 334 520 6.00E-15 73.3889 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#9867 - CGI_10015627 superfamily 243035 33 148 3.14E-24 91.9125 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9868 - CGI_10006479 superfamily 247856 159 221 1.00E-22 87.9885 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9868 - CGI_10006479 superfamily 247856 86 148 3.29E-21 83.7513 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9869 - CGI_10006480 superfamily 247856 17 79 1.07E-08 48.3129 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9869 - CGI_10006480 superfamily 247856 63 119 8.50E-08 45.6165 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9870 - CGI_10006481 superfamily 247856 384 445 8.90E-18 77.5881 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9870 - CGI_10006481 superfamily 247856 255 314 1.28E-17 77.2029 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9870 - CGI_10006481 superfamily 247856 182 243 6.07E-17 75.2769 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9870 - CGI_10006481 superfamily 247856 37 98 2.25E-10 56.4021 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9870 - CGI_10006481 superfamily 247856 311 373 2.75E-07 47.5425 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9870 - CGI_10006481 superfamily 247856 109 153 0.000218933 39.0681 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 298 360 5.86E-19 82.2105 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 225 287 1.49E-18 81.0549 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 579 641 1.49E-18 81.0549 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 427 489 2.10E-18 80.6697 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 354 416 2.30E-18 80.6697 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 652 708 3.38E-16 74.5065 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 483 538 5.32E-11 59.4837 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 73 134 4.27E-09 53.7057 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 150 212 1.49E-06 46.3869 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9871 - CGI_10006482 superfamily 247856 517 567 0.00574827 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9872 - CGI_10006483 superfamily 247856 128 183 1.25E-18 77.2029 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9872 - CGI_10006483 superfamily 247856 53 153 1.09E-07 47.1573 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#9873 - CGI_10006484 superfamily 241623 538 885 1.73E-178 522.557 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#9873 - CGI_10006484 superfamily 246669 27 185 1.53E-73 239.458 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#9873 - CGI_10006484 superfamily 241742 282 501 6.85E-57 193.316 cl00271 PI3Ka superfamily - - "Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture." Q#9874 - CGI_10006485 superfamily 245841 44 426 4.09E-129 391.063 cl12025 PolY superfamily - - "Y-family of DNA polymerases; Y-family DNA polymerases are a specialized subset of polymerases that facilitate translesion synthesis (TLS), a process that allows the bypass of a variety of DNA lesions. Unlike replicative polymerases, TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. The active sites of TLS polymerases are large and flexible to allow the accomodation of distorted bases. Most TLS polymerases are members of the Y-family, including Pol eta, Pol kappa/IV, Pol iota, Rev1, and Pol V, which is found exclusively in bacteria. In eukaryotes, the B-family polymerase Pol zeta also functions as a TLS polymerase. Expression of Y-family polymerases is often induced by DNA damage and is believed to be highly regulated. TLS is likely induced by the monoubiquitination of the replication clamp PCNA, which provides a scaffold for TLS polymerases to bind in order to access the lesion. Because of their high error rates, TLS polymerases are potential targets for cancer treatment and prevention." Q#9876 - CGI_10006487 superfamily 241645 12 90 2.81E-33 119.366 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#9876 - CGI_10006487 superfamily 244906 119 166 7.39E-23 89.8919 cl08315 CAP_GLY superfamily N - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#9877 - CGI_10006488 superfamily 241644 4620 4780 1.69E-31 124.237 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#9877 - CGI_10006488 superfamily 241564 235 307 6.43E-25 102.729 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#9877 - CGI_10006488 superfamily 221546 3503 3690 1.06E-37 143.209 cl13752 DUF3643 superfamily - - Protein of unknown function (DUF3643); This family of proteins is found in eukaryotes. Proteins in this family are typically between 217 and 4852 amino acids in length. There is a conserved TLA sequence motif. Q#9878 - CGI_10011071 superfamily 199168 52 75 0.000361447 34.2496 cl15310 LRR_TYP superfamily - - "Leucine-rich repeats, typical (most populated) subfamily; Leucine-rich repeats, typical (most populated) subfamily. " Q#9881 - CGI_10011074 superfamily 243035 639 750 2.06E-09 56.0889 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9881 - CGI_10011074 superfamily 243035 371 482 2.94E-07 49.5406 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9881 - CGI_10011074 superfamily 243061 504 587 0.000262451 40.1427 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#9882 - CGI_10011075 superfamily 221370 1 95 2.30E-05 40.4325 cl13441 DUF3497 superfamily N - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#9883 - CGI_10011076 superfamily 247804 98 141 1.34E-08 51.8074 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#9883 - CGI_10011076 superfamily 203011 461 538 3.79E-16 74.1657 cl04515 SWIRM superfamily - - SWIRM domain; This SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in chromosomal proteins. It contains a helix-turn helix motif and binds to DNA. Q#9884 - CGI_10011077 superfamily 244859 7 234 1.69E-14 70.2681 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#9886 - CGI_10005027 superfamily 248097 138 265 3.84E-20 83.4686 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9888 - CGI_10005029 superfamily 248097 194 321 8.20E-21 85.7798 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9889 - CGI_10005030 superfamily 248097 43 92 2.34E-12 58.4306 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9890 - CGI_10005031 superfamily 248097 197 324 2.98E-19 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9891 - CGI_10005032 superfamily 248097 35 149 8.03E-17 72.683 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9892 - CGI_10005033 superfamily 243082 160 526 2.09E-151 439.5 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#9892 - CGI_10005033 superfamily 245220 46 107 1.98E-10 57.0294 cl09957 zf-UBP superfamily - - Zn-finger in ubiquitin-hydrolases and other protein; Zn-finger in ubiquitin-hydrolases and other protein. Q#9893 - CGI_10005034 superfamily 245206 97 229 9.60E-41 141.955 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#9893 - CGI_10005034 superfamily 245206 32 81 2.13E-15 71.8483 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#9895 - CGI_10005036 superfamily 199166 244 416 3.05E-17 79.6788 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#9895 - CGI_10005036 superfamily 199166 162 316 1.24E-13 68.8932 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#9896 - CGI_10005037 superfamily 219782 41 105 4.62E-17 73.1378 cl07050 RNA_pol_Rbc25 superfamily N - "RNA polymerase III subunit Rpc25; Rpc25 is a strongly conserved subunit of RNA polymerase III and has homology to Rpa43 in RNA polymerase I, Rpb7 in RNA polymerase II and the archaeal RpoE subunit. Rpc25 is required for transcription initiation and is not essential for the elongating properties of RNA polymerase III." Q#9898 - CGI_10018604 superfamily 217897 95 282 4.60E-73 224.824 cl04402 APG5 superfamily - - Autophagy protein Apg5; Apg5 is directly required for the import of aminopeptidase I via the cytoplasm-to-vacuole targeting pathway. Q#9900 - CGI_10018606 superfamily 243078 4 123 6.02E-34 128.099 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#9900 - CGI_10018606 superfamily 247723 433 509 2.45E-29 113.589 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9901 - CGI_10018607 superfamily 242559 30 371 0 538.054 cl01529 GH99_GH71_like superfamily - - "Glycoside hydrolase families 71, 99, and related domains; This superfamily of glycoside hydrolases contains families GH71 and GH99 (following the CAZY nomenclature), as well as other members with undefined function and specificity." Q#9903 - CGI_10018609 superfamily 243035 109 157 1.28E-12 61.0766 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9905 - CGI_10018611 superfamily 247805 28 73 5.06E-07 45.3279 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#9907 - CGI_10018613 superfamily 116971 118 172 0.00710713 35.0114 cl07131 Ly49 superfamily NC - "Ly49-like protein, N-terminal region; The sequences making up this family are annotated as, or are similar to, Ly49 receptors. These are type II transmembrane receptors expressed by mouse natural killer (NK) cells. They are classified as being activating (e.g.Ly49D and H) or inhibitory (e.g. Ly49A and G), depending on their effect on NK cell function. They are members of the C-type lectin receptor superfamily, and in fact in many family members this region is found immediately N-terminal to a lectin C-type domain (pfam00059)." Q#9908 - CGI_10018614 superfamily 243034 5 97 2.35E-19 80.5019 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#9908 - CGI_10018614 superfamily 248098 202 274 2.85E-36 125.484 cl17544 U-box superfamily - - U-box domain; This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. Q#9908 - CGI_10018614 superfamily 244881 76 124 0.00668451 36.0434 cl08267 ISOPREN_C2_like superfamily C - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#9909 - CGI_10018615 superfamily 222269 1 178 7.97E-39 142.462 cl18657 Cupin_8 superfamily N - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#9912 - CGI_10018618 superfamily 247725 396 494 1.56E-44 159.413 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9912 - CGI_10018618 superfamily 247725 12 146 8.46E-58 198.796 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#9912 - CGI_10018618 superfamily 243591 821 933 0.0070539 38.6154 cl03951 CDC37_N superfamily N - Cdc37 N terminal kinase binding; Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domain corresponds to the N terminal domain which binds predominantly to protein kinases and is found N terminal to the Hsp (Heat shocked protein) 90-binding domain pfam08565. Expression of a construct consisting of only the N-terminal domain of Saccharomyces pombe Cdc37 results in cellular viability. This indicates that interactions with the cochaperone Hsp90 may not be essential for Cdc37 function. Q#9913 - CGI_10018619 superfamily 190261 426 499 4.51E-27 106.479 cl03504 RFX_DNA_binding superfamily - - RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. Q#9913 - CGI_10018619 superfamily 218683 7 97 3.25E-26 104.596 cl05307 DUF814 superfamily - - "Domain of unknown function (DUF814); This domain occurs in proteins that have been annotated as Fibronectin/fibrinogen binding protein by similarity. This annotation comes from B. subtilis yloA, where the N-terminal region is involved in this activity. Hence the activity of this C-terminal domain is unknown. This domain contains a conserved motif D/E-X-W/Y-X-H that may be functionally important." Q#9913 - CGI_10018619 superfamily 218161 308 385 0.00723495 36.8889 cl04612 RFX1_trans_act superfamily NC - "RFX1 transcription activation region; The RFX family is a family of winged-helix DNA binding proteins. RFX1 is a regulatory factor essential for expression of MHC class II genes. This region is to found N terminal to the RFX DNA binding region (pfam02257) in some mammalian RFX proteins, and is thought to activate transcription when associated with DNA. Deletion analysis has identified the region 233-351 in human RFX1 as being required for maximal activation." Q#9913 - CGI_10018619 superfamily 206361 157 193 0.0090491 35.7627 cl16696 DUF4315 superfamily C - Domain of unknown function (DUF4315); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are approximately 90 amino acids in length. Q#9914 - CGI_10018620 superfamily 241622 292 360 4.66E-15 72.5994 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#9914 - CGI_10018620 superfamily 241622 14 78 1.79E-08 53.0781 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#9914 - CGI_10018620 superfamily 241622 81 135 7.58E-07 48.3319 cl00117 PDZ superfamily C - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#9914 - CGI_10018620 superfamily 143751 398 473 4.75E-10 57.8286 cl11968 harmonin_N_like superfamily - - "N-terminal protein-binding module of harmonin and similar domains; This domain is found in harmonin, and similar proteins such as delphilin, and whirlin. These are postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold proteins. Harmonin and whirlin are organizers of the Usher protein network of the inner ear and the retina, delphilin is found at the cerebellar parallel fiber-Purkinje cell synapses. This harmonin_N_like domain is found in either one or two copies. Harmonin contains a single copy, which is found at its N-terminus and binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain; cadherin 23 is a component of the Usher protein network. Whirlin contains two copies of the harmonin_N_like domain; the first of these has been assayed for interaction with the cytoplasmic domain of cadherin 23 and no interaction could be detected." Q#9914 - CGI_10018620 superfamily 143751 183 258 2.00E-07 50.0076 cl11968 harmonin_N_like superfamily - - "N-terminal protein-binding module of harmonin and similar domains; This domain is found in harmonin, and similar proteins such as delphilin, and whirlin. These are postsynaptic density-95/discs-large/ZO-1 (PDZ) domain-containing scaffold proteins. Harmonin and whirlin are organizers of the Usher protein network of the inner ear and the retina, delphilin is found at the cerebellar parallel fiber-Purkinje cell synapses. This harmonin_N_like domain is found in either one or two copies. Harmonin contains a single copy, which is found at its N-terminus and binds specifically to a short internal peptide fragment of the cadherin 23 cytoplasmic domain; cadherin 23 is a component of the Usher protein network. Whirlin contains two copies of the harmonin_N_like domain; the first of these has been assayed for interaction with the cytoplasmic domain of cadherin 23 and no interaction could be detected." Q#9915 - CGI_10018621 superfamily 247792 14 62 3.33E-07 48.2108 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9915 - CGI_10018621 superfamily 241563 159 195 2.82E-05 42.2744 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9915 - CGI_10018621 superfamily 241563 261 300 7.61E-07 47.0888 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9915 - CGI_10018621 superfamily 128778 311 432 1.50E-06 47.2595 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#9916 - CGI_10018622 superfamily 247907 1094 1239 1.89E-12 67.058 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#9916 - CGI_10018622 superfamily 247907 918 1056 2.71E-10 60.1245 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#9916 - CGI_10018622 superfamily 246918 652 702 6.72E-15 71.8491 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9916 - CGI_10018622 superfamily 246918 604 647 3.79E-10 57.9819 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9916 - CGI_10018622 superfamily 246918 283 333 8.50E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9916 - CGI_10018622 superfamily 246918 710 752 1.57E-05 44.4999 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9916 - CGI_10018622 superfamily 246918 447 501 2.43E-05 43.7295 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9916 - CGI_10018622 superfamily 246918 504 556 0.000520002 39.8775 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9919 - CGI_10018625 superfamily 241680 3 211 3.90E-31 115.086 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#9920 - CGI_10018626 superfamily 241680 23 191 2.38E-31 114.658 cl00200 MIP superfamily N - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#9921 - CGI_10018627 superfamily 241680 166 319 1.32E-26 108.11 cl00200 MIP superfamily N - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#9921 - CGI_10018627 superfamily 241680 24 169 3.46E-20 89.6202 cl00200 MIP superfamily C - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#9921 - CGI_10018627 superfamily 241680 434 595 6.70E-15 73.4848 cl00200 MIP superfamily C - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#9922 - CGI_10018628 superfamily 241680 17 231 2.21E-29 110.421 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#9923 - CGI_10018629 superfamily 247755 500 726 4.37E-98 317.139 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#9923 - CGI_10018629 superfamily 247755 1278 1497 6.71E-89 290.946 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#9923 - CGI_10018629 superfamily 246918 1679 1710 3.51E-10 58.7523 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#9924 - CGI_10018630 superfamily 245226 12 36 0.00303335 36.1083 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#9925 - CGI_10018631 superfamily 245226 13 86 2.56E-06 46.1235 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#9927 - CGI_10018633 superfamily 245213 110 143 0.000267142 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9928 - CGI_10018634 superfamily 248264 1 50 0.00407526 32.209 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#9929 - CGI_10005388 superfamily 243050 4 57 5.05E-24 86.856 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#9930 - CGI_10005389 superfamily 220695 55 184 0.000140764 41.7955 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#9931 - CGI_10005390 superfamily 241637 444 512 1.47E-10 58.0886 cl00146 TFIIS_I superfamily - - N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme Q#9931 - CGI_10005390 superfamily 243263 4 137 1.39E-16 81.3002 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#9932 - CGI_10005391 superfamily 243859 34 119 8.99E-11 54.2582 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#9933 - CGI_10005392 superfamily 241564 1 66 2.67E-32 115.826 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#9933 - CGI_10005392 superfamily 247792 278 317 1.86E-05 41.2772 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9934 - CGI_10005393 superfamily 241564 305 370 2.03E-32 119.678 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#9934 - CGI_10005393 superfamily 241564 74 141 1.23E-16 75.3799 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#9934 - CGI_10005393 superfamily 247792 582 621 4.53E-06 44.3588 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9937 - CGI_10004088 superfamily 216399 47 276 1.69E-102 302.337 cl03142 Cyto_heme_lyase superfamily - - Cytochrome c/c1 heme lyase; Cytochrome c/c1 heme lyase. Q#9938 - CGI_10004089 superfamily 243074 125 170 1.09E-11 57.5165 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#9939 - CGI_10004090 superfamily 241832 48 98 3.06E-07 47.6665 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#9940 - CGI_10004091 superfamily 247723 393 487 8.78E-59 194.952 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9940 - CGI_10004091 superfamily 247723 576 645 2.04E-41 145.865 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9940 - CGI_10004091 superfamily 247723 689 763 4.46E-39 139.719 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9940 - CGI_10004091 superfamily 247723 281 361 6.77E-35 128.268 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9944 - CGI_10008131 superfamily 243035 5 80 1.01E-09 56.0889 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#9945 - CGI_10008132 superfamily 241563 60 102 0.000765905 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#9946 - CGI_10008133 superfamily 247684 2 152 1.04E-58 191.75 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#9947 - CGI_10006013 superfamily 141488 126 188 5.22E-16 69.8576 cl02524 GAS2 superfamily - - Growth-Arrest-Specific Protein 2 Domain; Growth-Arrest-Specific Protein 2 Domain. Q#9947 - CGI_10006013 superfamily 241559 15 77 3.52E-06 43.4535 cl00030 CH superfamily N - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#9948 - CGI_10006014 superfamily 203593 122 235 0.00896518 36.8946 cl18243 Mod_r superfamily C - "Modifier of rudimentary (Mod(r)) protein; This family represents a conserved region approximately 150 residues long within a number of eukaryotic proteins that show homology with Drosophila melanogaster Modifier of rudimentary (Mod(r)) proteins. The N-terminal half of Mod(r) proteins is acidic, whereas the C-terminal half is basic, and both of these regions are represented in this family. Members of this family include the Vps37 subunit of the endosomal sorting complex ESCRT-I, a complex involved in recruiting transport machinery for protein sorting at the multivesicular body (MVB). The yeast ESCRT-I complex consists of three proteins (Vps23, Vps28 and Vps37). The mammalian homologue of Vps37 interacts with Tsg101 (Pfam: PF05743) through its mod(r) domain and its function is essential for lysosomal sorting of EGF receptors." Q#9949 - CGI_10006015 superfamily 241559 8 113 2.38E-19 78.8919 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#9950 - CGI_10006016 superfamily 241559 46 134 1.37E-20 82.7439 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#9950 - CGI_10006016 superfamily 109460 162 186 0.00036683 36.2474 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#9956 - CGI_10008835 superfamily 204544 16 117 3.63E-31 117.031 cl11308 NAD-GH superfamily C - "NAD-specific glutamate dehydrogenase; The members of this are annotated as being NAD-specific glutamate dehydrogenase encoded in antisense gene pair with DnaK-J. However, this could not be confirmed." Q#9957 - CGI_10008836 superfamily 246487 135 245 1.83E-49 172.854 cl13749 eIF3G superfamily - - "eIF3G domain found in eukaryotic translation initiation factor 3 subunit G (eIF-3G) and similar proteins; eIF-3G, also termed eIF-3 subunit 4, or eIF-3-delta, or eIF3-p42, or eIF3-p44, is the RNA-binding subunit of eIF3. eIF3 is a large multi-subunit complex that plays a central role in the initiation of translation by binding to the 40 S ribosomal subunit and promoting the binding of methionyl-tRNAi and mRNA. eIF-3G binds 18 S rRNA and beta-globin mRNA, and therefore appears to be a nonspecific RNA-binding protein. Besides, eIF-3G is one of the cytosolic targets; it interacts with mature apoptosis-inducing factor (AIF). This family also includes yeast eIF3-p33, a homolog of vertebrate eIF-3G; it plays an important role in the initiation phase of protein synthesis in yeast. It binds both mRNA and rRNA fragments due to an RNA recognition motif near its C-terminus." Q#9957 - CGI_10008836 superfamily 247723 312 388 1.69E-47 165.791 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#9957 - CGI_10008836 superfamily 246722 879 978 1.43E-19 87.7009 cl14812 PIN_SF superfamily C - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#9957 - CGI_10008836 superfamily 241647 408 438 0.000143814 40.9742 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#9959 - CGI_10008838 superfamily 244913 222 659 0 633.47 cl08327 Glyco_hydro_47 superfamily - - "Glycosyl hydrolase family 47; Members of this family are alpha-mannosidases that catalyze the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2)." Q#9968 - CGI_10007478 superfamily 192997 3 123 6.71E-27 104.201 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#9969 - CGI_10007479 superfamily 241754 607 947 0 602.346 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#9969 - CGI_10007479 superfamily 245213 319 355 4.12E-11 60.7282 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9969 - CGI_10007479 superfamily 245213 209 242 2.95E-08 52.2538 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#9969 - CGI_10007479 superfamily 222460 1503 1644 5.98E-08 52.901 cl16484 Microtub_bind superfamily - - Kinesin-associated microtubule-binding; This domain binds to micotubules. Q#9973 - CGI_10007483 superfamily 216401 12 285 3.20E-124 358.51 cl03143 F-actin_cap_A superfamily - - F-actin capping protein alpha subunit; F-actin capping protein alpha subunit. Q#9976 - CGI_10007486 superfamily 245226 104 270 4.63E-19 82.3484 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#9977 - CGI_10007487 superfamily 129885 93 318 6.10E-56 183.71 cl17977 nst superfamily - - "UDP-galactose transporter; The 10-12 TMS Nucleotide Sugar Transporters (TC 2.A.7.10)Nucleotide-sugar transporters (NSTs) are found in the Golgi apparatus and the endoplasmic reticulum of eukaryotic cells. Members of the family have been sequenced from yeast, protozoans and animals. Animals such as C. elegans possess many of these transporters. Humans have at least two closely related isoforms of the UDP-galactose:UMP exchange transporter.NSTs generally appear to function by antiport mechanisms, exchanging a nucleotide-sugar for a nucleotide. Thus, CMP-sialic acid is exchanged for CMP; GDP-mannose is preferentially exchanged for GMP, and UDP-galactose and UDP-N-acetylglucosamine are exchanged for UMP (or possibly UDP). Other nucleotide sugars (e.g., GDP-fucose, UDP-xylose, UDP-glucose, UDP-N-acetylgalactosamine, etc.) may also be transported in exchange for various nucleotides, but their transporters have not been molecularly characterized. Each compound appears to be translocated by its own transport protein. Transport allows the compound, synthesized in the cytoplasm, to be exported to the lumen of the Golgi apparatus or the endoplasmic reticulum where it is used for the synthesis of glycoproteins and glycolipids." Q#9978 - CGI_10007488 superfamily 129885 137 206 1.57E-21 88.9512 cl17977 nst superfamily C - "UDP-galactose transporter; The 10-12 TMS Nucleotide Sugar Transporters (TC 2.A.7.10)Nucleotide-sugar transporters (NSTs) are found in the Golgi apparatus and the endoplasmic reticulum of eukaryotic cells. Members of the family have been sequenced from yeast, protozoans and animals. Animals such as C. elegans possess many of these transporters. Humans have at least two closely related isoforms of the UDP-galactose:UMP exchange transporter.NSTs generally appear to function by antiport mechanisms, exchanging a nucleotide-sugar for a nucleotide. Thus, CMP-sialic acid is exchanged for CMP; GDP-mannose is preferentially exchanged for GMP, and UDP-galactose and UDP-N-acetylglucosamine are exchanged for UMP (or possibly UDP). Other nucleotide sugars (e.g., GDP-fucose, UDP-xylose, UDP-glucose, UDP-N-acetylgalactosamine, etc.) may also be transported in exchange for various nucleotides, but their transporters have not been molecularly characterized. Each compound appears to be translocated by its own transport protein. Transport allows the compound, synthesized in the cytoplasm, to be exported to the lumen of the Golgi apparatus or the endoplasmic reticulum where it is used for the synthesis of glycoproteins and glycolipids." Q#9979 - CGI_10000852 superfamily 247792 68 113 1.88E-06 40.8261 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#9981 - CGI_10005162 superfamily 246723 108 403 1.19E-145 425.054 cl14813 GluZincin superfamily C - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#9983 - CGI_10005164 superfamily 222478 387 422 0.000144844 39.8901 cl16506 zf-RVT superfamily N - zinc-binding in reverse transcriptase; This domain would appear to be a zinc-binding region of a putative reverse transcriptase. Q#9987 - CGI_10026865 superfamily 220692 64 377 1.59E-09 57.5993 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#9989 - CGI_10026867 superfamily 245815 10 488 0 880.529 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#9990 - CGI_10026868 superfamily 245815 11 485 0 885.152 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#9995 - CGI_10026873 superfamily 248097 14 121 9.40E-18 73.8386 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#9996 - CGI_10026874 superfamily 247724 38 315 3.86E-141 406.46 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#9996 - CGI_10026874 superfamily 247063 317 399 6.92E-54 175.456 cl15768 TGS superfamily - - "The TGS domain, named after the ThrRS, GTPase, and SpoT/RelA proteins where it occurs, is structurally similar to ubiquitin. TGS is a small domain of about 50 amino acid residues with a predominantly beta-sheet structure. There is no direct information on the function of the TGS domain, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role." Q#9996 - CGI_10026874 superfamily 247803 3 62 0.000178119 40.5896 cl17249 YlqF_related_GTPase superfamily N - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#9999 - CGI_10026877 superfamily 241600 1 202 6.93E-101 293.378 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10000 - CGI_10026878 superfamily 241600 1 164 1.24E-76 230.205 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10001 - CGI_10026879 superfamily 241600 348 554 1.09E-97 298.385 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10001 - CGI_10026879 superfamily 241600 163 303 3.96E-55 186.677 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10001 - CGI_10026879 superfamily 241600 34 129 1.58E-26 107.326 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10002 - CGI_10026880 superfamily 241600 1 169 2.67E-83 247.539 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10003 - CGI_10026881 superfamily 241600 1 173 5.00E-78 234.057 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10004 - CGI_10026882 superfamily 241600 7 199 4.11E-90 266.028 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10005 - CGI_10026883 superfamily 241600 1 206 1.03E-96 282.977 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10006 - CGI_10026884 superfamily 241600 1 159 4.88E-68 207.863 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10007 - CGI_10026885 superfamily 241600 1 113 1.67E-41 137.757 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10008 - CGI_10026886 superfamily 241600 34 212 6.57E-79 237.909 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10010 - CGI_10026888 superfamily 241845 150 318 3.80E-27 105.496 cl00407 tRNA_m1G_MT superfamily - - "tRNA (Guanine-1)-methyltransferase; This is a family of tRNA (Guanine-1)-methyltransferases EC:2.1.1.31. In E.coli K12 this enzyme catalyzes the conversion of a guanosine residue to N1-methylguanine in position 37, next to the anticodon, in tRNA." Q#10011 - CGI_10026889 superfamily 247757 35 224 3.78E-76 238.518 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#10011 - CGI_10026889 superfamily 247757 291 394 4.56E-54 180.738 cl17203 Fer4_NifH superfamily N - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#10012 - CGI_10026890 superfamily 246713 142 279 5.47E-25 98.853 cl14786 ENDO3c superfamily C - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#10012 - CGI_10026890 superfamily 219650 11 144 1.50E-21 88.0891 cl06806 OGG_N superfamily - - "8-oxoguanine DNA glycosylase, N-terminal domain; The presence of 8-oxoguanine residues in DNA can give rise to G-C to T-A transversion mutations. This enzyme is found in archaeal, bacterial and eukaryotic species, and is specifically responsible for the process which leads to the removal of 8-oxoguanine residues. It has DNA glycosylase activity (EC:3.2.2.23) and DNA lyase activity (EC:4.2.99.18). The region featured in this family is the N-terminal domain, which is organised into a single copy of a TBP-like fold. The domain contributes residues to the 8-oxoguanine binding pocket." Q#10014 - CGI_10026892 superfamily 241866 266 597 7.23E-174 499.6 cl00445 Iso_dh superfamily - - Isocitrate/isopropylmalate dehydrogenase; Isocitrate/isopropylmalate dehydrogenase. Q#10014 - CGI_10026892 superfamily 242910 7 235 1.13E-93 295.345 cl02159 Peptidase_C13 superfamily C - "Peptidase C13 family; Members of this family are asparaginyl peptidases. The blood fluke parasite Schistosoma mansoni has at least five Clan CA cysteine peptidases in its digestive tract including cathepsins B (2 isoforms), C, F and L. All have been recombinantly expressed as active enzymes, albeit in various stages of activation. In addition, a Clan CD peptidase, termed asparaginyl endopeptidase or 'legumain' has been identified. This has formerly been characterized as a 'haemoglobinase', but this term is probably incorrect. Two cDNAs have been described for Schistosoma mansoni legumain; one encodes an active enzyme whereas the active site cysteine residue encoded by the second cDNA is substituted by an asparagine residue. Both forms have been recombinantly expressed." Q#10015 - CGI_10026893 superfamily 247744 673 845 4.15E-55 189.37 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#10015 - CGI_10026893 superfamily 247744 405 580 8.19E-41 149.309 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#10015 - CGI_10026893 superfamily 247744 146 301 1.08E-36 137.753 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#10016 - CGI_10026894 superfamily 247744 24 196 7.18E-66 203.622 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#10017 - CGI_10026895 superfamily 241584 534 605 9.24E-11 60.2027 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10017 - CGI_10026895 superfamily 241584 444 524 7.28E-10 57.5063 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10017 - CGI_10026895 superfamily 241584 355 439 2.76E-08 52.8839 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10017 - CGI_10026895 superfamily 243058 234 351 7.80E-05 42.6868 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#10018 - CGI_10026896 superfamily 247999 698 750 2.17E-09 55.1892 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#10018 - CGI_10026896 superfamily 243098 272 321 2.33E-06 46.1129 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#10018 - CGI_10026896 superfamily 243098 638 687 2.33E-06 46.1129 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#10018 - CGI_10026896 superfamily 247999 796 845 0.000164965 40.6582 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#10018 - CGI_10026896 superfamily 247999 332 381 0.000865078 38.6256 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#10019 - CGI_10026897 superfamily 243072 110 236 8.77E-35 129.426 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10019 - CGI_10026897 superfamily 243072 177 305 2.72E-34 127.885 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10019 - CGI_10026897 superfamily 245201 485 615 4.70E-32 124.656 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10019 - CGI_10026897 superfamily 243072 246 375 2.10E-31 119.796 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10019 - CGI_10026897 superfamily 243072 349 417 3.57E-11 61.2454 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10019 - CGI_10026897 superfamily 245201 637 674 0.00121547 40.2124 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10022 - CGI_10026900 superfamily 247856 13 93 5.88E-06 39.8385 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10023 - CGI_10026901 superfamily 220778 32 981 1.81E-60 225.146 cl11127 Nup188 superfamily - - "Nucleoporin subcomplex protein binding to Pom34; This is one of the many peptides that make up the nucleoporin complex (NPC), and is found across eukaryotes. The Nup188 subcomplex (Nic96p-Nup188p-Nup192p-Pom152p) is one of at least six that make up the NPC, and as such is symmetrically localised on both faces of the NPC at the nuclear end, being integrally bound to the C-terminus of Pom34p." Q#10024 - CGI_10026902 superfamily 241578 131 299 7.09E-35 128.66 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10025 - CGI_10026903 superfamily 243400 113 171 0.000769729 38.9613 cl03362 AICARFT_IMPCHas superfamily NC - "AICARFT/IMPCHase bienzyme; This is a family of bifunctional enzymes catalyzing the last two steps in de novo purine biosynthesis. The bifunctional enzyme is found in both prokaryotes and eukaryotes. The second last step is catalyzed by 5-aminoimidazole-4-carboxamide ribonucleotide formyltransferase EC:2.1.2.3 (AICARFT), this enzyme catalyzes the formylation of AICAR with 10-formyl-tetrahydrofolate to yield FAICAR and tetrahydrofolate. This is catalyzed by a pair of C-terminal deaminase fold domains in the protein, where the active site is formed by the dimeric interface of two monomeric units. The last step is catalyzed by the N-terminal IMP (Inosine monophosphate) cyclohydrolase domain EC:3.5.4.10 (IMPCHase), cyclizing FAICAR (5-formylaminoimidazole-4-carboxamide ribonucleotide) to IMP." Q#10026 - CGI_10026904 superfamily 248097 14 117 7.11E-14 63.053 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10027 - CGI_10026905 superfamily 248097 51 170 8.01E-17 72.683 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10028 - CGI_10026906 superfamily 217738 105 234 1.40E-35 131.64 cl04267 Ndc80_HEC superfamily - - HEC/Ndc80p family; Members of this family are components of the mitotic spindle. It has been shown that Ndc80/HEC from yeast is part of a complex called the Ndc80p complex. This complex is thought to bind to the microtubules of the spindle. Q#10028 - CGI_10026906 superfamily 245313 281 414 0.00483048 36.7653 cl10488 FlaC_arch superfamily - - "Flagella accessory protein C (FlaC); Although archaeal flagella appear superficially similar to those of bacteria, they are quite distinct. In several archaea, the flagellin genes are followed immediately by the flagellar accessory genes flaCDEFGHIJ. The gene products may have a role in translocation, secretion, or assembly of the flagellum. FlaC is a protein whose exact role is unknown but it has been shown to be membrane-associated (by immuno-blotting fractionated cells)." Q#10029 - CGI_10026907 superfamily 248097 68 171 9.52E-15 66.905 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10031 - CGI_10026909 superfamily 241596 44 98 2.16E-11 58.7647 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#10032 - CGI_10026910 superfamily 222150 289 313 0.000448219 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10034 - CGI_10026912 superfamily 243035 32 150 8.56E-29 103.468 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10035 - CGI_10026913 superfamily 243176 8 521 0 892.452 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#10038 - CGI_10026916 superfamily 246669 219 354 2.90E-71 220.753 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#10038 - CGI_10026916 superfamily 246669 87 211 1.23E-59 190.234 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#10039 - CGI_10026917 superfamily 245814 110 166 1.12E-05 42.9396 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10041 - CGI_10026919 superfamily 241546 1672 1793 7.68E-40 146.652 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#10041 - CGI_10026919 superfamily 245847 34 143 9.92E-06 46.5765 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10041 - CGI_10026919 superfamily 243086 1561 1595 0.00217494 38.5102 cl02559 GPS superfamily C - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#10042 - CGI_10026920 superfamily 243051 795 966 2.69E-36 135.583 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#10042 - CGI_10026920 superfamily 243051 378 529 1.08E-33 128.264 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#10042 - CGI_10026920 superfamily 243051 623 783 2.91E-33 127.109 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#10042 - CGI_10026920 superfamily 245814 553 609 3.75E-07 49.0247 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10042 - CGI_10026920 superfamily 245814 301 368 6.65E-10 56.994 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10043 - CGI_10026921 superfamily 243035 10 35 2.15E-05 37.6694 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10044 - CGI_10026922 superfamily 243035 24 120 3.07E-12 59.1506 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10046 - CGI_10026924 superfamily 246597 38 220 5.82E-56 181.006 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#10047 - CGI_10026925 superfamily 248458 30 391 4.17E-19 86.2137 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#10049 - CGI_10026927 superfamily 221174 99 254 2.52E-48 166.312 cl13199 Folliculin superfamily - - "Vesicle coat protein involved in Golgi to plasma membrane transport; In yeast cells this family functions in the regulated delivery of Gap1p (a general amino acid permease) to the cell surface, perhaps as a component of a post-Golgi secretory-vesicle coat complex. Birt-Hogg-Dube (BHD)4 syndrome is an autosomal dominant disorder characterized by hamartomas of skin follicles, lung cysts, spontaneous pneumothorax, and renal cell carcinoma. Folliculin is the protein from the BHD4 gene and is found to have no significant homology to any other human proteins. It is expressed in most tissues. These same symptoms also occur in TSC or tuberous sclerosis complex, suggesting that the same pathway is involved, and it is likely that the target is the down-stream Tor2 - an essential gene. Folliculin appears to bind Tor2, and down-regulation of Tor2 activity leads to up-regulation of nitrogen responsive genes including membrane transporters and amino acid permeases." Q#10050 - CGI_10026928 superfamily 246723 91 541 0 616.113 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#10054 - CGI_10009291 superfamily 177822 1 145 3.45E-07 48.7629 cl18088 PLN02164 superfamily NC - sulfotransferase Q#10055 - CGI_10009292 superfamily 177822 54 234 5.94E-12 63.4005 cl18088 PLN02164 superfamily N - sulfotransferase Q#10060 - CGI_10009297 superfamily 247068 473 569 2.48E-25 102.392 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10060 - CGI_10009297 superfamily 247068 257 353 6.19E-20 86.9837 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10060 - CGI_10009297 superfamily 247068 578 676 1.13E-18 83.5169 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10060 - CGI_10009297 superfamily 247068 372 465 2.42E-16 76.5833 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10060 - CGI_10009297 superfamily 247068 167 249 8.80E-15 71.9609 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10060 - CGI_10009297 superfamily 247068 695 778 2.72E-12 64.6421 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10060 - CGI_10009297 superfamily 247068 63 139 0.000211863 41.145 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10061 - CGI_10009298 superfamily 247068 626 725 2.40E-22 94.3025 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10061 - CGI_10009298 superfamily 247068 522 618 3.34E-21 90.8357 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10061 - CGI_10009298 superfamily 247068 306 395 1.94E-19 85.8281 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10061 - CGI_10009298 superfamily 247068 198 298 4.76E-16 75.8129 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10061 - CGI_10009298 superfamily 247068 737 825 2.63E-15 73.8869 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10061 - CGI_10009298 superfamily 247068 418 514 9.27E-13 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10061 - CGI_10009298 superfamily 247068 76 188 1.65E-07 50.775 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10062 - CGI_10009300 superfamily 247068 571 666 1.48E-25 103.162 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10062 - CGI_10009300 superfamily 247068 674 769 1.01E-23 98.1545 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10062 - CGI_10009300 superfamily 247068 355 455 1.37E-21 91.9913 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10062 - CGI_10009300 superfamily 247068 470 563 5.12E-20 87.3689 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10062 - CGI_10009300 superfamily 247068 247 347 2.60E-16 76.5833 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10062 - CGI_10009300 superfamily 247068 789 872 3.53E-12 64.6421 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10062 - CGI_10009300 superfamily 247068 134 237 1.46E-09 56.9382 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10063 - CGI_10009301 superfamily 247068 462 557 1.47E-23 97.3841 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10063 - CGI_10009301 superfamily 247068 568 662 3.16E-20 87.7541 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10063 - CGI_10009301 superfamily 247068 243 341 6.69E-19 83.9021 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10063 - CGI_10009301 superfamily 247068 356 454 4.72E-17 78.5093 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10063 - CGI_10009301 superfamily 247068 131 233 4.88E-17 78.5093 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10063 - CGI_10009301 superfamily 247068 675 765 4.93E-14 69.6497 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10063 - CGI_10009301 superfamily 247068 17 91 3.24E-05 43.4562 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10064 - CGI_10002017 superfamily 247069 6 134 1.32E-30 109.012 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#10065 - CGI_10002018 superfamily 247069 107 252 2.58E-33 121.338 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#10065 - CGI_10002018 superfamily 247643 29 75 2.06E-10 55.2483 cl16919 CRAL_TRIO_N superfamily - - "CRAL/TRIO, N-terminal domain; This all-alpha domain is found to the N-terminus of pfam00650." Q#10067 - CGI_10010038 superfamily 241568 44 94 3.67E-06 44.7612 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10067 - CGI_10010038 superfamily 243035 114 175 5.86E-06 44.533 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10067 - CGI_10010038 superfamily 192535 185 480 1.67E-06 48.361 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#10068 - CGI_10010039 superfamily 248458 93 235 4.88E-14 71.5761 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#10069 - CGI_10010040 superfamily 247743 25 173 3.67E-17 78.3419 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#10069 - CGI_10010040 superfamily 216502 233 461 7.38E-68 218.23 cl03209 Peptidase_M41 superfamily - - Peptidase family M41; Peptidase family M41. Q#10070 - CGI_10010041 superfamily 243072 88 197 3.82E-14 68.5642 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10071 - CGI_10010042 superfamily 201540 13 81 0.00569454 35.6021 cl16960 Troponin superfamily N - "Troponin; Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin." Q#10072 - CGI_10010043 superfamily 241563 18 50 0.000144706 39.2427 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10073 - CGI_10010044 superfamily 241563 18 50 0.00335444 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10074 - CGI_10010045 superfamily 247725 57 141 4.48E-09 49.6006 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10075 - CGI_10003219 superfamily 243034 23 103 9.09E-07 48.5304 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10075 - CGI_10003219 superfamily 243072 230 337 0.000625779 40.0595 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10075 - CGI_10003219 superfamily 243072 683 806 0.00126801 38.9039 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10078 - CGI_10006532 superfamily 246908 226 308 5.59E-07 46.6378 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#10078 - CGI_10006532 superfamily 247725 10 128 1.16E-06 45.8904 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10079 - CGI_10006533 superfamily 245835 287 509 2.09E-113 337.709 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#10079 - CGI_10006533 superfamily 243088 134 257 1.03E-57 188.942 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#10080 - CGI_10006534 superfamily 147027 53 96 5.42E-06 40.526 cl08423 Endosulfine superfamily NC - cAMP-regulated phosphoprotein/endosulfine conserved region; Conserved region found in both cAMP-regulated phosphoprotein 19 (ARPP-19) and Alpha/Beta endosulfine. No function has yet been assigned to ARPP-19. Endosulfine is the endogenous ligand for the ATP-dependent potassium (K ATP) channels which occupy a key position in the control of insulin release from the pancreatic beta cell by coupling cell polarity to metabolism. In both cases the region occupies the majority of the protein. Q#10081 - CGI_10006535 superfamily 241645 9 79 3.67E-38 134.903 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#10081 - CGI_10006535 superfamily 241643 536 573 1.32E-05 42.8315 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#10082 - CGI_10006536 superfamily 243092 163 410 3.51E-08 53.4928 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10084 - CGI_10006539 superfamily 245206 29 304 9.92E-134 383.74 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#10085 - CGI_10006540 superfamily 243099 90 195 2.71E-40 136.694 cl02575 Bcl-2_like superfamily N - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#10087 - CGI_10006542 superfamily 243099 90 192 1.20E-37 129.375 cl02575 Bcl-2_like superfamily N - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#10090 - CGI_10004183 superfamily 241766 53 328 1.74E-127 368.363 cl00303 PNP_UDP_1 superfamily - - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#10093 - CGI_10004186 superfamily 216981 259 411 1.80E-12 64.4762 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#10096 - CGI_10014292 superfamily 241574 650 869 6.72E-109 336.865 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#10096 - CGI_10014292 superfamily 241584 191 225 0.00577196 35.9351 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10096 - CGI_10014292 superfamily 197431 453 513 0.000103913 42.7964 cl06408 UP_III_II superfamily NC - "Uroplakin IIIb, IIIa and II; Uroplakins (UPs) are a family of proteins that associate with each other to form plaques on the apical surface of the urothelium, the pseudo-stratified epithelium lining the urinary tract from renal pelvis to the bladder outlet. UPs are classified into 3 types: UPIa and UPIb, UPII, and UPIIIa and IIIb. UPIs are tetraspanins that have four transmembrane domains separating one large and one small extracellular domain while UPII and UPIIIs are single-pass transmembrane proteins. UPIa and UPIb form specific heterodimers with UPII and UPIII, respectively, which allows them to exit the endoplasmatic rediculum. UPII/UPIa and UPIIIs/UPIb form heterotetramers; six of these tetramers form the 16nm particle, seen in the hexagonal array of the asymmetric unit membrane, which is believed to form a urinary tract barrier. Uroplakins are also believed to play a role during urinary tract morphogenesis." Q#10100 - CGI_10014296 superfamily 245201 30 230 1.84E-49 165.771 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10101 - CGI_10014297 superfamily 247068 521 619 1.75E-27 108.94 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 730 826 6.10E-27 107.399 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 835 934 2.63E-23 96.9989 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 631 723 9.88E-22 92.3765 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 943 1044 1.78E-20 88.5245 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 152 233 4.92E-16 75.8129 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 437 521 1.26E-15 74.6573 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 350 424 6.75E-10 57.7086 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 255 343 2.73E-07 50.0046 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10101 - CGI_10014297 superfamily 247068 42 143 0.000441524 39.9894 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10106 - CGI_10016183 superfamily 241574 179 284 7.19E-14 70.1729 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#10109 - CGI_10016186 superfamily 177822 48 283 1.70E-20 88.4385 cl18088 PLN02164 superfamily N - sulfotransferase Q#10114 - CGI_10016191 superfamily 218802 201 228 0.00263243 35.8002 cl05462 DUF862 superfamily N - "PPPDE putative peptidase domain; The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p)." Q#10115 - CGI_10016192 superfamily 203591 112 250 3.24E-37 135.962 cl06275 DUF1399 superfamily - - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#10115 - CGI_10016192 superfamily 203591 42 115 0.000504022 39.2768 cl06275 DUF1399 superfamily N - Protein of unknown function (DUF1399); This family represents a conserved region approximately 150 residues long within a number of hypothetical plant proteins of unknown function. Q#10117 - CGI_10016194 superfamily 247805 26 90 1.23E-05 47.6391 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10118 - CGI_10016195 superfamily 246669 17 90 1.02E-35 119.982 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#10119 - CGI_10016196 superfamily 241578 165 418 1.34E-111 331.644 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10119 - CGI_10016196 superfamily 246669 65 147 1.94E-47 160.039 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#10119 - CGI_10016196 superfamily 246669 17 45 1.52E-06 46.0234 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#10120 - CGI_10016197 superfamily 245814 38 128 5.25E-05 40.4587 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10121 - CGI_10016198 superfamily 110440 484 510 0.0068378 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#10122 - CGI_10016199 superfamily 110440 393 419 0.00300846 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#10123 - CGI_10016200 superfamily 241566 30 79 8.53E-17 69.4431 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#10124 - CGI_10016201 superfamily 246669 1 104 8.38E-59 189.395 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#10124 - CGI_10016201 superfamily 245201 145 342 4.14E-153 438.442 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10124 - CGI_10016201 superfamily 245201 343 385 1.97E-15 75.1992 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10125 - CGI_10016202 superfamily 241877 325 446 0.0050065 37.9942 cl00459 MIT_CorA-like superfamily NC - "metal ion transporter CorA-like divalent cation transporter superfamily; This superfamily of essential membrane proteins is involved in transporting divalent cations (uptake or efflux) across membranes. They are found in most bacteria and archaea, and in some eukaryotes. It is a functionally diverse group which includes the Mg2+ transporters of Escherichia coli and Salmonella typhimurium CorAs (which can also transport Co2+, and Ni2+ ), the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and the Zn2+ transporter Salmonella typhimurium ZntB, which mediates the efflux of Zn2+ (and Cd2+). It includes five Saccharomyces cerevisiae members: i) two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, ii) two mitochondrial inner membrane Mg2+ transporters: Mfm1p/Lpe10p, and Mrs2p, and iii) and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. It also includes a family of Arabidopsis thaliana members (AtMGTs), some of which are localized to distinct tissues, and not all of which can transport Mg2+. Thermotoga maritima CorA and Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, Mrs2p, and Alr1p. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport." Q#10126 - CGI_10016203 superfamily 246680 48 123 2.63E-08 49.9364 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#10127 - CGI_10016204 superfamily 190308 66 178 7.55E-06 45.0023 cl18163 Fringe superfamily C - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#10128 - CGI_10016205 superfamily 243072 784 916 5.05E-13 68.179 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10128 - CGI_10016205 superfamily 243072 470 613 8.46E-13 67.4086 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10128 - CGI_10016205 superfamily 243072 557 717 1.15E-09 58.1638 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10129 - CGI_10016206 superfamily 247724 141 341 3.40E-05 42.4436 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10130 - CGI_10025401 superfamily 215754 110 208 8.04E-22 87.694 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#10130 - CGI_10025401 superfamily 215754 4 100 3.17E-17 74.9824 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#10130 - CGI_10025401 superfamily 215754 207 278 9.93E-15 68.0488 cl02813 Mito_carr superfamily C - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#10131 - CGI_10025402 superfamily 246712 54 244 1.53E-100 293.661 cl14785 FMT_C_like superfamily - - "Carboxy-terminal domain of Formyltransferase and similar domains; This family represents the C-terminal domain of formyltransferase and similar proteins. This domain is found in a variety of enzymes with formyl transferase and alkyladenine DNA glycosylase activities. The proteins with formyltransferase function include methionyl-tRNA formyltransferase, ArnA, 10-formyltetrahydrofolate dehydrogenase and HypX proteins. Although most proteins with formyl transferase activity contain this C-terminal domain, prokaryotic glycinamide ribonucleotide transformylase (GART), a single domain protein, only contains the core catalytic domain. Thus, the C-terminal domain is not required for formyl transferase catalytic activity and may be involved in substrate binding. Some members of this family have shown nucleic acid binding capacity. The C-terminal domain of methionyl-tRNA formyltransferase is involved in tRNA binding. Alkyladenine DNA glycosylase is a distant member of this family with very low sequence similarity to other members. It catalyzes the first step in base excision repair (BER) by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site and shows ability to bind to DNA." Q#10132 - CGI_10025403 superfamily 247724 63 295 1.98E-138 413.48 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10132 - CGI_10025403 superfamily 247063 289 361 7.33E-38 136.841 cl15768 TGS superfamily - - "The TGS domain, named after the ThrRS, GTPase, and SpoT/RelA proteins where it occurs, is structurally similar to ubiquitin. TGS is a small domain of about 50 amino acid residues with a predominantly beta-sheet structure. There is no direct information on the function of the TGS domain, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role." Q#10132 - CGI_10025403 superfamily 217667 419 703 2.14E-49 180.262 cl12282 NPR3 superfamily - - "Nitrogen Permease regulator of amino acid transport activity 3; This family, also known in yeasts as Rmd11, complexes with NPR2, pfam06218. This complex heterodimer is responsible for inactivating TORC1. an evolutionarily conserved protein complex that controls cell size via nutritional input signals, specifically, in response to amino acid starvation." Q#10133 - CGI_10025404 superfamily 242383 8 172 8.48E-78 232.064 cl01240 CtaG_Cox11 superfamily - - "Cytochrome c oxidase assembly protein CtaG/Cox11; Cytochrome c oxidase assembly protein is essential for the assembly of functional cytochrome oxidase protein. In eukaryotes it is an integral protein of the mitochondrial inner membrane. Cox11 is essential for the insertion of Cu(I) ions to form the CuB site. This is essential for the stability of other structures in subunit I, for example haems a and a3, and the magnesium/manganese centre. Cox11 is probably only required in sub-stoichiometric amounts relative to the structural units. The C terminal region of the protein is known to form a dimer. Each monomer coordinates one Cu(I) ion via three conserved cysteine residues (111, 208 and 210) in Saccharomyces cerevisiae. Met 224 is also thought to play a role in copper transfer or stabilising the copper site." Q#10139 - CGI_10025410 superfamily 241669 17 137 3.93E-23 94.264 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#10139 - CGI_10025410 superfamily 241669 387 499 5.25E-22 91.1824 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#10139 - CGI_10025410 superfamily 241669 262 380 8.76E-22 90.7972 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#10139 - CGI_10025410 superfamily 241669 142 258 2.40E-16 75.3892 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#10140 - CGI_10025411 superfamily 222617 90 220 3.23E-35 126.616 cl16738 YHYH superfamily - - "YHYH protein; This domain family is found in bacteria, eukaryotes and viruses, and is typically between 141 and 198 amino acids in length. There is a conserved YHYH sequence motif." Q#10141 - CGI_10025412 superfamily 222617 46 176 3.91E-36 128.157 cl16738 YHYH superfamily - - "YHYH protein; This domain family is found in bacteria, eukaryotes and viruses, and is typically between 141 and 198 amino acids in length. There is a conserved YHYH sequence motif." Q#10142 - CGI_10025413 superfamily 243072 3 92 1.72E-20 86.6686 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10142 - CGI_10025413 superfamily 221304 152 424 7.90E-87 268.9 cl13359 GPCR_chapero_1 superfamily - - "GPCR-chaperone; This domain, and the associated ANK family repeat pfam00023 domain, together act as a chaperone for biogenesis and folding of the DP receptor for prostaglandin D2." Q#10143 - CGI_10025414 superfamily 221304 50 184 1.94E-43 147.947 cl13359 GPCR_chapero_1 superfamily N - "GPCR-chaperone; This domain, and the associated ANK family repeat pfam00023 domain, together act as a chaperone for biogenesis and folding of the DP receptor for prostaglandin D2." Q#10144 - CGI_10025415 superfamily 216152 1 177 1.06E-49 165.565 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#10145 - CGI_10025416 superfamily 247736 233 280 0.000179897 40.3369 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#10145 - CGI_10025416 superfamily 241752 586 691 1.84E-07 50.7919 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#10146 - CGI_10025417 superfamily 247856 68 129 3.87E-13 61.0245 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10146 - CGI_10025417 superfamily 247856 103 170 1.38E-11 56.7873 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10149 - CGI_10025420 superfamily 243062 278 362 2.35E-12 62.7478 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#10149 - CGI_10025420 superfamily 216062 55 227 1.13E-11 62.4554 cl02928 TGFb_propeptide superfamily - - TGF-beta propeptide; This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. Q#10150 - CGI_10025421 superfamily 243066 4 108 2.13E-27 106.162 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#10150 - CGI_10025421 superfamily 198867 117 214 1.51E-23 95.486 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#10150 - CGI_10025421 superfamily 243146 451 497 1.03E-11 60.753 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10150 - CGI_10025421 superfamily 243146 353 398 2.44E-11 59.5974 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10150 - CGI_10025421 superfamily 243146 412 462 3.22E-11 59.1091 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10150 - CGI_10025421 superfamily 243146 318 364 1.06E-09 54.8719 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10150 - CGI_10025421 superfamily 243146 264 317 1.65E-08 51.4051 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10150 - CGI_10025421 superfamily 243146 511 564 2.99E-08 50.6347 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10151 - CGI_10025422 superfamily 242181 2 332 4.74E-101 304.17 cl00900 Ldh_2 superfamily - - "Malate/L-lactate dehydrogenase; This family consists of bacterial and archaeal Malate/L-lactate dehydrogenase. L-lactate dehydrogenase, EC:1.1.1.27, catalyzes the reaction (S)-lactate + NAD(+) <=> pyruvate + NADH. Malate dehydrogenase, EC:1.1.1.37 and EC:1.1.1.82, catalyzes the reactions: (S)-malate + NAD(+) <=> oxaloacetate + NADH, and (S)-malate + NADP(+) <=> oxaloacetate + NADPH respectively." Q#10153 - CGI_10025424 superfamily 247723 160 244 7.78E-38 133.572 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10153 - CGI_10025424 superfamily 247723 9 85 7.12E-34 121.998 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10154 - CGI_10025425 superfamily 110440 354 379 0.00953086 33.5353 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#10155 - CGI_10025426 superfamily 241613 164 198 4.07E-06 44.8902 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#10155 - CGI_10025426 superfamily 243051 16 87 1.42E-05 44.6489 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#10155 - CGI_10025426 superfamily 246925 222 391 4.86E-05 44.6538 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#10155 - CGI_10025426 superfamily 220695 600 735 0.00655565 37.9435 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#10156 - CGI_10025427 superfamily 222090 51 239 1.52E-21 90.795 cl18636 Methyltransf_22 superfamily - - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#10157 - CGI_10025428 superfamily 218405 981 1173 1.61E-45 164.977 cl18455 DUF676 superfamily - - Putative serine esterase (DUF676); This family of proteins are probably serine esterase type enzymes with an alpha/beta hydrolase fold. Q#10157 - CGI_10025428 superfamily 221557 179 242 6.50E-16 74.6206 cl13784 DUF3657 superfamily - - "Protein of unknown function (DUF3657); This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam05057." Q#10158 - CGI_10025429 superfamily 241563 16 47 0.00317877 33.6068 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10160 - CGI_10025431 superfamily 247723 336 413 1.11E-51 169.809 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10160 - CGI_10025431 superfamily 247723 186 264 1.75E-49 164.01 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10160 - CGI_10025431 superfamily 247723 85 176 1.95E-44 150.652 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10161 - CGI_10025432 superfamily 245595 176 436 5.54E-163 466.127 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#10162 - CGI_10025433 superfamily 199226 17 43 0.00184311 35.8732 cl11662 LisH superfamily - - "LisH; The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex." Q#10163 - CGI_10025434 superfamily 202841 95 162 4.00E-20 80.7276 cl08411 Autophagy_act_C superfamily - - "Autophagocytosis associated protein, active-site domain; Autophagocytosis is a starvation-induced process responsible for transport of cytoplasmic proteins to the vacuole. The cysteine residue within the HPC motif is the putative active-site residue for recognition of the Apg5 subunit of the autophagosome complex." Q#10164 - CGI_10025435 superfamily 241625 22 105 9.32E-09 49.1261 cl00123 PROF superfamily C - "Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway." Q#10165 - CGI_10025436 superfamily 219525 621 666 1.65E-05 43.1766 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10165 - CGI_10025436 superfamily 219525 497 553 0.00674119 35.4726 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10167 - CGI_10025438 superfamily 219525 422 470 4.80E-07 47.0285 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10167 - CGI_10025438 superfamily 219525 300 356 6.58E-07 46.6433 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10168 - CGI_10025439 superfamily 243061 2 83 8.86E-30 103.191 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10170 - CGI_10025441 superfamily 217293 635 839 2.13E-60 205.945 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10170 - CGI_10025441 superfamily 217293 225 427 3.21E-48 171.662 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10170 - CGI_10025441 superfamily 202474 434 626 1.09E-19 89.2501 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#10170 - CGI_10025441 superfamily 202474 846 935 1.69E-14 73.4569 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#10171 - CGI_10025442 superfamily 219824 1 33 2.71E-05 43.4985 cl07139 AA_permease_N superfamily C - Amino acid permease N-terminal; This domain is found to the N-terminus of the amino acid permease domain (pfam00324) in metazoan Na-K-Cl cotransporters. Q#10172 - CGI_10025443 superfamily 243092 568 785 1.06E-27 113.969 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10172 - CGI_10025443 superfamily 243092 4 223 3.23E-23 100.487 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10172 - CGI_10025443 superfamily 243092 542 585 0.00550101 35.7908 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10173 - CGI_10025444 superfamily 241832 13 147 3.37E-50 165.151 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#10173 - CGI_10025444 superfamily 241832 196 305 1.30E-19 82.0212 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#10173 - CGI_10025444 superfamily 241832 158 187 2.16E-06 44.9689 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#10174 - CGI_10025445 superfamily 241787 1 66 5.55E-25 89.5311 cl00326 Ribosomal_L23 superfamily - - Ribosomal protein L23; Ribosomal protein L23. Q#10175 - CGI_10025446 superfamily 243992 5 59 1.03E-10 51.8058 cl05087 Complex1_LYR_1 superfamily - - "Complex1_LYR-like; This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria." Q#10176 - CGI_10025447 superfamily 247755 1031 1251 1.67E-117 367.974 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#10176 - CGI_10025447 superfamily 247755 378 580 4.10E-106 335.593 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#10176 - CGI_10025447 superfamily 216049 62 334 2.44E-34 134.721 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#10176 - CGI_10025447 superfamily 216049 722 988 8.58E-34 132.795 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#10177 - CGI_10025448 superfamily 246681 5 131 2.56E-66 204.596 cl14643 SRPBCC superfamily C - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#10178 - CGI_10025449 superfamily 248054 57 81 1.79E-05 42.8444 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#10182 - CGI_10000919 superfamily 248012 1 57 5.11E-18 73.4617 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#10183 - CGI_10003490 superfamily 247684 9 182 2.94E-20 86.4887 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10184 - CGI_10003491 superfamily 247684 9 182 2.86E-20 86.4887 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10185 - CGI_10003492 superfamily 247684 9 182 1.36E-20 87.2591 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10186 - CGI_10011675 superfamily 245847 133 268 6.18E-05 42.4896 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10186 - CGI_10011675 superfamily 245847 560 620 0.00170388 38.2524 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10187 - CGI_10011676 superfamily 241574 579 724 6.70E-52 183.171 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#10187 - CGI_10011676 superfamily 241574 784 1006 1.56E-14 73.7741 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#10187 - CGI_10011676 superfamily 245847 33 161 2.03E-05 44.4156 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10188 - CGI_10011677 superfamily 221337 705 739 2.78E-05 43.0743 cl13401 DUF3471 superfamily C - "Domain of unknown function (DUF3471); This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144." Q#10188 - CGI_10011677 superfamily 221198 148 182 0.00718531 36.1193 cl13226 KfrA_N superfamily N - "Plasmid replication region DNA-binding N-term; The broad host-range plasmid RK2 is able to replicate in and be inherited in a stable manner in diverse Gram-negative bacterial species. It encodes a number of co-ordinately regulated operons including a central control korF1 operon that represses the kfrA operon. The KfrA polypeptide is a site-specific DNA-binding protein whose operator overlaps the kfrA promoter. The N-terminus, containing an helix-turn-helix motif, is essential for function. Downstream from this family is an extended coiled-coil domain containing a heptad repeat segment which is probably responsible for formation of multimers, and may provide an example of a bridge to host structures required for plasmid partitioning." Q#10189 - CGI_10011678 superfamily 247912 32 232 2.77E-30 115.676 cl17358 Beta-lactamase superfamily C - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#10190 - CGI_10011679 superfamily 243096 61 150 2.76E-20 90.0496 cl02571 RhoGEF superfamily N - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#10192 - CGI_10011681 superfamily 241874 12 424 0 577.705 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#10193 - CGI_10011682 superfamily 241874 10 177 1.00E-73 233.722 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#10194 - CGI_10011683 superfamily 241874 1 370 0 536.874 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#10195 - CGI_10011684 superfamily 241874 26 581 0 651.664 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#10197 - CGI_10011686 superfamily 245874 68 117 3.37E-07 46.6506 cl12111 TNFR superfamily NC - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#10200 - CGI_10010073 superfamily 244560 1697 1820 1.62E-26 108.924 cl06954 BP28CT superfamily N - BP28CT (NUC211) domain; This C terminal domain is found in BAP28-like nucleolar proteins. Q#10200 - CGI_10010073 superfamily 204903 165 282 6.60E-21 91.4627 cl13787 U3snoRNP10 superfamily - - "U3 small nucleolar RNA-associated protein 10; This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam08146. This family is the protein associated with U3 snoRNA which is involved in the processing of pre-rRNA." Q#10201 - CGI_10010074 superfamily 241622 269 338 9.53E-09 51.7987 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#10201 - CGI_10010074 superfamily 243038 24 93 5.56E-18 77.2425 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#10201 - CGI_10010074 superfamily 243038 115 194 1.11E-16 74.0051 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#10202 - CGI_10010075 superfamily 247805 333 546 1.44E-78 253.176 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10202 - CGI_10010075 superfamily 247905 556 685 5.77E-41 146.999 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#10202 - CGI_10010075 superfamily 199156 261 275 7.34E-05 41.2857 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#10204 - CGI_10010077 superfamily 218284 43 99 3.65E-15 69.5907 cl04786 SOUL superfamily C - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#10205 - CGI_10010078 superfamily 245814 265 334 0.000125138 40.1144 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10206 - CGI_10010079 superfamily 245814 198 276 9.99E-06 43.196 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10207 - CGI_10010080 superfamily 241584 464 557 0.000244222 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10207 - CGI_10010080 superfamily 245814 283 361 6.39E-06 44.3516 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10211 - CGI_10002309 superfamily 248097 124 199 4.29E-15 68.4458 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10211 - CGI_10002309 superfamily 248097 2 69 0.000179907 38.7854 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10212 - CGI_10002310 superfamily 245612 16 494 1.91E-176 507.733 cl11426 Amidase superfamily - - Amidase; Amidase. Q#10213 - CGI_10002311 superfamily 247741 12 325 4.31E-172 492.904 cl17187 Aldolase_Class_I superfamily - - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#10213 - CGI_10002311 superfamily 247741 328 554 1.98E-106 324.571 cl17187 Aldolase_Class_I superfamily N - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#10214 - CGI_10004505 superfamily 241578 5 93 0.0016174 38.0414 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10219 - CGI_10004510 superfamily 217062 41 307 2.71E-43 151.268 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#10220 - CGI_10011704 superfamily 114037 98 186 0.00124914 37.6219 cl05042 BLYB superfamily - - "Borrelia hemolysin accessory protein; This family consists of several borrelia hemolysin accessory proteins (BLYB). BLYB was thought to be an accessory protein, which was proposed to comprise a hemolysis system but it is now thought that BlyA and BlyB function instead as a prophage-encoded holin or holin-like system." Q#10220 - CGI_10011704 superfamily 241563 8 52 0.00472852 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10220 - CGI_10011704 superfamily 241563 62 99 0.00858537 34.7624 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10223 - CGI_10011707 superfamily 242876 1 148 1.58E-53 168.358 cl02092 Clat_adaptor_s superfamily - - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#10224 - CGI_10011708 superfamily 247724 34 347 0 554.059 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10226 - CGI_10011710 superfamily 243092 32 319 9.51E-67 213.351 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10229 - CGI_10011713 superfamily 207662 61 153 1.72E-59 196.216 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#10229 - CGI_10011713 superfamily 207662 257 349 1.72E-59 196.216 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#10229 - CGI_10011713 superfamily 245599 443 677 5.21E-78 251.131 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#10230 - CGI_10011714 superfamily 220647 47 165 3.76E-07 47.3224 cl18565 L_HGMIC_fpl superfamily C - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#10231 - CGI_10011715 superfamily 243092 32 191 0.000359158 40.0108 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10232 - CGI_10011716 superfamily 241563 61 99 0.000961345 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10234 - CGI_10011718 superfamily 241609 32 106 8.07E-23 92.4411 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#10234 - CGI_10011718 superfamily 241609 191 274 1.16E-19 83.5815 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#10234 - CGI_10011718 superfamily 241613 410 440 2.69E-05 41.8086 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#10234 - CGI_10011718 superfamily 241609 112 187 5.46E-16 73.109 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#10234 - CGI_10011718 superfamily 243051 280 418 0.00659013 36.2018 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#10236 - CGI_10011720 superfamily 243128 49 222 3.17E-10 56.9851 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#10237 - CGI_10011721 superfamily 222150 218 243 7.68E-06 44.6901 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10237 - CGI_10011721 superfamily 222150 593 617 3.50E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10237 - CGI_10011721 superfamily 222150 857 881 7.33E-05 41.9937 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10237 - CGI_10011721 superfamily 222150 1232 1257 0.000116076 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10237 - CGI_10011721 superfamily 222150 1205 1228 0.000264768 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10239 - CGI_10011723 superfamily 244594 508 608 6.58E-52 179.014 cl07053 Sin3_corepress superfamily - - Sin3 family co-repressor; This domain is found on transcriptional regulators. It forms interactions with histone deacetylases. Q#10239 - CGI_10011723 superfamily 202341 88 134 1.51E-16 75.9738 cl07842 PAH superfamily - - "Paired amphipathic helix repeat; This family contains the paired amphipathic helix repeat. The family contains the yeast SIN3 gene (also known as SDI1) that is a negative regulator of the yeast HO gene. This repeat may be distantly related to the helix-loop-helix motif, which mediate protein-protein interactions." Q#10239 - CGI_10011723 superfamily 202341 429 475 1.71E-10 58.6398 cl07842 PAH superfamily - - "Paired amphipathic helix repeat; This family contains the paired amphipathic helix repeat. The family contains the yeast SIN3 gene (also known as SDI1) that is a negative regulator of the yeast HO gene. This repeat may be distantly related to the helix-loop-helix motif, which mediate protein-protein interactions." Q#10239 - CGI_10011723 superfamily 202341 260 320 2.30E-10 58.2546 cl07842 PAH superfamily - - "Paired amphipathic helix repeat; This family contains the paired amphipathic helix repeat. The family contains the yeast SIN3 gene (also known as SDI1) that is a negative regulator of the yeast HO gene. This repeat may be distantly related to the helix-loop-helix motif, which mediate protein-protein interactions." Q#10240 - CGI_10011724 superfamily 147390 127 179 7.81E-16 69.2744 cl04970 XPA_C superfamily - - XPA protein C-terminus; XPA protein C-terminus. Q#10240 - CGI_10011724 superfamily 189926 94 125 4.92E-09 50.8019 cl03148 XPA_N superfamily - - XPA protein N-terminal; XPA protein N-terminal. Q#10241 - CGI_10001915 superfamily 247743 222 359 7.51E-23 95.6759 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#10241 - CGI_10001915 superfamily 216502 422 595 8.27E-66 216.689 cl03209 Peptidase_M41 superfamily - - Peptidase family M41; Peptidase family M41. Q#10242 - CGI_10001918 superfamily 246940 64 243 3.32E-05 42.7058 cl15377 Radical_SAM superfamily - - "Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin." Q#10243 - CGI_10001919 superfamily 247749 45 364 5.26E-164 472.842 cl17195 LDH_MDH_like superfamily - - "NAD-dependent, lactate dehydrogenase-like, 2-hydroxycarboxylate dehydrogenase family; Members of this family include ubiquitous enzymes like L-lactate dehydrogenases (LDH), L-2-hydroxyisocaproate dehydrogenases, and some malate dehydrogenases (MDH). LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH/MDH-like proteins are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others." Q#10243 - CGI_10001919 superfamily 247827 394 561 5.49E-33 124 cl17273 Thioredoxin_4 superfamily - - Thioredoxin; Thioredoxin. Q#10244 - CGI_10001922 superfamily 241852 3 383 1.78E-119 356.368 cl00416 CS_ACL-C_CCL superfamily - - "Citrate synthase (CS), citryl-CoA lyase (CCL), the C-terminal portion of the single-subunit type ATP-citrate lyase (ACL) and the C-terminal portion of the large subunit of the two-subunit type ACL. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) from citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. Some CS proteins function as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. CCL cleaves citryl-CoA (CiCoA) to AcCoA and OAA. ACLs catalyze an ATP- and a CoA- dependant cleavage of citrate to form AcCoA and OAA; they do this in a multistep reaction, the final step of which is likely to involve the cleavage of CiCoA to generate AcCoA and OAA. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate CiCoA, and c) the hydrolysis of CiCoA to produce citrate and CoA. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. In fungi, yeast, plants, and animals ACL is cytosolic and generates AcCoA for lipogenesis. In several groups of autotrophic prokaryotes and archaea, ACL carries out the citrate-cleavage reaction of the reductive tricarboxylic acid (rTCA) cycle. In the family Aquificaceae this latter reaction in the rTCA cycle is carried out via a two enzyme system the second enzyme of which is CCL." Q#10247 - CGI_10014855 superfamily 241573 26 288 5.47E-38 142.85 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#10247 - CGI_10014855 superfamily 241653 307 429 3.15E-11 61.5676 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#10251 - CGI_10014859 superfamily 242406 4 104 5.45E-16 70.6981 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#10253 - CGI_10014861 superfamily 241644 5 144 1.45E-58 181.632 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#10254 - CGI_10014862 superfamily 245201 177 449 1.04E-114 345.29 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10254 - CGI_10014862 superfamily 219525 26 70 2.01E-08 51.2657 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10256 - CGI_10014864 superfamily 243034 130 221 1.59E-15 73.9535 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10256 - CGI_10014864 superfamily 243034 688 789 1.28E-06 47.76 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10260 - CGI_10009872 superfamily 248097 39 162 8.23E-27 99.647 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10261 - CGI_10009873 superfamily 218200 2 278 1.55E-140 407.138 cl04660 Glyco_transf_54 superfamily - - "N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein." Q#10262 - CGI_10009874 superfamily 247905 251 361 3.01E-24 100.39 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#10262 - CGI_10009874 superfamily 247805 35 171 8.17E-14 70.444 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10263 - CGI_10009875 superfamily 242274 41 276 1.18E-68 219.68 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#10264 - CGI_10009876 superfamily 242225 3 142 2.87E-78 230.853 cl00969 Ribosomal_S19e superfamily - - Ribosomal protein S19e; Ribosomal protein S19e. Q#10265 - CGI_10026801 superfamily 243050 94 149 5.01E-18 78.1603 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#10265 - CGI_10026801 superfamily 243050 33 86 1.65E-11 59.7915 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#10266 - CGI_10026802 superfamily 241571 37 142 5.07E-18 80.149 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10266 - CGI_10026802 superfamily 243146 295 346 0.000314534 38.8095 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10266 - CGI_10026802 superfamily 243146 248 294 0.00142233 36.8835 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10266 - CGI_10026802 superfamily 243146 335 367 0.00734405 34.8951 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#10272 - CGI_10026808 superfamily 245213 569 602 0.00042364 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10272 - CGI_10026808 superfamily 205157 699 734 1.88E-06 45.9915 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#10272 - CGI_10026808 superfamily 241578 144 184 0.00119369 40.0608 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10272 - CGI_10026808 superfamily 241578 356 391 0.00842648 37.3644 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10273 - CGI_10026809 superfamily 245815 1 467 0 816.249 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#10273 - CGI_10026809 superfamily 245815 529 756 1.31E-34 137.442 cl11961 ALDH-SF superfamily C - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#10274 - CGI_10026810 superfamily 241555 348 593 4.64E-126 375.353 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#10274 - CGI_10026810 superfamily 247757 70 310 2.40E-120 361.139 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#10275 - CGI_10026811 superfamily 245936 182 375 4.18E-69 217.803 cl12283 IPK superfamily - - Inositol polyphosphate kinase; ArgRIII has has been demonstrated to be an inositol polyphosphate kinase. Q#10276 - CGI_10026812 superfamily 243072 615 742 5.30E-31 120.566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10276 - CGI_10026812 superfamily 243072 682 809 2.26E-30 119.025 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10276 - CGI_10026812 superfamily 243072 353 483 3.58E-29 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10276 - CGI_10026812 superfamily 243072 283 415 3.96E-29 115.173 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10276 - CGI_10026812 superfamily 243072 221 339 1.44E-28 113.633 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10276 - CGI_10026812 superfamily 243072 936 1093 5.55E-28 112.092 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10276 - CGI_10026812 superfamily 243072 429 568 9.31E-25 102.462 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10276 - CGI_10026812 superfamily 243072 749 911 5.55E-23 97.4542 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10276 - CGI_10026812 superfamily 243072 115 275 1.63E-21 93.217 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10277 - CGI_10026813 superfamily 241758 324 591 1.55E-34 131.773 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#10277 - CGI_10026813 superfamily 241780 1 148 2.26E-44 156.293 cl00319 Gn_AT_II superfamily - - "Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer." Q#10278 - CGI_10026814 superfamily 245201 67 318 1.34E-51 174.347 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10282 - CGI_10026818 superfamily 152471 610 771 4.43E-44 157.906 cl13471 DUF3522 superfamily - - Protein of unknown function (DUF3522); This family of proteins is functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 220 to 787 amino acids in length. Q#10283 - CGI_10026819 superfamily 218122 308 627 7.06E-90 284.875 cl04558 Choline_transpo superfamily - - Plasma-membrane choline transporter; This family represents a high-affinity plasma-membrane choline transporter in C.elegans which is thought to be rate-limiting for ACh synthesis in cholinergic nerve terminals. Q#10285 - CGI_10026821 superfamily 247907 2888 3039 7.40E-40 148.335 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#10285 - CGI_10026821 superfamily 247907 3306 3448 9.21E-37 139.476 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#10285 - CGI_10026821 superfamily 247907 3480 3629 3.07E-29 117.519 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#10285 - CGI_10026821 superfamily 247907 2690 2856 2.41E-26 109.43 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#10285 - CGI_10026821 superfamily 247907 3065 3217 4.68E-25 105.578 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#10285 - CGI_10026821 superfamily 238012 546 593 2.77E-11 62.3718 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 1376 1418 5.01E-11 61.6014 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 2029 2074 3.02E-10 59.2902 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 1824 1867 3.09E-10 59.2902 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 497 541 6.18E-10 58.5198 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 636 678 7.38E-10 58.1346 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 450 494 3.04E-09 56.5938 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 1467 1510 5.47E-09 55.8234 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 795 838 8.57E-09 55.053 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 1514 1565 4.45E-08 53.127 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 2076 2120 5.20E-08 52.7418 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 1981 2027 5.90E-08 52.7418 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 405 447 8.33E-08 52.3566 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 592 633 1.01E-07 51.9714 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 742 793 2.43E-07 50.8158 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 1872 1927 1.94E-06 48.1194 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 276 317 1.41E-05 45.8082 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 683 740 1.53E-05 45.8082 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 1422 1459 7.79E-05 43.497 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 335 392 0.000218782 42.3414 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 238012 1928 1980 0.000267733 41.9562 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#10285 - CGI_10026821 superfamily 243198 28 275 9.26E-83 275.393 cl02806 Laminin_N superfamily - - Laminin N-terminal (Domain VI); Laminin N-terminal (Domain VI). Q#10285 - CGI_10026821 superfamily 243080 1626 1750 2.47E-36 137.392 cl02548 Laminin_B superfamily - - Laminin B (Domain IV); Laminin B (Domain IV). Q#10285 - CGI_10026821 superfamily 203372 2586 2716 1.05E-16 80.5951 cl05515 Laminin_II superfamily - - "Laminin Domain II; It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure." Q#10286 - CGI_10026822 superfamily 245814 336 395 6.11E-06 44.1088 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10286 - CGI_10026822 superfamily 245814 217 276 0.000541149 38.1568 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10290 - CGI_10026827 superfamily 221219 1 29 0.000348865 39.1539 cl13259 RRN7 superfamily - - RNA polymerase I-specific transcription initiation factor Rrn7; Rrn7 is a transcription binding factor that associates strongly with both Rrn6 and Rrn11 to form a complex which itself binds the TATA-binding protein and is required for transcription by the core domain of the RNA PolI promoter. Q#10291 - CGI_10026828 superfamily 241983 12 350 6.18E-44 154.823 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#10292 - CGI_10026829 superfamily 243072 330 438 2.14E-18 83.587 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10292 - CGI_10026829 superfamily 243072 149 257 3.38E-18 83.2018 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10292 - CGI_10026829 superfamily 243072 468 570 9.93E-13 67.0234 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10292 - CGI_10026829 superfamily 243072 1 103 1.02E-10 60.8602 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10292 - CGI_10026829 superfamily 219953 1243 1305 7.21E-15 74.1423 cl07317 DUF1777 superfamily N - Protein of unknown function (DUF1777); This is a family of eukaryotic proteins of unknown function. Some of the proteins in this family are putative nucleic acid binding proteins. Q#10292 - CGI_10026829 superfamily 217473 639 827 4.32E-07 51.9821 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#10293 - CGI_10026830 superfamily 248264 60 122 2.93E-13 62.2546 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#10295 - CGI_10026832 superfamily 246598 3 242 1.77E-138 392.867 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#10296 - CGI_10026833 superfamily 241563 166 201 4.23E-05 41.504 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10296 - CGI_10026833 superfamily 247792 36 82 0.00268375 36.2696 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10296 - CGI_10026833 superfamily 241563 105 152 0.000790538 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10297 - CGI_10026834 superfamily 245596 259 410 1.41E-10 62.9022 cl11394 Glyco_tranf_GTA_type superfamily NC - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#10298 - CGI_10026835 superfamily 241568 804 858 5.69E-07 48.228 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10298 - CGI_10026835 superfamily 214531 160 203 1.99E-05 43.3593 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#10298 - CGI_10026835 superfamily 214531 204 246 3.71E-05 42.5889 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#10298 - CGI_10026835 superfamily 241578 495 533 0.0014306 40.0608 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10300 - CGI_10026837 superfamily 248013 26 71 3.32E-05 40.7103 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#10300 - CGI_10026837 superfamily 218713 223 335 1.16E-37 134.392 cl05332 MRG superfamily C - "MRG; This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation. It contains 2 chromo domains and a leucine zipper motif." Q#10301 - CGI_10026838 superfamily 247683 769 830 8.69E-36 131.381 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#10301 - CGI_10026838 superfamily 247744 895 1072 4.04E-34 129.189 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#10301 - CGI_10026838 superfamily 241622 672 750 8.84E-17 77.2218 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#10301 - CGI_10026838 superfamily 241622 148 221 4.17E-13 66.8214 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#10301 - CGI_10026838 superfamily 243136 541 595 0.00222134 37.4876 cl02672 L27 superfamily - - L27 domain; The L27 domain is found in receptor targeting proteins Lin-2 and Lin-7. Q#10302 - CGI_10026839 superfamily 241599 147 203 5.57E-20 81.906 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#10308 - CGI_10026845 superfamily 246918 119 173 6.30E-12 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10308 - CGI_10026845 superfamily 246918 64 116 2.38E-11 57.5967 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10310 - CGI_10026847 superfamily 247684 19 148 2.02E-12 60.6803 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10311 - CGI_10026848 superfamily 247684 1 186 3.05E-121 350.539 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10314 - CGI_10026851 superfamily 245596 210 427 7.12E-36 133.761 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#10315 - CGI_10026852 superfamily 192254 699 759 0.000510756 39.4955 cl07826 ATG_C superfamily C - ATG C terminal domain; ATG2 (also known as Apg2) is a peripheral membrane protein. It functions in both cytoplasm to vacuole targeting and autophagy. Q#10316 - CGI_10026853 superfamily 243062 237 337 4.98E-11 58.4413 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#10318 - CGI_10026855 superfamily 241564 159 221 2.77E-24 95.0251 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#10318 - CGI_10026855 superfamily 247792 369 407 0.000168786 38.966 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10320 - CGI_10026857 superfamily 241564 411 478 4.03E-28 107.737 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#10320 - CGI_10026857 superfamily 247792 582 621 0.000758836 37.8104 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10320 - CGI_10026857 superfamily 241564 216 277 1.03E-10 58.4495 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#10321 - CGI_10026858 superfamily 241564 281 335 1.01E-21 88.4767 cl00035 BIR superfamily C - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#10321 - CGI_10026858 superfamily 247792 389 427 3.40E-05 41.2772 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10323 - CGI_10026860 superfamily 241564 17 84 1.02E-29 108.892 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#10323 - CGI_10026860 superfamily 247792 263 301 3.79E-05 40.5068 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10324 - CGI_10026861 superfamily 243051 386 540 3.34E-27 109.39 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#10324 - CGI_10026861 superfamily 241571 238 314 7.00E-09 54.7258 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10324 - CGI_10026861 superfamily 245213 687 718 0.000165782 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10324 - CGI_10026861 superfamily 241583 1 174 6.29E-33 126.532 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#10324 - CGI_10026861 superfamily 243051 598 679 2.15E-13 68.9165 cl02479 MAM superfamily NC - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#10324 - CGI_10026861 superfamily 243051 727 839 1.03E-12 66.6325 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#10327 - CGI_10026864 superfamily 243066 95 190 2.23E-10 55.3896 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#10332 - CGI_10008196 superfamily 245205 14 70 2.82E-06 41.8397 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#10333 - CGI_10008197 superfamily 243061 20 112 3.32E-08 49.2626 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10335 - CGI_10008199 superfamily 220389 2 160 6.67E-71 216.112 cl10747 DUF2053 superfamily - - Predicted membrane protein (DUF2053); This entry is of the conserved N-terminal 150 residues of proteins conserved from plants to humans. The function is unknown although some annotation suggests it to be a transmembrane protein. Q#10337 - CGI_10008201 superfamily 241571 162 272 2.00E-15 75.9118 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10337 - CGI_10008201 superfamily 241571 439 544 1.15E-11 64.741 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10337 - CGI_10008201 superfamily 241571 335 429 9.06E-10 58.963 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10337 - CGI_10008201 superfamily 245213 1791 1827 6.96E-09 54.9502 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10337 - CGI_10008201 superfamily 241568 875 936 5.04E-08 52.8504 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10337 - CGI_10008201 superfamily 241568 611 665 4.31E-07 50.154 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10337 - CGI_10008201 superfamily 247907 2175 2322 4.35E-07 51.2649 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#10337 - CGI_10008201 superfamily 245213 1838 1873 9.63E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10337 - CGI_10008201 superfamily 245213 1707 1750 2.22E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10337 - CGI_10008201 superfamily 243035 49 119 8.15E-06 46.8442 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10337 - CGI_10008201 superfamily 241568 690 726 2.29E-05 44.7612 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10337 - CGI_10008201 superfamily 245213 2108 2143 3.30E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10337 - CGI_10008201 superfamily 245213 1955 1990 0.000310226 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10337 - CGI_10008201 superfamily 245213 1993 2028 0.000322538 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10337 - CGI_10008201 superfamily 245213 1916 1951 0.000344494 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10337 - CGI_10008201 superfamily 245213 823 852 0.000357091 41.083 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10337 - CGI_10008201 superfamily 241568 1275 1330 0.00378413 38.2128 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10337 - CGI_10008201 superfamily 219525 2671 2721 1.35E-07 51.2657 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10337 - CGI_10008201 superfamily 219525 2728 2775 5.37E-07 49.7249 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10337 - CGI_10008201 superfamily 219525 1606 1649 1.59E-06 48.1841 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10337 - CGI_10008201 superfamily 219525 1502 1547 1.08E-05 45.8729 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10337 - CGI_10008201 superfamily 111397 1129 1208 6.15E-05 43.8691 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#10337 - CGI_10008201 superfamily 219525 1554 1599 0.000108487 42.7914 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10339 - CGI_10007169 superfamily 245814 71 131 4.30E-07 46.7135 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10340 - CGI_10007170 superfamily 245814 264 315 0.0085316 34.7723 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10340 - CGI_10007170 superfamily 245814 326 411 8.32E-11 58.6708 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10340 - CGI_10007170 superfamily 245814 89 134 0.000196285 39.5332 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10341 - CGI_10007171 superfamily 241568 1089 1146 1.79E-09 56.7024 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10341 - CGI_10007171 superfamily 245213 1790 1825 1.55E-07 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10341 - CGI_10007171 superfamily 241568 1150 1207 9.55E-05 42.8352 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10341 - CGI_10007171 superfamily 245213 545 579 0.000191439 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10341 - CGI_10007171 superfamily 245213 1827 1863 0.000221725 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10341 - CGI_10007171 superfamily 245213 507 541 0.00814084 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10341 - CGI_10007171 superfamily 111397 1004 1084 2.87E-13 68.5218 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#10341 - CGI_10007171 superfamily 246918 1880 1923 4.51E-09 55.2855 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10341 - CGI_10007171 superfamily 246918 587 636 2.62E-08 53.3595 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10341 - CGI_10007171 superfamily 219525 793 838 3.54E-08 52.8065 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10341 - CGI_10007171 superfamily 219525 2146 2191 1.31E-07 50.8805 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10341 - CGI_10007171 superfamily 111397 1208 1292 1.94E-07 51.1879 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#10341 - CGI_10007171 superfamily 219525 1560 1607 5.67E-07 49.3397 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10341 - CGI_10007171 superfamily 219525 1522 1565 4.83E-06 46.2581 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10341 - CGI_10007171 superfamily 219525 400 446 1.55E-05 45.1026 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10341 - CGI_10007171 superfamily 219525 343 393 3.04E-05 43.947 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10342 - CGI_10007172 superfamily 245213 160 195 1.11E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10342 - CGI_10007172 superfamily 111397 197 277 2.55E-13 63.8994 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#10343 - CGI_10007173 superfamily 241568 172 233 6.11E-06 45.5316 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10343 - CGI_10007173 superfamily 241568 246 295 1.83E-05 43.9908 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10343 - CGI_10007173 superfamily 245213 835 866 0.000409015 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10343 - CGI_10007173 superfamily 245213 799 828 0.00687813 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10343 - CGI_10007173 superfamily 246918 874 927 1.30E-11 61.8339 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10343 - CGI_10007173 superfamily 111397 295 378 4.62E-09 55.0398 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#10343 - CGI_10007173 superfamily 219525 635 685 6.75E-08 50.8805 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10343 - CGI_10007173 superfamily 219525 1152 1197 3.16E-07 48.9545 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10343 - CGI_10007173 superfamily 219525 692 738 0.00035767 40.095 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10343 - CGI_10007173 superfamily 111397 87 167 0.00221351 37.7059 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#10344 - CGI_10007174 superfamily 245814 136 208 6.31E-05 40.9517 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10346 - CGI_10007176 superfamily 245814 223 309 5.83E-05 40.587 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10347 - CGI_10007177 superfamily 219525 25 70 4.42E-09 47.7989 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10357 - CGI_10016580 superfamily 110440 214 240 0.000760547 36.6169 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#10359 - CGI_10016582 superfamily 243035 4 74 1.21E-13 61.8669 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10360 - CGI_10016583 superfamily 243035 39 148 1.73E-18 76.8897 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10361 - CGI_10016584 superfamily 243035 32 161 1.42E-21 84.9789 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10362 - CGI_10016585 superfamily 247856 48 90 1.03E-09 55.2465 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10363 - CGI_10016586 superfamily 241600 299 346 2.19E-10 58.4059 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10364 - CGI_10016587 superfamily 241600 456 666 5.17E-95 294.533 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10365 - CGI_10016589 superfamily 243092 470 681 2.89E-10 60.4264 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10366 - CGI_10016590 superfamily 222412 142 172 0.000335975 36.1945 cl16432 Tnp_zf-ribbon_2 superfamily - - DDE_Tnp_1-like zinc-ribbon; This zinc-ribbon domain is frequently found at the C-terminal of proteins derived from transposable elements. Q#10367 - CGI_10016592 superfamily 245847 33 99 6.15E-15 66.4261 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10368 - CGI_10016593 superfamily 246664 17 266 4.55E-80 248.644 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#10375 - CGI_10002487 superfamily 246683 791 889 3.46E-51 183.09 cl14648 Aldose_epim superfamily C - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#10375 - CGI_10002487 superfamily 243119 364 412 4.27E-05 42.4209 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#10375 - CGI_10002487 superfamily 243119 428 463 0.000455181 39.3494 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#10375 - CGI_10002487 superfamily 241611 644 733 0.00054097 40.0644 cl00102 PTX superfamily N - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#10379 - CGI_10009583 superfamily 220695 55 182 3.59E-05 43.3363 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#10380 - CGI_10009584 superfamily 245603 1 66 2.74E-11 56.0612 cl11403 pepsin_retropepsin_like superfamily - - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#10381 - CGI_10002791 superfamily 247724 88 107 0.00172271 35.3802 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10384 - CGI_10002877 superfamily 243082 13 157 6.09E-58 185.539 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#10387 - CGI_10002710 superfamily 245847 8 120 2.67E-13 62.1889 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10388 - CGI_10002711 superfamily 245847 191 237 0.00327445 35.6102 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10391 - CGI_10002714 superfamily 243091 101 156 0.000258691 37.6991 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#10393 - CGI_10003448 superfamily 243034 383 485 5.06E-17 77.8055 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10393 - CGI_10003448 superfamily 243092 21 197 2.06E-23 100.487 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10393 - CGI_10003448 superfamily 243092 524 624 4.53E-09 56.9596 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10396 - CGI_10011542 superfamily 219094 77 317 6.39E-07 48.8013 cl05875 Neogenin_C superfamily N - "Neogenin C-terminus; This family represents the C-terminus of eukaryotic neogenin precursor proteins, which contains several potential phosphorylation sites. Neogenin is a member of the N-CAM family of cell adhesion molecules (and therefore contains multiple copies of pfam00047 and pfam00041) and is closely related to the DCC tumour suppressor gene product - these proteins may play an integral role in regulating differentiation programmes and/or cell migration events within many adult and embryonic tissues." Q#10397 - CGI_10011543 superfamily 241584 730 818 3.77E-16 76.3811 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10397 - CGI_10011543 superfamily 241584 1136 1231 4.26E-16 75.9959 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10397 - CGI_10011543 superfamily 241584 825 918 4.26E-15 73.2995 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10397 - CGI_10011543 superfamily 241584 929 1019 3.54E-12 64.8251 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10397 - CGI_10011543 superfamily 241584 636 713 1.67E-08 53.6543 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10397 - CGI_10011543 superfamily 241584 1032 1128 2.80E-07 50.1875 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10397 - CGI_10011543 superfamily 247684 13 230 2.80E-57 205.204 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10397 - CGI_10011543 superfamily 245814 244 325 6.35E-16 75.5923 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10397 - CGI_10011543 superfamily 245814 546 615 1.31E-15 74.2267 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10397 - CGI_10011543 superfamily 245814 337 418 5.38E-12 63.9533 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10397 - CGI_10011543 superfamily 245814 439 526 7.96E-11 60.5968 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10398 - CGI_10011544 superfamily 247684 17 178 7.68E-45 153.972 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10399 - CGI_10011545 superfamily 243384 26 48 0.000126283 36.6474 cl03312 Flavokinase superfamily C - Riboflavin kinase; This family represents the C-terminal region of the bifunctional riboflavin biosynthesis protein known as RibC in Bacillus subtilis. The RibC protein from Bacillus subtilis has both flavokinase and flavin adenine dinucleotide synthetase (FAD-synthetase) activities. RibC plays an essential role in the flavin metabolism. This domain is thought to have kinase activity. Q#10401 - CGI_10011547 superfamily 241984 60 150 8.15E-41 141.924 cl00615 Membrane-FADS-like superfamily C - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#10401 - CGI_10011547 superfamily 241984 204 300 1.12E-30 114.96 cl00615 Membrane-FADS-like superfamily N - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#10402 - CGI_10011548 superfamily 216966 11 196 1.21E-49 161.723 cl03523 HORMA superfamily - - "HORMA domain; The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognise chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity." Q#10403 - CGI_10011549 superfamily 241596 104 161 3.88E-14 63.7723 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#10404 - CGI_10011550 superfamily 241596 94 154 2.58E-15 68.7799 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#10405 - CGI_10003838 superfamily 247724 21 192 5.00E-111 319.348 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10406 - CGI_10003839 superfamily 213148 324 445 1.45E-28 113.924 cl17041 helicase_insert_domain superfamily - - "helical domain inserted in SF2-type helicase domain in Hef-, MDA5- and FancM-like proteins; This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases, like archaeal Hef helicase, MDA5-like helicases and FancM-like helicases. The exact function of this domain is unknown, but seems to play a role in interaction with nucleotides and/or the stabilization of the nucleotide complex." Q#10406 - CGI_10003839 superfamily 247905 458 603 6.40E-24 100.775 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#10406 - CGI_10003839 superfamily 247805 131 274 4.06E-21 93.1708 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10406 - CGI_10003839 superfamily 245342 2105 2185 1.56E-05 45.4175 cl10594 ERCC4 superfamily - - ERCC4 domain; This domain is a family of nucleases. The family includes EME1 which is an essential component of a Holliday junction resolvase. EME1 interacts with MUS81 to form a DNA structure-specific endonuclease. Q#10407 - CGI_10003840 superfamily 190724 1136 1201 4.23E-09 55.1158 cl04231 Ribosomal_S5_C superfamily - - "Ribosomal protein S5, C-terminal domain; Ribosomal protein S5, C-terminal domain. " Q#10407 - CGI_10003840 superfamily 144065 1066 1111 0.000373828 40.3465 cl02842 Ribosomal_S5 superfamily - - "Ribosomal protein S5, N-terminal domain; Ribosomal protein S5, N-terminal domain. " Q#10408 - CGI_10003841 superfamily 241672 2 297 5.54E-119 347.23 cl00192 ribokinase_pfkB_like superfamily - - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#10410 - CGI_10003843 superfamily 215859 44 145 0.00226396 35.6551 cl18347 Peptidase_S9 superfamily N - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#10411 - CGI_10003221 superfamily 243035 49 146 1.57E-09 51.4666 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10412 - CGI_10003369 superfamily 243269 48 418 3.60E-67 222.552 cl03012 Ammonium_transp superfamily - - Ammonium Transporter Family; Ammonium Transporter Family. Q#10413 - CGI_10003370 superfamily 243072 3 75 4.23E-09 50.4598 cl02529 ANK superfamily NC - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10414 - CGI_10004085 superfamily 149199 10 333 0 516.223 cl06843 DUF1693 superfamily - - Domain of unknown function (DUF1693); This family contains many hypothetical proteins. It also includes four nematode prion-like proteins. This domain has been identified as part of the nucleotidyltransferase superfamily. Q#10415 - CGI_10004086 superfamily 241578 850 951 0.000511307 41.123 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10416 - CGI_10003670 superfamily 247896 9 63 7.56E-14 66.1834 cl17342 Pyruvate_Kinase superfamily N - "Pyruvate kinase (PK): Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors. Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state. PK exists as several different isozymes, depending on organism and tissue type. In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung. PK forms a homotetramer, with each subunit containing three domains. The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer." Q#10417 - CGI_10003671 superfamily 247896 1 461 0 701.763 cl17342 Pyruvate_Kinase superfamily - - "Pyruvate kinase (PK): Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors. Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state. PK exists as several different isozymes, depending on organism and tissue type. In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung. PK forms a homotetramer, with each subunit containing three domains. The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer." Q#10421 - CGI_10006375 superfamily 247804 468 515 3.36E-08 50.2666 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#10421 - CGI_10006375 superfamily 243077 23 67 4.65E-06 44.4436 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#10423 - CGI_10006377 superfamily 248097 3 124 1.55E-13 62.2826 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10424 - CGI_10006378 superfamily 247057 24 105 1.16E-39 128.718 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#10425 - CGI_10006379 superfamily 247792 18 70 6.81E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10426 - CGI_10006380 superfamily 248097 1 71 0.000980196 34.163 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10427 - CGI_10006381 superfamily 247057 1085 1166 2.28E-38 139.504 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#10427 - CGI_10006381 superfamily 221437 819 932 3.40E-33 125.93 cl13561 DUF3588 superfamily - - "Protein of unknown function (DUF3588); This family of proteins is found in eukaryotes. Proteins in this family are typically between 129 and 866 amino acids in length, and the family is found in association with pfam02820. The exact function of this family is not known." Q#10427 - CGI_10006381 superfamily 248259 352 449 3.17E-32 122.358 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#10427 - CGI_10006381 superfamily 248259 687 779 4.69E-23 96.1646 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#10427 - CGI_10006381 superfamily 248259 580 676 3.84E-20 87.6902 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#10427 - CGI_10006381 superfamily 248259 466 559 3.97E-20 87.6902 cl17705 MBT superfamily - - "mbt repeat; The function of this repeat is unknown, but is found in a number of nuclear proteins such as drosophila sex comb on midleg protein. The repeat is found in up to four copies. The repeat contains a completely conserved glutamate at its amino terminus that may be important for function." Q#10428 - CGI_10006382 superfamily 221559 103 225 5.57E-43 145.072 cl13789 DUF3661 superfamily - - "Vaculolar membrane protein; This domain family is found in eukaryotes, and is typically between 123 and 138 amino acids in length." Q#10432 - CGI_10006386 superfamily 246918 213 265 6.22E-12 61.0635 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10432 - CGI_10006386 superfamily 246918 270 323 4.13E-11 58.7523 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10432 - CGI_10006386 superfamily 246918 383 437 5.37E-07 46.8111 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10432 - CGI_10006386 superfamily 246918 156 207 1.88E-06 45.2703 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10435 - CGI_10005557 superfamily 245055 30 380 3.24E-101 316.437 cl09326 MATE_like superfamily - - "Multidrug and toxic compound extrusion family and similar proteins; The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria." Q#10435 - CGI_10005557 superfamily 245055 483 559 2.29E-05 45.6414 cl09326 MATE_like superfamily NC - "Multidrug and toxic compound extrusion family and similar proteins; The integral membrane proteins from the MATE family are involved in exporting metabolites across the cell membrane and are responsible for multidrug resistance (MDR) in many bacteria and animals. MATE has also been identified as a large multigene family in plants, where the proteins are linked to disease resistance. A number of family members are involved in the synthesis of peptidoglycan components in bacteria." Q#10436 - CGI_10005558 superfamily 245206 30 208 3.65E-39 139.353 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#10436 - CGI_10005558 superfamily 245206 206 329 5.30E-34 125.486 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#10439 - CGI_10005561 superfamily 241740 2 104 3.96E-37 123.991 cl00269 cytidine_deaminase-like superfamily - - "Cytidine and deoxycytidylate deaminase zinc-binding region. The family contains cytidine deaminases, nucleoside deaminases, deoxycytidylate deaminases and riboflavin deaminases. Also included are the apoBec family of mRNA editing enzymes. All members are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate." Q#10442 - CGI_10005564 superfamily 247057 147 214 1.89E-40 134.452 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#10444 - CGI_10012005 superfamily 242203 32 204 3.54E-33 123.907 cl00935 Brix superfamily - - Brix domain; Brix domain. Q#10446 - CGI_10012007 superfamily 241563 60 102 0.00020745 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10446 - CGI_10012007 superfamily 110440 490 516 0.00703075 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#10448 - CGI_10012009 superfamily 245864 74 533 2.30E-103 330.78 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#10448 - CGI_10012009 superfamily 245864 541 888 1.08E-52 190.567 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#10449 - CGI_10012010 superfamily 241640 206 441 9.12E-91 277.62 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#10451 - CGI_10012012 superfamily 219237 53 150 3.90E-30 114.09 cl06140 PLA2G12 superfamily N - Group XII secretory phospholipase A2 precursor (PLA2G12); This family consists of several group XII secretory phospholipase A2 precursor (PLA2G12) (EC:3.1.1.4) proteins. Group XII and group V PLA(2)s are thought to participate in helper T cell immune response through release of immediate second signals and generation of downstream eicosanoids. Q#10451 - CGI_10012012 superfamily 219237 181 263 3.27E-26 102.919 cl06140 PLA2G12 superfamily N - Group XII secretory phospholipase A2 precursor (PLA2G12); This family consists of several group XII secretory phospholipase A2 precursor (PLA2G12) (EC:3.1.1.4) proteins. Group XII and group V PLA(2)s are thought to participate in helper T cell immune response through release of immediate second signals and generation of downstream eicosanoids. Q#10452 - CGI_10012013 superfamily 241563 60 102 6.71E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10452 - CGI_10012013 superfamily 110440 527 554 0.00392359 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#10453 - CGI_10012014 superfamily 245864 25 445 5.12E-97 311.135 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#10453 - CGI_10012014 superfamily 243131 702 767 0.000301149 39.6463 cl02660 zf-TAZ superfamily - - "TAZ zinc finger; The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumour suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC." Q#10454 - CGI_10012015 superfamily 245201 78 244 1.45E-23 96.1517 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10455 - CGI_10012016 superfamily 244083 668 784 1.27E-28 111.573 cl05417 PLA2_like superfamily - - "PLA2_like: Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers." Q#10455 - CGI_10012016 superfamily 241958 275 537 3.36E-88 285.949 cl00573 SDF superfamily N - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#10455 - CGI_10012016 superfamily 241958 58 197 1.46E-26 111.839 cl00573 SDF superfamily C - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#10456 - CGI_10012017 superfamily 247769 499 639 1.96E-07 50.4157 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#10456 - CGI_10012017 superfamily 242093 768 816 0.00926424 35.4661 cl00788 MttA_Hcf106 superfamily N - mttA/Hcf106 family; Members of this protein family are involved in a sec independent translocation mechanism. This pathway has been called the DeltapH pathway in chloroplasts. Members of this family in E.coli are involved in export of redox proteins with a "twin arginine" leader motif. Q#10458 - CGI_10012019 superfamily 244539 17 142 6.74E-18 82.88 cl06868 FNR_like superfamily N - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#10458 - CGI_10012019 superfamily 247683 625 677 1.91E-16 74.7614 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#10458 - CGI_10012019 superfamily 202162 305 341 3.30E-14 67.8208 cl03489 HS1_rep superfamily - - Repeat in HS1/Cortactin; The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018. Q#10458 - CGI_10012019 superfamily 202162 268 304 2.75E-13 65.5096 cl03489 HS1_rep superfamily - - Repeat in HS1/Cortactin; The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018. Q#10458 - CGI_10012019 superfamily 202162 342 378 4.84E-13 64.7392 cl03489 HS1_rep superfamily - - Repeat in HS1/Cortactin; The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018. Q#10458 - CGI_10012019 superfamily 202162 379 415 7.82E-12 61.2724 cl03489 HS1_rep superfamily - - Repeat in HS1/Cortactin; The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018. Q#10458 - CGI_10012019 superfamily 202162 231 267 1.06E-10 57.8056 cl03489 HS1_rep superfamily - - Repeat in HS1/Cortactin; The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018. Q#10458 - CGI_10012019 superfamily 202162 416 452 4.60E-10 56.2648 cl03489 HS1_rep superfamily - - Repeat in HS1/Cortactin; The function of this repeat is unknown. Seven copies are found in cortactin and four copies are found in HS1. The repeats are always found amino terminal to an SH3 domain pfam00018. Q#10460 - CGI_10007158 superfamily 241754 230 552 8.16E-170 511.804 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#10460 - CGI_10007158 superfamily 247057 33 75 7.21E-05 42.2844 cl15755 SAM_superfamily superfamily C - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#10461 - CGI_10007159 superfamily 241563 51 87 7.23E-07 46.7036 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10461 - CGI_10007159 superfamily 110440 472 497 0.00387898 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#10463 - CGI_10007161 superfamily 241563 388 428 7.69E-08 50.1704 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10463 - CGI_10007161 superfamily 128778 7 117 0.00586916 36.4739 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#10463 - CGI_10007161 superfamily 110440 808 833 0.00796654 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#10466 - CGI_10007165 superfamily 216981 157 334 2.66E-07 49.8386 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#10466 - CGI_10007165 superfamily 209366 769 791 2.21E-05 42.5734 cl11604 zf-A20 superfamily - - A20-like zinc finger; The A20 Zn-finger of bovine/human Rabex5/rabGEF1 is a Ubiquitin Binding Domain. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation. Q#10468 - CGI_10007167 superfamily 243058 326 421 5.79E-06 44.6128 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#10468 - CGI_10007167 superfamily 243058 160 276 3.32E-05 42.3016 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#10470 - CGI_10025805 superfamily 247068 563 658 2.23E-27 109.325 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 762 856 2.45E-18 83.1317 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 1106 1207 5.10E-16 76.5833 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 1339 1433 1.03E-14 72.7313 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 1215 1313 9.04E-14 70.0349 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 1449 1571 1.26E-13 69.6497 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 996 1097 6.81E-12 64.2569 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 323 432 2.45E-11 62.7161 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 441 555 1.88E-10 60.0197 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 866 977 4.34E-09 56.1678 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 194 314 2.83E-08 53.8566 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 667 732 1.98E-06 48.0786 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 93 185 0.000156739 42.3006 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10470 - CGI_10025805 superfamily 247068 1592 1681 0.000771121 39.9724 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 1049 1148 3.38E-21 91.6061 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 721 818 5.58E-18 81.9761 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 1268 1367 1.14E-17 81.2057 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 949 1040 1.83E-17 80.8205 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 1156 1250 3.13E-16 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 504 596 2.47E-14 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 388 495 3.55E-14 71.1905 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 1387 1463 6.89E-14 70.0349 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 618 712 1.39E-13 69.2645 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 58 145 5.31E-12 64.6421 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 276 378 4.11E-09 56.1678 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 828 939 1.03E-06 48.849 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10471 - CGI_10025806 superfamily 247068 155 262 4.47E-05 43.8414 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10472 - CGI_10025807 superfamily 247724 127 335 3.82E-136 399.358 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10472 - CGI_10025807 superfamily 243184 450 536 1.43E-46 159.33 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#10472 - CGI_10025807 superfamily 243185 358 444 1.11E-29 112.698 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#10473 - CGI_10025808 superfamily 241782 15 438 7.10E-98 302.18 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#10474 - CGI_10025809 superfamily 243050 452 504 8.56E-26 99.817 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#10474 - CGI_10025809 superfamily 241622 10 88 7.30E-17 75.681 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#10474 - CGI_10025809 superfamily 243050 332 382 1.46E-22 90.9784 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#10474 - CGI_10025809 superfamily 243050 393 444 3.43E-18 78.6736 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#10475 - CGI_10025810 superfamily 241622 13 93 2.48E-20 83.385 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#10475 - CGI_10025810 superfamily 243050 255 306 6.66E-21 83.9611 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#10475 - CGI_10025810 superfamily 128974 172 196 4.89E-08 48.368 cl00302 ZM superfamily - - "ZASP-like motif; Short motif (26 amino acids) present in an alpha-actinin-binding protein, ZASP, and similar molecules." Q#10477 - CGI_10025812 superfamily 241640 629 856 2.04E-88 282.243 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#10477 - CGI_10025812 superfamily 241571 452 568 9.70E-30 115.202 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10477 - CGI_10025812 superfamily 241571 324 438 1.36E-29 114.817 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10477 - CGI_10025812 superfamily 241571 24 139 1.13E-26 106.343 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10477 - CGI_10025812 superfamily 241571 190 305 3.24E-25 102.105 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#10477 - CGI_10025812 superfamily 241613 575 609 5.96E-12 61.8389 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#10479 - CGI_10025814 superfamily 241599 51 89 3.89E-09 50.7049 cl00084 homeodomain superfamily C - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#10482 - CGI_10025817 superfamily 241547 69 273 1.05E-71 223.699 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#10483 - CGI_10025818 superfamily 245201 1 321 0 634.229 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10484 - CGI_10025819 superfamily 245814 237 297 0.000345379 39.5799 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10484 - CGI_10025819 superfamily 245814 128 202 0.00102672 38.1759 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10484 - CGI_10025819 superfamily 245814 22 93 0.00250203 36.7145 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10485 - CGI_10025820 superfamily 216686 1 174 2.59E-39 135.914 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#10486 - CGI_10025821 superfamily 216686 121 307 7.86E-35 127.824 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#10487 - CGI_10025822 superfamily 220015 31 118 3.38E-19 77.3734 cl07407 RPA_C superfamily - - "Replication protein A C terminal; This domain corresponds to the C terminal of the single stranded DNA binding protein RPA (replication protein A). RPA is involved in many DNA metabolic pathways including DNA replication, DNA repair, recombination, cell cycle and DNA damage checkpoints." Q#10488 - CGI_10025823 superfamily 247792 13 56 7.53E-08 50.1368 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10488 - CGI_10025823 superfamily 243035 713 779 3.45E-06 46.0738 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10489 - CGI_10025824 superfamily 242906 63 143 2.62E-25 96.5397 cl02153 TFIIE_beta_winged_helix superfamily - - "TFIIE_beta_winged_helix domain, located at the central core region of TFIIE beta, with double-stranded DNA binding activity; Transcription Factor IIE (TFIIE) beta winged-helix (or forkhead) domain is located at the central core region of TFIIE beta. The winged-helix is a form of helix-turn-helix (HTH) domain which typically binds DNA with the 3rd helix. The winged-helix domain is distinguished by the presence of a C-terminal beta-strand hairpin unit (the wing) that packs against the cleft of the tri-helical core. Although most winged-helix domains are multi-member families, TFIIE beta winged-helix domain is typically found as a single orthologous group. TFIIE is one of the six eukaryotic general transcription factors (TFIIA, TFIIB, TFIID, TFIIE, TFIIF and TFIIH) that are required for transcription initiation of protein-coding genes. TFIIE is a heterotetramer consisting of two copies each of alpha and beta subunits. TFIIE beta contains several functional domains, an N-terminal serine-rich region, a central core domain exhibiting a winged-helix structure capable of binding double-stranded DNA, a leucine repeat, a sigma3 region, and a C-terminal domain containing two basic regions. The assembly of transcription preinitiation complex (PIC) includes the general transcription factors and RNA polymerase II (pol II) initiated by the binding of the TBP subunit of TFIID to the TATA box, followed by either the sequential assembly of other general transcription factors and pol II or a preassembled pol II holoenzyme pathway. TFIIE interacts directly with TFIIF, TFIIB, pol II, and promoter DNA. TFIIE recruits TFIIH and regulates its activities. TFIIE and TFIIH are also important for the transition from initiation to elongation." Q#10491 - CGI_10025826 superfamily 247724 122 338 4.14E-12 65.6457 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10491 - CGI_10025826 superfamily 245201 719 913 0.00149818 39.9126 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10492 - CGI_10025827 superfamily 115057 1616 1952 7.36E-82 283.11 cl05724 BLVR superfamily N - "Bovine leukaemia virus receptor (BLVR); This family consists of several bovine specific leukaemia virus receptors which are thought to function as transmembrane proteins, although their exact function is unknown." Q#10492 - CGI_10025827 superfamily 115057 988 1128 2.98E-38 152.142 cl05724 BLVR superfamily N - "Bovine leukaemia virus receptor (BLVR); This family consists of several bovine specific leukaemia virus receptors which are thought to function as transmembrane proteins, although their exact function is unknown." Q#10492 - CGI_10025827 superfamily 115057 710 819 3.41E-18 89.3541 cl05724 BLVR superfamily C - "Bovine leukaemia virus receptor (BLVR); This family consists of several bovine specific leukaemia virus receptors which are thought to function as transmembrane proteins, although their exact function is unknown." Q#10492 - CGI_10025827 superfamily 115057 1423 1532 3.41E-18 89.3541 cl05724 BLVR superfamily C - "Bovine leukaemia virus receptor (BLVR); This family consists of several bovine specific leukaemia virus receptors which are thought to function as transmembrane proteins, although their exact function is unknown." Q#10494 - CGI_10025830 superfamily 241576 61 169 2.30E-22 93.2115 cl00055 MH1 superfamily - - "N-terminal Mad Homology 1 (MH1) domain; The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. Receptor-regulated SMAD proteins (R-SMADs, including SMAD1, SMAD2, SMAD3, SMAD5, and SMAD9) are activated by phosphorylation by transforming growth factor (TGF)-beta type I receptors. The active R-SMAD associates with a common mediator SMAD (Co-SMAD or SMAD4) and other cofactors, which together translocate to the nucleus to regulate gene expression. The inhibitory or antagonistic SMADs (I-SMADs, including SMAD6 and SMAD7) negatively regulate TGF-beta signaling by competing with R-SMADs for type I receptor or Co-SMADs. MH1 domains of R-SMAD and SMAD4 contain a nuclear localization signal as well as DNA-binding activity. The activated R-SMAD/SMAD4 complex then binds with very low affinity to a DNA sequence CAGAC called SMAD-binding element (SBE) via the MH1 domain." Q#10494 - CGI_10025830 superfamily 151076 2 40 8.15E-19 81.4988 cl11159 NfI_DNAbd_pre-N superfamily - - "Nuclear factor I protein pre-N-terminus; The Nuclear factor I (NFI) family of site-specific DNA-binding proteins (also known as CTF or CAAT box transcription factor) functions both in viral DNA replication and in the regulation of gene expression in higher organisms. The N-terminal 200 residues contains the DNA-binding and dimerisation domain, but also has an 8-47 residue highly conserved region 5' of this, whose function is not known. Deletion of the N-terminal 200 amino acids removes the DNA-binding activity, dimerisation-ability and the stimulation of adenovirus DNA replication." Q#10495 - CGI_10025831 superfamily 245201 528 717 2.12E-39 146.896 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10496 - CGI_10025832 superfamily 246925 168 283 1.60E-05 46.1946 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#10498 - CGI_10025834 superfamily 241563 60 100 0.00522369 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10499 - CGI_10025835 superfamily 221566 8 339 1.10E-121 377.458 cl13804 DUF3668 superfamily - - Cep120 protein; This family includes the Cep120 protein which is associated with centriole structure and function. Q#10500 - CGI_10025836 superfamily 245201 46 239 2.43E-32 122.345 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10501 - CGI_10025837 superfamily 245213 468 502 8.84E-05 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10501 - CGI_10025837 superfamily 147730 889 1089 1.35E-123 379.834 cl05347 TSP_C superfamily - - Thrombospondin C-terminal region; This region is found at the C-terminus of thrombospondin and related proteins. Q#10501 - CGI_10025837 superfamily 202235 642 676 1.58E-09 55.4691 cl15981 TSP_3 superfamily - - Thrombospondin type 3 repeat; The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure. Q#10501 - CGI_10025837 superfamily 202235 763 799 4.48E-05 42.3723 cl15981 TSP_3 superfamily - - Thrombospondin type 3 repeat; The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure. Q#10501 - CGI_10025837 superfamily 202235 837 871 0.000699869 38.9055 cl15981 TSP_3 superfamily - - Thrombospondin type 3 repeat; The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure. Q#10501 - CGI_10025837 superfamily 152034 203 248 0.00140671 38.1041 cl13107 COMP superfamily - - Cartilage oligomeric matrix protein; This family of proteins represents the five-stranded coiled-coil domain of cartilage oligomeric matrix protein (COMP). This region has a binding site between two internal rings formed by Leu37 and Thr40 Q#10501 - CGI_10025837 superfamily 205157 578 606 0.00271828 37.1319 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#10502 - CGI_10025838 superfamily 241572 153 242 2.21E-14 68.034 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#10502 - CGI_10025838 superfamily 241572 252 316 2.95E-06 44.9262 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#10503 - CGI_10025839 superfamily 245847 22 154 3.79E-32 114.757 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10506 - CGI_10025843 superfamily 199156 114 130 0.00830597 31.2705 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#10508 - CGI_10025845 superfamily 247805 33 181 6.66E-10 54.2656 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10510 - CGI_10025847 superfamily 243035 42 178 1.99E-26 105.394 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10510 - CGI_10025847 superfamily 243035 383 517 3.85E-26 104.239 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10510 - CGI_10025847 superfamily 243035 202 335 1.29E-25 103.083 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10511 - CGI_10025848 superfamily 243035 38 187 1.14E-24 94.9941 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10512 - CGI_10025849 superfamily 241645 5 116 3.98E-69 205.162 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#10513 - CGI_10025850 superfamily 247858 564 728 6.28E-22 93.6066 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#10515 - CGI_10025852 superfamily 245201 417 708 0 591.506 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10518 - CGI_10025856 superfamily 247755 617 835 8.73E-143 425.637 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#10518 - CGI_10025856 superfamily 248376 288 592 1.14E-52 186.075 cl17822 MutS_III superfamily - - "MutS domain III; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterized in." Q#10518 - CGI_10025856 superfamily 216613 7 118 3.58E-20 87.6292 cl03286 MutS_I superfamily - - "MutS domain I; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with globular domain I, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in." Q#10518 - CGI_10025856 superfamily 218486 141 267 9.78E-13 66.6157 cl04975 MutS_II superfamily - - "MutS domain II; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam01624, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. This domain corresponds to domain II in Thermus aquaticus MutS as characterized in, and has similarity resembles RNAse-H-like domains (see pfam00075)." Q#10519 - CGI_10025857 superfamily 218603 386 500 6.51E-07 51.4508 cl08440 Serendipity_A superfamily NC - "Serendipity locus alpha protein (SRY-A); The Drosophila serendipity alpha (sry alpha) gene is specifically transcribed at the blastoderm stage, from nuclear cycle 11 to the onset of gastrulation, in all somatic nuclei. SRY-A is required for the cellularisation of the embryo and is involved in the localisation of the actin filaments just prior to and during plasma membrane invagination." Q#10521 - CGI_10025859 superfamily 243088 1 73 2.97E-11 58.9605 cl02563 PX_domain superfamily N - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#10524 - CGI_10016076 superfamily 247866 53 280 6.94E-20 85.9672 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#10525 - CGI_10016077 superfamily 241643 315 352 6.53E-08 48.9947 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#10525 - CGI_10016077 superfamily 241643 222 258 6.32E-06 43.2167 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#10525 - CGI_10016077 superfamily 241645 17 93 0.00456736 34.9354 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#10526 - CGI_10016078 superfamily 241754 18 338 2.22E-170 511.842 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#10526 - CGI_10016078 superfamily 243114 1135 1169 0.0013178 38.5453 cl02622 Pre-SET superfamily N - Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains. Q#10527 - CGI_10016079 superfamily 217293 81 256 1.36E-47 163.187 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10527 - CGI_10016079 superfamily 202474 295 378 2.47E-29 113.518 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#10530 - CGI_10016083 superfamily 241675 43 302 3.21E-136 389.349 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#10531 - CGI_10016084 superfamily 246938 6 183 1.01E-46 160.116 cl15371 NIF3 superfamily C - "NIF3 (NGG1p interacting factor 3); This family contains several NIF3 (NGG1p interacting factor 3) protein homologues. NIF3 interacts with the yeast transcriptional coactivator NGG1p which is part of the ADA complex, the exact function of this interaction is unknown." Q#10531 - CGI_10016084 superfamily 246938 193 320 2.19E-23 96.1725 cl15371 NIF3 superfamily N - "NIF3 (NGG1p interacting factor 3); This family contains several NIF3 (NGG1p interacting factor 3) protein homologues. NIF3 interacts with the yeast transcriptional coactivator NGG1p which is part of the ADA complex, the exact function of this interaction is unknown." Q#10532 - CGI_10016085 superfamily 243066 17 103 1.37E-24 94.9272 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#10533 - CGI_10016086 superfamily 243371 31 475 8.37E-99 317.028 cl03282 Pro_dh superfamily - - Proline dehydrogenase; Proline dehydrogenase. Q#10533 - CGI_10016086 superfamily 243371 565 810 3.37E-73 248.077 cl03282 Pro_dh superfamily N - Proline dehydrogenase; Proline dehydrogenase. Q#10538 - CGI_10016092 superfamily 248097 146 271 4.12E-15 69.2162 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10541 - CGI_10016095 superfamily 220808 19 130 2.48E-40 135.134 cl11187 MTP18 superfamily - - Mitochondrial 18 KDa protein (MTP18); This family of proteins are mitochondrial 18KDa proteins that are often misannotated as carbonic anhydrases. It was shown that knockdown of MTP18 protein results in a cytochrome c release from mitochondria and consequently leads to apoptosis. Overexpression studies suggest that MTP18 is required for mitochondrial fission. Q#10543 - CGI_10016097 superfamily 245205 455 621 1.33E-60 200.997 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#10543 - CGI_10016097 superfamily 245205 197 298 2.08E-51 173.587 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#10543 - CGI_10016097 superfamily 245205 326 426 1.43E-39 140.412 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#10543 - CGI_10016097 superfamily 245205 13 87 1.15E-22 93.416 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#10544 - CGI_10016098 superfamily 247725 210 306 6.21E-62 200.936 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10544 - CGI_10016098 superfamily 215882 103 216 8.66E-28 107.752 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#10544 - CGI_10016098 superfamily 220215 18 96 5.72E-19 81.8878 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#10547 - CGI_10016101 superfamily 245201 219 490 4.35E-56 191.194 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10548 - CGI_10011881 superfamily 246902 115 247 0.00224838 36.3847 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10549 - CGI_10011882 superfamily 243092 18 216 7.50E-26 102.413 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10550 - CGI_10011883 superfamily 241599 131 190 3.32E-12 61.4904 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#10552 - CGI_10011885 superfamily 245864 31 235 4.93E-62 204.049 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#10553 - CGI_10011886 superfamily 245864 125 229 8.33E-05 41.8802 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#10554 - CGI_10011887 superfamily 241611 94 237 1.10E-06 46.6128 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#10554 - CGI_10011887 superfamily 241619 306 367 0.00092955 37.1732 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#10555 - CGI_10011888 superfamily 220097 25 129 0.00143326 35.0721 cl08518 Phospholip_A2_3 superfamily - - "Prokaryotic phospholipase A2; The prokaryotic phospholipase A2 domain is predominantly found in bacterial and fungal phospholipases, as well as various hypothetical and putative proteins. It enables the liberation of fatty acids and lysophospholipid by hydrolysing the 2-ester bond of 1,2-diacyl-3-sn-phosphoglycerides. The domain adopts an alpha-helical secondary structure, consisting of five alpha-helices and two helical segments." Q#10557 - CGI_10011891 superfamily 241568 885 940 0.00389603 36.672 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#10557 - CGI_10011891 superfamily 243124 95 251 4.01E-40 146.803 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#10557 - CGI_10011891 superfamily 155088 434 562 1.09E-23 99.5926 cl02758 AMOP superfamily - - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#10557 - CGI_10011891 superfamily 243065 605 776 0.00143117 39.3473 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#10560 - CGI_10011894 superfamily 247058 3 160 9.51E-38 130.758 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#10561 - CGI_10011895 superfamily 247058 3 179 3.32E-44 148.478 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#10565 - CGI_10012957 superfamily 217062 122 331 6.04E-43 152.038 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#10568 - CGI_10012960 superfamily 216363 40 121 1.34E-13 62.873 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#10570 - CGI_10012962 superfamily 243083 87 142 1.41E-14 70.1078 cl02554 PWWP superfamily C - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#10570 - CGI_10012962 superfamily 219431 19 67 6.05E-13 63.9892 cl06504 zf-CW superfamily - - "CW-type Zinc Finger; This domain appears to be a zinc finger. The alignment shows four conserved cysteine residues and a conserved tryptophan. It was first identified by, and is predicted to be a "highly specialised mononuclear four-cysteine zinc finger...that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including...chromatin methylation status and early embryonic development." Weak homology to pfam00628 further evidences these predictions (personal obs: C Yeats). Twelve different CW-domain-containing protein subfamilies are described, with different subfamilies being characteristic of vertebrates, higher plants and other animals in which these domain is found." Q#10570 - CGI_10012962 superfamily 243120 362 423 6.06E-08 50.3072 cl02633 ARID superfamily C - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#10573 - CGI_10012965 superfamily 246925 549 795 1.97E-15 77.0105 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#10573 - CGI_10012965 superfamily 246925 682 911 4.09E-05 45.4242 cl15309 LRR_RI superfamily C - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#10575 - CGI_10012967 superfamily 247044 371 456 1.79E-20 86.1492 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#10575 - CGI_10012967 superfamily 247044 198 247 5.55E-18 79.5744 cl15697 ADF_gelsolin superfamily N - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#10575 - CGI_10012967 superfamily 247044 261 335 1.81E-14 69.1884 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#10576 - CGI_10012968 superfamily 241563 68 109 1.71E-06 45.548 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10576 - CGI_10012968 superfamily 241563 28 59 0.000904387 37.4588 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10579 - CGI_10010711 superfamily 247725 1 89 1.34E-33 123.84 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10579 - CGI_10010711 superfamily 246675 259 333 8.62E-52 178.685 cl14615 PI-PLCc_GDPD_SF superfamily C - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#10579 - CGI_10010711 superfamily 150071 168 258 2.40E-22 91.481 cl08538 efhand_like superfamily - - "Phosphoinositide-specific phospholipase C, efhand-like; Members of this family are predominantly found in phosphoinositide-specific phospholipase C. They adopt a structure consisting of a core of four alpha helices, in an EF like fold, and are required for functioning of the enzyme." Q#10579 - CGI_10010711 superfamily 246675 475 524 1.48E-19 87.7782 cl14615 PI-PLCc_GDPD_SF superfamily NC - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#10580 - CGI_10010712 superfamily 247805 1 131 1.11E-10 60.814 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10580 - CGI_10010712 superfamily 247905 291 397 2.98E-09 56.4773 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#10580 - CGI_10010712 superfamily 241598 1144 1182 1.30E-05 44.3848 cl00083 HNHc superfamily N - "HNH nucleases; HNH endonuclease signature which is found in viral, prokaryotic, and eukaryotic proteins. The alignment includes members of the large group of homing endonucleases, yeast intron 1 protein, MutS, as well as bacterial colicins, pyocins, and anaredoxins." Q#10580 - CGI_10010712 superfamily 207690 599 623 0.000446118 39.6085 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#10581 - CGI_10010713 superfamily 243106 3 175 1.39E-88 277.35 cl02608 BAH superfamily - - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#10581 - CGI_10010713 superfamily 212559 275 320 9.84E-25 98.0702 cl18297 SANT_MTA3_like superfamily - - "Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis." Q#10581 - CGI_10010713 superfamily 241648 370 422 2.81E-08 51.2194 cl00158 ZnF_GATA superfamily - - Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Q#10581 - CGI_10010713 superfamily 216509 154 209 2.46E-12 63.0266 cl03218 ELM2 superfamily - - "ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in a member from Arabidopsis thaliana. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain." Q#10582 - CGI_10010714 superfamily 247805 27 227 2.98E-94 301.326 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10582 - CGI_10010714 superfamily 247905 238 367 2.22E-42 152.777 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#10583 - CGI_10010715 superfamily 244363 8 181 1.41E-50 163.066 cl06336 Commd superfamily - - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#10584 - CGI_10010716 superfamily 234278 249 445 2.46E-17 83.8121 cl15938 non_repeat_PQQ superfamily C - "dehydrogenase, PQQ-dependent, s-GDH family; PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis." Q#10584 - CGI_10010716 superfamily 217324 26 126 1.89E-09 56.303 cl03844 Folate_rec superfamily C - Folate receptor family; This family includes the folate receptor which binds to folate and reduced folic acid derivatives and mediates delivery of 5-methyltetrahydrofolate to the interior of cells. These proteins are attached to the membrane by a GPI-anchor. The proteins contain 16 conserved cysteines that form eight disulphide bridges. Q#10585 - CGI_10010717 superfamily 241749 20 164 9.14E-25 94.7601 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#10586 - CGI_10010718 superfamily 243077 33 79 2.55E-11 56.0145 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#10589 - CGI_10010721 superfamily 245323 2954 3243 1.02E-99 325.737 cl10511 Beach superfamily - - "BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins." Q#10589 - CGI_10010721 superfamily 247725 2833 2943 5.73E-27 109.692 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10589 - CGI_10010721 superfamily 243092 3349 3590 2.72E-22 100.487 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10590 - CGI_10010722 superfamily 241563 1 22 0.00873532 34.2351 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10592 - CGI_10010724 superfamily 217078 52 226 4.70E-58 188.909 cl15643 CoA_transf_3 superfamily - - "CoA-transferase family III; CoA-transferases are found in organisms from all lines of descent. Most of these enzymes belong to two well-known enzyme families, but recent work on unusual biochemical pathways of anaerobic bacteria has revealed the existence of a third family of CoA-transferases. The members of this enzyme family differ in sequence and reaction mechanism from CoA-transferases of the other families. Currently known enzymes of the new family are a formyl-CoA: oxalate CoA-transferase, a succinyl-CoA: (R)-benzylsuccinate CoA-transferase, an (E)-cinnamoyl-CoA: (R)-phenyllactate CoA-transferase, and a butyrobetainyl-CoA: (R)-carnitine CoA-transferase. In addition, a large number of proteins of unknown or differently annotated function from Bacteria, Archaea and Eukarya apparently belong to this enzyme family. Properties and reaction mechanisms of the CoA-transferases of family III are described and compared to those of the previously known CoA-transferases." Q#10593 - CGI_10010725 superfamily 221831 195 263 8.16E-12 59.0617 cl15144 TORC_C superfamily - - "Transducer of regulated CREB activity, C terminus; This family includes the C terminal region of TORC proteins. TORC (Transducer of regulated CREB activity) is a protein family of coactivators that enhances the activity of CRE-depended transcription via a phosphorylation-independent interaction with the bZIP DNA binding/dimerisation domain of CREB (cAMP Response Element-Binding). The C terminus region is negatively charged, resembling the transcription activation domains. When this domain, from all three human TORC proteins, was expressed as fusion proteins with the DNA-binding domain of GAL4 (GAL4-BD), and tested for induction of a minimal promoter linked to GAL4-binding sites (UAS-GAL4), UAS-GAL4 was potently induced by GAL4-BD fusions containing the C-terminal portion of all three human TORCs." Q#10596 - CGI_10006020 superfamily 245835 18 225 4.56E-62 196.82 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#10597 - CGI_10006021 superfamily 241559 27 111 1.62E-14 64.6395 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#10598 - CGI_10006022 superfamily 150420 22 176 1.80E-60 206.125 cl18042 Jnk-SapK_ap_N superfamily - - JNK_SAPK-associated protein-1; This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end. Q#10600 - CGI_10006024 superfamily 245819 142 303 2.89E-51 178.927 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#10600 - CGI_10006024 superfamily 245819 820 1005 9.74E-40 146.185 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#10601 - CGI_10006025 superfamily 111102 319 398 1.56E-30 112.54 cl03478 KIX superfamily - - KIX domain; CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun. Q#10601 - CGI_10006025 superfamily 243131 245 298 5.38E-15 69.6919 cl02660 zf-TAZ superfamily C - "TAZ zinc finger; The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumour suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC." Q#10602 - CGI_10006026 superfamily 243131 108 186 4.12E-27 102.049 cl02660 zf-TAZ superfamily - - "TAZ zinc finger; The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumour suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC." Q#10602 - CGI_10006026 superfamily 111102 223 285 2.43E-21 85.9612 cl03478 KIX superfamily - - KIX domain; CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun. Q#10603 - CGI_10006027 superfamily 243131 105 183 7.75E-27 99.3522 cl02660 zf-TAZ superfamily - - "TAZ zinc finger; The TAZ2 domain of CBP binds to other transcription factors such as the p53 tumour suppressor protein, E1A oncoprotein, MyoD, and GATA-1. The zinc coordinating motif that is necessary for binding to target DNA sequences consists of HCCC." Q#10604 - CGI_10006028 superfamily 245205 59 113 0.000678235 36.4469 cl09930 RPA_2b-aaRSs_OBF_like superfamily C - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#10604 - CGI_10006028 superfamily 245205 156 208 0.00135703 35.2913 cl09930 RPA_2b-aaRSs_OBF_like superfamily C - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#10605 - CGI_10006029 superfamily 111102 17 75 1.68E-24 89.428 cl03478 KIX superfamily N - KIX domain; CBP and P300 bind to the CREB via a domain known as KIX. The KIX domain of CBP also binds to transactivation domains of other nuclear factors including Myb and Jun. Q#10606 - CGI_10003614 superfamily 241583 27 202 1.44E-18 84.5975 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#10606 - CGI_10003614 superfamily 246918 285 339 1.69E-11 60.2931 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10607 - CGI_10003615 superfamily 246918 522 576 3.59E-12 62.9895 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10607 - CGI_10003615 superfamily 241583 334 427 1.91E-08 54.1439 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#10607 - CGI_10003615 superfamily 216572 125 234 4.47E-08 52.2771 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#10608 - CGI_10003616 superfamily 241637 584 819 1.85E-26 111.715 cl00146 TFIIS_I superfamily N - N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme Q#10609 - CGI_10003617 superfamily 241734 17 104 9.32E-34 118.84 cl00261 PLPDE_III superfamily C - "Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes; The fold type III PLP-dependent enzyme family is predominantly composed of two-domain proteins with similarity to bacterial alanine racemases (AR) including eukaryotic ornithine decarboxylases (ODC), prokaryotic diaminopimelate decarboxylases (DapDC), biosynthetic arginine decarboxylases (ADC), carboxynorspermidine decarboxylases (CANSDC), and similar proteins. AR-like proteins contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. These proteins play important roles in the biosynthesis of amino acids and polyamine. The family also includes the single-domain YBL036c-like proteins, which contain a single PLP-binding TIM-barrel domain without any N- or C-terminal extensions. Due to the lack of a second domain, these proteins may possess only limited D- to L-alanine racemase activity or non-specific racemase activity." Q#10611 - CGI_10004830 superfamily 222150 369 396 0.00111071 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10611 - CGI_10004830 superfamily 222150 805 828 0.00324397 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10611 - CGI_10004830 superfamily 222150 775 802 0.00347616 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10612 - CGI_10004831 superfamily 243074 4 49 1.57E-13 64.0649 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#10613 - CGI_10004832 superfamily 247858 187 402 4.65E-26 102.851 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#10614 - CGI_10004833 superfamily 220830 283 416 2.63E-21 93.9296 cl11246 Ofd1_CTDD superfamily N - "Oxoglutarate and iron-dependent oxygenase degradation C-term; Ofd1 is a prolyl 4-hydroxylase-like 2-oxoglutarate-Fe(II) dioxygenase that accelerates the degradation of Sre1N in the presence of oxygen. The domain is conserved from yeasts to humans. Yeast Sre1 is the orthologue of mammalian sterol regulatory element binding protein (SREBP), and it responds to changes in oxygen-dependent sterol synthesis as an indirect measure of oxygen availability. However, unlike the prolyl 4-hydroxylases that regulate mammalian hypoxia-inducible factor, Ofd1 uses multiple domains to regulate Sre1N degradation by oxygen; the Ofd1 N-terminal dioxygenase domain is required for oxygen sensing and this Ofd1 C-terminal domain accelerates Sre1N degradation in yeasts." Q#10614 - CGI_10004833 superfamily 247858 149 247 7.46E-17 77.0557 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#10614 - CGI_10004833 superfamily 243169 549 600 0.00127552 37.8408 cl02766 NGN superfamily C - "N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization Substance G (NusG) and its eukaryotic homolog Spt5 are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms a Spt4-Spt5 complex that is an essential RNA Polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The diverse activities suggest that, after diverging from a common ancestor, NusG proteins became specialized in different bacteria." Q#10615 - CGI_10004834 superfamily 245213 420 454 1.39E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10615 - CGI_10004834 superfamily 222669 86 152 1.82E-09 55.8628 cl17048 Fn3-like superfamily - - Fibronectin type III-like domain; This domain has a fibronectin type III-like structure. It is often found in association with pfam00933 and pfam01915. Its function is unknown. Q#10615 - CGI_10004834 superfamily 243028 465 498 3.51E-07 48.4442 cl02419 Notch superfamily - - LNR domain; The LNR (Lin-12/Notch repeat) domain is found in three tandem copies in Notch related proteins. The structure of the domain has been determined by NMR and was shown to contain three disulphide bonds and coordinate a calcium ion. Three repeats are also found in the PAPP-A peptidase. Q#10616 - CGI_10004835 superfamily 247803 210 366 8.73E-119 355.068 cl17249 YlqF_related_GTPase superfamily - - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#10616 - CGI_10004835 superfamily 149291 45 175 3.85E-58 194.015 cl06961 NGP1NT superfamily - - NGP1NT (NUC091) domain; This N terminal domain is found in a subfamily of hypothetical nucleolar GTP-binding proteins similar to human NGP1. Q#10617 - CGI_10004836 superfamily 247856 24 80 2.09E-05 37.9125 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10622 - CGI_10018870 superfamily 241574 968 1196 3.85E-90 293.338 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#10622 - CGI_10018870 superfamily 241574 1260 1491 8.51E-64 218.609 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#10622 - CGI_10018870 superfamily 241584 595 683 4.25E-10 58.6619 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10622 - CGI_10018870 superfamily 241584 500 590 8.64E-08 51.7283 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10622 - CGI_10018870 superfamily 241584 409 495 1.05E-07 51.7283 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10622 - CGI_10018870 superfamily 241585 44 90 8.10E-07 48.284 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#10622 - CGI_10018870 superfamily 241585 11 41 0.000132448 41.7356 cl00066 FU superfamily C - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#10623 - CGI_10018871 superfamily 243061 24 124 3.82E-44 152.881 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10623 - CGI_10018871 superfamily 243068 285 531 1.85E-27 110.939 cl02523 Zona_pellucida superfamily - - Zona pellucida-like domain; Zona pellucida-like domain. Q#10623 - CGI_10018871 superfamily 243061 163 270 3.86E-20 86.2418 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10624 - CGI_10018872 superfamily 243061 20 122 9.38E-34 125.147 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10624 - CGI_10018872 superfamily 243068 321 528 1.08E-27 112.254 cl02523 Zona_pellucida superfamily - - Zona pellucida-like domain; Zona pellucida-like domain. Q#10624 - CGI_10018872 superfamily 243061 160 267 5.78E-22 91.6346 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10625 - CGI_10018873 superfamily 247856 100 172 1.83E-13 62.1801 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10625 - CGI_10018873 superfamily 247856 67 125 2.00E-12 59.0985 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10626 - CGI_10018874 superfamily 241762 572 631 2.40E-19 83.2149 cl00297 R3H superfamily - - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#10626 - CGI_10018874 superfamily 241575 279 335 0.003407 36.0963 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#10626 - CGI_10018874 superfamily 152387 39 100 2.74E-13 66.6081 cl13399 DUF3469 superfamily C - Protein of unknown function (DUF3469); This family of proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 108 to 439 amino acids in length. Q#10628 - CGI_10018876 superfamily 247684 9 182 3.48E-20 86.4887 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10630 - CGI_10018878 superfamily 246902 212 347 9.20E-58 185.9 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10630 - CGI_10018878 superfamily 246902 20 163 1.67E-52 172.479 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10631 - CGI_10018879 superfamily 246902 133 207 6.88E-18 76.5036 cl15239 PLDc_SF superfamily N - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10631 - CGI_10018879 superfamily 246902 46 117 2.13E-17 75.0231 cl15239 PLDc_SF superfamily N - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10632 - CGI_10018880 superfamily 246902 156 279 5.90E-49 170.107 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10632 - CGI_10018880 superfamily 246902 563 688 3.81E-41 148.151 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10632 - CGI_10018880 superfamily 246902 751 845 9.68E-33 124.268 cl15239 PLDc_SF superfamily C - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10632 - CGI_10018880 superfamily 246902 2 113 1.06E-21 92.7423 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10632 - CGI_10018880 superfamily 246902 452 526 1.55E-16 77.7195 cl15239 PLDc_SF superfamily N - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10632 - CGI_10018880 superfamily 246902 405 446 2.40E-11 62.2512 cl15239 PLDc_SF superfamily NC - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#10633 - CGI_10018881 superfamily 215821 174 268 1.16E-32 123.505 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#10633 - CGI_10018881 superfamily 247725 59 149 4.98E-05 42.8219 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10634 - CGI_10018882 superfamily 217925 63 183 8.37E-25 94.4953 cl04417 Ctr superfamily - - "Ctr copper transporter family; The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport." Q#10635 - CGI_10018883 superfamily 243053 217 487 1.19E-68 225.98 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#10635 - CGI_10018883 superfamily 243067 62 184 1.67E-19 85.5419 cl02520 REM superfamily - - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#10635 - CGI_10018883 superfamily 241645 569 648 2.58E-15 72.4519 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#10636 - CGI_10018884 superfamily 217519 5 193 2.37E-56 184.67 cl04030 PRP38 superfamily - - PRP38 family; Members of this family are related to the pre mRNA splicing factor PRP38 from yeast. Therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation. Q#10637 - CGI_10018885 superfamily 247912 97 488 1.70E-41 151.885 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#10638 - CGI_10018886 superfamily 241622 1085 1168 1.19E-12 65.6658 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#10639 - CGI_10018887 superfamily 243570 270 619 6.36E-112 343.451 cl03905 EXS superfamily - - "EXS family; We have named this region the EXS family after (ERD1, XPR1, and SYG1). This family includes C-terminus portions from the SYG1 G-protein associated signal transduction protein from Saccharomyces cerevisiae, and sequences that are thought to be murine leukaemia virus (MLV) receptors (XPR1). N-terminus portions from these proteins are aligned in the SPX pfam03105 family. The previously noted similarity between SYG1 and MLV receptors over their whole sequences is thus borne out in pfam03105 and this family. While the N-termini aligned in pfam03105 are thought to be involved in signal transduction, the role of the C-terminus sequences aligned in this family is not known. This region of similarity contains several predicted transmembrane helices. This family also includes the ERD1 (ERD: ER retention defective) yeast proteins. ERD1 proteins are involved in the localisation of endogenous endoplasmic reticulum (ER) proteins. erd1 null mutants secrete such proteins even though they possess the C-terminal HDEL ER lumen localisation label sequence. In addition, null mutants also exhibit defects in the Golgi-dependent processing of several glycoproteins, which led to the suggestion that the sorting of luminal ER proteins actually occurs in the Golgi, with subsequent return of these proteins to the ER via `salvage' vesicles." Q#10639 - CGI_10018887 superfamily 217372 1 182 4.33E-42 150.659 cl18405 SPX superfamily - - "SPX domain; We have named this region the SPX domain after (SYG1, Pho81 and XPR1). This 180 residue length domain is found at the amino terminus of a variety of proteins. In the yeast protein SYG1, the N-terminus directly binds to the G- protein beta subunit and inhibits transduction of the mating pheromone signal. This finding suggests that all the members of this family are involved in G-protein associated signal transduction. The N-termini of several proteins involved in the regulation of phosphate transport, including the putative phosphate level sensors PHO81 from Saccharomyces cerevisiae and NUC-2 from Neurospora crassa, are also members of this family. The SPX domain of S. cerevisiae low-affinity phosphate transporters Pho87 and Pho90 auto-regulates uptake and prevents efflux. This SPX dependent inhibition is mediated by the physical interaction with Spl2 NUC-2 contains several ankyrin repeats pfam00023. Several members of this family are annotated as XPR1 proteins: the xenotropic and polytropic retrovirus receptor confers susceptibility to infection with murine leukaemia viruses (MLV). The similarity between SYG1, phosphate regulators and XPR1 sequences has been previously noted, as has the additional similarity to several predicted proteins, of unknown function, from Drosophila melanogaster, Arabidopsis thaliana, Caenorhabditis elegans, Schizosaccharomyces pombe, and Saccharomyces cerevisiae. In addition, given the similarities between XPR1 and SYG1 and phosphate regulatory proteins, it has been proposed that XPR1 might be involved in G-protein associated signal transduction and may itself function as a phosphate sensor." Q#10640 - CGI_10018888 superfamily 190601 1 44 7.08E-06 44.3371 cl10605 GerA superfamily N - Bacillus/Clostridium GerA spore germination protein; Bacillus/Clostridium GerA spore germination protein. Q#10641 - CGI_10018889 superfamily 248458 91 262 4.15E-18 84.2877 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#10641 - CGI_10018889 superfamily 248458 316 506 5.65E-09 56.5533 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#10643 - CGI_10018891 superfamily 241629 1234 1387 2.63E-33 127.636 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#10643 - CGI_10018891 superfamily 243034 762 830 4.75E-09 55.464 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10643 - CGI_10018891 superfamily 243034 1010 1070 0.000258214 41.2116 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10643 - CGI_10018891 superfamily 247743 257 411 5.40E-07 49.8908 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#10643 - CGI_10018891 superfamily 205451 13 127 5.63E-06 46.0323 cl16203 DUF4062 superfamily - - "Domain of unknown function (DUF4062); This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. There is a conserved SST sequence motif." Q#10644 - CGI_10018892 superfamily 247775 267 398 1.37E-24 103.434 cl17221 ArsB_NhaD_permease superfamily N - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#10644 - CGI_10018892 superfamily 247775 105 170 1.09E-09 58.8943 cl17221 ArsB_NhaD_permease superfamily NC - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#10644 - CGI_10018892 superfamily 247775 1 31 0.000171625 42.187 cl17221 ArsB_NhaD_permease superfamily NC - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#10646 - CGI_10018894 superfamily 245206 1 143 3.63E-34 123.103 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#10651 - CGI_10018899 superfamily 241563 62 103 0.0034338 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10651 - CGI_10018899 superfamily 217316 111 168 0.00621647 36.4504 cl03832 DUF234 superfamily C - Archaea bacterial proteins of unknown function; Archaea bacterial proteins of unknown function. Q#10652 - CGI_10018900 superfamily 247046 241 393 3.31E-06 46.2741 cl15705 DUF563 superfamily C - Protein of unknown function (DUF563); Family of uncharacterized proteins. Q#10654 - CGI_10018902 superfamily 245599 428 649 1.99E-110 335.842 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#10654 - CGI_10018902 superfamily 207662 176 258 6.91E-56 186.122 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#10655 - CGI_10018903 superfamily 245230 46 121 2.57E-40 142.045 cl10017 Tubulin_FtsZ superfamily C - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#10656 - CGI_10013953 superfamily 248312 21 205 1.68E-07 48.1185 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#10657 - CGI_10013954 superfamily 206609 421 528 2.97E-54 185.676 cl16886 DBC1 superfamily C - DBC1; DBC1 and it homologs from diverse eukaryotes are a catalytically inactive version of the Nudix hydrolase (MutT) domain. DBC1 is predicted to bind NAD metabolites and regulate the activity of SIRT1 or related deacetylases by sensing the soluble products or substrates of the NAD-dependent deacetylation reaction. Q#10657 - CGI_10013954 superfamily 206610 128 185 2.64E-32 121.306 cl16887 S1-like superfamily - - S1-like; S1-like RNA binding domain found in DBC1 Q#10657 - CGI_10013954 superfamily 207684 577 605 0.000657444 38.8992 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#10657 - CGI_10013954 superfamily 221533 972 1035 0.000826785 38.832 cl13726 TMF_DNA_bd superfamily - - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#10658 - CGI_10013955 superfamily 241626 162 277 5.07E-48 158.18 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#10658 - CGI_10013955 superfamily 241626 7 137 2.70E-47 156.24 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#10659 - CGI_10013956 superfamily 241675 117 331 4.58E-95 296.905 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#10662 - CGI_10013959 superfamily 241622 22 69 2.40E-10 58.7322 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#10662 - CGI_10013959 superfamily 128974 398 423 1.71E-06 46.442 cl00302 ZM superfamily - - "ZASP-like motif; Short motif (26 amino acids) present in an alpha-actinin-binding protein, ZASP, and similar molecules." Q#10663 - CGI_10013960 superfamily 149077 1398 1510 4.88E-33 125.815 cl06719 TMC superfamily - - "TMC domain; These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 and EVIN2 - this region is termed the TMC domain. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters." Q#10664 - CGI_10013961 superfamily 217293 59 270 2.75E-65 213.263 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10664 - CGI_10013961 superfamily 202474 291 387 8.42E-11 60.7453 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#10665 - CGI_10013962 superfamily 217293 22 124 9.58E-32 120.045 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10665 - CGI_10013962 superfamily 202474 132 202 5.97E-10 57.6637 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#10666 - CGI_10013963 superfamily 241762 132 194 7.13E-15 69.6607 cl00297 R3H superfamily - - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#10666 - CGI_10013963 superfamily 247723 365 394 1.14E-09 55.0829 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10666 - CGI_10013963 superfamily 248145 232 316 4.43E-06 46.474 cl17591 CAF1 superfamily N - CAF1 family ribonuclease; The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localises to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom#resolution. Q#10667 - CGI_10013964 superfamily 246925 370 605 3.12E-10 60.447 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#10668 - CGI_10013965 superfamily 245303 44 263 3.64E-113 332.725 cl10447 GH18_chitinase-like superfamily C - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#10670 - CGI_10021845 superfamily 241574 18 179 1.10E-05 43.3434 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#10673 - CGI_10021848 superfamily 147609 12 176 1.24E-17 75.4707 cl05205 p25-alpha superfamily - - "p25-alpha; This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila." Q#10674 - CGI_10021849 superfamily 151166 1 60 5.26E-17 70.7645 cl11265 MitoNEET_N superfamily - - "Iron-containing outer mitochondrial membrane protein N-terminus; MitoNEET_N is the N-terminal region of the MitoNEET and Miner-type proteins that carry a zf-CDGSH, pfam09360, redox-active 2Fe-2S cluster. The whole protein regulates oxidative capacity. The domain is an anchor sequence that tethers the protein to the outer membrane." Q#10674 - CGI_10021849 superfamily 243164 78 116 1.29E-15 66.2116 cl02748 zf-CDGSH superfamily - - "Iron-binding zinc finger CDGSH type; The CDGSH-type zinc finger domain binds iron rather than zinc as a redox-active pH-labile 2Fe-2S cluster. The conserved sequence C-X-C-X2-(S/T)-X3-P-X-C-D-G-(S/A/T)-H is a defining feature of this family. The domain is oriented towards the cytoplasm and is tethered to the mitochondrial membrane by a more N-terminal domain found in higher vertebrates, MitoNEET_N, pfam10660. The domain forms a uniquely folded homo-dimer and spans the outer mitochondrial membrane, orienting the iron-binding residues towards the cytoplasm." Q#10675 - CGI_10021850 superfamily 241644 4 144 2.31E-56 175.083 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#10676 - CGI_10021851 superfamily 247723 225 301 8.87E-51 167.793 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10676 - CGI_10021851 superfamily 247723 114 195 8.16E-47 157.317 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10676 - CGI_10021851 superfamily 247723 329 416 7.15E-28 105.331 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10677 - CGI_10021852 superfamily 241776 18 184 3.50E-64 202.812 cl00315 RPS2 superfamily - - "Ribosomal protein S2 (RPS2), involved in formation of the translation initiation complex, where it might contact the messenger RNA and several components of the ribosome. It has been shown that in Escherichia coli RPS2 is essential for the binding of ribosomal protein S1 to the 30s ribosomal subunit. In humans, most likely in all vertebrates, and perhaps in all metazoans, the protein also functions as the 67 kDa laminin receptor (LAMR1 or 67LR), which is formed from a 37 kDa precursor, and is overexpressed in many tumors. 67LR is a cell surface receptor which interacts with a variety of ligands, laminin-1 and others. It is assumed that the ligand interactions are mediated via the conserved C-terminus, which becomes extracellular as the protein undergoes conformational changes which are not well understood. Specifically, a conserved palindromic motif, LMWWML, may participate in the interactions. 67LR plays essential roles in the adhesion of cells to the basement membrane and subsequent signalling events, and has been linked to several diseases. Some evidence also suggests that the precursor of 67LR, 37LRP is also present in the nucleus in animals, where it appears associated with histones." Q#10679 - CGI_10021854 superfamily 213458 1 47 1.90E-11 54.2021 cl17044 DD_cGKI superfamily - - "Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I; Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is also expressed at lower concentrations in other tissues. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing their targeting to different subcellular compartments and intracellular substrates." Q#10681 - CGI_10021856 superfamily 245201 384 644 3.14E-169 488.282 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10681 - CGI_10021856 superfamily 241570 242 355 1.91E-28 110.493 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#10681 - CGI_10021856 superfamily 241570 123 230 7.13E-25 100.478 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#10681 - CGI_10021856 superfamily 213458 10 61 1.82E-05 43.0997 cl17044 DD_cGKI superfamily - - "Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I; Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is also expressed at lower concentrations in other tissues. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing their targeting to different subcellular compartments and intracellular substrates." Q#10681 - CGI_10021856 superfamily 245597 639 667 3.14E-08 51.2072 cl11395 Pkinase_C superfamily C - Protein kinase C terminal domain; Protein kinase C terminal domain. Q#10682 - CGI_10021857 superfamily 147395 1 42 9.64E-11 53.7424 cl04973 Dpy-30 superfamily - - Dpy-30 motif; This motif is found in a wide variety of domain contexts. It is found in the Dpy-30 proteins hence the motifs name. It is about 40 residues long and is probably fomed of two alpha-helices. It may be a dimerisation motif analogous to pfam02197 (Bateman A pers obs). Q#10685 - CGI_10021860 superfamily 220695 21 68 0.00461153 36.4027 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#10686 - CGI_10021861 superfamily 207654 244 309 5.41E-22 87.113 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10686 - CGI_10021861 superfamily 207654 85 150 8.41E-22 86.7278 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10686 - CGI_10021861 superfamily 207654 14 77 2.62E-20 82.4906 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10686 - CGI_10021861 superfamily 207654 169 234 4.55E-17 73.631 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10688 - CGI_10021863 superfamily 207654 97 162 1.67E-26 99.8246 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10688 - CGI_10021863 superfamily 207654 256 321 7.11E-24 92.5058 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10688 - CGI_10021863 superfamily 207654 26 90 4.96E-23 90.1946 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10688 - CGI_10021863 superfamily 207654 180 246 1.50E-20 83.261 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10689 - CGI_10021864 superfamily 207654 1 57 6.71E-13 57.4526 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#10691 - CGI_10021866 superfamily 245815 10 175 2.87E-97 292.33 cl11961 ALDH-SF superfamily C - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#10692 - CGI_10021867 superfamily 241584 579 673 9.46E-08 52.4987 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10692 - CGI_10021867 superfamily 241584 1598 1678 9.28E-05 43.2539 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10692 - CGI_10021867 superfamily 241584 256 308 0.00076795 40.5575 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10692 - CGI_10021867 superfamily 241584 1527 1596 0.00209673 39.0167 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10692 - CGI_10021867 superfamily 241584 1419 1522 0.00352976 38.2463 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10692 - CGI_10021867 superfamily 245201 1854 2121 1.50E-144 452.389 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10694 - CGI_10021869 superfamily 248097 3 120 1.71E-16 70.3718 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10695 - CGI_10021870 superfamily 217414 115 476 3.56E-06 47.7096 cl03927 Otopetrin superfamily - - "Protein of unknown function, DUF270; Protein of unknown function, DUF270. " Q#10696 - CGI_10021871 superfamily 241645 8 71 2.50E-12 61.129 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#10696 - CGI_10021871 superfamily 245201 137 162 0.00512287 36.5189 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10697 - CGI_10021872 superfamily 241926 26 140 3.30E-48 153.944 cl00528 IscU_like superfamily - - "Iron-sulfur cluster scaffold-like proteins; IscU_like and NifU_like proteins. IscU and NifU function as a scaffold for the assembly of [2Fe-2S] clusters before they are transferred to apo target proteins. They are highly conserved and play vital roles in the ISC and NIF systems of Fe-S protein maturation. NIF genes participate in nitrogen fixation in several isolated bacterial species. The NifU domain, however, is also found in bacteria that do not fix nitrogen, so it may have wider significance in the cell. Human IscU interacts with frataxin, the Friedreich ataxia gene product, and incorrectly spliced IscU has been shown to disrupt iron homeostasis in skeletal muscle and cause myopathy." Q#10698 - CGI_10021873 superfamily 243099 441 531 0.00707603 34.9275 cl02575 Bcl-2_like superfamily - - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#10699 - CGI_10021874 superfamily 241793 119 219 1.19E-37 131.473 cl00333 Ribosomal_L13 superfamily - - "Ribosomal protein L13. Protein L13, a large ribosomal subunit protein, is one of five proteins required for an early folding intermediate of 23S rRNA in the assembly of the large subunit. L13 is situated on the bottom of the large subunit, near the polypeptide exit site. It interacts with proteins L3 and L6, and forms an extensive network of interactions with 23S rRNA. L13 has been identified as a homolog of the human breast basic conserved protein 1 (BBC1), a protein identified through its increased expression in breast cancer. L13 expression is also upregulated in a variety of human gastrointestinal cancers, suggesting it may play a role in the etiology of a variety of human malignancies." Q#10700 - CGI_10021875 superfamily 202715 150 250 7.29E-29 106.123 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#10701 - CGI_10021876 superfamily 248012 334 429 3.99E-19 83.0108 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#10702 - CGI_10021877 superfamily 243058 86 183 5.84E-08 50.7759 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#10702 - CGI_10021877 superfamily 248012 207 302 1.06E-19 84.1664 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#10706 - CGI_10021881 superfamily 220692 65 246 1.05E-05 45.2729 cl18570 7TM_GPCR_Srw superfamily C - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#10707 - CGI_10021882 superfamily 243066 32 122 5.93E-27 104.557 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#10707 - CGI_10021882 superfamily 219619 345 421 6.23E-08 50.284 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#10708 - CGI_10021883 superfamily 245814 38 118 9.44E-05 41.3369 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10710 - CGI_10001483 superfamily 248097 47 181 6.12E-15 67.7125 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10711 - CGI_10013599 superfamily 247723 34 123 6.27E-45 147.02 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10712 - CGI_10013600 superfamily 243034 69 150 3.76E-09 51.612 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10712 - CGI_10013600 superfamily 215821 1 39 0.00161352 35.2939 cl18346 FKBP_C superfamily N - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#10713 - CGI_10013601 superfamily 243082 121 463 1.16E-63 211.462 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#10713 - CGI_10013601 superfamily 245220 4 63 3.40E-17 75.9042 cl09957 zf-UBP superfamily - - Zn-finger in ubiquitin-hydrolases and other protein; Zn-finger in ubiquitin-hydrolases and other protein. Q#10716 - CGI_10013604 superfamily 241584 308 410 3.06E-08 52.4987 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10718 - CGI_10013606 superfamily 241599 8 51 1.21E-08 46.0825 cl00084 homeodomain superfamily C - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#10719 - CGI_10013607 superfamily 247856 224 290 2.70E-16 71.4249 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10719 - CGI_10013607 superfamily 247856 158 213 3.33E-10 54.8613 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10724 - CGI_10008493 superfamily 241645 1 76 2.05E-51 159.65 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#10724 - CGI_10008493 superfamily 242019 77 128 9.90E-24 87.7376 cl00671 Ribosomal_L40e superfamily - - Ribosomal L40e family; Bovine L40 has been identified as a secondary RNA binding protein. L40 is fused to a ubiquitin protein. Q#10725 - CGI_10008494 superfamily 242080 234 426 2.88E-43 151.506 cl00771 zpr1_rel superfamily - - "ZPR1-related zinc finger protein; This model describes a strictly archaeal family homologous to the domain duplicated in the eukaryotic zinc-binding protein ZPR1. ZPR1 was shown experimentally to bind approximately two moles of zinc; each copy of the domain contains a putative zinc finger of the form CXXCX(25)CXXC. ZPR1 binds the tyrosine kinase domain of epidermal growth factor receptor, but is displaced by receptor activation and autophosphorylation after which it redistributes in part to the nucleus. The proteins described by This model by analogy may be suggested to play a role in signal transduction. A model ZPR1_znf (TIGR00310) has been created to describe the domain shared by this protein and ZPR1 [Unknown function, General]." Q#10725 - CGI_10008494 superfamily 242080 32 188 1.05E-21 91.4144 cl00771 zpr1_rel superfamily - - "ZPR1-related zinc finger protein; This model describes a strictly archaeal family homologous to the domain duplicated in the eukaryotic zinc-binding protein ZPR1. ZPR1 was shown experimentally to bind approximately two moles of zinc; each copy of the domain contains a putative zinc finger of the form CXXCX(25)CXXC. ZPR1 binds the tyrosine kinase domain of epidermal growth factor receptor, but is displaced by receptor activation and autophosphorylation after which it redistributes in part to the nucleus. The proteins described by This model by analogy may be suggested to play a role in signal transduction. A model ZPR1_znf (TIGR00310) has been created to describe the domain shared by this protein and ZPR1 [Unknown function, General]." Q#10726 - CGI_10008495 superfamily 245814 51 135 7.56E-08 46.208 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10727 - CGI_10008496 superfamily 241584 403 495 2.91E-13 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10727 - CGI_10008496 superfamily 245814 328 394 5.75E-10 57.8843 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10727 - CGI_10008496 superfamily 241584 527 617 1.39E-06 47.8763 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#10727 - CGI_10008496 superfamily 245814 43 106 1.23E-13 68.2023 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10727 - CGI_10008496 superfamily 245814 126 210 1.13E-08 54.0485 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10727 - CGI_10008496 superfamily 245814 244 299 1.40E-07 50.5206 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#10728 - CGI_10008497 superfamily 245936 228 434 8.63E-39 139.607 cl12283 IPK superfamily - - Inositol polyphosphate kinase; ArgRIII has has been demonstrated to be an inositol polyphosphate kinase. Q#10729 - CGI_10008498 superfamily 242415 25 220 5.93E-90 275.261 cl01287 AE_Prim_S_like superfamily - - "AE_Prim_S_like: primase domain similar to that found in the small subunit of archaeal and eukaryotic (A/E) DNA primases. The replication machineries of A/Es are distinct from that of bacteria. Primases are DNA-dependent RNA polymerases which synthesis the short RNA primers required for DNA replication. In eukaryotes, this small catalytically active primase subunit (p50) and a larger primase subunit (p60), referred to jointly as the core primase, associate with the B subunit and the DNA polymerase alpha subunit in a complex, called Pol alpha-pri. In addition to its catalytic role in replication, eukaryotic DNA primase may play a role in coupling replication to DNA damage repair and in checkpoint control during S phase. Pfu41 and Pfu46 comprise the primase complex of the archaea Pyrococcus furiosus; these proteins have sequence identity to the eukaryotic p50 and p60 primase proteins respectively. Pfu41 preferentially uses dNTPs as substrate. Pfu46 regulates the primase activity of Pfu41. Also found in this group is the primase-polymerase (primpol) domain of replicases from archaeal plasmids including the ORF904 protein of pRN1 from Sulfolobus islandicus (pRN1 primpol). The pRN1 primpol domain exhibits DNA polymerase and primase activities; a cluster of active site residues (three acidic residues, and a histidine) is required for both these activities. The pRN1 primpol primase activity prefers dNTPs to rNTPs; however incorporation of dNTPs requires rNTP as cofactor. This group also includes the Pol domain of bacterial LigD proteins such Mycobacterium tuberculosis (Mt)LigD. MtLigD contains an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. LigD Pol plays a role in non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB) in vivo, perhaps by filling in short 5'-overhangs with ribonucleotides; the filled in termini would be sealed by the associated LigD ligase domain. The MtLigD Pol domain is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro." Q#10729 - CGI_10008498 superfamily 242415 303 343 4.80E-20 87.6691 cl01287 AE_Prim_S_like superfamily N - "AE_Prim_S_like: primase domain similar to that found in the small subunit of archaeal and eukaryotic (A/E) DNA primases. The replication machineries of A/Es are distinct from that of bacteria. Primases are DNA-dependent RNA polymerases which synthesis the short RNA primers required for DNA replication. In eukaryotes, this small catalytically active primase subunit (p50) and a larger primase subunit (p60), referred to jointly as the core primase, associate with the B subunit and the DNA polymerase alpha subunit in a complex, called Pol alpha-pri. In addition to its catalytic role in replication, eukaryotic DNA primase may play a role in coupling replication to DNA damage repair and in checkpoint control during S phase. Pfu41 and Pfu46 comprise the primase complex of the archaea Pyrococcus furiosus; these proteins have sequence identity to the eukaryotic p50 and p60 primase proteins respectively. Pfu41 preferentially uses dNTPs as substrate. Pfu46 regulates the primase activity of Pfu41. Also found in this group is the primase-polymerase (primpol) domain of replicases from archaeal plasmids including the ORF904 protein of pRN1 from Sulfolobus islandicus (pRN1 primpol). The pRN1 primpol domain exhibits DNA polymerase and primase activities; a cluster of active site residues (three acidic residues, and a histidine) is required for both these activities. The pRN1 primpol primase activity prefers dNTPs to rNTPs; however incorporation of dNTPs requires rNTP as cofactor. This group also includes the Pol domain of bacterial LigD proteins such Mycobacterium tuberculosis (Mt)LigD. MtLigD contains an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. LigD Pol plays a role in non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB) in vivo, perhaps by filling in short 5'-overhangs with ribonucleotides; the filled in termini would be sealed by the associated LigD ligase domain. The MtLigD Pol domain is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro." Q#10730 - CGI_10008499 superfamily 245201 61 323 9.73E-154 435.006 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10731 - CGI_10002721 superfamily 248097 1 118 2.73E-20 80.387 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10732 - CGI_10002722 superfamily 242232 116 173 3.14E-12 59.2126 cl00984 TM2 superfamily C - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#10733 - CGI_10001621 superfamily 241563 60 102 6.44E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10733 - CGI_10001621 superfamily 241563 8 53 0.00383141 35.7759 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10736 - CGI_10002173 superfamily 248012 170 282 4.42E-17 78.0032 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#10736 - CGI_10002173 superfamily 248012 376 469 1.74E-08 52.9653 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#10737 - CGI_10025516 superfamily 177822 1 189 1.82E-16 75.3417 cl18088 PLN02164 superfamily N - sulfotransferase Q#10738 - CGI_10025517 superfamily 247725 123 247 1.12E-65 209.064 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10745 - CGI_10025524 superfamily 247999 1160 1204 1.58E-15 74.4491 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#10745 - CGI_10025524 superfamily 219408 402 484 0.00482452 39.7182 cl06454 DUF1510 superfamily C - Protein of unknown function (DUF1510); This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. Q#10747 - CGI_10025526 superfamily 190706 73 278 1.19E-20 90.9232 cl04201 Glyco_hydro_79n superfamily N - "Glycosyl hydrolase family 79, N-terminal domain; Family of endo-beta-N-glucuronidase, or heparanase. Heparan sulfate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulfate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular micro-environment. Heparanase degrades HS at specific intra-chain sites. The enzyme is synthesised as a latent approximately 65 kDa protein that is processed at the N-terminus into a highly active approximately 50 kDa form. Experimental evidence suggests that heparanase may facilitate both tumour cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity." Q#10748 - CGI_10025527 superfamily 241628 48 239 3.33E-61 200.147 cl00130 PseudoU_synth superfamily - - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#10749 - CGI_10025528 superfamily 241622 825 862 0.000319112 40.2427 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#10749 - CGI_10025528 superfamily 247725 7 78 0.000266361 40.6097 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10753 - CGI_10025532 superfamily 246597 2 265 2.68E-77 242.977 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#10755 - CGI_10025534 superfamily 241608 21 91 1.27E-45 148.91 cl00098 KH-II superfamily - - "KH-II (K homology RNA-binding domain, type II). KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins (e.g. ribosomal protein S3), transcription factors (e.g. NusA_K), and post-transcriptional modifiers of mRNA (e.g. hnRNP K). There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is a beta-alpha-alpha-beta-beta unit that folds into an alpha-beta structure with a three stranded beta-sheet interupted by two contiguous helices. In addition to their KH core domain, KH-II proteins have an N-terminal alpha helical extension while KH-I proteins have a C-terminal alpha helical extension." Q#10755 - CGI_10025534 superfamily 215779 102 186 9.70E-24 91.8035 cl02819 Ribosomal_S3_C superfamily - - "Ribosomal protein S3, C-terminal domain; This family contains a central domain pfam00013, hence the amino and carboxyl terminal domains are stored separately. This is a minimal carboxyl-terminal domain. Some are much longer." Q#10756 - CGI_10025535 superfamily 241763 28 170 1.43E-25 98.465 cl00298 Peptidase_C1 superfamily C - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#10757 - CGI_10025536 superfamily 247068 2198 2294 3.11E-31 121.652 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 1156 1255 1.49E-27 110.866 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 5 102 3.89E-27 109.71 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 1996 2085 3.29E-26 107.014 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 2093 2190 1.93E-24 102.007 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 423 517 2.24E-24 101.621 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247907 2718 2868 2.50E-24 103.267 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#10757 - CGI_10025536 superfamily 247068 315 413 5.28E-23 97.7693 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 217 307 3.35E-22 95.4581 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 1787 1878 2.45E-21 92.7617 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 625 730 9.85E-21 91.2209 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 1891 1983 1.22E-19 88.1393 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 1044 1146 9.38E-19 85.4429 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 527 616 1.12E-18 85.0577 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 1365 1442 9.82E-18 82.3613 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 940 1032 3.11E-17 80.8205 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 1468 1565 1.00E-16 79.6649 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 2303 2395 5.52E-16 77.3537 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 752 829 1.51E-13 70.0349 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 126 202 2.01E-13 70.0349 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 839 932 5.83E-13 68.4941 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 247068 1263 1356 1.85E-10 61.1753 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 245213 2990 3022 6.05E-10 58.417 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10757 - CGI_10025536 superfamily 247068 1679 1754 5.90E-08 53.4714 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 245213 2946 2984 2.27E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10757 - CGI_10025536 superfamily 247068 2419 2491 0.000486948 41.5302 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10757 - CGI_10025536 superfamily 245213 2912 2944 0.00321012 38.3866 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#10757 - CGI_10025536 superfamily 247068 1576 1667 0.00569888 38.0634 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#10760 - CGI_10025539 superfamily 241734 1 136 7.55E-66 204.355 cl00261 PLPDE_III superfamily N - "Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes; The fold type III PLP-dependent enzyme family is predominantly composed of two-domain proteins with similarity to bacterial alanine racemases (AR) including eukaryotic ornithine decarboxylases (ODC), prokaryotic diaminopimelate decarboxylases (DapDC), biosynthetic arginine decarboxylases (ADC), carboxynorspermidine decarboxylases (CANSDC), and similar proteins. AR-like proteins contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. These proteins play important roles in the biosynthesis of amino acids and polyamine. The family also includes the single-domain YBL036c-like proteins, which contain a single PLP-binding TIM-barrel domain without any N- or C-terminal extensions. Due to the lack of a second domain, these proteins may possess only limited D- to L-alanine racemase activity or non-specific racemase activity." Q#10761 - CGI_10025540 superfamily 241547 71 199 9.63E-33 120.851 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#10762 - CGI_10025541 superfamily 245202 116 217 1.14E-28 113.081 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10762 - CGI_10025541 superfamily 243034 1387 1481 3.74E-05 43.908 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10762 - CGI_10025541 superfamily 245202 588 657 2.89E-19 84.9699 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10762 - CGI_10025541 superfamily 245202 500 566 2.92E-17 79.2045 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10762 - CGI_10025541 superfamily 245202 731 800 1.96E-11 62.6053 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10762 - CGI_10025541 superfamily 245202 412 480 6.20E-11 60.7391 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10762 - CGI_10025541 superfamily 245202 1011 1083 1.33E-09 56.9765 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10762 - CGI_10025541 superfamily 245202 228 311 5.07E-09 55.3337 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10762 - CGI_10025541 superfamily 245202 329 389 2.09E-08 53.0497 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10762 - CGI_10025541 superfamily 245202 678 730 4.46E-07 49.3933 cl09927 S1_like superfamily C - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#10763 - CGI_10025542 superfamily 247792 901 942 5.44E-10 57.4556 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10763 - CGI_10025542 superfamily 247684 1071 1319 2.50E-93 310.011 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#10763 - CGI_10025542 superfamily 241554 491 660 2.02E-31 123.149 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#10763 - CGI_10025542 superfamily 193607 975 1085 1.18E-22 96.4869 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#10763 - CGI_10025542 superfamily 247723 3 72 4.67E-05 43.0597 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10764 - CGI_10025543 superfamily 248012 63 149 3.31E-17 77.2328 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#10765 - CGI_10025544 superfamily 216686 70 255 5.24E-52 171.352 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#10767 - CGI_10025546 superfamily 217062 142 400 1.22E-42 152.809 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#10768 - CGI_10025547 superfamily 241749 33 167 2.62E-22 88.2117 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#10770 - CGI_10025549 superfamily 241983 17 323 3.36E-45 157.905 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#10776 - CGI_10025555 superfamily 247792 315 357 1.06E-05 42.4328 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10776 - CGI_10025555 superfamily 221597 97 242 9.61E-14 67.3805 cl13864 GIDE superfamily - - "E3 Ubiquitin ligase; This domain family is found in bacteria, archaea and eukaryotes, and is typically between 150 and 163 amino acids in length. There is a single completely conserved residue E that may be functionally important. GIDE is an E3 ubiquitin ligase which is involved in inducing apoptosis." Q#10777 - CGI_10025556 superfamily 247741 13 343 0 636.214 cl17187 Aldolase_Class_I superfamily - - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#10778 - CGI_10025557 superfamily 247805 596 744 6.58E-19 86.2372 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10778 - CGI_10025557 superfamily 247905 942 1066 1.49E-18 84.982 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#10779 - CGI_10025558 superfamily 241832 4 75 2.97E-31 108.099 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#10779 - CGI_10025558 superfamily 243175 89 114 3.15E-08 47.5931 cl02776 GST_C_family superfamily C - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#10782 - CGI_10025561 superfamily 241852 845 1100 6.22E-86 278.297 cl00416 CS_ACL-C_CCL superfamily - - "Citrate synthase (CS), citryl-CoA lyase (CCL), the C-terminal portion of the single-subunit type ATP-citrate lyase (ACL) and the C-terminal portion of the large subunit of the two-subunit type ACL. CS catalyzes the condensation of acetyl coenzyme A (AcCoA) and oxalacetate (OAA) from citrate and coenzyme A (CoA), the first step in the oxidative citric acid cycle (TCA or Krebs cycle). Peroxisomal CS is involved in the glyoxylate cycle. Some CS proteins function as a 2-methylcitrate synthase (2MCS). 2MCS catalyzes the condensation of propionyl-CoA (PrCoA) and OAA to form 2-methylcitrate and CoA during propionate metabolism. CCL cleaves citryl-CoA (CiCoA) to AcCoA and OAA. ACLs catalyze an ATP- and a CoA- dependant cleavage of citrate to form AcCoA and OAA; they do this in a multistep reaction, the final step of which is likely to involve the cleavage of CiCoA to generate AcCoA and OAA. The overall CS reaction is thought to proceed through three partial reactions and involves both closed and open conformational forms of the enzyme: a) the carbanion or equivalent is generated from AcCoA by base abstraction of a proton, b) the nucleophilic attack of this carbanion on OAA to generate CiCoA, and c) the hydrolysis of CiCoA to produce citrate and CoA. This group contains proteins which functions exclusively as either a CS or a 2MCS, as well as those with relaxed specificity which have dual functions as both a CS and a 2MCS. There are two types of CSs: type I CS and type II CSs. Type I CSs are found in eukarya, gram-positive bacteria, archaea, and in some gram-negative bacteria and are homodimers with both subunits participating in the active site. Type II CSs are unique to gram-negative bacteria and are homohexamers of identical subunits (approximated as a trimer of dimers). Some type II CSs are strongly and specifically inhibited by NADH through an allosteric mechanism. In fungi, yeast, plants, and animals ACL is cytosolic and generates AcCoA for lipogenesis. In several groups of autotrophic prokaryotes and archaea, ACL carries out the citrate-cleavage reaction of the reductive tricarboxylic acid (rTCA) cycle. In the family Aquificaceae this latter reaction in the rTCA cycle is carried out via a two enzyme system the second enzyme of which is CCL." Q#10782 - CGI_10025561 superfamily 215988 649 773 8.98E-23 96.1716 cl18355 Ligase_CoA superfamily - - "CoA-ligase; This family includes the CoA ligases Succinyl-CoA synthetase alpha and beta chains, malate CoA ligase and ATP-citrate lyase. Some members of the family utilise ATP others use GTP." Q#10782 - CGI_10025561 superfamily 247910 483 589 4.34E-07 49.0938 cl17356 CoA_binding superfamily - - "CoA binding domain; This domain has a Rossmann fold and is found in a number of proteins including succinyl CoA synthetases, malate and ATP-citrate ligases." Q#10782 - CGI_10025561 superfamily 219843 6 203 2.45E-05 45.3077 cl18528 ATP-grasp_2 superfamily - - ATP-grasp domain; ATP-grasp domain. Q#10784 - CGI_10025563 superfamily 220097 46 120 2.26E-05 39.6945 cl08518 Phospholip_A2_3 superfamily N - "Prokaryotic phospholipase A2; The prokaryotic phospholipase A2 domain is predominantly found in bacterial and fungal phospholipases, as well as various hypothetical and putative proteins. It enables the liberation of fatty acids and lysophospholipid by hydrolysing the 2-ester bond of 1,2-diacyl-3-sn-phosphoglycerides. The domain adopts an alpha-helical secondary structure, consisting of five alpha-helices and two helical segments." Q#10789 - CGI_10025569 superfamily 241578 199 364 2.05E-21 92.278 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10789 - CGI_10025569 superfamily 217211 413 479 2.72E-07 48.8198 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#10790 - CGI_10025570 superfamily 241578 181 286 3.90E-15 74.3529 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10790 - CGI_10025570 superfamily 217211 364 430 1.93E-06 46.5086 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#10791 - CGI_10025571 superfamily 241578 164 320 1.99E-19 85.5237 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#10791 - CGI_10025571 superfamily 217211 368 434 1.38E-07 48.8198 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#10793 - CGI_10025574 superfamily 217293 9 47 0.00113666 34.9159 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10794 - CGI_10025575 superfamily 217293 26 224 1.45E-45 152.787 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10796 - CGI_10025577 superfamily 241563 18 50 0.00638228 33.2216 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10797 - CGI_10025578 superfamily 246680 12 83 3.13E-12 58.6478 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#10799 - CGI_10025580 superfamily 216339 779 853 2.39E-23 100.152 cl08308 Tub superfamily N - Tub family; Tub family. Q#10799 - CGI_10025580 superfamily 243073 167 198 0.00725627 35.5284 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#10800 - CGI_10002322 superfamily 245847 130 251 0.000103633 40.1784 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#10802 - CGI_10003360 superfamily 245596 99 320 1.26E-32 123.36 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#10803 - CGI_10003361 superfamily 243072 361 479 3.08E-37 133.663 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10803 - CGI_10003361 superfamily 243072 288 413 1.80E-35 128.655 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10803 - CGI_10003361 superfamily 243072 187 314 1.87E-31 117.485 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10804 - CGI_10003362 superfamily 241607 289 327 9.45E-06 43.4126 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#10805 - CGI_10003363 superfamily 241649 113 194 5.72E-25 94.7864 cl00159 fer2 superfamily C - "2Fe-2S iron-sulfur cluster binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins, which act as electron carriers in photosynthesis and ferredoxins, which participate in redox chains (from bacteria to mammals). Fold is ismilar to thioredoxin." Q#10805 - CGI_10003363 superfamily 241649 57 91 7.54E-08 47.4069 cl00159 fer2 superfamily C - "2Fe-2S iron-sulfur cluster binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins, which act as electron carriers in photosynthesis and ferredoxins, which participate in redox chains (from bacteria to mammals). Fold is ismilar to thioredoxin." Q#10806 - CGI_10000576 superfamily 241696 41 141 8.83E-56 179.363 cl00218 Glyco_hydrolase_16 superfamily C - "glycosyl hydrolase family 16; The O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycosyl hydrolase family 16. Family 16 includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues." Q#10807 - CGI_10001476 superfamily 246723 1 44 0.000650571 34.8487 cl14813 GluZincin superfamily C - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#10808 - CGI_10019283 superfamily 247723 30 121 7.02E-52 165.562 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10811 - CGI_10019286 superfamily 207684 8 42 0.000367264 36.588 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#10813 - CGI_10019288 superfamily 241618 48 81 2.26E-05 39.9451 cl00111 PAH superfamily - - "Pancreatic Hormone domain, a regulator of pancreatic and gastrointestinal functions; neuropeptide Y (NPY)b, peptide YY (PYY), and pancreatic polypetide (PP) are closely related; propeptide is enzymatically cleaved to yield the mature active peptide with amidated C-terminal ends; receptor binding and activation functions may reside in the N- and C-termini respectively; occurs in neurons, intestinal endocrine cells, and pancreas; exist as monomers and dimers" Q#10814 - CGI_10019289 superfamily 145281 334 449 0.00508723 37.1877 cl03405 Glyco_hydro_45 superfamily C - Glycosyl hydrolase family 45; Glycosyl hydrolase family 45. Q#10816 - CGI_10019291 superfamily 247099 56 437 0 544.703 cl15845 MntH superfamily - - Mn2+ and Fe2+ transporters of the NRAMP family [Inorganic ion transport and metabolism] Q#10817 - CGI_10019292 superfamily 245201 22 255 1.68E-69 224.723 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#10817 - CGI_10019292 superfamily 247057 333 391 3.90E-15 70.6124 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#10820 - CGI_10019295 superfamily 216276 18 93 0.00045377 37.1423 cl15639 Activin_recp superfamily - - "Activin types I and II receptor domain; This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box." Q#10821 - CGI_10019296 superfamily 241563 76 114 0.00042625 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10823 - CGI_10019298 superfamily 243061 546 637 4.89E-20 86.627 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10823 - CGI_10019298 superfamily 243061 655 747 1.05E-19 85.5962 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10825 - CGI_10019300 superfamily 243035 112 199 3.02E-06 44.1478 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10827 - CGI_10019302 superfamily 248061 216 301 3.98E-07 47.0082 cl17507 LbR-like superfamily N - "Left-handed beta-roll, including virulence factors and various other proteins; This family contains a variety of protein domains with a left-handed beta-roll structure including cell surface adhesion proteins, bacterial virulence factors, and ice-binding proteins, and other activities. UspA1 Head And Neck Domain and YadA of Yersinia are part of a class of pathogenicity factors that act as cell surface adhesion molecules, in which N-terminal head and neck domains extend from the bacterial outer membrane. The UspA1 head domain of Moraxella catarrhalis, is formed from trimeric beta-rolls of 14-16 amino acid repeats. The UspA1 head domain connects to a neck region of large extended, charged loops that maybe be ligand binding, which is in turn connected to an extended coiled coil domain that tethers the head and neck region to the cell surface via a transmembrane region. The collagen-binding domain virulence factor YadA an adhesion proteins of several Yersinia species, and related cell surface proteins. The collagen-binding portion is found in the hydrophobic N-terminal region. YadA forms a matrix on the bacterial outer membrane, which mediates binding to collagen and epithelial cells. YadA inhibits the complement-activating pathway with the coating of the cell surface with factor H, which impedes C3b molecules. The ice-binding protein of the grass Lolium perenne (LpIBP) discourages the recrystallization of ice. Ice-binding proteins produced by organisms to prevent the growing of ice are termed to anti-freeze proteins. LpIBP consists of an unusual left-handed beta roll. Ice-binding is mediated by a flat beta-sheet on one side of the helix. These domains form a left handed beta roll made up of a series of short repeated elements." Q#10828 - CGI_10019303 superfamily 243134 205 313 4.22E-22 89.2456 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#10828 - CGI_10019303 superfamily 243134 56 177 2.98E-21 86.9344 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#10829 - CGI_10019304 superfamily 241600 1 68 2.32E-28 101.933 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10832 - CGI_10019307 superfamily 243092 216 436 1.29E-07 51.5668 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10834 - CGI_10019309 superfamily 243078 16 145 3.07E-71 229.56 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#10834 - CGI_10019309 superfamily 243521 567 682 1.32E-23 97.0378 cl03759 Alpha_adaptinC2 superfamily - - "Adaptin C-terminal domain; Alpha adaptin is a heterotetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This ig-fold domain is found in alpha, beta and gamma adaptins." Q#10834 - CGI_10019309 superfamily 190532 206 296 2.44E-19 84.2275 cl03906 GAT superfamily - - "GAT domain; The GAT domain is responsible for binding of GGA proteins to several members of the ARF family including ARF1 and ARF3. The GAT domain stabilises membrane bound ARF1 in its GTP bound state, by interfering with GAP proteins." Q#10835 - CGI_10019310 superfamily 199156 60 74 0.00563552 33.9669 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#10836 - CGI_10019311 superfamily 247856 73 132 8.32E-07 47.9277 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#10836 - CGI_10019311 superfamily 243092 349 692 8.71E-25 106.265 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10836 - CGI_10019311 superfamily 243092 976 1011 7.04E-05 41.9142 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10838 - CGI_10019313 superfamily 202712 587 628 1.50E-21 89.2418 cl04191 CXC superfamily - - Tesmin/TSO1-like CXC domain; This family includes proteins that have two copies of a cysteine rich motif as follows: C-X-C-X4-C-X3-YC-X-C-X6-C-X3-C-X-C-X2-C. The family includes Tesmin and TSO1. This family is called a CXC domain in. Q#10839 - CGI_10019314 superfamily 242889 268 363 2.64E-19 82.2657 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#10840 - CGI_10019315 superfamily 221442 44 118 1.38E-07 47.929 cl18607 Hydrolase_4 superfamily - - "Putative lysophospholipase; This domain is found in bacteria and eukaryotes and is approximately 110 amino acids in length. It is found in association with pfam00561. Many members are annotated as being lysophospholipases, and others as alpha-beta hydrolase fold-containing proteins." Q#10840 - CGI_10019315 superfamily 247857 126 161 0.000244831 40.69 cl17303 Esterase_713_like superfamily NC - Novel bacterial esterase that cleaves esters on halogenated cyclic compounds; This family contains proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown. This enzyme is possibly exported from the cytosol to the periplasmic space. A large majority of sequences in this family have yet to be characterized. Q#10842 - CGI_10008536 superfamily 247692 198 556 5.59E-61 206.988 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#10842 - CGI_10008536 superfamily 247692 50 138 8.60E-10 59.4662 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#10843 - CGI_10008537 superfamily 217293 340 532 9.80E-34 131.216 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10843 - CGI_10008537 superfamily 217293 1042 1209 1.90E-28 115.808 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10843 - CGI_10008537 superfamily 217293 707 860 1.57E-24 104.252 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10843 - CGI_10008537 superfamily 217293 6 168 2.44E-24 103.481 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#10843 - CGI_10008537 superfamily 202474 1241 1346 2.20E-08 54.9673 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#10843 - CGI_10008537 superfamily 202474 867 1042 1.82E-06 49.1893 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#10843 - CGI_10008537 superfamily 202474 175 248 0.000943717 41.1001 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#10844 - CGI_10008538 superfamily 248458 50 226 6.80E-28 114.333 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#10844 - CGI_10008538 superfamily 248458 291 382 4.91E-05 44.6121 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#10847 - CGI_10008541 superfamily 247755 397 602 1.88E-83 262.475 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#10847 - CGI_10008541 superfamily 241940 45 284 4.07E-92 289.515 cl00549 ABC_membrane_2 superfamily - - ABC transporter transmembrane region 2; This domain covers the transmembrane of a small family of ABC transporters and shares sequence similarity with pfam00664. Mutations in this domain in human ABCD3 (PMP70) are believed responsible for Zellweger Syndrome-2; mutations in human ABCD1 (ALD) are responsible for recessive X-linked adrenoleukodystrophy. A Saccharomyces cerevisiae homolog is involved in the import of long-chain fatty acids. Q#10848 - CGI_10008542 superfamily 243072 598 715 6.58E-21 90.9058 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10848 - CGI_10008542 superfamily 243072 861 1005 1.85E-10 60.0898 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10849 - CGI_10008543 superfamily 241600 55 182 2.52E-39 134.712 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10850 - CGI_10008544 superfamily 243098 63 111 1.03E-05 44.5111 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#10851 - CGI_10008545 superfamily 247725 127 202 2.49E-08 49.4989 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#10852 - CGI_10008546 superfamily 241958 27 469 2.39E-80 257.444 cl00573 SDF superfamily - - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#10854 - CGI_10008548 superfamily 241958 25 346 1.30E-33 128.787 cl00573 SDF superfamily - - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#10857 - CGI_10028824 superfamily 241900 3 365 4.85E-158 463.864 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#10858 - CGI_10028825 superfamily 241599 165 223 4.81E-19 78.4392 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#10860 - CGI_10028827 superfamily 243152 27 154 4.43E-35 121.242 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#10861 - CGI_10028828 superfamily 243065 491 668 3.28E-05 43.9697 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#10861 - CGI_10028828 superfamily 247724 700 811 0.000205825 42.69 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10861 - CGI_10028828 superfamily 247724 58 260 0.00614592 37.436 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#10863 - CGI_10028830 superfamily 212639 42 1261 0 985.653 cl17018 FANC superfamily - - "Fanconi anemia ID complex proteins FANCI and FANCD2; The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome." Q#10864 - CGI_10028831 superfamily 191698 1630 1808 2.63E-92 298.271 cl06299 Tcp10_C superfamily - - T-complex protein 10 C-terminus; This family represents the C-terminus (approximately 180 residues) of eukaryotic T-complex protein 10. The T-complex is involved in spermatogenesis in mice. Q#10864 - CGI_10028831 superfamily 247739 78 189 5.28E-15 75.7204 cl17185 LPLAT superfamily N - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#10864 - CGI_10028831 superfamily 243130 232 271 0.000635406 39.7559 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#10866 - CGI_10028833 superfamily 243061 54 98 6.01E-19 75.4562 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10866 - CGI_10028833 superfamily 243061 1 46 6.14E-14 62.3594 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#10867 - CGI_10028834 superfamily 243034 7 106 1.49E-18 77.8055 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#10868 - CGI_10028835 superfamily 203324 33 151 4.19E-55 180.91 cl18240 UEV superfamily - - "UEV domain; This family includes the eukaryotic tumour susceptibility gene 101 protein (TSG101). Altered transcripts of this gene have been detected in sporadic breast cancers and many other human malignancies. However, the involvement of this gene in neoplastic transformation and tumorigenesis is still elusive. TSG101 is required for normal cell function of embryonic and adult tissues but that this gene is not a tumour suppressor for sporadic forms of breast cancer. This family is related to the ubiquitin conjugating enzymes." Q#10868 - CGI_10028835 superfamily 117992 369 415 1.58E-14 68.4264 cl09692 Vps23_core superfamily C - Vps23 core domain; ESCRT complexes form the main machinery driving protein sorting from endosomes to lysosomes. The core domain of the Vps23 subunit of the heterotrimeric ESCRT-I complex is a helical hairpin sandwiched in a fan-like formation between two other helical hairpins from Vps28 (pfam03997) and Vps37. Vps23 gives ESCRT-I complex its stability. Q#10869 - CGI_10028836 superfamily 243035 106 218 3.43E-18 77.2749 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10869 - CGI_10028836 superfamily 111223 1 48 0.00704446 34.2349 cl15893 MgtC superfamily C - MgtC family; The MgtC protein is found in an operon with the Mg2+ transporter protein MgtB. The function of MgtC and its homologues is not known. Q#10872 - CGI_10028839 superfamily 241936 29 121 1.04E-05 43.2107 cl00542 RBFA superfamily - - Ribosome-binding factor A; Ribosome-binding factor A. Q#10875 - CGI_10028842 superfamily 247676 67 176 5.36E-41 136.627 cl17012 GINS_A superfamily - - "Alpha-helical domain of GINS complex proteins; Sld5, Psf1, Psf2 and Psf3; The GINS complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In eukaryotes, GINS is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3. The GINS complex has been found in eukaryotes and archaea, but not in bacteria. The four subunits of the complex are homologous and consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3." Q#10876 - CGI_10028843 superfamily 247912 85 356 5.69E-27 111.824 cl17358 Beta-lactamase superfamily C - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#10877 - CGI_10028844 superfamily 247639 99 353 1.57E-51 174.187 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#10879 - CGI_10028846 superfamily 247792 55 103 0.00740053 33.9584 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10882 - CGI_10028849 superfamily 221450 1193 1599 8.07E-86 288.859 cl13584 DUF3595 superfamily - - Protein of unknown function (DUF3595); This family of proteins is functionally uncharacterized.This family of proteins is found in eukaryotes. Proteins in this family are typically between 578 and 2525 amino acids in length. Q#10884 - CGI_10028851 superfamily 221450 7 92 7.89E-23 90.0961 cl13584 DUF3595 superfamily N - Protein of unknown function (DUF3595); This family of proteins is functionally uncharacterized.This family of proteins is found in eukaryotes. Proteins in this family are typically between 578 and 2525 amino acids in length. Q#10888 - CGI_10028856 superfamily 241739 8 254 7.39E-126 383.249 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#10888 - CGI_10028856 superfamily 244970 691 749 3.66E-11 60.0886 cl08469 tRNA_SAD superfamily - - "Threonyl and Alanyl tRNA synthetase second additional domain; The catalytically active from of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain." Q#10888 - CGI_10028856 superfamily 243185 503 530 0.000113947 41.3173 cl02787 Translation_Factor_II_like superfamily N - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#10888 - CGI_10028856 superfamily 216955 885 952 0.00495954 36.4343 cl03510 DHHA1 superfamily - - "DHHA1 domain; This domain is often found adjacent to the DHH domain pfam01368 and is called DHHA1 for DHH associated domain. This domain is diagnostic of DHH subfamily 1 members. This domains is also found in alanyl tRNA synthetase , suggesting that this domain may have an RNA binding function. The domain is about 60 residues long and contains a conserved GG motif." Q#10889 - CGI_10028857 superfamily 245819 128 286 2.01E-50 175.075 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#10889 - CGI_10028857 superfamily 245819 650 839 9.93E-43 153.889 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#10889 - CGI_10028857 superfamily 218992 334 416 6.50E-07 48.1797 cl05691 DUF1053 superfamily - - Domain of Unknown Function (DUF1053); This domain is found in Adenylate cyclases. Q#10890 - CGI_10028858 superfamily 220615 1 100 5.68E-11 55.6155 cl10869 MPP6 superfamily - - M-phase phosphoprotein 6; This is a family of M-phase phosphoprotein 6s which is necessary for generation of the 3' end of the 5.8S rRNA precursor. It preferentially binds to poly(C) and poly(U). Q#10891 - CGI_10028859 superfamily 243098 1304 1351 7.24E-11 60.3043 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#10891 - CGI_10028859 superfamily 247905 817 903 4.43E-06 47.2325 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#10891 - CGI_10028859 superfamily 243098 1064 1116 0.00605706 36.8072 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#10891 - CGI_10028859 superfamily 241659 1812 1911 6.90E-20 88.0352 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#10891 - CGI_10028859 superfamily 247805 580 738 6.33E-19 87.9253 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#10894 - CGI_10028862 superfamily 243065 208 362 5.67E-06 45.5105 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#10895 - CGI_10028863 superfamily 243074 13 58 5.79E-11 57.5165 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#10897 - CGI_10028865 superfamily 241900 25 64 7.36E-05 42.3307 cl00490 EEP superfamily C - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#10899 - CGI_10028867 superfamily 219525 103 159 2.00E-07 47.0285 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10899 - CGI_10028867 superfamily 219525 166 221 0.000869164 36.6282 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10900 - CGI_10028868 superfamily 219525 294 332 8.94E-06 43.5618 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#10900 - CGI_10028868 superfamily 219525 339 395 9.58E-06 43.1766 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10900 - CGI_10028868 superfamily 219525 402 458 7.98E-05 40.4802 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#10902 - CGI_10028870 superfamily 248012 79 188 4.88E-08 50.6541 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#10903 - CGI_10028871 superfamily 218258 168 268 5.55E-14 67.4372 cl04741 HABP4_PAI-RBP1 superfamily - - "Hyaluronan / mRNA binding family; This family includes the HABP4 family of hyaluronan-binding proteins, and the PAI-1 mRNA-binding protein, PAI-RBP1. HABP4 has been observed to bind hyaluronan (a glucosaminoglycan), but it is not known whether this is its primary role in vivo. It has also been observed to bind RNA, but with a lower affinity than that for hyaluronan. PAI-1 mRNA-binding protein specifically binds the mRNA of type-1 plasminogen activator inhibitor (PAI-1), and is thought to be involved in regulation of mRNA stability. However, in both cases, the sequence motifs predicted to be important for ligand binding are not conserved throughout the family, so it is not known whether members of this family share a common function." Q#10904 - CGI_10028872 superfamily 197676 56 77 0.000992172 36.6749 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#10904 - CGI_10028872 superfamily 222150 100 126 0.00125208 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10904 - CGI_10028872 superfamily 222150 69 94 0.00162867 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#10905 - CGI_10028873 superfamily 241804 127 348 2.27E-22 94.0188 cl00348 COG0182 superfamily N - "Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]" Q#10907 - CGI_10028875 superfamily 238191 634 1123 3.12E-122 387.458 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#10907 - CGI_10028875 superfamily 238191 34 537 2.47E-107 347.397 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#10909 - CGI_10028877 superfamily 238191 1 419 5.99E-96 299.633 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#10910 - CGI_10028878 superfamily 238191 27 522 5.58E-113 348.168 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#10912 - CGI_10028880 superfamily 243072 159 285 4.92E-32 122.492 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#10913 - CGI_10028881 superfamily 245836 240 458 1.69E-101 317.983 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#10913 - CGI_10028881 superfamily 244947 462 602 1.60E-60 202.405 cl08424 OBF_DNA_ligase_family superfamily - - "The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases; ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain." Q#10913 - CGI_10028881 superfamily 241565 818 902 1.22E-05 44.2347 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10913 - CGI_10028881 superfamily 241565 666 736 0.000234908 40.3827 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10913 - CGI_10028881 superfamily 151850 761 789 0.000186683 40.119 cl12940 DNA_ligase_IV superfamily - - "DNA ligase IV; DNA ligase IV along with Xrcc4 functions in DNA non-homologous end joining. This process is required to mend double-strand breaks. Upon ligase binding to an Xrcc4 dimer, the helical tails unwind leading to a flat interaction surface." Q#10914 - CGI_10028882 superfamily 245838 276 420 1.03E-22 95.9747 cl12018 Peptidase_M48 superfamily N - Peptidase family M48; Peptidase family M48. Q#10915 - CGI_10028883 superfamily 244947 214 304 3.01E-40 143.469 cl08424 OBF_DNA_ligase_family superfamily N - "The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases; ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain." Q#10915 - CGI_10028883 superfamily 241565 522 606 3.88E-06 45.0051 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10915 - CGI_10028883 superfamily 241565 368 438 0.00012995 40.3827 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10915 - CGI_10028883 superfamily 241565 59 117 3.98E-05 41.8116 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10915 - CGI_10028883 superfamily 151850 463 493 0.00827217 34.7262 cl12940 DNA_ligase_IV superfamily - - "DNA ligase IV; DNA ligase IV along with Xrcc4 functions in DNA non-homologous end joining. This process is required to mend double-strand breaks. Upon ligase binding to an Xrcc4 dimer, the helical tails unwind leading to a flat interaction surface." Q#10916 - CGI_10028884 superfamily 204929 19 69 9.62E-12 61.0657 cl13846 MMS19_N superfamily N - "NER and RNAPII transcription protein n terminal; This domain family is found in eukaryotes, and is approximately 60 amino acids in length. MMS19 is required for both nucleotide excision repair (NER) and RNA polymerase II (RNAP II) transcription." Q#10916 - CGI_10028884 superfamily 245838 467 559 1.43E-09 57.0696 cl12018 Peptidase_M48 superfamily N - Peptidase family M48; Peptidase family M48. Q#10917 - CGI_10028885 superfamily 241645 150 219 1.16E-23 91.0668 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#10919 - CGI_10028887 superfamily 220692 93 368 5.01E-11 62.2217 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#10921 - CGI_10028889 superfamily 241583 252 414 3.91E-43 152.358 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#10921 - CGI_10028889 superfamily 243048 439 630 9.76E-17 78.8924 cl02471 HX superfamily - - Hemopexin-like repeats.; Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). This CD contains 4 instances of the repeat. Q#10921 - CGI_10028889 superfamily 216518 105 129 0.00251889 36.7393 cl18368 PG_binding_1 superfamily N - Putative peptidoglycan binding domain; This domain is composed of three alpha helices. This domain is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. This family is found N-terminal to the catalytic domain of matrixins. The domain is found to bind peptidoglycan experimentally. Q#10924 - CGI_10028892 superfamily 243485 62 96 0.000261951 36.9488 cl03649 HemD superfamily N - "Uroporphyrinogen-III synthase (HemD) catalyzes the asymmetrical cyclization of tetrapyrrole (linear) to uroporphyrinogen-III, the fourth step in the biosynthesis of heme. This ubiquitous enzyme is present in eukaryotes, bacteria and archaea. Mutations in the human uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria, a recessive inborn error of metabolism also known as Gunther disease." Q#10925 - CGI_10028893 superfamily 243485 18 182 3.01E-30 112.015 cl03649 HemD superfamily C - "Uroporphyrinogen-III synthase (HemD) catalyzes the asymmetrical cyclization of tetrapyrrole (linear) to uroporphyrinogen-III, the fourth step in the biosynthesis of heme. This ubiquitous enzyme is present in eukaryotes, bacteria and archaea. Mutations in the human uroporphyrinogen-III synthase gene cause congenital erythropoietic porphyria, a recessive inborn error of metabolism also known as Gunther disease." Q#10927 - CGI_10028895 superfamily 243035 31 140 9.63E-10 51.8518 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#10928 - CGI_10028896 superfamily 243100 187 236 9.80E-10 53.3367 cl02576 B_zip1 superfamily N - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#10929 - CGI_10028897 superfamily 216347 342 760 1.49E-121 374.95 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#10929 - CGI_10028897 superfamily 145726 93 177 0.00111511 38.0954 cl08353 Cu_amine_oxidN2 superfamily - - "Copper amine oxidase, N2 domain; This domain is the first or second structural domain in copper amine oxidases, it is known as the N2 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ)." Q#10930 - CGI_10028898 superfamily 241607 737 770 0.0065922 35.7086 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#10930 - CGI_10028898 superfamily 241607 617 637 0.000203939 40.3451 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#10931 - CGI_10028900 superfamily 247792 799 844 0.00144988 37.4252 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10931 - CGI_10028900 superfamily 243092 52 157 2.30E-05 45.7888 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10931 - CGI_10028900 superfamily 150957 721 825 0.00245145 37.2105 cl11034 Vps39_2 superfamily - - "Vacuolar sorting protein 39 domain 2; This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. This domain is involved in localisation and in mediating the interactions of Vps39 with Vps11." Q#10932 - CGI_10028901 superfamily 241599 414 471 1.82E-16 73.8168 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#10932 - CGI_10028901 superfamily 198730 318 392 7.78E-35 125.11 cl02582 Pou superfamily - - Pou domain - N-terminal to homeobox domain; Pou domain - N-terminal to homeobox domain. Q#10934 - CGI_10028903 superfamily 238191 15 505 2.33E-108 334.301 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#10935 - CGI_10028904 superfamily 248054 22 220 1.44E-09 56.5419 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#10936 - CGI_10028905 superfamily 248012 289 383 3.95E-10 57.2833 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#10937 - CGI_10028906 superfamily 241600 84 214 6.10E-62 208.634 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10938 - CGI_10028907 superfamily 241600 309 516 4.20E-96 292.992 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10938 - CGI_10028907 superfamily 241600 84 256 1.22E-82 258.324 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10941 - CGI_10028910 superfamily 241600 84 234 1.64E-81 245.613 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#10942 - CGI_10028911 superfamily 248318 286 352 2.44E-18 80.5577 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#10942 - CGI_10028911 superfamily 243092 23 279 1.46E-26 110.502 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10942 - CGI_10028911 superfamily 216421 521 844 5.43E-23 100.19 cl03153 Lamp superfamily - - Lysosome-associated membrane glycoprotein (Lamp); Lysosome-associated membrane glycoprotein (Lamp). Q#10944 - CGI_10028913 superfamily 247723 541 619 1.75E-23 96.5472 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#10944 - CGI_10028913 superfamily 245716 511 534 1.38E-05 43.3159 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#10945 - CGI_10028914 superfamily 201362 63 148 4.04E-13 63.1531 cl08277 Motile_Sperm superfamily C - MSP (Major sperm protein) domain; Major sperm proteins are involved in sperm motility. These proteins oligomerise to form filaments. This family contains many other proteins. Q#10946 - CGI_10028915 superfamily 152807 1369 1411 6.79E-09 54.7557 cl13765 DUF3652 superfamily - - "Huntingtin protein region; This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam02985. This family is in the middle region of the Huntingtin protein associated with Huntington's disease. The protein is of unknown function, however it is known that a polyglutamine (CAG) repeat in the gene coding for it results in the development of Huntington's disease." Q#10948 - CGI_10028917 superfamily 245670 149 332 1.62E-58 196.239 cl11519 DENN superfamily - - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#10948 - CGI_10028917 superfamily 243635 15 85 1.91E-18 81.2268 cl04085 uDENN superfamily N - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#10951 - CGI_10028920 superfamily 245599 159 364 8.63E-102 302.677 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#10951 - CGI_10028920 superfamily 207662 8 100 1.35E-66 208.119 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#10952 - CGI_10028921 superfamily 245599 165 371 1.41E-79 245.667 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#10952 - CGI_10028921 superfamily 207662 9 100 5.71E-60 191.17 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#10953 - CGI_10028922 superfamily 248012 2 51 0.000121818 36.4825 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#10954 - CGI_10028923 superfamily 248012 2 94 5.23E-11 55.7425 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#10958 - CGI_10028927 superfamily 246748 14 375 1.89E-155 445.765 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#10959 - CGI_10028928 superfamily 246748 14 361 5.42E-147 426.505 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#10961 - CGI_10028930 superfamily 247792 17 60 4.21E-08 50.522 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#10961 - CGI_10028930 superfamily 241563 154 189 7.92E-06 43.8151 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#10961 - CGI_10028930 superfamily 128778 200 325 0.000686807 38.7851 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#10962 - CGI_10028931 superfamily 243092 797 1083 2.57E-44 163.275 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#10962 - CGI_10028931 superfamily 222477 120 746 0 544.678 cl16505 SID-1_RNA_chan superfamily - - dsRNA-gated channel SID-1; This is a family of proteins that are transmembrane dsRNA-gated channels. They passively transport dsRNA into cells and do not act as ATP-dependent pumps. They are required for systemic RNA interference. Q#10964 - CGI_10028933 superfamily 205570 66 105 7.40E-06 41.6498 cl16264 HNH_3 superfamily - - HNH endonuclease; HNH endonuclease. Q#10965 - CGI_10028934 superfamily 241583 162 361 5.32E-60 202.854 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#10965 - CGI_10028934 superfamily 216572 31 136 3.51E-19 85.019 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#10965 - CGI_10028934 superfamily 246918 749 799 0.000254422 39.8775 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10965 - CGI_10028934 superfamily 246918 483 513 0.000280568 39.8775 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#10968 - CGI_10028938 superfamily 217410 11 108 0.00148501 35.7928 cl18409 DDE_1 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localised to the centromere." Q#10971 - CGI_10013357 superfamily 241750 9 348 6.27E-101 303.741 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#10972 - CGI_10013358 superfamily 216423 208 330 9.07E-38 141.99 cl18367 Glyco_hydro_35 superfamily N - Glycosyl hydrolases family 35; Glycosyl hydrolases family 35. Q#10972 - CGI_10013358 superfamily 216423 65 118 7.44E-08 53.0091 cl18367 Glyco_hydro_35 superfamily NC - Glycosyl hydrolases family 35; Glycosyl hydrolases family 35. Q#10973 - CGI_10013359 superfamily 243029 23 88 2.16E-12 61.5977 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#10975 - CGI_10013361 superfamily 248097 142 262 3.85E-27 102.729 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10976 - CGI_10013362 superfamily 248097 11 132 2.93E-27 99.2618 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#10978 - CGI_10013364 superfamily 241565 155 254 0.000832484 37.3558 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10979 - CGI_10013365 superfamily 241565 110 179 1.16E-10 59.6426 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10979 - CGI_10013365 superfamily 241565 203 273 2.38E-10 58.487 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10979 - CGI_10013365 superfamily 241565 366 437 2.87E-07 49.6275 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10979 - CGI_10013365 superfamily 241565 643 717 5.88E-07 48.4719 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#10980 - CGI_10013366 superfamily 152088 123 203 4.77E-20 81.0951 cl13155 DUF3259 superfamily - - Protein of unknown function (DUF3259); This eukaryotic family of proteins has no known function. Q#10982 - CGI_10013368 superfamily 214806 200 293 8.42E-18 76.5653 cl15966 CRA superfamily - - "CT11-RanBPM; protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi)" Q#10982 - CGI_10013368 superfamily 199226 108 141 7.81E-07 45.118 cl11662 LisH superfamily - - "LisH; The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex." Q#10982 - CGI_10013368 superfamily 128914 148 186 0.000249311 38.321 cl15352 CTLH superfamily C - C-terminal to LisH motif; Alpha-helical motif of unknown function. Q#10984 - CGI_10013370 superfamily 241647 132 157 0.00110689 37.9332 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#10985 - CGI_10013371 superfamily 243095 137 322 1.56E-103 304.314 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#10985 - CGI_10013371 superfamily 243052 1 126 1.71E-20 85.8751 cl02480 MyTH4 superfamily N - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#10986 - CGI_10013372 superfamily 247999 68 111 6.85E-05 37.0848 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#10987 - CGI_10013373 superfamily 222055 312 431 1.57E-24 99.3244 cl16245 AATF-Che1 superfamily - - Apoptosis antagonizing transcription factor; The N-terminal and leucine-zipper region of the apoptosis antagonizing transcription factor-Che1. Q#10987 - CGI_10013373 superfamily 219734 480 596 4.50E-17 76.8788 cl06972 TRAUB superfamily - - "Apoptosis-antagonizing transcription factor, C-terminal; This C terminal domain is found in traube proteins. This is the domain of the AATF proteins that interacts with BLOS2 or Ceap, that functions as an adaptor in processes such as protein and vesicle processing and transport, and perhaps transcription." Q#10988 - CGI_10013374 superfamily 243028 432 468 6.63E-12 62.3114 cl02419 Notch superfamily - - LNR domain; The LNR (Lin-12/Notch repeat) domain is found in three tandem copies in Notch related proteins. The structure of the domain has been determined by NMR and was shown to contain three disulphide bonds and coordinate a calcium ion. Three repeats are also found in the PAPP-A peptidase. Q#10988 - CGI_10013374 superfamily 221100 318 428 1.51E-08 57.2193 cl15596 DUF3184 superfamily C - Protein of unknown function (DUF3184); This eukaryotic family of proteins has no known function. Q#10988 - CGI_10013374 superfamily 243028 490 525 7.26E-07 47.6738 cl02419 Notch superfamily - - LNR domain; The LNR (Lin-12/Notch repeat) domain is found in three tandem copies in Notch related proteins. The structure of the domain has been determined by NMR and was shown to contain three disulphide bonds and coordinate a calcium ion. Three repeats are also found in the PAPP-A peptidase. Q#10996 - CGI_10013382 superfamily 243066 23 124 9.99E-12 62.2497 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#10997 - CGI_10013383 superfamily 242685 83 115 3.55E-17 74.5359 cl01749 UPF0160 superfamily C - Uncharacterized protein family (UPF0160); This family of proteins contains a large number of metal binding residues. The patterns are suggestive of a phosphoesterase function. The conserved DHH motif may mean this family is related to pfam01368. Q#10998 - CGI_10013384 superfamily 242685 19 340 1.13E-177 498.641 cl01749 UPF0160 superfamily - - Uncharacterized protein family (UPF0160); This family of proteins contains a large number of metal binding residues. The patterns are suggestive of a phosphoesterase function. The conserved DHH motif may mean this family is related to pfam01368. Q#11006 - CGI_10003404 superfamily 242877 20 81 3.30E-26 96.9141 cl02093 Coq4 superfamily C - "Coenzyme Q (ubiquinone) biosynthesis protein Coq4; Coq4p was shown to peripherally associate with the matrix face of the mitochondrial inner membrane. The putative mitochondrial- targeting sequence present at the amino-terminus of the polypeptide efficiently imported it to mitochondria. The function of Coq4p is unknown, although its presence is required to maintain a steady-state level of Coq7p, another component of the Q biosynthetic pathway. The overall structure of Coq4 is alpha helical and shows resemblance to haemoglobin/myoglobin (information from TOPSAN)." Q#11007 - CGI_10022762 superfamily 245201 7 157 5.22E-110 322.689 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11008 - CGI_10022763 superfamily 245201 17 58 6.51E-24 91.2184 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11010 - CGI_10022765 superfamily 247744 47 166 5.87E-08 48.804 cl17190 NK superfamily N - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#11011 - CGI_10022766 superfamily 247805 351 580 7.14E-83 264.732 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#11011 - CGI_10022766 superfamily 247905 597 723 2.23E-36 133.902 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#11013 - CGI_10022768 superfamily 243127 6 38 4.56E-11 54.2346 cl02651 FYRC superfamily N - F/Y rich C-terminus; This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00542. Q#11015 - CGI_10022771 superfamily 247057 663 731 1.20E-26 104.273 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#11015 - CGI_10022771 superfamily 241622 3 89 1.75E-14 69.903 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11017 - CGI_10022773 superfamily 247805 318 425 8.31E-13 67.7476 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#11017 - CGI_10022773 superfamily 247905 592 685 2.04E-11 63.0256 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#11017 - CGI_10022773 superfamily 219729 1190 1366 1.21E-72 240.966 cl06956 DSHCT superfamily - - DSHCT (NUC185) domain; This C terminal domain is found in DOB1/SK12/helY-like DEAD box helicases. Q#11017 - CGI_10022773 superfamily 219729 1034 1144 5.54E-35 133.495 cl06956 DSHCT superfamily C - DSHCT (NUC185) domain; This C terminal domain is found in DOB1/SK12/helY-like DEAD box helicases. Q#11018 - CGI_10022774 superfamily 207662 65 152 3.75E-64 206.152 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#11018 - CGI_10022774 superfamily 245599 316 538 9.27E-95 291.107 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#11019 - CGI_10022775 superfamily 246918 568 621 2.60E-12 63.7599 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#11019 - CGI_10022775 superfamily 246918 338 390 9.90E-12 62.2191 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#11019 - CGI_10022775 superfamily 246918 626 678 1.32E-11 61.8339 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#11019 - CGI_10022775 superfamily 246918 453 505 2.91E-11 60.6783 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#11019 - CGI_10022775 superfamily 246918 395 445 4.81E-09 54.1299 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#11019 - CGI_10022775 superfamily 243060 763 830 6.15E-06 45.8328 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#11019 - CGI_10022775 superfamily 246918 508 563 0.000212422 40.6479 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#11019 - CGI_10022775 superfamily 241672 830 881 0.00153836 40.3579 cl00192 ribokinase_pfkB_like superfamily NC - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#11020 - CGI_10022776 superfamily 242730 162 217 0.00100415 38.3963 cl01825 Phage_Mu_Gam superfamily C - Bacteriophage Mu Gam like protein; This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. Q#11021 - CGI_10022777 superfamily 242730 162 203 0.00594752 36.0851 cl01825 Phage_Mu_Gam superfamily C - Bacteriophage Mu Gam like protein; This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. Q#11021 - CGI_10022777 superfamily 241563 64 101 0.00709639 34.9556 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11022 - CGI_10022778 superfamily 246908 109 152 0.00649459 35.0194 cl15255 SH2 superfamily N - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#11023 - CGI_10022779 superfamily 217293 1 159 3.27E-35 128.905 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11023 - CGI_10022779 superfamily 202474 167 338 8.87E-25 100.036 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#11024 - CGI_10022780 superfamily 177822 32 180 6.59E-16 73.8009 cl18088 PLN02164 superfamily N - sulfotransferase Q#11025 - CGI_10022781 superfamily 245008 211 275 2.80E-12 60.2796 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#11025 - CGI_10022781 superfamily 207794 1 193 2.43E-59 196.744 cl02948 GH20_hexosaminidase superfamily N - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#11026 - CGI_10022783 superfamily 245213 69 107 4.19E-08 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11026 - CGI_10022783 superfamily 245213 31 66 3.95E-07 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11026 - CGI_10022783 superfamily 216897 123 202 4.05E-25 94.6704 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#11027 - CGI_10022784 superfamily 241832 7 79 3.07E-16 70.6556 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#11027 - CGI_10022784 superfamily 243175 90 205 1.65E-14 66.4922 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#11028 - CGI_10022785 superfamily 245201 386 469 6.72E-13 67.1597 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11028 - CGI_10022785 superfamily 247724 172 283 3.60E-05 42.919 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11030 - CGI_10022787 superfamily 245226 180 353 9.52E-101 299.532 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#11030 - CGI_10022787 superfamily 243082 2 109 3.61E-07 49.8184 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#11032 - CGI_10022789 superfamily 245201 18 246 6.32E-63 200.802 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11033 - CGI_10022790 superfamily 245201 15 260 2.78E-106 325.222 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11034 - CGI_10001202 superfamily 247692 1 68 2.94E-14 65.3382 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#11038 - CGI_10005349 superfamily 222429 19 98 4.07E-08 46.85 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#11041 - CGI_10016318 superfamily 248097 51 177 1.75E-18 77.3054 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11045 - CGI_10016322 superfamily 242274 29 94 1.32E-06 42.993 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#11046 - CGI_10016323 superfamily 221602 506 741 4.15E-25 105.571 cl13871 BCAS3 superfamily - - "Breast carcinoma amplified sequence 3; This domain family is found in eukaryotes, and is typically between 229 and 245 amino acids in length. The proteins in this family have been shown to be proto-oncogenes implicated in the development of breast cancer." Q#11047 - CGI_10016324 superfamily 241636 85 270 2.36E-121 361.135 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#11048 - CGI_10016325 superfamily 241636 10 195 1.63E-124 354.972 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#11050 - CGI_10016327 superfamily 219153 196 360 3.63E-52 182.555 cl15854 DEAD_2 superfamily - - "DEAD_2; This represents a conserved region within a number of RAD3-like DNA-binding helicases that are seemingly ubiquitous - members include proteins of eukaryotic, bacterial and archaeal origin. RAD3 is involved in nucleotide excision repair, and forms part of the transcription factor TFIIH in yeast." Q#11050 - CGI_10016327 superfamily 248014 622 811 2.08E-47 168.899 cl17460 Csf4_U superfamily - - CRISPR/Cas system-associated DinG family helicase Csf4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase Q#11052 - CGI_10016329 superfamily 245208 662 1052 0 662.163 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#11052 - CGI_10016329 superfamily 247824 258 506 1.48E-98 312.984 cl17270 APH_ChoK_like superfamily - - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#11052 - CGI_10016329 superfamily 248469 100 195 8.18E-05 42.3571 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#11053 - CGI_10016330 superfamily 247941 134 264 6.95E-05 40.7821 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#11054 - CGI_10016331 superfamily 247723 115 188 6.57E-49 158.752 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#11054 - CGI_10016331 superfamily 247723 27 101 1.27E-44 147.578 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#11057 - CGI_10016334 superfamily 245206 6 351 6.78E-64 208.9 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#11059 - CGI_10015011 superfamily 243035 107 234 6.57E-27 101.157 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11059 - CGI_10015011 superfamily 243051 12 87 0.00104654 37.3574 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11060 - CGI_10015012 superfamily 241610 521 574 5.56E-22 90.0018 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#11061 - CGI_10015013 superfamily 241645 48 110 3.88E-37 122.823 cl00155 UBQ superfamily N - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#11062 - CGI_10015014 superfamily 194922 1 174 7.50E-90 263.033 cl04601 SPC22 superfamily - - "Signal peptidase subunit; Translocation of polypeptide chains across the endoplasmic reticulum membrane is triggered by signal sequences. During translocation of the nascent chain through the membrane, the signal sequence of most secretory and membrane proteins is cleaved off. Cleavage occurs by the signal peptidase complex (SPC) which consists of four subunits in yeast and five in mammals. This family is common to yeast and mammals." Q#11064 - CGI_10015016 superfamily 192535 49 151 0.000353982 40.2718 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#11066 - CGI_10015018 superfamily 216188 236 497 5.34E-67 221.708 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#11066 - CGI_10015018 superfamily 205965 75 158 4.27E-35 127.529 cl18285 Sulfate_tra_GLY superfamily - - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#11067 - CGI_10015019 superfamily 217293 22 227 1.54E-44 155.869 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11067 - CGI_10015019 superfamily 202474 234 420 7.90E-32 120.836 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#11068 - CGI_10015020 superfamily 245596 390 597 7.92E-22 93.9229 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#11069 - CGI_10015021 superfamily 241995 27 327 1.23E-129 374.246 cl00635 Ntn_Asparaginase_2_like superfamily - - "Ntn-hydrolase superfamily, L-Asparaginase type 2-like enzymes. This family includes Glycosylasparaginase, Taspase 1 and L-Asparaginase type 2 enzymes. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue." Q#11070 - CGI_10015022 superfamily 243056 479 679 1.23E-48 171.773 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#11070 - CGI_10015022 superfamily 245201 45 276 6.84E-27 111.11 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11070 - CGI_10015022 superfamily 241626 801 889 1.25E-23 97.5249 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#11071 - CGI_10015023 superfamily 222150 139 164 0.00403941 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#11071 - CGI_10015023 superfamily 222150 167 190 0.00464249 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#11073 - CGI_10015025 superfamily 247919 368 577 0.00704434 38.8106 cl17365 TrkH superfamily NC - Cation transport protein; This family consists of various cation transport proteins (Trk) and V-type sodium ATP synthase subunit J or translocating ATPase J EC:3.6.1.34. These proteins are involved in active sodium up-take utilising ATP in the process. TrkH a member of the family from E. coli is a hydrophobic membrane protein and determines the specificity and kinetics of cation transport by the TrK system in E. coli. Q#11074 - CGI_10015026 superfamily 219000 432 582 3.46E-19 87.7019 cl05717 Drf_FH3 superfamily C - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#11074 - CGI_10015026 superfamily 219001 376 428 6.81E-05 43.8367 cl05720 Drf_GBD superfamily N - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#11075 - CGI_10015027 superfamily 216686 119 307 8.48E-37 133.217 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#11077 - CGI_10015029 superfamily 243670 7 366 4.50E-88 273.15 cl04217 UPF0075 superfamily - - Uncharacterized protein family (UPF0075); The proteins is this family are about 370 amino acids long and have no known function. Q#11078 - CGI_10015030 superfamily 241554 482 569 2.13E-13 69.4857 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#11078 - CGI_10015030 superfamily 241752 873 1002 2.67E-12 65.0333 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#11079 - CGI_10015031 superfamily 241554 149 279 2.69E-30 116.976 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#11079 - CGI_10015031 superfamily 241752 596 728 4.03E-17 78.5153 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#11080 - CGI_10015032 superfamily 247769 1379 1555 1.09E-13 70.8313 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#11080 - CGI_10015032 superfamily 248010 1103 1255 2.98E-27 110.549 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#11080 - CGI_10015032 superfamily 248010 229 376 2.13E-08 54.3096 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#11081 - CGI_10015033 superfamily 247986 276 369 8.80E-15 73.5614 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#11081 - CGI_10015033 superfamily 247986 489 624 4.46E-05 43.901 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#11081 - CGI_10015033 superfamily 245225 16 232 5.21E-38 145.146 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#11082 - CGI_10015034 superfamily 247986 4 87 8.32E-14 69.3242 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#11082 - CGI_10015034 superfamily 247986 207 342 1.26E-05 45.0566 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#11083 - CGI_10015035 superfamily 247684 8 176 1.35E-48 164.373 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11085 - CGI_10015037 superfamily 247684 1 189 2.25E-50 169.765 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11090 - CGI_10013910 superfamily 241867 21 96 3.01E-06 42.1554 cl00446 Lactamase_B superfamily N - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#11091 - CGI_10013911 superfamily 245847 142 262 4.70E-20 83.7601 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#11091 - CGI_10013911 superfamily 241619 27 71 0.00760107 33.7097 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#11095 - CGI_10013915 superfamily 245040 26 100 1.25E-05 39.2071 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#11096 - CGI_10013916 superfamily 245040 25 83 0.000439274 34.5847 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#11097 - CGI_10013917 superfamily 245040 19 80 0.000186569 38.224 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#11097 - CGI_10013917 superfamily 245040 139 174 0.00107976 35.7403 cl09238 CY superfamily NC - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#11098 - CGI_10013918 superfamily 248325 80 413 3.54E-88 272.639 cl17771 Methyltransf_5 superfamily - - MraW methylase family; Members of this family are probably SAM dependent methyltransferases based on Escherichia coli rsmH. This family appears to be related to pfam01596. Q#11100 - CGI_10013920 superfamily 246679 6 128 3.93E-67 201.265 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#11101 - CGI_10013921 superfamily 241768 179 351 4.30E-42 146.87 cl00305 Sua5_yciO_yrdC superfamily - - Telomere recombination; This domain has been shown to bind preferentially to dsRNA. The domain is found in SUA5 as well as HypF and YrdC. It has also been shown to be required for telomere recombniation in yeast. Q#11103 - CGI_10013923 superfamily 241768 43 229 6.15E-29 109.121 cl00305 Sua5_yciO_yrdC superfamily - - Telomere recombination; This domain has been shown to bind preferentially to dsRNA. The domain is found in SUA5 as well as HypF and YrdC. It has also been shown to be required for telomere recombniation in yeast. Q#11104 - CGI_10013924 superfamily 247866 8 203 1.22E-28 109.079 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#11105 - CGI_10013925 superfamily 242187 57 499 1.29E-36 140.15 cl00912 MmgE_PrpD superfamily - - MmgE/PrpD family; This family includes 2-methylcitrate dehydratase EC:4.2.1.79 (PrpD) that is required for propionate catabolism. It catalyzes the third step of the 2-methylcitric acid cycle. Q#11108 - CGI_10013928 superfamily 241737 58 333 9.76E-137 394.3 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#11110 - CGI_10013930 superfamily 204056 354 406 2.43E-10 56.3373 cl07395 DEK_C superfamily - - DEK C terminal domain; DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family. This domain is also found in chitin synthase proteins and in protein phosphatases. Q#11114 - CGI_10016255 superfamily 248312 13 164 2.21E-07 46.9629 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#11115 - CGI_10016256 superfamily 241563 68 109 7.45E-06 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11116 - CGI_10016257 superfamily 245084 29 349 0 661.472 cl09506 catalase_like superfamily - - "Catalase-like heme-binding proteins and protein domains; Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity." Q#11116 - CGI_10016257 superfamily 245084 351 397 1.29E-12 67.557 cl09506 catalase_like superfamily N - "Catalase-like heme-binding proteins and protein domains; Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity." Q#11117 - CGI_10016258 superfamily 242385 163 449 0 580.608 cl01244 arom_aa_hydroxylase superfamily - - "Biopterin-dependent aromatic amino acid hydroxylase; a family of non-heme, iron(II)-dependent enzymes that includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH converts L-phenylalanine to L-tyrosine, an important step in phenylalanine catabolism and neurotransmitter biosynthesis, and is linked to a severe variant of phenylketonuria in humans. TyrOH and TrpOH are involved in the biosynthesis of catecholamine and serotonin, respectively. The eukaryotic enzymes are all homotetramers." Q#11117 - CGI_10016258 superfamily 245020 66 138 1.55E-29 110.921 cl09141 ACT superfamily - - "ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme; Members of this CD belong to the superfamily of ACT regulatory domains. Pairs of ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. The ACT domain has been detected in a number of diverse proteins; some of these proteins are involved in amino acid and purine biosynthesis, phenylalanine hydroxylation, regulation of bacterial metabolism and transcription, and many remain to be characterized. ACT domain-containing enzymes involved in amino acid and purine synthesis are in many cases allosteric enzymes with complex regulation enforced by the binding of ligands. The ACT domain is commonly involved in the binding of a small regulatory molecule, such as the amino acids L-Ser and L-Phe in the case of D-3-phosphoglycerate dehydrogenase and the bifunctional chorismate mutase-prephenate dehydratase enzyme (P-protein), respectively. Aspartokinases typically consist of two C-terminal ACT domains in a tandem repeat, but the second ACT domain is inserted within the first, resulting in, what is normally the terminal beta strand of ACT2, formed from a region N-terminal of ACT1. ACT domain repeats have been shown to have nonequivalent ligand-binding sites with complex regulatory patterns such as those seen in the bifunctional enzyme, aspartokinase-homoserine dehydrogenase (ThrA). In other enzymes, such as phenylalanine hydroxylases, the ACT domain appears to function as a flexible small module providing allosteric regulation via transmission of conformational changes, these conformational changes are not necessarily initiated by regulatory ligand binding at the ACT domain itself. ACT domains are present either singularly, N- or C-terminal, or in pairs present C-terminal or between two catalytic domains. Unique to cyanobacteria are four ACT domains C-terminal to an aspartokinase domain. A few proteins are composed almost entirely of ACT domain repeats as seen in the four ACT domain protein, the ACR protein, found in higher plants; and the two ACT domain protein, the glycine cleavage system transcriptional repressor (GcvR) protein, found in some bacteria. Also seen are single ACT domain proteins similar to the Streptococcus pneumoniae ACT domain protein (uncharacterized pdb structure 1ZPV) found in both bacteria and archaea. Purportedly, the ACT domain is an evolutionarily mobile ligand binding regulatory module that has been fused to different enzymes at various times." Q#11118 - CGI_10016259 superfamily 241787 1 49 3.89E-07 43.7374 cl00326 Ribosomal_L23 superfamily C - Ribosomal protein L23; Ribosomal protein L23. Q#11119 - CGI_10016260 superfamily 241567 178 308 6.84E-25 100.367 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#11119 - CGI_10016260 superfamily 241567 42 155 8.13E-07 47.9803 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#11120 - CGI_10016261 superfamily 245605 22 90 1.60E-36 130.43 cl11409 RNAP_RPB11_RPB3 superfamily - - "RPB11 and RPB3 subunits of RNA polymerase; The eukaryotic RPB11 and RPB3 subunits of RNA polymerase (RNAP), as well as their archaeal (L and D subunits) and bacterial (alpha subunit) counterparts, are involved in the assembly of RNAP, a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts." Q#11120 - CGI_10016261 superfamily 241567 101 239 1.03E-29 116.546 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#11120 - CGI_10016261 superfamily 241567 440 530 0.000290325 41.432 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#11121 - CGI_10016262 superfamily 245213 72 112 3.26E-11 54.565 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11121 - CGI_10016262 superfamily 241571 14 68 6.10E-05 38.9327 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#11122 - CGI_10016263 superfamily 243831 17 178 1.84E-50 177.846 cl04653 TAF7 superfamily - - "TATA Binding Protein (TBP) Associated Factor 7 (TAF7) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex; The TATA Binding Protein (TBP) Associated Factor 7 (TAF7) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the preinitiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A new, unified nomenclature has been suggested for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and is not universally involved for transcription as are GTFs. TAF7 is involved in the regulation of the transition from PIC assembly to initiation and elongation. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers." Q#11122 - CGI_10016263 superfamily 247684 360 788 3.59E-72 249.117 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11123 - CGI_10016264 superfamily 247684 11 428 2.03E-107 330.779 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11129 - CGI_10016270 superfamily 220695 112 240 2.04E-06 47.9587 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#11130 - CGI_10016271 superfamily 246751 54 348 1.25E-109 328.819 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#11131 - CGI_10016272 superfamily 246751 1 275 1.82E-115 336.138 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#11134 - CGI_10016275 superfamily 246751 1 178 6.27E-69 213.259 cl14883 Lipase superfamily N - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#11135 - CGI_10016276 superfamily 246751 59 352 1.20E-107 319.189 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#11136 - CGI_10016277 superfamily 241563 98 138 2.31E-07 48.2444 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11137 - CGI_10016011 superfamily 245531 9 87 2.93E-10 52.3935 cl11158 BEN superfamily - - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#11141 - CGI_10016015 superfamily 241583 395 567 6.52E-25 103.857 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#11142 - CGI_10016016 superfamily 241578 1 152 8.77E-29 116.237 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11142 - CGI_10016016 superfamily 241568 2089 2145 3.60E-07 50.154 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1684 1738 2.16E-06 47.8428 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1971 2026 5.91E-06 46.6872 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1283 1337 1.17E-05 45.9168 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1455 1509 1.28E-05 45.5316 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 939 993 1.55E-05 45.5316 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 235 289 3.55E-05 44.376 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 2262 2316 0.00012806 42.8352 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 178 231 0.000206618 42.0648 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 2148 2202 0.000344014 41.2944 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1569 1624 0.00069444 40.524 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 768 822 0.000706774 40.524 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 349 403 0.000825637 40.1388 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 996 1050 0.00140676 39.3684 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1167 1217 0.00150021 39.3684 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 566 620 0.00315669 38.598 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 825 879 0.00508527 37.8276 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1340 1395 0.00558025 37.8276 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1798 1853 0.00652644 37.4424 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1627 1681 0.00750045 37.4424 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1398 1452 0.00792475 37.0572 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 292 346 0.00945493 37.0572 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 243119 2344 2386 0.00266198 38.579 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11142 - CGI_10016016 superfamily 241568 1110 1163 0.00759221 37.1696 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 882 935 0.00859975 37.1696 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11142 - CGI_10016016 superfamily 241568 1914 1967 0.00892065 37.1209 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11143 - CGI_10016017 superfamily 218263 532 664 2.51E-39 141.487 cl04748 DUF547 superfamily - - "Protein of unknown function, DUF547; Family of uncharacterized proteins from C. elegans and A. thaliana." Q#11143 - CGI_10016017 superfamily 243119 48 98 0.000197544 40.1097 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11143 - CGI_10016017 superfamily 247824 416 480 0.00606666 36.6137 cl17270 APH_ChoK_like superfamily N - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#11144 - CGI_10016018 superfamily 245606 56 210 9.29E-60 198.525 cl11410 TPP_enzyme_PYR superfamily - - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#11144 - CGI_10016018 superfamily 242611 453 626 1.95E-51 176.184 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#11144 - CGI_10016018 superfamily 215786 274 411 1.04E-14 71.8133 cl18345 TPP_enzyme_M superfamily - - "Thiamine pyrophosphate enzyme, central domain; The central domain of TPP enzymes contains a 2-fold Rossman fold." Q#11145 - CGI_10016019 superfamily 248458 463 623 4.91E-10 60.4053 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11145 - CGI_10016019 superfamily 248458 36 182 5.18E-07 50.7753 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11146 - CGI_10016020 superfamily 241659 134 210 2.84E-25 97.5906 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#11146 - CGI_10016020 superfamily 241659 270 348 1.52E-21 87.5754 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#11147 - CGI_10016021 superfamily 248097 131 241 1.64E-24 95.0246 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11147 - CGI_10016021 superfamily 247746 16 90 0.00297245 35.3118 cl17192 ATP-synt_B superfamily N - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#11148 - CGI_10016022 superfamily 248097 271 381 4.80E-22 90.017 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11148 - CGI_10016022 superfamily 147120 71 174 0.00245117 36.5828 cl04763 Cor1 superfamily - - "Cor1/Xlr/Xmr conserved region; Cor1 is a component of the chromosome core in the meiotic prophase chromosomes. Xlr is a lymphoid cell specific protein. Xlm is abundantly transcribed in testis in a tissue-specific and developmentally regulated manner. The protein is located in the nuclei of spermatocytes, early in the prophase of the first meiotic division, and later becomes concentrated in the XY nuclear subregion where it is in particular associated with the axes of sex chromosomes." Q#11149 - CGI_10016023 superfamily 248097 310 417 2.98E-23 93.869 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11151 - CGI_10016025 superfamily 243029 16 41 0.00173982 32.7078 cl02422 HRM superfamily C - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#11154 - CGI_10016028 superfamily 247743 295 434 3.05E-23 99.1427 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#11154 - CGI_10016028 superfamily 144608 243 262 0.000255461 42.8873 cl18013 Mg_chelatase superfamily C - "Magnesium chelatase, subunit ChlI; Magnesium-chelatase is a three-component enzyme that catalyzes the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in channelling inter- mediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weight between 38-42 kDa." Q#11154 - CGI_10016028 superfamily 204202 501 533 0.00750814 36.4645 cl07827 Vps4_C superfamily N - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#11163 - CGI_10000733 superfamily 242685 1 93 1.60E-38 133.086 cl01749 UPF0160 superfamily N - Uncharacterized protein family (UPF0160); This family of proteins contains a large number of metal binding residues. The patterns are suggestive of a phosphoesterase function. The conserved DHH motif may mean this family is related to pfam01368. Q#11164 - CGI_10016812 superfamily 247724 24 171 7.45E-87 256.339 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11165 - CGI_10016813 superfamily 241900 42 320 2.78E-112 329.769 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#11167 - CGI_10016815 superfamily 241672 67 294 2.17E-71 224.029 cl00192 ribokinase_pfkB_like superfamily - - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#11169 - CGI_10016817 superfamily 247068 455 551 1.33E-25 103.547 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11169 - CGI_10016817 superfamily 247068 242 341 1.43E-25 103.162 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11169 - CGI_10016817 superfamily 247068 559 654 7.42E-21 89.6801 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11169 - CGI_10016817 superfamily 247068 356 447 4.06E-17 78.8945 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11169 - CGI_10016817 superfamily 247068 135 233 1.31E-16 77.3537 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11169 - CGI_10016817 superfamily 247068 674 757 1.13E-07 51.1602 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11169 - CGI_10016817 superfamily 247068 9 94 2.10E-10 59.0435 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11170 - CGI_10016818 superfamily 247068 220 320 3.16E-27 108.17 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11170 - CGI_10016818 superfamily 247068 434 530 5.37E-23 95.8433 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11170 - CGI_10016818 superfamily 247068 540 634 1.19E-20 89.2949 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11170 - CGI_10016818 superfamily 247068 109 212 2.17E-20 88.5245 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11170 - CGI_10016818 superfamily 247068 335 426 9.77E-16 75.0425 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11170 - CGI_10016818 superfamily 247068 654 736 1.15E-15 74.6573 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11170 - CGI_10016818 superfamily 247068 2 78 8.48E-08 51.3395 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11172 - CGI_10016820 superfamily 248458 487 737 1.96E-14 74.2725 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11172 - CGI_10016820 superfamily 248458 194 299 1.31E-07 53.0865 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11173 - CGI_10016821 superfamily 220238 51 151 2.43E-26 99.2888 cl18552 DUF2012 superfamily - - Protein of unknown function (DUF2012); This is a eukaryotic family of uncharacterized proteins. Q#11174 - CGI_10016822 superfamily 248345 551 674 5.23E-38 138.926 cl17791 SAC3_GANP superfamily - - "SAC3/GANP/Nin1/mts3/eIF-3 p25 family; This large family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit. This family includes several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits." Q#11177 - CGI_10016825 superfamily 218721 14 366 1.60E-64 215.058 cl05344 TROVE superfamily - - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#11178 - CGI_10016826 superfamily 243072 116 183 1.58E-13 67.7938 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11179 - CGI_10016827 superfamily 248264 46 214 2.81E-18 79.9737 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#11180 - CGI_10016828 superfamily 110440 97 123 0.000208047 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#11182 - CGI_10016830 superfamily 243092 11 32 0.00250558 31.1684 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11183 - CGI_10016831 superfamily 243092 11 90 4.73E-09 50.7964 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11185 - CGI_10016833 superfamily 222150 136 154 0.00196285 33.9045 cl16282 zf-H2C2_2 superfamily C - Zinc-finger double domain; Zinc-finger double domain. Q#11185 - CGI_10016833 superfamily 222150 107 132 0.00239574 33.5193 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#11186 - CGI_10016834 superfamily 243092 15 255 2.63E-44 153.645 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11187 - CGI_10016835 superfamily 241563 69 109 3.37E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11187 - CGI_10016835 superfamily 241563 15 60 0.00669859 34.7624 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11188 - CGI_10016836 superfamily 117343 17 71 1.08E-14 66.3185 cl07399 CathepsinC_exc superfamily C - "Cathepsin C exclusion domain; Cathepsin C (dipeptidyl peptidase I) is the physiological activator of a group of serine proteases. This domain corresponds to the exclusion domain whose structure excludes the approach of a polypeptide apart from its termini. It forms an enclosed beta barrel structure composed from 8 anti-parallel beta strands. Based on a structural comparison and interaction data, it is suggested that the exclusion domain originates from a metallo-protease inhibitor." Q#11189 - CGI_10016837 superfamily 247724 22 180 1.58E-111 317.723 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11190 - CGI_10016838 superfamily 247724 106 308 1.70E-87 269.79 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11190 - CGI_10016838 superfamily 207690 55 77 0.00642841 34.6329 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#11197 - CGI_10000848 superfamily 245814 17 97 0.000994005 34.7885 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11198 - CGI_10001011 superfamily 238012 229 287 1.18E-05 42.3414 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#11198 - CGI_10001011 superfamily 243198 2 227 2.21E-86 264.222 cl02806 Laminin_N superfamily - - Laminin N-terminal (Domain VI); Laminin N-terminal (Domain VI). Q#11198 - CGI_10001011 superfamily 238012 288 333 0.000728128 37.3338 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#11200 - CGI_10014349 superfamily 150784 106 485 6.01E-103 319.559 cl10848 DUF2359 superfamily - - "Uncharacterized conserved protein (DUF2359); This is a 450 amino acid region of a family of proteins conserved from insects to humans. The mouse protein, Q8BM55, is annotated as being a putative Vitamin K-dependent carboxylation gamma-carboxyglutamic (GLA) domain containing protein, but this could not be confirmed. The function is not known." Q#11201 - CGI_10014350 superfamily 222429 9 84 1.40E-05 39.5313 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#11202 - CGI_10014351 superfamily 217473 133 324 6.22E-23 98.9765 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#11206 - CGI_10014355 superfamily 241600 91 290 4.98E-80 244.072 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11207 - CGI_10014356 superfamily 241600 131 328 4.43E-82 252.161 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11208 - CGI_10014357 superfamily 241600 32 207 2.79E-73 223.271 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11209 - CGI_10014358 superfamily 245213 166 206 6.33E-11 55.7206 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11209 - CGI_10014358 superfamily 241571 71 162 1.09E-07 48.1775 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#11210 - CGI_10014359 superfamily 241600 137 353 1.59E-94 283.362 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11210 - CGI_10014359 superfamily 241619 37 81 0.00028797 38.3321 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#11211 - CGI_10014360 superfamily 241600 125 345 1.86E-93 280.281 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11211 - CGI_10014360 superfamily 241619 37 81 0.000221476 38.7173 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#11212 - CGI_10014361 superfamily 248458 286 471 3.22E-10 60.4053 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11212 - CGI_10014361 superfamily 248458 73 224 0.00749326 37.2933 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11220 - CGI_10014369 superfamily 241546 564 683 8.29E-41 149.348 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#11220 - CGI_10014369 superfamily 248011 1496 1566 0.000166302 42.0142 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#11220 - CGI_10014369 superfamily 243086 454 505 0.00125704 39.293 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#11220 - CGI_10014369 superfamily 248011 1603 1677 0.00234116 38.5862 cl17457 PKD superfamily N - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#11220 - CGI_10014369 superfamily 248011 1706 1753 0.00879091 37.0066 cl17457 PKD superfamily C - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#11223 - CGI_10001097 superfamily 245213 28 61 4.46E-06 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11223 - CGI_10001097 superfamily 246918 75 126 4.80E-14 65.6859 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#11223 - CGI_10001097 superfamily 221695 10 31 3.27E-06 43.2126 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#11224 - CGI_10001167 superfamily 241563 72 109 4.39E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11224 - CGI_10001167 superfamily 241563 21 58 0.00700125 35.1476 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11225 - CGI_10022791 superfamily 193256 241 451 2.08E-65 216.352 cl18189 AAA_8 superfamily C - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#11225 - CGI_10022791 superfamily 193251 6 204 1.11E-40 148.546 cl18188 AAA_7 superfamily N - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#11225 - CGI_10022791 superfamily 193253 491 540 5.50E-07 50.4205 cl15084 MT superfamily NC - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#11226 - CGI_10022792 superfamily 193251 1466 1581 1.02E-16 81.5208 cl18188 AAA_7 superfamily C - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#11226 - CGI_10022792 superfamily 193256 240 292 5.80E-09 57.65 cl18189 AAA_8 superfamily NC - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#11226 - CGI_10022792 superfamily 241619 39 102 0.00022386 41.4137 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#11226 - CGI_10022792 superfamily 247743 894 1015 0.00281984 38.4304 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#11227 - CGI_10022793 superfamily 245206 5 139 4.46E-42 141.284 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#11228 - CGI_10022794 superfamily 245206 5 233 4.58E-62 197.138 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#11229 - CGI_10022795 superfamily 245206 5 233 1.01E-62 199.064 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#11230 - CGI_10022796 superfamily 245847 41 156 0.00137951 37.8672 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#11231 - CGI_10022797 superfamily 241563 61 96 0.00223025 37.0736 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11235 - CGI_10022802 superfamily 245226 38 152 2.44E-14 66.5552 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#11237 - CGI_10022804 superfamily 246925 122 309 7.87E-09 57.3654 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#11237 - CGI_10022804 superfamily 246925 2 207 1.20E-05 47.3502 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#11239 - CGI_10022806 superfamily 242087 462 589 5.42E-27 107.943 cl00781 DUF389 superfamily - - Domain of unknown function (DUF389); Family of hypothetical bacterial proteins with an undetermined function. Q#11240 - CGI_10022807 superfamily 220653 104 156 4.24E-12 58.5371 cl10936 PP28 superfamily C - "Casein kinase substrate phosphoprotein PP28; This domain is a region of 70 residues conserved in proteins from plants to humans and contains a serine/arginine rich motif. In rats the full protein is a casein kinase substrate, and this region contains phosphorylation sites for both cAMP-dependent protein kinase and casein kinase II." Q#11241 - CGI_10022808 superfamily 243100 278 339 1.58E-10 56.4183 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#11243 - CGI_10022810 superfamily 222005 55 136 3.75E-08 51.9692 cl18632 AAA_19 superfamily - - Part of AAA domain; Part of AAA domain. Q#11244 - CGI_10022811 superfamily 247637 9 371 0 710.537 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#11246 - CGI_10022813 superfamily 247916 318 369 0.000316691 39.2883 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#11251 - CGI_10022818 superfamily 243088 544 660 2.90E-51 174.073 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#11251 - CGI_10022818 superfamily 243142 13 106 1.01E-21 91.9191 cl02689 RUN superfamily N - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#11251 - CGI_10022818 superfamily 191369 338 516 0.00989919 36.3162 cl05372 DUF837 superfamily - - Protein of unknown function (DUF837); This family consists of several eukaryotic proteins of unknown function. One of the family members is a circulating cathodic antigen (CCA) found in Schistosoma mansoni (Blood fluke). Q#11252 - CGI_10022819 superfamily 243142 54 197 9.73E-27 104.245 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#11253 - CGI_10022820 superfamily 241809 41 165 2.63E-40 135.312 cl00353 Ribosomal_L16_L10e superfamily - - "Ribosomal_L16_L10e: L16 is an essential protein in the large ribosomal subunit of bacteria, mitochondria, and chloroplasts. Large subunits that lack L16 are defective in peptidyl transferase activity, peptidyl-tRNA hydrolysis activity, association with the 30S subunit, binding of aminoacyl-tRNA and interaction with antibiotics. L16 is required for the function of elongation factor P (EF-P), a protein involved in peptide bond synthesis through the stimulation of peptidyl transferase activity by the ribosome. Mutations in L16 and the adjoining bases of 23S rRNA confer antibiotic resistance in bacteria, suggesting a role for L16 in the formation of the antibiotic binding site. The GTPase RbgA (YlqF) is essential for the assembly of the large subunit, and it is believed to regulate the incorporation of L16. L10e is the archaeal and eukaryotic cytosolic homolog of bacterial L16. L16 and L10e exhibit structural differences at the N-terminus." Q#11257 - CGI_10022824 superfamily 241550 241 604 3.32E-169 488.681 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#11257 - CGI_10022824 superfamily 247744 41 190 1.31E-91 281.674 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#11258 - CGI_10022825 superfamily 245601 235 396 7.52E-23 95.9039 cl11399 HP superfamily N - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#11260 - CGI_10022827 superfamily 243061 1 99 6.18E-29 101.65 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11261 - CGI_10022828 superfamily 243061 14 96 8.35E-11 56.9666 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11262 - CGI_10001181 superfamily 243092 42 158 2.00E-10 56.1892 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11263 - CGI_10001182 superfamily 241563 62 102 6.77E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11264 - CGI_10001199 superfamily 246684 30 125 2.22E-42 138.078 cl14651 RNA_pol_Rpb6 superfamily N - "RNA polymerase Rpb6; Rpb6 is an essential subunit in the eukaryotic polymerases Pol I, II and III. This family also contains the bacterial equivalent to Rpb6, the omega subunit. Rpb6 and omega are structurally conserved and both function in polymerase assembly." Q#11265 - CGI_10001200 superfamily 214545 425 565 2.51E-54 184.06 cl10551 CULLIN superfamily - - Cullin; Cullin. Q#11265 - CGI_10001200 superfamily 245539 688 755 5.21E-22 91.0761 cl11186 Cullin_Nedd8 superfamily - - "Cullin protein neddylation domain; This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue." Q#11267 - CGI_10022543 superfamily 221312 415 577 5.50E-78 247.873 cl13369 Vac14_Fig4_bd superfamily - - "Vacuolar protein 14 C-terminal Fig4p binding; Vac14 is a scaffold for the Fab1 kinase complex, a complex that allows for the dynamic interconversion of PI3P and PI(3,5)P2p (phosphoinositide phosphate (PIP) lipids, that are generated transiently on the cytoplasmic face of selected intracellular membranes). This interconversion is regulated by at least five proteins in yeast: the lipid kinase Fab1p, lipid phosphatase Fig4p, the Fab1p activator Vac7p, the Fab1p inhibitor Atg18p, and Vac14p, a protein required for the activity of both Fab1p and Fig4p. The C-terminal region of Vac14 binds to Fig4p. The full length Vac14 in yeasts is likely to be a protein carrying a succession of HEAT repeats, most of which have now degenerated. This regulatory system is crucial for the proper functioning of the mammalian nervous system." Q#11267 - CGI_10022543 superfamily 193231 66 162 1.97E-38 136.915 cl15071 Vac14_Fab1_bd superfamily - - "Vacuolar 14 Fab1-binding region; Vac14 is a scaffold for the Fab1 kinase complex, a complex that allows for the dynamic interconversion of PI3P and PI(3,5)P2p (phosphoinositide phosphate (PIP) lipids, that are generated transiently on the cytoplasmic face of selected intracellular membranes). This interconversion is regulated by at least five proteins in yeast: the lipid kinase Fab1p, lipid phosphatase Fig4p, the Fab1p activator Vac7p, the Fab1p inhibitor Atg18p, and Vac14p, a protein required for the activity of both Fab1p and Fig4p. This domain appears to be the one responsible for binding to Fab1. The full length Vac14 in yeasts is likely to be a protein carrying a succession of HEAT repeats, most of which have now degenerated. This regulatory system is crucial for the proper functioning of the mammalian nervous system." Q#11268 - CGI_10022544 superfamily 216411 110 170 0.000300087 38.4119 cl15974 MARVEL superfamily NC - "Membrane-associating domain; MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis." Q#11269 - CGI_10022545 superfamily 241680 39 206 2.08E-44 152.022 cl00200 MIP superfamily C - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#11271 - CGI_10022547 superfamily 241680 15 222 2.35E-53 175.134 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#11273 - CGI_10022549 superfamily 247856 21 81 2.45E-11 55.2465 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11273 - CGI_10022549 superfamily 247856 56 130 2.31E-09 49.8537 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11274 - CGI_10022550 superfamily 243061 769 869 6.91E-38 138.244 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11274 - CGI_10022550 superfamily 243061 443 544 1.00E-34 129.384 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11274 - CGI_10022550 superfamily 243061 662 763 1.36E-34 128.999 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11274 - CGI_10022550 superfamily 243061 117 218 7.01E-34 126.688 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11274 - CGI_10022550 superfamily 243061 225 326 8.62E-33 123.606 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11274 - CGI_10022550 superfamily 243061 553 653 1.42E-29 114.361 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11274 - CGI_10022550 superfamily 243061 335 435 6.15E-29 112.821 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11274 - CGI_10022550 superfamily 243061 9 109 1.55E-28 111.665 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11275 - CGI_10022551 superfamily 242281 41 267 6.77E-45 155.225 cl01067 Dyp_perox superfamily N - Dyp-type peroxidase family; This family of dye-decolourising peroxidases lack a typical heme-binding region. Q#11276 - CGI_10022552 superfamily 207654 221 286 1.27E-22 88.6538 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#11276 - CGI_10022552 superfamily 207654 145 211 4.15E-17 73.2458 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#11276 - CGI_10022552 superfamily 207654 28 99 6.75E-16 70.1642 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#11276 - CGI_10022552 superfamily 207654 1 21 0.00910864 33.5702 cl02574 Annexin superfamily N - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#11277 - CGI_10022553 superfamily 201431 615 727 2.32E-41 147.269 cl08290 THF_DHG_CYH superfamily - - "Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain; Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain. " Q#11278 - CGI_10022554 superfamily 245323 165 442 2.28E-163 478.276 cl10511 Beach superfamily - - "BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins." Q#11278 - CGI_10022554 superfamily 247725 75 136 9.19E-16 74.6391 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#11278 - CGI_10022554 superfamily 243092 557 811 1.73E-17 82.768 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11279 - CGI_10022555 superfamily 217473 61 347 1.59E-100 301.591 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#11281 - CGI_10022557 superfamily 219043 1886 2062 5.06E-83 272.094 cl05796 DUF1088 superfamily - - Domain of Unknown Function (DUF1088); This family is found in the neurobeachins. The function of this region is not known. Q#11281 - CGI_10022557 superfamily 241611 216 386 1.11E-12 68.184 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#11281 - CGI_10022557 superfamily 247725 2069 2102 4.96E-05 43.8232 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#11282 - CGI_10022558 superfamily 247724 102 370 1.39E-147 425.421 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11285 - CGI_10022561 superfamily 243091 1788 1909 3.24E-37 138.621 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#11285 - CGI_10022561 superfamily 248279 1443 1521 8.05E-26 104.756 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#11285 - CGI_10022561 superfamily 243126 1565 1621 1.17E-21 91.4983 cl02650 FYRN superfamily - - F/Y-rich N-terminus; This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00541. Q#11285 - CGI_10022561 superfamily 243127 1623 1708 7.12E-20 87.3618 cl02651 FYRC superfamily - - F/Y rich C-terminus; This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00542. Q#11285 - CGI_10022561 superfamily 214703 1911 1927 0.00219995 38.1552 cl02636 PostSET superfamily - - Cysteine-rich motif following a subset of SET domains; Cysteine-rich motif following a subset of SET domains. Q#11286 - CGI_10022562 superfamily 148739 138 239 3.47E-18 80.7582 cl06366 SRA1 superfamily N - Steroid receptor RNA activator (SRA1); This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs. Q#11286 - CGI_10022562 superfamily 148739 247 356 6.44E-16 74.2098 cl06366 SRA1 superfamily N - Steroid receptor RNA activator (SRA1); This family consists of several hypothetical mammalian steroid receptor RNA activator proteins. SRA-RNAs likely to encode stable proteins are widely expressed in breast cancer cell lines. SRA-RNA is a steroid receptor co-activator which acts as a functional RNA and is classified as belonging to the growing family of functional non-coding RNAs. Q#11287 - CGI_10022563 superfamily 212596 390 481 1.22E-29 113.494 cl17033 SOAR superfamily - - "STIM1 Orai1-activating region; STIM1 (stromal interaction module 1) is a metazoan transmembrane protein located in the endoplasmic reticulum (ER) membrane, which functions as a sensor for ER calcium ion levels and activates store-operated Ca2+ influx channels (SOCs), such as the Orai1 Ca2+ channel located in the plasma membrane. STIM1 has an N-terminal Ca-binding EF-hand domain, which is located in the ER lumen. Responding to the release of Ca2+ from the ER, STIM1 was found to aggregate near the plasma membrane and contact Orai1. This model describes a region near the C-terminus of STIM1, which has been shown to mediate the interaction with Orai1 and has been labeled SOAR (STIM1 Orai1-activating region). STIM1 has also been linked to sensing oxidative and temperature-variation stress and may play a rather general role in mediating calcium signaling in response to stress. Dimerization of STIM1 via the SOAR domain appears required for the activation of the Orai1 calcium channel. A model for STIM1 activation has been proposed, in which an inhibitory helix N-terminal to the SOAR domain prevents STIM1 clustering or aggregation, and in which conformational changes triggered by depletion of the calcium stores allow the clustering and activation of Orai1." Q#11287 - CGI_10022563 superfamily 247057 182 255 4.16E-29 111.271 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#11288 - CGI_10022564 superfamily 243095 515 716 3.30E-81 263.532 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#11291 - CGI_10022567 superfamily 241900 411 583 1.45E-09 57.7956 cl00490 EEP superfamily C - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#11292 - CGI_10022568 superfamily 245596 52 272 1.98E-108 316.817 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#11299 - CGI_10022575 superfamily 248458 9 166 2.63E-11 63.4869 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11299 - CGI_10022575 superfamily 248458 234 382 0.000146902 42.3009 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11301 - CGI_10022577 superfamily 247727 55 181 1.20E-06 44.7283 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#11302 - CGI_10022578 superfamily 245864 34 102 1.34E-13 68.0738 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#11303 - CGI_10022579 superfamily 246723 223 466 1.68E-98 311.16 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#11303 - CGI_10022579 superfamily 248136 102 208 4.27E-51 171.732 cl17582 Sybindin superfamily - - "Sybindin-like family; Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses." Q#11304 - CGI_10022580 superfamily 243072 38 161 1.13E-34 125.959 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11304 - CGI_10022580 superfamily 243072 169 290 7.67E-33 120.566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11304 - CGI_10022580 superfamily 243072 268 331 4.80E-07 47.7634 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11305 - CGI_10022581 superfamily 245614 1 57 0.000106412 38.7856 cl11433 FtsL superfamily C - "Cell division protein FtsL; In Escherichia coli, nine gene products are known to be essential for assembly of the division septum. One of these, FtsL, is a bitopic membrane protein whose precise function is not understood. It has been proposed that FtsL interacts with the DivIC protein pfam04977, however this interaction may be indirect." Q#11307 - CGI_10022583 superfamily 245814 50 110 0.000290203 39.344 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11308 - CGI_10022584 superfamily 245304 1 92 3.25E-42 152.329 cl10459 Peptidases_S8_S53 superfamily C - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#11308 - CGI_10022584 superfamily 201820 220 300 1.31E-25 99.6222 cl08326 P_proprotein superfamily - - Proprotein convertase P-domain; A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. Q#11308 - CGI_10022584 superfamily 245304 92 179 2.01E-25 104.179 cl10459 Peptidases_S8_S53 superfamily N - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#11309 - CGI_10001342 superfamily 215647 585 774 1.36E-33 130.036 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#11315 - CGI_10001479 superfamily 243161 4 86 3.97E-13 61.2561 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#11316 - CGI_10001480 superfamily 206130 178 269 0.000477244 39.1091 cl16501 DUF4218 superfamily C - Domain of unknown function (DUF4218); Domain of unknown function (DUF4218). Q#11317 - CGI_10001636 superfamily 245864 1 415 3.65E-62 209.442 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#11319 - CGI_10001056 superfamily 222150 29 54 2.28E-05 41.9937 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#11319 - CGI_10001056 superfamily 222150 57 80 0.000393989 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#11319 - CGI_10001056 superfamily 246975 16 37 0.00869124 34.6301 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#11320 - CGI_10002345 superfamily 248458 87 219 3.01E-06 46.5381 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11321 - CGI_10002346 superfamily 247743 8 148 3.45E-16 73.3793 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#11321 - CGI_10002346 superfamily 209247 156 243 4.41E-10 54.7571 cl11083 ClpB_D2-small superfamily - - "C-terminal, D2-small domain, of ClpB protein; This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, pfam00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighboring subunit and thereby providing enough binding energy to stabilise the functional assembly. The domain is associated with two Clp_N, pfam02861, at the N-terminus as well as AAA, pfam00004 and AAA_2, pfam07724." Q#11322 - CGI_10002347 superfamily 219667 73 156 1.97E-33 121.176 cl06829 Swi3 superfamily - - "Replication Fork Protection Component Swi3; Replication fork pausing is required to initiate a recombination events. More specifically, Swi1 is required for recombination near the mat1 locus. Swi3 has been found to co-purify with Swi1 Swi3, together with Swi1, define a fork protection complex that coordinates leading- and lagging-strand synthesis and stabilises stalled replication forks. The Swi1-Swi3 complex is required for accurate replication, fork protection and replication checkpoint signalling" Q#11325 - CGI_10002419 superfamily 114049 71 187 5.27E-44 150.681 cl05052 Mec-17 superfamily - - Touch receptor neuron protein Mec-17; Mec-17 is the protein product of one of the 18 genes required for the development and function of the touch receptor neuron for gentle touch. Mec-17 is specifically required for maintaining the differentiation of the touch receptor. This family is conserved to higher eukaryotes. Q#11326 - CGI_10002420 superfamily 247803 159 243 4.17E-25 101.155 cl17249 YlqF_related_GTPase superfamily C - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#11326 - CGI_10002420 superfamily 247803 328 384 1.01E-24 99.9991 cl17249 YlqF_related_GTPase superfamily N - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#11327 - CGI_10002421 superfamily 245303 32 363 6.33E-143 422.738 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#11331 - CGI_10002937 superfamily 241868 52 159 1.15E-29 106.805 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#11332 - CGI_10002938 superfamily 177822 36 247 1.74E-13 69.1785 cl18088 PLN02164 superfamily - - sulfotransferase Q#11332 - CGI_10002938 superfamily 238012 298 319 0.00671188 33.867 cl11390 EGF_Lam superfamily NC - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#11335 - CGI_10003954 superfamily 245205 150 231 6.20E-09 54.9365 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#11335 - CGI_10003954 superfamily 245205 33 100 0.000291174 40.6841 cl09930 RPA_2b-aaRSs_OBF_like superfamily C - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#11335 - CGI_10003954 superfamily 241578 505 679 8.75E-22 95.5389 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11335 - CGI_10003954 superfamily 241578 1264 1360 2.98E-11 63.5673 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11335 - CGI_10003954 superfamily 217211 727 793 1.95E-08 53.4422 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#11338 - CGI_10001075 superfamily 247905 161 300 9.20E-20 84.2116 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#11338 - CGI_10001075 superfamily 247805 23 131 6.67E-07 47.332 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#11340 - CGI_10001077 superfamily 110440 213 239 0.00320276 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#11341 - CGI_10003999 superfamily 241550 39 325 2.34E-88 269.861 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#11342 - CGI_10004000 superfamily 247739 63 232 8.67E-54 177.816 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#11344 - CGI_10004002 superfamily 241568 74 133 1.38E-07 49.7688 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11344 - CGI_10004002 superfamily 245213 453 487 1.77E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11344 - CGI_10004002 superfamily 241568 148 195 0.000299548 39.7536 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#11344 - CGI_10004002 superfamily 245213 490 526 0.000640925 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11344 - CGI_10004002 superfamily 246918 534 585 1.00E-16 76.0863 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#11344 - CGI_10004002 superfamily 219525 710 755 2.27E-07 48.9545 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#11344 - CGI_10004002 superfamily 111397 195 277 7.67E-07 47.7211 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#11346 - CGI_10014927 superfamily 246664 335 710 2.96E-155 458.963 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#11347 - CGI_10014928 superfamily 246664 191 318 1.34E-26 109.707 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#11348 - CGI_10014929 superfamily 241802 59 359 1.69E-136 403.818 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#11348 - CGI_10014929 superfamily 246936 513 637 2.47E-47 162.88 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#11348 - CGI_10014929 superfamily 246936 402 500 3.81E-37 134.76 cl15354 CBS_pair superfamily C - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#11349 - CGI_10014930 superfamily 245201 417 688 5.42E-77 250.146 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11349 - CGI_10014930 superfamily 245814 246 302 0.000250671 39.7799 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11349 - CGI_10014930 superfamily 245814 146 208 0.00109279 37.8539 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11349 - CGI_10014930 superfamily 245814 73 127 0.00380164 36.3131 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11350 - CGI_10014931 superfamily 247684 106 352 2.91E-51 180.936 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11350 - CGI_10014931 superfamily 247684 21 113 3.69E-15 76.1619 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11351 - CGI_10014932 superfamily 247684 95 327 1.90E-31 123.927 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11351 - CGI_10014932 superfamily 247684 18 109 1.81E-09 58.0575 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11352 - CGI_10014933 superfamily 247684 85 269 1.97E-30 120.075 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11352 - CGI_10014933 superfamily 247684 1 81 7.85E-17 80.3991 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11353 - CGI_10014934 superfamily 247684 22 114 6.89E-14 70.3839 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11353 - CGI_10014934 superfamily 247684 136 161 0.00545843 36.886 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11354 - CGI_10014935 superfamily 247684 28 317 2.98E-58 199.041 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11356 - CGI_10014937 superfamily 245201 400 497 4.78E-18 83.4401 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11356 - CGI_10014937 superfamily 245201 621 768 8.37E-21 92.9933 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11357 - CGI_10014938 superfamily 243072 594 691 2.06E-30 117.099 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11357 - CGI_10014938 superfamily 246675 306 549 6.80E-82 266.082 cl14615 PI-PLCc_GDPD_SF superfamily - - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#11357 - CGI_10014938 superfamily 246935 28 147 7.85E-19 83.9116 cl15347 CBM20 superfamily - - "The family 20 carbohydrate-binding module (CBM20), also known as the starch-binding domain, is found in a large number of starch degrading enzymes including alpha-amylase, beta-amylase, glucoamylase, and CGTase (cyclodextrin glucanotransferase). CBM20 is also present in proteins that have a regulatory role in starch metabolism in plants (e.g. alpha-amylase) or glycogen metabolism in mammals (e.g. laforin). CBM20 folds as an antiparallel beta-barrel structure with two starch binding sites. These two sites are thought to differ functionally with site 1 acting as the initial starch recognition site and site 2 involved in the specific recognition of appropriate regions of starch." Q#11357 - CGI_10014938 superfamily 243073 757 803 0.000168634 40.5453 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#11360 - CGI_10014941 superfamily 241886 81 226 2.08E-09 55.7242 cl00470 Aldo_ket_red superfamily NC - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#11362 - CGI_10014943 superfamily 243026 34 242 4.79E-05 42.4431 cl02417 Myelin_PLP superfamily - - Myelin proteolipid protein (PLP or lipophilin); Myelin proteolipid protein (PLP or lipophilin). Q#11363 - CGI_10014944 superfamily 248097 15 137 2.98E-22 86.165 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11364 - CGI_10014945 superfamily 248097 151 250 2.89E-18 78.0758 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11364 - CGI_10014945 superfamily 248097 1 96 2.49E-15 69.6014 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11365 - CGI_10003530 superfamily 243092 12 253 6.10E-38 136.311 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11366 - CGI_10003531 superfamily 241596 125 183 2.82E-17 73.0171 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#11367 - CGI_10003532 superfamily 243092 212 529 6.15E-30 119.747 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11368 - CGI_10003533 superfamily 245227 2 388 0 628.075 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#11370 - CGI_10003535 superfamily 245201 42 334 0 554.078 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11374 - CGI_10015431 superfamily 241587 8 65 1.18E-14 62.3066 cl00069 GGL superfamily - - "G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors" Q#11375 - CGI_10015432 superfamily 241587 139 193 1.29E-09 51.1358 cl00069 GGL superfamily - - "G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors" Q#11376 - CGI_10015433 superfamily 243092 40 331 7.09E-27 107.421 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11381 - CGI_10015438 superfamily 190277 30 95 2.51E-12 57.3104 cl03535 UCR_hinge superfamily - - Ubiquinol-cytochrome C reductase hinge protein; The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 'hinge' protein of the complex which is thought to mediate formation of the cytochrome c1 and cytochrome c complex. Q#11382 - CGI_10015439 superfamily 241567 13 96 3.00E-13 67.2403 cl00042 CASc superfamily NC - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#11382 - CGI_10015439 superfamily 241567 253 356 4.92E-05 42.9728 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#11383 - CGI_10015440 superfamily 114645 45 365 1.66E-23 104.455 cl05479 MCLC superfamily N - Mid-1-related chloride channel (MCLC); This family consists of several mid-1-related chloride channels. mid-1-related chloride channel (MCLC) proteins function as a chloride channel when incorporated in the planar lipid bilayer. Q#11383 - CGI_10015440 superfamily 248097 402 522 8.87E-18 81.1574 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11386 - CGI_10015444 superfamily 245596 1 99 3.41E-35 129.735 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#11388 - CGI_10015446 superfamily 242274 10 161 4.88E-07 47.8913 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#11391 - CGI_10015449 superfamily 241589 48 169 6.74E-33 120.817 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#11391 - CGI_10015449 superfamily 241589 187 315 3.66E-32 118.891 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#11391 - CGI_10015449 superfamily 241589 324 394 1.96E-16 74.9779 cl00071 GLECT superfamily C - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#11392 - CGI_10015450 superfamily 241609 1 73 1.30E-22 90.1299 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#11392 - CGI_10015450 superfamily 241609 251 327 4.86E-22 88.2039 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#11392 - CGI_10015450 superfamily 241609 157 240 1.86E-20 83.9667 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#11392 - CGI_10015450 superfamily 241609 81 156 6.02E-20 82.8111 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#11393 - CGI_10015451 superfamily 241589 375 492 6.77E-34 124.669 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#11393 - CGI_10015451 superfamily 241589 41 149 1.72E-29 112.728 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#11393 - CGI_10015451 superfamily 241589 280 359 1.54E-16 76.0903 cl00071 GLECT superfamily C - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#11393 - CGI_10015451 superfamily 241589 173 264 1.20E-13 68.0444 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#11395 - CGI_10002561 superfamily 241580 211 286 1.51E-35 126.515 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#11397 - CGI_10002563 superfamily 247684 170 275 0.000621351 39.8796 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11397 - CGI_10002563 superfamily 247684 480 581 0.000378555 41.6786 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11397 - CGI_10002563 superfamily 247684 224 374 0.00240582 38.9822 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11398 - CGI_10002564 superfamily 241628 87 186 0.00269006 36.6313 cl00130 PseudoU_synth superfamily C - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#11399 - CGI_10002565 superfamily 220611 3 106 4.95E-09 49.4655 cl10864 Laps superfamily - - Learning-associated protein; This is a family of 121-amino acid secretory proteins. Laps functions in the regulation of neuronal cell adhesion and/or movement and synapse attachment. Laps binds to the ApC/EBP (Aplysia CCAAT/enhancer binding protein) promoter and activates the transcription of ApC/EBP mRNA. Q#11400 - CGI_10004248 superfamily 245847 1 142 1.01E-16 72.2041 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#11401 - CGI_10004249 superfamily 245847 18 150 8.75E-15 69.1225 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#11401 - CGI_10004249 superfamily 242406 159 259 4.43E-14 66.8461 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#11402 - CGI_10004536 superfamily 148810 518 598 1.86E-16 77.9897 cl06447 Geminin superfamily N - "Geminin; This family contains the eukaryotic protein geminin (approximately 200 residues long). Geminin inhibits DNA replication by preventing the incorporation of MCM complex into prereplication complex, and is degraded during the mitotic phase of the cell cycle. It has been proposed that geminin inhibits DNA replication during S, G2, and M phases and that geminin destruction at the metaphase-anaphase transition permits replication in the succeeding cell cycle." Q#11404 - CGI_10004538 superfamily 245840 59 191 3.42E-10 56.5619 cl12022 Ribosomal_L18e superfamily - - Ribosomal protein L18e/L15; This family includes eukaryotic L18 as well as prokaryotic L15. Q#11405 - CGI_10004539 superfamily 217545 258 553 2.62E-113 355.089 cl04056 Peptidase_C54 superfamily - - Peptidase family C54; Peptidase family C54. Q#11406 - CGI_10004540 superfamily 242564 42 114 3.73E-07 45.3285 cl01534 NDUFA12 superfamily C - "NADH ubiquinone oxidoreductase subunit NDUFA12; This family contains the 17.2 kD subunit of complex I (NDUFA12) and its homologues. The family also contains a second related eukaryotic protein of unknown function, ." Q#11407 - CGI_10004541 superfamily 241565 737 801 4.23E-07 48.4719 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#11407 - CGI_10004541 superfamily 241565 66 127 0.000658047 38.8419 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#11407 - CGI_10004541 superfamily 241565 844 900 0.00212005 37.3011 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#11408 - CGI_10005881 superfamily 246669 131 185 6.12E-13 65.4662 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#11410 - CGI_10005883 superfamily 248012 4 89 2.00E-07 44.6216 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#11414 - CGI_10005887 superfamily 246680 445 533 0.00526861 35.4436 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#11415 - CGI_10005888 superfamily 241550 450 717 2.24E-98 317.651 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#11415 - CGI_10005888 superfamily 245839 719 853 1.66E-42 152.711 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#11415 - CGI_10005888 superfamily 241550 120 274 1.07E-81 271.813 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#11415 - CGI_10005888 superfamily 222257 302 392 5.86E-10 58.6415 cl16317 tRNA-synt_1_2 superfamily C - "Leucyl-tRNA synthetase, Domain 2; This is a family of the conserved region of Leucine-tRNA ligase or Leucyl-tRNA synthetase, EC:6.1.1.4." Q#11415 - CGI_10005888 superfamily 241634 970 1037 0.00107211 38.4899 cl00143 SynN superfamily C - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#11416 - CGI_10005889 superfamily 243694 16 376 1.64E-127 377.325 cl04289 Tfb2 superfamily - - Transcription factor Tfb2; Transcription factor Tfb2. Q#11417 - CGI_10005890 superfamily 246721 140 500 6.45E-116 366.968 cl14807 ACE1-Sec16-like superfamily - - "Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site." Q#11418 - CGI_10005891 superfamily 241900 1 103 0.00229577 34.3609 cl00490 EEP superfamily N - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#11421 - CGI_10020153 superfamily 241972 2 188 3.84E-55 177.733 cl00600 Ribosomal_L7Ae superfamily N - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#11422 - CGI_10020154 superfamily 247792 29 71 1.83E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11422 - CGI_10020154 superfamily 207713 97 147 1.30E-13 65.0537 cl02729 WWE superfamily C - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#11425 - CGI_10020157 superfamily 217293 64 278 8.02E-82 256.791 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11425 - CGI_10020157 superfamily 202474 285 535 8.12E-40 144.719 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#11427 - CGI_10020159 superfamily 217473 557 629 5.58E-05 43.893 cl03978 Mab-21 superfamily NC - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#11429 - CGI_10020161 superfamily 241644 10 169 1.25E-57 193.573 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#11429 - CGI_10020161 superfamily 246722 192 364 1.14E-05 45.9501 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#11430 - CGI_10020162 superfamily 241574 487 721 3.72E-102 315.679 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#11430 - CGI_10020162 superfamily 247069 126 280 1.87E-31 120.953 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#11433 - CGI_10020165 superfamily 243072 110 202 2.62E-30 109.781 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11435 - CGI_10020167 superfamily 245213 407 435 0.00345938 35.6902 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11435 - CGI_10020167 superfamily 214531 201 235 4.43E-07 47.2113 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11437 - CGI_10020169 superfamily 248012 13 105 1.69E-12 60.3649 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#11438 - CGI_10020170 superfamily 241563 63 100 0.00176002 35.726 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11440 - CGI_10020172 superfamily 110440 768 794 0.000102963 40.8541 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#11442 - CGI_10020174 superfamily 247724 18 152 9.76E-95 300.093 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11442 - CGI_10020174 superfamily 247792 186 226 6.41E-12 62.4632 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11442 - CGI_10020174 superfamily 207684 416 450 0.00116114 38.1288 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#11442 - CGI_10020174 superfamily 128973 369 392 0.00137158 37.9722 cl02765 ZnF_Rad18 superfamily - - Rad18-like CCHC zinc finger; Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids. Q#11447 - CGI_10020179 superfamily 243066 42 131 7.66E-36 126.899 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#11447 - CGI_10020179 superfamily 152037 2 27 6.40E-11 56.9092 cl13109 Shal-type superfamily - - Shal-type voltage-gated potassium channels; This family of proteins represents Shal-type voltage-gated potassium channels which interact with Kv channel-interacting proteins to modulate cell surface expression and function of Kv4 channels. The interaction of the Shal-type protein Kv4.2 and the Kv interacting protein KChiP1 forms a structure which is like the structure between calmodulin and its target peptides when they interact. Interactions of an N terminal alpha helix in Kv4.2 and a C terminal alpha helix in KChIP1 are essential for the modulation of Kv4.2 by KChIPs. Q#11447 - CGI_10020179 superfamily 219619 328 369 0.00918597 34.1056 cl18518 Ion_trans_2 superfamily C - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#11449 - CGI_10020181 superfamily 221286 75 164 3.28E-15 69.2048 cl13340 DUF3399 superfamily - - "Domain of unknown function (DUF3399); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 100 amino acids in length. This domain is found associated with pfam02214, pfam00520." Q#11450 - CGI_10020182 superfamily 241578 19 196 1.69E-44 156.606 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11450 - CGI_10020182 superfamily 148333 431 558 1.12E-08 54.2094 cl05947 ITI_HC_C superfamily C - "Inter-alpha-trypsin inhibitor heavy chain C-terminus; This family represents the C-terminal region of inter-alpha-trypsin inhibitor heavy chains. Inter-alpha-trypsin inhibitors are glycoproteins with a high inhibitory activity against trypsin, built up from different combinations of four polypeptides: bikunin and the three heavy chains that belong to this family (HC1, HC2, HC3). The heavy chains do not have any protease inhibitory properties but have the capacity to interact in vitro and in vivo with hyaluronic acid, which promotes the stability of the extra-cellular matrix. All family members contain the pfam00092 domain." Q#11451 - CGI_10020183 superfamily 110440 89 110 0.00377798 32.7649 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#11454 - CGI_10020186 superfamily 248264 54 139 1.35E-05 41.4538 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#11459 - CGI_10020191 superfamily 241610 131 181 1.50E-12 59.571 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#11459 - CGI_10020191 superfamily 241610 12 62 8.63E-10 52.2522 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#11460 - CGI_10005131 superfamily 247792 57 99 6.11E-09 48.53 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11462 - CGI_10005133 superfamily 241606 215 319 8.24E-29 113.54 cl00096 IRF superfamily - - Interferon Regulatory Factor (IRF); also known as tryptophan pentad repeat. The family of IRF transcription factors is important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. The IRF family is characterized by a unique 'tryptophan cluster' DNA-binding region. Viral IRFs bind to cellular IRFs; block type I and II interferons and host IRF-mediated transcriptional activation. Q#11463 - CGI_10005134 superfamily 217293 26 226 2.08E-33 125.053 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11463 - CGI_10005134 superfamily 202474 234 372 3.87E-09 55.3525 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#11464 - CGI_10005135 superfamily 248097 89 201 1.10E-11 58.8158 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11465 - CGI_10005136 superfamily 241584 446 538 1.08E-13 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#11465 - CGI_10005136 superfamily 241584 357 441 9.83E-11 59.0471 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#11465 - CGI_10005136 superfamily 241584 64 154 1.60E-09 55.5803 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#11465 - CGI_10005136 superfamily 241584 146 227 9.43E-09 53.2691 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#11465 - CGI_10005136 superfamily 245814 12 56 7.15E-06 44.8414 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11468 - CGI_10016126 superfamily 246680 478 560 0.00184464 36.9298 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#11470 - CGI_10016128 superfamily 218570 144 166 0.00595769 33.5301 cl05109 Pacifastin_I superfamily NC - "Pacifastin inhibitor (LCMII); Structures of members of this family show that they are comprised of a triple-stranded antiparallel beta-sheet connected by three disulfide bridges, which defines this as a novel family of serine protease inhibitors." Q#11470 - CGI_10016128 superfamily 218570 229 256 0.00645024 33.5301 cl05109 Pacifastin_I superfamily N - "Pacifastin inhibitor (LCMII); Structures of members of this family show that they are comprised of a triple-stranded antiparallel beta-sheet connected by three disulfide bridges, which defines this as a novel family of serine protease inhibitors." Q#11471 - CGI_10016129 superfamily 216212 466 990 0 682.095 cl03037 HCO3_cotransp superfamily - - HCO3- transporter family; This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. Q#11472 - CGI_10016130 superfamily 241596 107 165 7.04E-18 76.8691 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#11472 - CGI_10016130 superfamily 221625 167 231 1.02E-17 78.1703 cl13910 Neuro_bHLH superfamily C - "Neuronal helix-loop-helix transcription factor; This domain family is found in eukaryotes, and is approximately 80 amino acids in length. The family is found C-terminal to pfam00010. There is a single completely conserved residue W that may be functionally important. Neuronal basic helix-loop-helix (bHLH) transcription factors such as neuroD and neurogenin have been shown to play important roles in neuronal development." Q#11473 - CGI_10016131 superfamily 241669 11 113 6.04E-16 74.2336 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#11473 - CGI_10016131 superfamily 241669 382 499 1.60E-14 69.9964 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#11473 - CGI_10016131 superfamily 241669 259 364 1.06E-11 61.9072 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#11473 - CGI_10016131 superfamily 241669 136 253 3.46E-11 60.3664 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#11474 - CGI_10016132 superfamily 241669 9 112 3.77E-16 74.6188 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#11474 - CGI_10016132 superfamily 241669 259 368 1.75E-09 55.3589 cl00187 Fascin superfamily - - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#11474 - CGI_10016132 superfamily 241669 385 480 2.99E-06 45.3437 cl00187 Fascin superfamily C - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#11474 - CGI_10016132 superfamily 241669 148 230 0.000835185 38.0371 cl00187 Fascin superfamily C - "Fascin-like domain; members include actin-bundling/crosslinking proteins facsin, histoactophilin and singed; identified in sea urchin, Drosophila, Xenopus, rodents, and humans; The fascin-like domain adopts a beta-trefoil topology and contains an internal threefold repeat; the fascin subgroup contains four copies of the domain; Structurally similar to fibroblast growth factor (FGF)" Q#11477 - CGI_10016135 superfamily 246683 20 276 5.69E-91 273.331 cl14648 Aldose_epim superfamily - - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#11478 - CGI_10016136 superfamily 242465 115 251 6.03E-16 73.63 cl01378 LicD superfamily - - "LicD family; The LICD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent. These proteins are part of the nucleotidyltransferase superfamily." Q#11480 - CGI_10016138 superfamily 217293 2 157 2.21E-26 108.874 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11480 - CGI_10016138 superfamily 202474 190 240 2.85E-11 63.4417 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#11480 - CGI_10016138 superfamily 204434 935 958 2.20E-08 51.8117 cl10963 zf-CCHH superfamily - - "Zinc-finger (CX5CX6HX5H) motif; This domain is a zinc-finger motif that in humans is part of the APLF, aprataxin- and PNK-like forkead association domain-containing protein. The ZnF is highly conserved both in primary sequence and in the spacing between the putative zinc coordinating residues and is configured CX5CX6HX5H. Many of the proteins containing the APLF-like ZnF are involved in DNA strand break repair and/or contain domains implicated in DNA metabolism." Q#11480 - CGI_10016138 superfamily 241592 365 381 0.00827556 35.8702 cl00074 H2A superfamily NC - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#11480 - CGI_10016138 superfamily 241581 457 522 0.00829773 35.6499 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#11481 - CGI_10016139 superfamily 202837 71 120 0.000615189 36.4569 cl04354 Nnf1 superfamily N - Nnf1; NNF1 is an essential yeast gene that is necessary for chromosome segregation. It is associated with the spindle poles and forms part of a kinetochore subcomplex called MIND. Q#11482 - CGI_10016140 superfamily 245815 31 507 0 903.425 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#11483 - CGI_10016141 superfamily 241659 289 356 0.00414332 36.7983 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#11485 - CGI_10016143 superfamily 241982 20 215 2.29E-56 181.226 cl00613 ATP-synt_D superfamily - - ATP synthase subunit D; This is a family of subunit D form various ATP synthases including V-type H+ transporting and Na+ dependent. Subunit D is suggested to be an integral part of the catalytic sector of the V-ATPase. Q#11486 - CGI_10016144 superfamily 151110 15 188 1.36E-49 162.367 cl11201 UPF0552 superfamily N - Uncharacterized protein family UPF0552; This family of proteins has no known function. Q#11487 - CGI_10016145 superfamily 247683 39 109 2.59E-16 76.7504 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#11488 - CGI_10016146 superfamily 247692 34 680 0 831.263 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#11489 - CGI_10016147 superfamily 243072 241 366 3.71E-35 127.5 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11489 - CGI_10016147 superfamily 243072 307 432 2.37E-29 111.707 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11489 - CGI_10016147 superfamily 243072 75 199 7.07E-17 77.0386 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11490 - CGI_10016148 superfamily 245814 148 197 3.03E-06 43.6852 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11490 - CGI_10016148 superfamily 245814 42 110 0.00528069 34.0482 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11491 - CGI_10016149 superfamily 241622 228 308 1.95E-18 83.7702 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11491 - CGI_10016149 superfamily 241622 675 761 3.67E-13 68.3622 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11491 - CGI_10016149 superfamily 241622 2704 2789 4.21E-12 65.2806 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11491 - CGI_10016149 superfamily 241622 816 892 9.07E-10 58.347 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11491 - CGI_10016149 superfamily 241622 2428 2497 5.42E-07 49.9965 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11491 - CGI_10016149 superfamily 241622 1584 1652 1.25E-06 48.8409 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11491 - CGI_10016149 superfamily 241622 391 432 0.00226767 39.0871 cl00117 PDZ superfamily C - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11491 - CGI_10016149 superfamily 218881 1320 1539 0.00295199 41.8148 cl18482 Herpes_UL32 superfamily N - Herpesvirus large structural phosphoprotein UL32; The large phosphorylated protein (UL32-like) of herpes viruses is the polypeptide most frequently reactive in immuno-blotting analyses with antisera when compared with other viral proteins. Q#11493 - CGI_10006112 superfamily 216554 27 165 3.06E-23 92.1573 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#11494 - CGI_10006113 superfamily 241645 250 317 1.32E-29 109.357 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#11494 - CGI_10006113 superfamily 241643 6 39 2.70E-08 49.3799 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#11495 - CGI_10006114 superfamily 243176 28 524 1.32E-36 141.211 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#11497 - CGI_10006116 superfamily 241563 73 106 1.93E-05 42.2744 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11500 - CGI_10006119 superfamily 247856 64 125 3.21E-16 68.3433 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11500 - CGI_10006119 superfamily 247856 2 52 4.10E-11 54.4761 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11502 - CGI_10006121 superfamily 243034 2 123 0.000131611 38.13 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#11503 - CGI_10006122 superfamily 248345 59 189 7.85E-34 119.687 cl17791 SAC3_GANP superfamily - - "SAC3/GANP/Nin1/mts3/eIF-3 p25 family; This large family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit. This family includes several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits." Q#11506 - CGI_10006125 superfamily 241599 77 135 8.88E-23 88.8396 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#11507 - CGI_10014791 superfamily 243092 1161 1449 1.73E-92 303.102 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11507 - CGI_10014791 superfamily 241578 327 497 1.81E-11 64.1242 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11513 - CGI_10014799 superfamily 248264 215 376 1.47E-05 43.765 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#11514 - CGI_10014800 superfamily 220672 44 227 1.03E-23 95.7766 cl10957 Frag1 superfamily - - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#11515 - CGI_10014801 superfamily 246748 32 425 0 603.117 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#11516 - CGI_10014802 superfamily 245814 46 98 7.31E-10 51.4469 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11518 - CGI_10015070 superfamily 241629 29 151 1.79E-37 129.948 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#11520 - CGI_10015072 superfamily 243066 118 218 3.26E-12 64.1757 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#11520 - CGI_10015072 superfamily 198867 387 473 6.73E-10 57.3512 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#11520 - CGI_10015072 superfamily 243066 220 377 1.13E-06 47.6121 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#11521 - CGI_10015073 superfamily 246669 610 710 2.87E-23 96.7522 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#11521 - CGI_10015073 superfamily 246669 334 419 1.09E-17 80.5739 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#11523 - CGI_10015075 superfamily 241829 1 52 0.00225821 33.5761 cl00383 Ribosomal_L33 superfamily - - Ribosomal protein L33; Ribosomal protein L33. Q#11524 - CGI_10015076 superfamily 216301 1 179 3.93E-57 179.767 cl03099 EMP24_GP25L superfamily - - emp24/gp25L/p24 family/GOLD; Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Q#11527 - CGI_10015079 superfamily 222313 1 40 1.03E-10 56.4314 cl18662 Methyltransf_32 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#11528 - CGI_10015080 superfamily 243092 9 305 1.52E-23 97.0204 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11529 - CGI_10015081 superfamily 247725 680 775 2.15E-41 148.985 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#11529 - CGI_10015081 superfamily 247725 790 881 0.000239237 41.1868 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#11529 - CGI_10015081 superfamily 247725 1322 1424 4.36E-34 128.317 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#11529 - CGI_10015081 superfamily 243052 925 1120 4.67E-27 109.757 cl02480 MyTH4 superfamily - - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#11529 - CGI_10015081 superfamily 215882 1238 1326 2.48E-05 44.1939 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#11531 - CGI_10015083 superfamily 245716 152 178 4.21E-10 54.8719 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#11531 - CGI_10015083 superfamily 245716 114 139 8.84E-10 54.1015 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#11534 - CGI_10015086 superfamily 204716 42 192 4.97E-05 42.2275 cl18257 Git3 superfamily C - "G protein-coupled glucose receptor regulating Gpa2; Git3 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. Git3 contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This is the conserved N-terminus of these proteins, and the C-terminal conserved region is now in family Git3_C." Q#11537 - CGI_10015089 superfamily 247743 4 135 1.44E-05 40.9775 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#11540 - CGI_10015093 superfamily 241732 1521 1685 4.75E-37 138.515 cl00258 RIBOc superfamily - - "RIBOc. Ribonuclease III C terminal domain. This group consists of eukaryotic, bacterial and archeal ribonuclease III (RNAse III) proteins. RNAse III is a double stranded RNA-specific endonuclease. Prokaryotic RNAse III is important in post-transcriptional control of mRNA stability and translational efficiency. It is involved in the processing of ribosomal RNA precursors. Prokaryotic RNAse III also plays a role in the maturation of tRNA precursors and in the processing of phage and plasmid transcripts. Eukaryotic RNase III's participate (through direct cleavage) in rRNA processing, in processing of small nucleolar RNAs (snoRNAs) and snRNA's (components of the spliceosome). In eukaryotes RNase III or RNaseIII like enzymes such as Dicer are involved in RNAi (RNA interference) and miRNA (micro-RNA) gene silencing." Q#11540 - CGI_10015093 superfamily 247905 522 577 1.19E-09 58.4033 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#11540 - CGI_10015093 superfamily 241765 911 1033 3.42E-46 164.155 cl00301 PAZ superfamily - - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#11540 - CGI_10015093 superfamily 190615 648 742 3.60E-28 111.545 cl04028 dsRNA_bind superfamily - - "Double stranded RNA binding domain; This domain is a divergent double stranded RNA-binding domain. It is found in members of the Dicer protein family which function in RNA interference, an evolutionarily conserved mechanism for gene silencing using double-stranded RNA (dsRNA) molecules." Q#11540 - CGI_10015093 superfamily 241732 1426 1467 3.00E-05 44.5259 cl00258 RIBOc superfamily N - "RIBOc. Ribonuclease III C terminal domain. This group consists of eukaryotic, bacterial and archeal ribonuclease III (RNAse III) proteins. RNAse III is a double stranded RNA-specific endonuclease. Prokaryotic RNAse III is important in post-transcriptional control of mRNA stability and translational efficiency. It is involved in the processing of ribosomal RNA precursors. Prokaryotic RNAse III also plays a role in the maturation of tRNA precursors and in the processing of phage and plasmid transcripts. Eukaryotic RNase III's participate (through direct cleavage) in rRNA processing, in processing of small nucleolar RNAs (snoRNAs) and snRNA's (components of the spliceosome). In eukaryotes RNase III or RNaseIII like enzymes such as Dicer are involved in RNAi (RNA interference) and miRNA (micro-RNA) gene silencing." Q#11540 - CGI_10015093 superfamily 247805 45 193 0.00368357 38.4724 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#11541 - CGI_10015094 superfamily 241782 10 162 1.14E-58 193.055 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#11542 - CGI_10007754 superfamily 217293 240 368 5.81E-20 89.9995 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11542 - CGI_10007754 superfamily 215647 721 843 2.48E-16 79.1896 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#11542 - CGI_10007754 superfamily 202474 375 425 1.06E-07 52.2709 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#11542 - CGI_10007754 superfamily 221370 530 694 0.000101157 43.5141 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#11543 - CGI_10007755 superfamily 245213 227 263 1.10E-09 56.1058 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 949 984 4.02E-08 51.4834 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 986 1022 4.90E-08 51.4834 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 151 187 9.05E-08 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 75 111 1.06E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 37 73 2.39E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 113 149 2.80E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 1 35 3.47E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 189 224 4.49E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 1024 1060 4.95E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 245213 1062 1098 1.20E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11543 - CGI_10007755 superfamily 215647 562 660 5.99E-17 81.5008 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#11543 - CGI_10007755 superfamily 216897 880 946 1.08E-11 63.0841 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#11543 - CGI_10007755 superfamily 221370 1261 1352 3.01E-06 48.5217 cl13441 DUF3497 superfamily C - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#11543 - CGI_10007755 superfamily 243029 1178 1229 4.99E-05 42.7229 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#11543 - CGI_10007755 superfamily 221370 472 551 8.32E-05 44.2845 cl13441 DUF3497 superfamily C - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#11544 - CGI_10007756 superfamily 215647 10 105 5.00E-10 53.3813 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#11547 - CGI_10007332 superfamily 241571 517 640 8.29E-14 68.9782 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#11547 - CGI_10007332 superfamily 241571 140 266 1.55E-11 62.0446 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#11547 - CGI_10007332 superfamily 245213 689 720 2.99E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11547 - CGI_10007332 superfamily 245213 648 681 1.35E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11547 - CGI_10007332 superfamily 245213 315 346 0.000192373 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11547 - CGI_10007332 superfamily 241583 410 512 2.71E-28 112.665 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#11547 - CGI_10007332 superfamily 241583 1 96 6.78E-23 96.8714 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#11550 - CGI_10007335 superfamily 241802 11 125 4.19E-36 134.288 cl00342 Trp-synth-beta_II superfamily NC - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#11551 - CGI_10007336 superfamily 241802 2 462 0 529.118 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#11552 - CGI_10007337 superfamily 219732 472 633 6.35E-50 177.022 cl06969 NUC173 superfamily - - NUC173 domain; This is the central domain of of novel family of hypothetical nucleolar proteins. Q#11553 - CGI_10014061 superfamily 247676 1 93 2.31E-30 108.467 cl17012 GINS_A superfamily N - "Alpha-helical domain of GINS complex proteins; Sld5, Psf1, Psf2 and Psf3; The GINS complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In eukaryotes, GINS is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3. The GINS complex has been found in eukaryotes and archaea, but not in bacteria. The four subunits of the complex are homologous and consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3." Q#11555 - CGI_10014063 superfamily 243267 7 351 7.19E-49 170.872 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#11556 - CGI_10014064 superfamily 243267 126 257 6.15E-18 81.8912 cl03000 Innexin superfamily N - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#11556 - CGI_10014064 superfamily 243267 5 111 3.11E-09 56.0828 cl03000 Innexin superfamily C - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#11562 - CGI_10014070 superfamily 247794 1 260 6.62E-129 371.342 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#11563 - CGI_10014071 superfamily 241764 25 88 1.36E-05 41.8179 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#11564 - CGI_10014072 superfamily 217293 51 240 9.03E-35 128.905 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11565 - CGI_10014073 superfamily 247637 5 328 1.45E-151 432.072 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#11568 - CGI_10014076 superfamily 110440 408 435 0.00433506 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#11571 - CGI_10014079 superfamily 245936 207 417 1.35E-54 181.594 cl12283 IPK superfamily - - Inositol polyphosphate kinase; ArgRIII has has been demonstrated to be an inositol polyphosphate kinase. Q#11572 - CGI_10014080 superfamily 217584 37 156 3.21E-27 103.591 cl04100 MOSC_N superfamily - - "MOSC N-terminal beta barrel domain; This domain is found to the N-terminus of pfam03473. The function of this domain is unknown, however it is predicted to adopt a beta barrel fold." Q#11572 - CGI_10014080 superfamily 217583 186 320 6.18E-11 58.5283 cl04097 MOSC superfamily - - "MOSC domain; The MOSC (MOCO sulfurase C-terminal) domain is a superfamily of beta-strand-rich domains identified in the molybdenum cofactor sulfurase and several other proteins from both prokaryotes and eukaryotes. These MOSC domains contain an absolutely conserved cysteine and occur either as stand-alone forms, or fused to other domains such as NifS-like catalytic domain in Molybdenum cofactor sulfurase. The MOSC domain is predicted to be a sulfur-carrier domain that receives sulfur abstracted by the pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulfur-metal clusters." Q#11573 - CGI_10014081 superfamily 245201 301 514 9.69E-118 355.658 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11573 - CGI_10014081 superfamily 247725 23 130 4.50E-18 80.877 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#11573 - CGI_10014081 superfamily 245201 266 303 6.86E-14 71.7657 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11573 - CGI_10014081 superfamily 245201 521 584 1.73E-08 55.2022 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11573 - CGI_10014081 superfamily 247736 167 219 0.000227053 40.4585 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#11574 - CGI_10014082 superfamily 241971 106 381 7.61E-110 325.3 cl00599 Extradiol_Dioxygenase_3B_like superfamily - - "Subunit B of Class III Extradiol ring-cleavage dioxygenases; Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be further divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two-domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B. This model represents the catalytic subunit B of extradiol dioxygenase class III enzymes. Enzymes belonging to this family include Protocatechuate 4,5-dioxygenase (LigAB), 2'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarB), 4,5-DOPA Dioxygenase, 2,3-dihydroxyphenylpropionate 1,2-dioxygenase, and 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD). There are also some family members that do not show the typical dioxygenase activity." Q#11574 - CGI_10014082 superfamily 241971 6 109 9.45E-43 151.575 cl00599 Extradiol_Dioxygenase_3B_like superfamily C - "Subunit B of Class III Extradiol ring-cleavage dioxygenases; Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be further divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two-domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B. This model represents the catalytic subunit B of extradiol dioxygenase class III enzymes. Enzymes belonging to this family include Protocatechuate 4,5-dioxygenase (LigAB), 2'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarB), 4,5-DOPA Dioxygenase, 2,3-dihydroxyphenylpropionate 1,2-dioxygenase, and 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD). There are also some family members that do not show the typical dioxygenase activity." Q#11576 - CGI_10014084 superfamily 243072 326 450 1.92E-28 110.166 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11576 - CGI_10014084 superfamily 243072 256 384 4.14E-26 103.617 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11576 - CGI_10014084 superfamily 243072 191 318 1.61E-22 93.6022 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11576 - CGI_10014084 superfamily 243072 123 241 1.29E-20 88.2094 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11577 - CGI_10014085 superfamily 241872 30 172 1.37E-20 86.4366 cl00453 CDP-OH_P_transf superfamily N - CDP-alcohol phosphatidyltransferase; All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. Q#11580 - CGI_10007098 superfamily 218493 466 613 1.43E-45 159.445 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#11580 - CGI_10007098 superfamily 248097 628 750 1.49E-14 71.1422 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11580 - CGI_10007098 superfamily 180442 263 323 0.00249391 39.6677 cl18106 PRK06175 superfamily NC - L-aspartate oxidase; Provisional Q#11581 - CGI_10007099 superfamily 150464 26 134 7.86E-21 82.297 cl10773 Transmemb_17 superfamily - - Predicted membrane protein; This is a 100 amino acid region of a family of proteins conserved from nematodes to humans. It is predicted to be a transmembrane region but its function is not known. Q#11583 - CGI_10008805 superfamily 202367 4 183 1.37E-51 170.028 cl18226 3HCDH_N superfamily - - "3-hydroxyacyl-CoA dehydrogenase, NAD binding domain; This family also includes lambda crystallin." Q#11583 - CGI_10008805 superfamily 216084 188 256 1.10E-13 65.3057 cl08285 3HCDH superfamily C - "3-hydroxyacyl-CoA dehydrogenase, C-terminal domain; This family also includes lambda crystallin. Some proteins include two copies of this domain." Q#11584 - CGI_10008806 superfamily 243099 578 683 1.47E-06 47.3276 cl02575 Bcl-2_like superfamily N - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#11586 - CGI_10008808 superfamily 243099 466 593 1.29E-33 125.523 cl02575 Bcl-2_like superfamily - - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#11586 - CGI_10008808 superfamily 243099 288 389 2.76E-08 52.3352 cl02575 Bcl-2_like superfamily N - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#11587 - CGI_10008809 superfamily 246671 61 154 4.09E-10 57.4328 cl14606 Reeler_cohesin_like superfamily N - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#11588 - CGI_10008810 superfamily 241884 7 206 5.31E-134 379.712 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#11589 - CGI_10008811 superfamily 221323 323 501 1.29E-86 266.821 cl13383 DUF3449 superfamily - - Domain of unknown function (DUF3449); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 181 to 207 amino acids in length. This domain has two conserved sequence motifs: PIP and CEICG. The domain carries a zinc-finger domain of the C2H2-type. Q#11589 - CGI_10008811 superfamily 205477 244 303 3.51E-29 109.155 cl16217 Telomere_Sde2_2 superfamily - - "Telomere stability C-terminal; This short C-terminal domain is found in higher eukaryotes further downstream from the Sde2 family, pfam13019. It is found in all Sde2-related proteins except those from fission yeast, fly, and mosquito. Its exact function in telomere formation and maintenance has not yet been established." Q#11589 - CGI_10008811 superfamily 192937 74 100 4.45E-07 46.7375 cl13534 SF3a60_bindingd superfamily - - "Splicing factor SF3a60 binding domain; This domain is found in eukaryotes. This domain is about 30 amino acids in length. This domain has a single completely conserved residue Y that may be functionally important. SF3a60 makes up the SF3a complex with SF3a66 and SF3a120. This domain is the binding site of SF3a60 for SF3a120. The SF3a complex is part of the spliceosome, a protein complex involved in splicing mRNA after transcription." Q#11590 - CGI_10008812 superfamily 243072 492 603 1.30E-32 123.263 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11591 - CGI_10008813 superfamily 243066 268 359 2.57E-21 86.8381 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#11592 - CGI_10008814 superfamily 242898 40 135 4.43E-22 85.6409 cl02130 Got1 superfamily - - "Got1/Sft2-like family; Traffic through the yeast Golgi complex depends on a member of the syntaxin family of SNARE proteins, Sed5, present in early Golgi cisternae. Got1 is thought to facilitate Sed5-dependent fusion events. This is a family of sequences derived from eukaryotic proteins. They are similar to a region of a SNARE-like protein required for traffic through the Golgi complex, SFT2 protein. This is a conserved protein with four putative transmembrane helices, thought to be involved in vesicular transport in later Golgi compartments." Q#11593 - CGI_10008815 superfamily 247724 43 249 1.79E-135 391.631 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11593 - CGI_10008815 superfamily 243185 250 362 1.12E-63 203.902 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#11593 - CGI_10008815 superfamily 150009 369 460 3.74E-45 153.888 cl08529 eIF2_C superfamily - - "Initiation factor eIF2 gamma, C terminal; Members of this family, which are found in the initiation factors eIF2 and EF-Tu, adopt a structure consisting of a beta barrel with Greek key topology. They are required for formation of the ternary complex with GTP and initiator tRNA." Q#11594 - CGI_10008816 superfamily 245232 7 197 4.25E-64 211.116 cl10020 S2P-M50 superfamily C - "Site-2 protease (S2P) class of zinc metalloproteases (MEROPS family M50) cleaves transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of this family use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. The domain core structure appears to contain at least three transmembrane helices with a catalytic zinc atom coordinated by three conserved residues contained within the consensus sequence HExxH, together with a conserved aspartate residue. The S2P/M50 family of RIP proteases is widely distributed; in eukaryotic cells, they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum (ER) stress responses. In sterol-depleted mammalian cells, a two-step proteolytic process releases the N-terminal domains of sterol regulatory element-binding proteins (SREBPs) from membranes of the ER. These domains translocate into the nucleus, where they activate genes of cholesterol and fatty acid biosynthesis. It is the second proteolytic step that is carried out by the SREBP Site-2 protease (S2P) which is present in this CD superfamily. Prokaryotic S2P/M50 homologs have been shown to regulate stress responses, sporulation, cell division, and cell differentiation. In Escherichia coli, the S2P homolog RseP is involved in the sigmaE pathway of extracytoplasmic stress responses, and in Bacillus subtilis, the S2P homolog SpoIVFB is involved in the pro-sigmaK pathway of spore formation. Some of the subfamilies within this hierarchy contain one or two PDZ domain insertions, with putative regulatory roles, such as the inhibition of substrate cleavage as seen by the RseP PDZ domain." Q#11594 - CGI_10008816 superfamily 245232 409 477 2.25E-19 87.0818 cl10020 S2P-M50 superfamily N - "Site-2 protease (S2P) class of zinc metalloproteases (MEROPS family M50) cleaves transmembrane domains of substrate proteins, regulating intramembrane proteolysis (RIP) of diverse signal transduction mechanisms. Members of this family use proteolytic activity within the membrane to transfer information across membranes to integrate gene expression with physiologic stresses occurring in another cellular compartment. The domain core structure appears to contain at least three transmembrane helices with a catalytic zinc atom coordinated by three conserved residues contained within the consensus sequence HExxH, together with a conserved aspartate residue. The S2P/M50 family of RIP proteases is widely distributed; in eukaryotic cells, they regulate such processes as sterol and lipid metabolism, and endoplasmic reticulum (ER) stress responses. In sterol-depleted mammalian cells, a two-step proteolytic process releases the N-terminal domains of sterol regulatory element-binding proteins (SREBPs) from membranes of the ER. These domains translocate into the nucleus, where they activate genes of cholesterol and fatty acid biosynthesis. It is the second proteolytic step that is carried out by the SREBP Site-2 protease (S2P) which is present in this CD superfamily. Prokaryotic S2P/M50 homologs have been shown to regulate stress responses, sporulation, cell division, and cell differentiation. In Escherichia coli, the S2P homolog RseP is involved in the sigmaE pathway of extracytoplasmic stress responses, and in Bacillus subtilis, the S2P homolog SpoIVFB is involved in the pro-sigmaK pathway of spore formation. Some of the subfamilies within this hierarchy contain one or two PDZ domain insertions, with putative regulatory roles, such as the inhibition of substrate cleavage as seen by the RseP PDZ domain." Q#11597 - CGI_10008819 superfamily 241609 26 108 2.30E-18 73.9515 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#11597 - CGI_10008819 superfamily 241609 4 24 1.26E-07 44.6763 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#11599 - CGI_10011322 superfamily 245201 690 884 4.37E-26 109.569 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11600 - CGI_10011323 superfamily 177822 62 173 0.000443865 38.7477 cl18088 PLN02164 superfamily N - sulfotransferase Q#11601 - CGI_10011324 superfamily 241547 46 244 2.01E-48 161.297 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#11602 - CGI_10011325 superfamily 243109 9 65 0.00176935 36.4106 cl02614 SPRY superfamily C - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#11603 - CGI_10011326 superfamily 247792 26 70 1.91E-05 42.818 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11603 - CGI_10011326 superfamily 128778 213 339 3.86E-06 45.7187 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#11604 - CGI_10011327 superfamily 247792 61 104 3.44E-06 45.1292 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11604 - CGI_10011327 superfamily 128778 251 373 1.79E-05 43.7927 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#11604 - CGI_10011327 superfamily 241563 195 239 0.000227382 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11604 - CGI_10011327 superfamily 243109 557 648 0.000432866 40.3653 cl02614 SPRY superfamily C - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#11604 - CGI_10011327 superfamily 216033 410 501 0.000960857 38.0836 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#11605 - CGI_10011328 superfamily 219502 363 581 3.14E-79 251.98 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#11605 - CGI_10011328 superfamily 201962 181 252 4.01E-21 88.2016 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#11605 - CGI_10011328 superfamily 219507 260 358 2.80E-12 63.7975 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#11606 - CGI_10011329 superfamily 245226 296 465 4.90E-23 95.8304 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#11608 - CGI_10011331 superfamily 216686 278 454 3.88E-49 169.041 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#11609 - CGI_10005366 superfamily 109891 28 160 1.64E-72 229.685 cl02989 Runt superfamily - - Runt domain; Runt domain. Q#11611 - CGI_10005368 superfamily 241571 177 293 3.01E-08 50.1035 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#11611 - CGI_10005368 superfamily 241583 15 172 4.43E-31 115.746 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#11612 - CGI_10013251 superfamily 243072 98 224 8.06E-20 83.2018 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11612 - CGI_10013251 superfamily 243072 30 154 6.90E-18 77.809 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11612 - CGI_10013251 superfamily 243072 165 300 1.12E-06 45.8374 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11615 - CGI_10013254 superfamily 220003 608 873 1.41E-106 331.923 cl07388 Gamma-COP superfamily - - "Coatomer gamma subunit appendage domain; COPI-coated vesicles function in retrograde transport from the Golgi to the ER, and in intra-Golgi transport. This domain corresponds to the coatomer gamma subunit appendage domain. It contains a protein-protein interaction site and a second proposed binding site that interacts with the alpha, beta,epsilon COPI subcomplex." Q#11616 - CGI_10013255 superfamily 243119 75 116 0.000652323 36.6429 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11617 - CGI_10013256 superfamily 243119 35 74 1.72E-06 45.1273 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11617 - CGI_10013256 superfamily 243119 270 318 0.000437874 38.1837 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11617 - CGI_10013256 superfamily 243119 331 380 0.00388348 35.4974 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11617 - CGI_10013256 superfamily 243119 91 140 0.00651294 34.727 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11619 - CGI_10013258 superfamily 243119 5 54 0.000131731 38.1837 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11620 - CGI_10013259 superfamily 243119 32 73 6.32E-06 42.0357 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#11621 - CGI_10013260 superfamily 245847 31 112 6.01E-08 47.1662 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#11622 - CGI_10013261 superfamily 245040 42 99 2.70E-06 41.5183 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#11623 - CGI_10013262 superfamily 241574 388 613 1.70E-92 288.33 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#11624 - CGI_10013263 superfamily 241752 112 231 2.77E-35 123.198 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#11625 - CGI_10013264 superfamily 241563 62 100 2.21E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11628 - CGI_10013269 superfamily 241563 60 96 4.90E-05 41.1188 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11628 - CGI_10013269 superfamily 110440 521 548 0.00885849 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#11629 - CGI_10013270 superfamily 207662 104 177 1.38E-30 113.811 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#11629 - CGI_10013270 superfamily 245599 333 506 1.08E-22 94.9822 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#11630 - CGI_10013271 superfamily 241659 20 86 3.42E-23 89.576 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#11631 - CGI_10013272 superfamily 190308 140 287 1.96E-09 56.5583 cl18163 Fringe superfamily C - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#11632 - CGI_10001796 superfamily 247057 304 365 1.30E-07 48.3012 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#11633 - CGI_10001797 superfamily 247725 25 113 4.46E-33 125.445 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#11634 - CGI_10001798 superfamily 215825 5 106 6.80E-33 120.485 cl02828 Calreticulin superfamily N - Calreticulin family; Calreticulin family. Q#11636 - CGI_10026148 superfamily 242173 47 186 9.63E-15 67.2771 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#11640 - CGI_10026152 superfamily 241564 95 163 4.41E-21 82.6987 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#11641 - CGI_10026153 superfamily 245847 346 491 1.39E-11 62.1889 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#11641 - CGI_10026153 superfamily 216939 9 66 5.39E-11 58.8285 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#11641 - CGI_10026153 superfamily 241619 117 163 0.00518663 35.2505 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#11642 - CGI_10026155 superfamily 241613 234 268 1.63E-11 62.2241 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 241613 274 308 1.63E-11 62.2241 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 241613 152 186 4.99E-10 57.987 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 241613 313 347 8.64E-10 57.2166 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 241613 694 728 1.35E-09 56.4462 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 241613 652 686 1.58E-09 56.4462 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 241613 32 66 2.46E-09 55.6758 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 241613 193 228 1.25E-07 50.6682 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 241613 360 389 2.40E-07 49.8978 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11642 - CGI_10026155 superfamily 245213 773 806 2.07E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11642 - CGI_10026155 superfamily 214531 1191 1233 7.79E-13 66.086 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1804 1846 2.07E-12 64.9304 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 970 1011 6.15E-11 60.6932 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1278 1320 1.12E-10 59.9228 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 543 585 7.90E-10 57.2264 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 215683 1254 1295 1.89E-09 56.4095 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#11642 - CGI_10026155 superfamily 214531 499 542 6.46E-09 54.5301 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1500 1540 1.13E-08 54.1449 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1891 1933 1.73E-08 53.3745 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 215683 1561 1601 2.17E-08 53.3279 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#11642 - CGI_10026155 superfamily 214531 1585 1627 4.37E-08 52.2189 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 883 922 4.41E-08 52.2189 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 462 498 7.36E-08 51.4485 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1847 1890 2.15E-07 50.2929 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 925 968 1.24E-06 47.9817 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1459 1497 1.27E-06 47.9817 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 221695 753 776 1.09E-05 45.1386 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#11642 - CGI_10026155 superfamily 214531 1149 1190 1.10E-05 45.2853 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1012 1053 5.07E-05 43.3593 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1767 1802 7.55E-05 42.5889 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 841 878 0.000109729 42.2037 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 414 455 0.000180809 41.8185 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11642 - CGI_10026155 superfamily 214531 1933 1974 0.000336455 40.6629 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#11646 - CGI_10026159 superfamily 241571 357 468 1.25E-08 53.185 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#11646 - CGI_10026159 superfamily 241583 130 314 8.15E-48 166.284 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#11646 - CGI_10026159 superfamily 243051 484 637 6.43E-09 54.6913 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11647 - CGI_10026160 superfamily 247044 11 146 5.49E-48 153.481 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#11648 - CGI_10026162 superfamily 248458 60 227 4.38E-16 78.5097 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11648 - CGI_10026162 superfamily 243161 495 556 1.56E-08 52.3966 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#11649 - CGI_10026163 superfamily 243072 123 272 2.20E-14 67.4086 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11649 - CGI_10026163 superfamily 243072 20 173 3.28E-09 52.771 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11654 - CGI_10026168 superfamily 248281 79 146 7.37E-09 52.6579 cl17727 GT1 superfamily N - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#11655 - CGI_10026169 superfamily 248458 43 151 2.08E-09 57.3237 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11655 - CGI_10026169 superfamily 248458 215 378 0.000298131 41.1453 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11656 - CGI_10026170 superfamily 247805 15 78 4.97E-09 49.9503 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#11658 - CGI_10026172 superfamily 248458 1 169 7.00E-08 50.3901 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11659 - CGI_10026173 superfamily 243303 5 73 1.80E-38 124.654 cl03104 CKS superfamily - - Cyclin-dependent kinase regulatory subunit; Cyclin-dependent kinase regulatory subunit. Q#11660 - CGI_10026174 superfamily 248012 955 1048 6.07E-24 99.1892 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#11660 - CGI_10026174 superfamily 248012 1148 1256 7.02E-23 96.1076 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#11660 - CGI_10026174 superfamily 246680 859 935 2.01E-08 53.548 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#11661 - CGI_10026175 superfamily 243035 2 92 5.50E-09 48.7702 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11662 - CGI_10026176 superfamily 248012 690 783 5.31E-24 98.804 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#11662 - CGI_10026176 superfamily 248012 885 993 6.09E-23 95.7224 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#11662 - CGI_10026176 superfamily 246680 594 670 1.75E-08 53.1628 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#11663 - CGI_10026177 superfamily 241574 12 145 2.99E-52 170.482 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#11664 - CGI_10026178 superfamily 247744 353 515 1.29E-47 165.864 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#11664 - CGI_10026178 superfamily 201217 218 267 6.22E-15 69.862 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#11664 - CGI_10026178 superfamily 201217 165 215 2.19E-12 62.5432 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#11664 - CGI_10026178 superfamily 201217 112 162 5.90E-09 52.9132 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#11664 - CGI_10026178 superfamily 201217 270 333 0.000673108 37.8904 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#11665 - CGI_10026179 superfamily 245206 7 259 8.15E-116 335.28 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#11666 - CGI_10026180 superfamily 207637 308 378 5.07E-12 61.3999 cl02541 CIDE_N superfamily - - "CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein." Q#11667 - CGI_10026181 superfamily 247856 114 176 3.82E-14 63.7209 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11667 - CGI_10026181 superfamily 247856 41 103 1.42E-11 56.4021 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11668 - CGI_10026182 superfamily 245201 117 413 1.39E-36 140.926 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11669 - CGI_10026183 superfamily 243092 57 353 5.91E-36 140.548 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11669 - CGI_10026183 superfamily 243092 717 1020 3.14E-32 129.377 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11669 - CGI_10026183 superfamily 243092 1594 1912 1.79E-21 97.0204 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11669 - CGI_10026183 superfamily 243092 988 1261 5.27E-21 95.4796 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11669 - CGI_10026183 superfamily 243092 285 592 1.02E-19 91.6276 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11669 - CGI_10026183 superfamily 243092 1411 1718 1.55E-19 90.8572 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11669 - CGI_10026183 superfamily 190637 663 710 2.43E-13 68.1962 cl04081 HELP superfamily N - "HELP motif; The founding member of the EMAP protein family is the 75 kDa Echinoderm Microtubule-Associated Protein, so-named for its abundance in sea urchin, sand dollar and starfish eggs. The Hydrophobic EMAP-Like Protein (HELP) motif was identified initially in the human EMAP-Like Protein 2 (EML2) and subsequently in the entire EMAP Protein family. The HELP motif is approximately 60-70 amino acids in length and is conserved amongst metazoans. Although the HELP motif is hydrophobic, there is no evidence that EMAP-Like Proteins are membrane-associated. All members of the EMAP-Like Protein family, identified to-date, are constructed with an amino terminal HELP motif followed by a WD domain. In C. elegans, EMAP-Like Protein-1 (ELP-1) is required for touch sensation indicating that ELP-1 may play a role in mechanosensation. The localization of ELP-1 to microtubules and adhesion sites implies that ELP-1 may transmit forces between the body surface and the touch receptor neurons." Q#11669 - CGI_10026183 superfamily 190637 2 48 3.50E-12 64.7294 cl04081 HELP superfamily N - "HELP motif; The founding member of the EMAP protein family is the 75 kDa Echinoderm Microtubule-Associated Protein, so-named for its abundance in sea urchin, sand dollar and starfish eggs. The Hydrophobic EMAP-Like Protein (HELP) motif was identified initially in the human EMAP-Like Protein 2 (EML2) and subsequently in the entire EMAP Protein family. The HELP motif is approximately 60-70 amino acids in length and is conserved amongst metazoans. Although the HELP motif is hydrophobic, there is no evidence that EMAP-Like Proteins are membrane-associated. All members of the EMAP-Like Protein family, identified to-date, are constructed with an amino terminal HELP motif followed by a WD domain. In C. elegans, EMAP-Like Protein-1 (ELP-1) is required for touch sensation indicating that ELP-1 may play a role in mechanosensation. The localization of ELP-1 to microtubules and adhesion sites implies that ELP-1 may transmit forces between the body surface and the touch receptor neurons." Q#11669 - CGI_10026183 superfamily 190637 1357 1399 1.22E-05 45.4694 cl04081 HELP superfamily N - "HELP motif; The founding member of the EMAP protein family is the 75 kDa Echinoderm Microtubule-Associated Protein, so-named for its abundance in sea urchin, sand dollar and starfish eggs. The Hydrophobic EMAP-Like Protein (HELP) motif was identified initially in the human EMAP-Like Protein 2 (EML2) and subsequently in the entire EMAP Protein family. The HELP motif is approximately 60-70 amino acids in length and is conserved amongst metazoans. Although the HELP motif is hydrophobic, there is no evidence that EMAP-Like Proteins are membrane-associated. All members of the EMAP-Like Protein family, identified to-date, are constructed with an amino terminal HELP motif followed by a WD domain. In C. elegans, EMAP-Like Protein-1 (ELP-1) is required for touch sensation indicating that ELP-1 may play a role in mechanosensation. The localization of ELP-1 to microtubules and adhesion sites implies that ELP-1 may transmit forces between the body surface and the touch receptor neurons." Q#11669 - CGI_10026183 superfamily 243092 1881 1957 0.000347901 43.4776 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11671 - CGI_10026185 superfamily 220377 90 153 7.54E-14 65.134 cl10731 DUF2052 superfamily C - "Coiled-coil domain containing protein (DUF2052); This entry is of sequences of two conserved domains separated by a region of low complexity, spanning some 200 residues. The function is unknown." Q#11672 - CGI_10026186 superfamily 241573 189 497 1.89E-131 402.48 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#11672 - CGI_10026186 superfamily 241653 512 651 1.68E-47 167.475 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#11672 - CGI_10026186 superfamily 247856 922 977 0.00014212 40.9941 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11672 - CGI_10026186 superfamily 241653 653 720 3.71E-21 91.6132 cl00165 Calpain_III superfamily N - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#11672 - CGI_10026186 superfamily 241573 100 142 1.22E-10 62.729 cl00051 CysPc superfamily C - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#11672 - CGI_10026186 superfamily 247856 891 946 0.00211737 37.5273 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11678 - CGI_10026192 superfamily 243100 99 151 0.000128755 37.9288 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#11682 - CGI_10026196 superfamily 243062 297 395 2.42E-27 104.28 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#11682 - CGI_10026196 superfamily 216062 38 265 5.51E-18 81.3302 cl02928 TGFb_propeptide superfamily - - TGF-beta propeptide; This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. Q#11683 - CGI_10026197 superfamily 243062 253 351 7.64E-30 110.828 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#11683 - CGI_10026197 superfamily 216062 83 218 2.23E-17 79.019 cl02928 TGFb_propeptide superfamily N - TGF-beta propeptide; This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. Q#11685 - CGI_10026199 superfamily 248097 13 115 2.74E-12 59.201 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#11686 - CGI_10026200 superfamily 241600 115 314 3.58E-81 247.924 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11687 - CGI_10026201 superfamily 241600 1 172 1.74E-65 202.085 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11688 - CGI_10026202 superfamily 241600 109 311 2.01E-79 243.302 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11688 - CGI_10026202 superfamily 241619 4 70 0.00950886 33.7097 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#11689 - CGI_10026203 superfamily 199168 190 213 0.00802743 34.6348 cl15310 LRR_TYP superfamily - - "Leucine-rich repeats, typical (most populated) subfamily; Leucine-rich repeats, typical (most populated) subfamily. " Q#11692 - CGI_10026206 superfamily 243035 131 208 1.69E-09 53.5825 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11692 - CGI_10026206 superfamily 241619 28 85 0.00573659 34.0916 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#11693 - CGI_10026207 superfamily 248264 369 414 3.38E-08 51.469 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#11693 - CGI_10026207 superfamily 243035 165 266 1.55E-07 49.1354 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11693 - CGI_10026207 superfamily 243035 296 371 9.26E-07 46.6489 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11694 - CGI_10026208 superfamily 246925 46 181 0.000407454 41.187 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#11698 - CGI_10016368 superfamily 247905 1572 1681 4.86E-13 68.8036 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#11698 - CGI_10016368 superfamily 241571 1737 1819 7.03E-10 58.963 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#11698 - CGI_10016368 superfamily 241613 1839 1875 7.09E-09 54.135 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11698 - CGI_10016368 superfamily 247792 1487 1534 1.54E-06 47.4404 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11698 - CGI_10016368 superfamily 247805 780 912 0.00110439 40.0132 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#11699 - CGI_10016369 superfamily 241547 15 113 1.80E-18 83.4863 cl00012 alpha_CA superfamily C - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#11699 - CGI_10016369 superfamily 241547 255 300 6.82E-16 75.7823 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#11700 - CGI_10016370 superfamily 245304 105 388 7.53E-98 302.171 cl10459 Peptidases_S8_S53 superfamily - - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#11700 - CGI_10016370 superfamily 201820 473 545 4.65E-21 88.0662 cl08326 P_proprotein superfamily C - Proprotein convertase P-domain; A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. Q#11702 - CGI_10016372 superfamily 241578 76 180 2.54E-05 43.8007 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11703 - CGI_10016373 superfamily 243092 172 306 0.00413878 37.3144 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11704 - CGI_10016374 superfamily 241563 40 80 5.60E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11704 - CGI_10016374 superfamily 243092 286 422 0.00373949 38.0848 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11705 - CGI_10016375 superfamily 247725 308 419 9.07E-52 176.396 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#11705 - CGI_10016375 superfamily 243072 640 793 5.76E-28 110.166 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11705 - CGI_10016375 superfamily 245835 37 247 4.51E-78 253.104 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#11705 - CGI_10016375 superfamily 243047 438 560 2.47E-37 136.596 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#11706 - CGI_10016376 superfamily 243091 25 154 4.99E-29 105.879 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#11708 - CGI_10016378 superfamily 248012 19 102 5.25E-13 61.4397 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#11710 - CGI_10016380 superfamily 221377 208 346 2.35E-06 45.9227 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#11711 - CGI_10016381 superfamily 243092 268 566 4.10E-65 216.047 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11717 - CGI_10016387 superfamily 247780 277 554 5.32E-125 371.881 cl17226 NAD_bind_amino_acid_DH superfamily - - "NAD(P) binding domain of amino acid dehydrogenase-like proteins; Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts." Q#11717 - CGI_10016387 superfamily 215894 86 267 2.85E-104 314.585 cl02855 malic superfamily - - "Malic enzyme, N-terminal domain; Malic enzyme, N-terminal domain. " Q#11718 - CGI_10016388 superfamily 247724 8 214 2.49E-109 315.161 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11719 - CGI_10016389 superfamily 243072 220 346 3.34E-31 122.107 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#11719 - CGI_10016389 superfamily 241622 1519 1569 6.73E-16 76.4514 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#11719 - CGI_10016389 superfamily 247683 734 780 5.58E-15 72.7514 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#11719 - CGI_10016389 superfamily 247057 2623 2686 1.22E-21 92.3817 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#11722 - CGI_10016392 superfamily 217617 4 46 0.00022776 39.7069 cl15988 Sulfotransfer_2 superfamily NC - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#11723 - CGI_10016393 superfamily 217617 98 280 9.58E-23 95.1756 cl15988 Sulfotransfer_2 superfamily C - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#11725 - CGI_10016395 superfamily 245201 189 371 1.69E-31 120.034 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11725 - CGI_10016395 superfamily 245201 58 122 1.57E-11 62.901 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11726 - CGI_10016396 superfamily 245201 1 157 2.70E-32 118.493 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11727 - CGI_10000467 superfamily 243066 119 222 3.22E-17 76.1904 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#11728 - CGI_10001206 superfamily 180442 76 405 3.86E-20 92.0549 cl18106 PRK06175 superfamily C - L-aspartate oxidase; Provisional Q#11728 - CGI_10001206 superfamily 248236 457 587 3.55E-13 68.8652 cl17682 PRK07189 superfamily C - malonate decarboxylase subunit beta; Reviewed Q#11729 - CGI_10022510 superfamily 247068 119 205 6.28E-13 61.5605 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11731 - CGI_10022512 superfamily 247684 99 369 5.76E-33 128.549 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11731 - CGI_10022512 superfamily 247684 11 105 2.29E-18 85.7919 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11732 - CGI_10022513 superfamily 247684 2 255 3.14E-42 152.817 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11734 - CGI_10022516 superfamily 247684 11 178 9.05E-40 140.875 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#11739 - CGI_10022521 superfamily 241818 1 58 7.52E-31 107.649 cl00366 PMSR superfamily N - Peptide methionine sulfoxide reductase; This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine. Q#11740 - CGI_10022522 superfamily 180442 214 649 1.78E-72 244.594 cl18106 PRK06175 superfamily - - L-aspartate oxidase; Provisional Q#11740 - CGI_10022522 superfamily 217281 644 768 3.44E-43 152.753 cl08378 Succ_DH_flav_C superfamily - - "Fumarate reductase flavoprotein C-term; This family contains fumarate reductases, succinate dehydrogenases and L-aspartate oxidases." Q#11741 - CGI_10022523 superfamily 243051 3090 3246 7.24E-57 199.141 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 3594 3751 1.10E-53 189.896 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 2391 2549 1.34E-52 186.815 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 5246 5402 7.59E-50 179.111 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 7898 8056 1.48E-49 178.34 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 4168 4326 1.62E-49 177.955 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 2036 2193 5.39E-49 176.414 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 4566 4722 1.56E-48 175.259 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 1514 1673 3.36E-48 174.103 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 6888 7048 3.72E-47 171.407 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 2199 2355 4.58E-47 171.022 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 6724 6882 7.44E-47 170.251 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 1349 1506 8.94E-47 170.251 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 1872 2028 1.61E-46 169.481 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 3252 3407 2.09E-46 169.096 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 4728 4887 4.99E-46 167.94 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 1157 1314 1.16E-45 166.784 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 6540 6699 1.61E-45 166.399 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 474 640 1.62E-45 166.399 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 9413 9567 6.79E-45 164.858 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 296 456 2.65E-44 162.932 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 2559 2722 4.32E-44 162.547 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 6374 6534 6.25E-44 161.777 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 831 978 9.58E-44 161.392 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 3757 3916 1.53E-43 160.621 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 2728 2887 2.03E-43 160.236 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 6211 6368 4.93E-43 159.466 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 8322 8473 1.01E-42 158.31 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 2912 3075 1.24E-42 158.31 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 4893 5048 1.72E-42 157.54 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 994 1151 5.37E-42 156.384 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 3432 3586 1.09E-40 152.532 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 7467 7626 4.10E-40 150.991 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 640 797 6.47E-40 150.221 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 7054 7215 9.15E-39 146.754 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 8746 8905 1.01E-38 146.754 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 5820 5977 3.94E-38 145.213 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 5073 5240 1.00E-37 144.058 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 105 257 8.21E-36 138.28 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 9620 9774 1.09E-35 137.894 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 3960 4113 4.00E-35 136.354 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 5610 5765 2.79E-34 134.042 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 4388 4549 5.66E-34 132.887 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 9193 9353 4.60E-33 130.576 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 5408 5562 2.92E-32 128.264 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 1680 1837 5.92E-32 127.109 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 7260 7414 1.01E-31 126.338 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 2 97 1.24E-26 111.701 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 7685 7848 6.39E-24 103.612 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 6039 6176 1.70E-23 102.456 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 8533 8692 1.31E-20 93.9817 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 243051 8979 9138 1.08E-13 73.5661 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 241613 9375 9409 2.35E-10 60.6833 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 8495 8529 2.99E-10 60.2981 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 9863 9899 6.86E-10 59.5278 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 9578 9613 1.00E-09 58.7574 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 7857 7891 1.33E-09 58.7574 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 6001 6036 1.42E-09 58.3722 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 5775 5808 6.23E-09 56.8314 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 4351 4385 1.03E-08 56.061 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 8074 8108 1.18E-08 55.6758 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 8280 8313 2.43E-08 54.9054 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 8704 8734 4.01E-08 54.135 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 9149 9180 1.07E-07 52.9794 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 4123 4157 1.35E-07 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 9788 9822 2.24E-07 52.209 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 8938 8974 3.72E-07 51.4386 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 7430 7458 1.46E-06 49.8978 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 7651 7682 4.26E-05 45.2754 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 7226 7256 0.000375763 42.579 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 241613 5580 5607 0.00694404 38.727 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11741 - CGI_10022523 superfamily 243051 8112 8269 1.53E-12 70.0993 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11741 - CGI_10022523 superfamily 241613 9902 9938 0.00718309 38.8162 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#11742 - CGI_10022524 superfamily 243051 431 593 3.46E-46 163.703 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11742 - CGI_10022524 superfamily 243051 599 757 1.23E-43 156.769 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11742 - CGI_10022524 superfamily 243051 114 273 1.22E-40 148.295 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11742 - CGI_10022524 superfamily 243051 792 927 9.07E-40 145.598 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11742 - CGI_10022524 superfamily 243051 304 412 1.92E-33 127.494 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11742 - CGI_10022524 superfamily 243051 1 108 8.99E-29 114.012 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11742 - CGI_10022524 superfamily 243051 936 982 0.0014754 38.8709 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 540 696 1.23E-54 189.896 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 720 880 6.77E-49 173.333 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 1615 1771 5.57E-48 171.022 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 885 1042 3.14E-47 168.71 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 1073 1226 6.19E-46 164.858 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 1246 1400 1.45E-45 164.088 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 1406 1563 1.63E-39 146.369 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 21 183 1.09E-37 141.361 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 204 360 1.67E-36 137.894 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 366 497 4.44E-32 125.183 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11743 - CGI_10022525 superfamily 243051 1793 1837 2.18E-07 51.5825 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#11744 - CGI_10022526 superfamily 241647 145 171 0.00730793 36.0072 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#11745 - CGI_10022527 superfamily 248458 210 376 1.24E-09 58.0941 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11746 - CGI_10022528 superfamily 241555 9 102 4.99E-21 83.8388 cl00020 GAT_1 superfamily C - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#11747 - CGI_10022529 superfamily 201526 15 83 1.44E-24 91.0604 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#11748 - CGI_10022530 superfamily 243689 22 102 2.32E-09 55.7125 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#11748 - CGI_10022530 superfamily 219817 104 213 6.31E-05 42.9905 cl07129 Xpo1 superfamily C - "Exportin 1-like protein; The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus." Q#11749 - CGI_10022531 superfamily 241832 57 169 1.16E-53 169.291 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#11750 - CGI_10022532 superfamily 242206 22 122 2.14E-17 77.529 cl00938 Rieske superfamily - - "Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis." Q#11750 - CGI_10022532 superfamily 241737 235 362 1.36E-07 49.4168 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#11752 - CGI_10022534 superfamily 243091 247 302 3.37E-05 41.7088 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#11753 - CGI_10022535 superfamily 243540 157 378 7.41E-57 187.455 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#11756 - CGI_10022538 superfamily 243186 4 135 1.30E-43 143.774 cl02788 Ser_Recombinase superfamily - - "Serine Recombinase family, catalytic domain; a DNA binding domain may be present either N- or C-terminal to the catalytic domain. These enzymes perform site-specific recombination of DNA molecules by a concerted, four-strand cleavage and rejoining mechanism which involves a transient phosphoserine linkage between DNA and serine recombinase. Serine recombinases demonstrate functional versatility and include resolvases, invertases, integrases, and transposases. Resolvases and invertases (i.e. Tn3, gamma-delta, Tn5044 resolvases, Gin and Hin invertases) in this family contain a C-terminal DNA binding domain and comprise a major phylogenic group. Also included are phage- and bacterial-encoded recombinases such as phiC31 integrase, SpoIVCA excisionase, and Tn4451 TnpX transposase. These integrases and transposases have larger C-terminal domains compared to resolvases/invertases and are referred to as large serine recombinases. Also belonging to this family are proteins with N-terminal DNA binding domains similar to IS607- and IS1535-transposases from Helicobacter and Mycobacterium." Q#11756 - CGI_10022538 superfamily 247947 147 187 6.19E-10 51.6165 cl17393 HTH_Hin_like superfamily - - "Helix-turn-helix domain of Hin and related proteins, a family of DNA-binding domains unique to bacteria and represented by the Hin protein of Salmonella. The basic HTH domain is a simple fold comprised of three core helices that form a right-handed helical bundle. The principal DNA-protein interface is formed by the third helix, the recognition helix, inserting itself into the major groove of the DNA. A diverse array of HTH domains participate in a variety of functions that depend on their DNA-binding properties. HTH_Hin represents one of the simplest versions of the HTH domains; the characterization of homologous relationships between various sequence-diverse HTH domain families remains difficult. The Hin recombinase induces the site-specific inversion of a chromosomal DNA segment containing a promoter, which controls the alternate expression of two genes by reversibly switching orientation. The Hin recombinase consists of a single polypeptide chain containing a DNA-binding domain (HTH_Hin) and a catalytic domain." Q#11757 - CGI_10022539 superfamily 203644 62 109 8.92E-13 61.4242 cl10081 Collar superfamily - - "Phage Tail Collar Domain; This region is occasionally found in conjunction with pfam03335. Most of the family appear to be phage tail proteins; however some appear to be involved in other processes. For instance rhiB from Rhizobium leguminosarum may be involved in plant-microbe interactions. A related protein mrpB is involved in the pathogenicity of Microcystis aeruginosa. The finding of this family in a structural component of the phage tail fibre baseplate suggests that its function is structural rather than enzymatic. Structural studies show this region consists of a helix and a loop and three beta-strands. This alignment does not catch the third strand as it is separated from the rest of the structure by around 100 residues. This strand is conserved in homologues but the intervening sequence is not. Much of the function of Bacteriophage T4 short tail fiber protein appears to reside in this intervening region. In the tertiary structure of the phage baseplate this domain forms part of the 'collar'. The domain may bind SO4, however the residues accredited with this vary between the PDB file and the Swiss-Prot entry. The long unconserved region maybe due to domain swapping in and out of a loop or reflective of rapid evolution." Q#11759 - CGI_10022541 superfamily 241563 71 109 2.19E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11760 - CGI_10002919 superfamily 242064 46 150 3.01E-64 194.671 cl00748 Ribosomal_L32_L32e superfamily - - "Ribosomal_L32_L32e: L32 is a protein from the large subunit that contains a surface-exposed globular domain and a finger-like projection that extends into the RNA core to stabilize the tertiary structure. L32 does not appear to play a role in forming the A (aminacyl), P (peptidyl) or E (exit) sites of the ribosome, but does interact with 23S rRNA, which has a "kink-turn" secondary structure motif. L32 is overexpressed in human prostate cancer and has been identified as a stably expressed housekeeping gene in macrophages of human chronic obstructive pulmonary disease (COPD) patients. In Schizosaccharomyces pombe, L32 has also been suggested to play a role as a transcriptional regulator in the nucleus. Found in archaea and eukaryotes, this protein is known as L32 in eukaryotes and L32e in archaea." Q#11761 - CGI_10000675 superfamily 247999 614 659 3.88E-14 68.0073 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#11762 - CGI_10000676 superfamily 248099 3 79 2.09E-25 97.395 cl17545 Bromo_TP superfamily - - Bromodomain associated; This domain is predicted to bind DNA and is often found associated with pfam00439 and in transcription factors. It has a histone-like fold. Q#11763 - CGI_10000319 superfamily 192535 47 172 1.58E-05 44.509 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#11764 - CGI_10015471 superfamily 247675 4 103 3.06E-48 162.928 cl17011 Arginase_HDAC superfamily N - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#11765 - CGI_10015472 superfamily 246664 828 1270 0 775.713 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#11765 - CGI_10015472 superfamily 245814 407 488 3.19E-18 82.168 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11765 - CGI_10015472 superfamily 245814 316 396 9.65E-16 74.8492 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11765 - CGI_10015472 superfamily 248289 1380 1435 7.46E-13 65.9083 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#11765 - CGI_10015472 superfamily 245814 222 289 7.80E-12 63.5681 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11765 - CGI_10015472 superfamily 246664 757 801 1.75E-08 56.935 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#11765 - CGI_10015472 superfamily 245814 506 568 1.91E-08 53.1795 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11766 - CGI_10015473 superfamily 247907 2 141 2.23E-14 72.4508 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#11766 - CGI_10015473 superfamily 247907 649 815 2.45E-07 50.8797 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#11766 - CGI_10015473 superfamily 245213 164 200 0.000149443 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11766 - CGI_10015473 superfamily 245213 609 642 0.00038779 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11766 - CGI_10015473 superfamily 247907 499 585 0.000518457 40.4006 cl17353 LamG superfamily N - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#11766 - CGI_10015473 superfamily 247907 884 962 0.00131615 38.9688 cl17353 LamG superfamily N - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#11766 - CGI_10015473 superfamily 247907 287 337 0.00361296 37.8132 cl17353 LamG superfamily C - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#11768 - CGI_10015475 superfamily 128837 88 117 2.49E-06 43.5584 cl17973 EZ_HEAT superfamily - - "E-Z type HEAT repeats; Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role." Q#11768 - CGI_10015475 superfamily 128837 241 270 3.22E-05 40.4768 cl17973 EZ_HEAT superfamily - - "E-Z type HEAT repeats; Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role." Q#11768 - CGI_10015475 superfamily 128837 177 206 0.00115455 35.8544 cl17973 EZ_HEAT superfamily - - "E-Z type HEAT repeats; Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role." Q#11768 - CGI_10015475 superfamily 128837 208 236 0.00269869 34.6988 cl17973 EZ_HEAT superfamily - - "E-Z type HEAT repeats; Present in subunits of cyanobacterial phycocyanin lyase, and other proteins. Probable scaffolding role." Q#11772 - CGI_10015479 superfamily 243061 87 173 2.55E-17 73.5302 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11772 - CGI_10015479 superfamily 243061 2 71 5.59E-12 58.5074 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11774 - CGI_10015482 superfamily 115363 284 343 2.78E-14 68.5525 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#11774 - CGI_10015482 superfamily 115363 213 260 2.19E-08 51.6038 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#11774 - CGI_10015482 superfamily 241578 43 116 0.00217367 37.9649 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11775 - CGI_10015483 superfamily 115363 269 324 7.47E-13 63.5449 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#11775 - CGI_10015483 superfamily 115363 198 256 9.79E-07 45.8258 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#11776 - CGI_10015485 superfamily 243035 3 117 2.17E-25 93.8385 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11785 - CGI_10009729 superfamily 247727 87 200 1.17E-17 76.6998 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#11787 - CGI_10009731 superfamily 245539 418 483 4.51E-26 100.706 cl11186 Cullin_Nedd8 superfamily - - "Cullin protein neddylation domain; This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue." Q#11788 - CGI_10009732 superfamily 246748 1045 1197 1.98E-25 108.449 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#11788 - CGI_10009732 superfamily 216290 836 945 1.12E-21 93.5069 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#11788 - CGI_10009732 superfamily 202944 1300 1410 0.00396452 37.6353 cl07854 TFR_dimer superfamily - - Transferrin receptor-like dimerisation domain; This domain is involved in dimerisation of the transferrin receptor as shown in its crystal structure. Q#11792 - CGI_10020816 superfamily 216599 89 496 0 580.001 cl18372 B56 superfamily - - "Protein phosphatase 2A regulatory B subunit (B56 family); Protein phosphatase 2A (PP2A) is a major intracellular protein phosphatase that regulates multiple aspects of cell growth and metabolism. The ability of this widely distributed heterotrimeric enzyme to act on a diverse array of substrates is largely controlled by the nature of its regulatory B subunit. There are multiple families of B subunits (See also pfam01240), this family is called the B56 family." Q#11794 - CGI_10020818 superfamily 241610 53 107 1.58E-08 46.8594 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#11794 - CGI_10020818 superfamily 241610 15 52 0.00977297 30.681 cl00101 KU superfamily N - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#11797 - CGI_10020821 superfamily 241563 64 96 3.68E-05 41.504 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11797 - CGI_10020821 superfamily 245010 103 197 0.00930524 34.9011 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#11799 - CGI_10020823 superfamily 241563 77 109 5.82E-05 41.1188 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11799 - CGI_10020823 superfamily 245010 110 210 0.000337816 39.1383 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#11800 - CGI_10020824 superfamily 241563 64 96 5.25E-05 41.1188 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11800 - CGI_10020824 superfamily 245010 103 197 0.00478897 35.6715 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#11804 - CGI_10020828 superfamily 241610 23 75 6.66E-22 86.9202 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#11804 - CGI_10020828 superfamily 241610 1 23 0.000120229 39.2041 cl00101 KU superfamily N - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#11805 - CGI_10020829 superfamily 241563 64 96 4.27E-05 41.504 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11805 - CGI_10020829 superfamily 245010 103 197 0.0035251 36.0567 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#11807 - CGI_10020831 superfamily 248312 2 105 0.00178108 35.7921 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#11812 - CGI_10020836 superfamily 201590 212 286 5.00E-07 49.5186 cl03090 HH_signal superfamily N - "Hedgehog amino-terminal signalling domain; For the carboxyl Hint module, see pfam01079. Hedgehog is a family of secreted signal molecules required for embryonic cell differentiation." Q#11814 - CGI_10020838 superfamily 247724 2594 2758 9.66E-91 296.306 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11814 - CGI_10020838 superfamily 243185 2770 2857 9.46E-31 120.667 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#11814 - CGI_10020838 superfamily 243185 3030 3117 7.51E-26 106.031 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#11814 - CGI_10020838 superfamily 247038 273 333 7.70E-05 44.3677 cl15674 IPT superfamily N - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#11814 - CGI_10020838 superfamily 245201 32 272 3.44E-142 448.81 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11814 - CGI_10020838 superfamily 220608 340 459 6.23E-35 133.585 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#11814 - CGI_10020838 superfamily 221359 2923 3011 4.74E-26 107.532 cl13429 IF-2 superfamily - - "Translation-initiation factor 2; IF-2 is a translation initiator in each of the three main phylogenetic domains (Eukaryotes, Bacteria and Archaea). IF2 interacts with formylmethionine-tRNA, GTP, IF1, IF3 and both ribosomal subunits. Through these interactions, IF2 promotes the binding of the initiator tRNA to the A site in the smaller ribosomal subunit and catalyzes the hydrolysis of GTP following initiation-complex formation." Q#11814 - CGI_10020838 superfamily 220608 1162 1287 1.51E-22 97.7618 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#11820 - CGI_10024411 superfamily 110440 85 109 0.000317062 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#11820 - CGI_10024411 superfamily 110440 124 151 0.000332348 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#11821 - CGI_10024412 superfamily 245847 49 104 0.000975553 37.8672 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#11822 - CGI_10024413 superfamily 243066 4 93 2.92E-24 96.0828 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#11822 - CGI_10024413 superfamily 219619 315 391 1.12E-11 60.6843 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#11824 - CGI_10024415 superfamily 243035 216 283 3.48E-16 72.2673 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11827 - CGI_10024419 superfamily 241578 287 454 2.85E-43 153.91 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11827 - CGI_10024419 superfamily 207701 34 147 4.39E-15 72.7759 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#11828 - CGI_10024420 superfamily 245814 119 179 1.14E-05 43.2629 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11831 - CGI_10024423 superfamily 247792 679 724 6.20E-13 64.7744 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11833 - CGI_10024425 superfamily 246681 17 135 9.61E-16 70.0762 cl14643 SRPBCC superfamily C - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#11834 - CGI_10024426 superfamily 247856 46 99 0.00187779 33.6753 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#11842 - CGI_10024434 superfamily 246664 2 477 0 667.816 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#11843 - CGI_10024435 superfamily 149105 606 678 0.00116859 40.4961 cl12353 TMPIT superfamily C - "TMPIT-like protein; A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this." Q#11844 - CGI_10024436 superfamily 241600 45 206 3.36E-21 87.681 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#11845 - CGI_10024437 superfamily 243074 54 100 8.58E-14 67.1465 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#11845 - CGI_10024437 superfamily 243092 674 737 1.57E-07 52.3372 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11845 - CGI_10024437 superfamily 243092 135 168 2.42E-07 48.4626 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11846 - CGI_10024438 superfamily 243034 40 134 2.68E-08 51.9972 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#11846 - CGI_10024438 superfamily 243034 194 287 2.57E-07 48.9156 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#11846 - CGI_10024438 superfamily 243034 259 330 0.00022748 39.6708 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#11846 - CGI_10024438 superfamily 245875 428 450 1.40E-05 42.6909 cl12112 GoLoco superfamily - - GoLoco motif; GoLoco motif. Q#11846 - CGI_10024438 superfamily 245875 519 539 2.34E-05 41.9205 cl12112 GoLoco superfamily - - GoLoco motif; GoLoco motif. Q#11846 - CGI_10024438 superfamily 245875 552 574 4.28E-05 41.1501 cl12112 GoLoco superfamily - - GoLoco motif; GoLoco motif. Q#11846 - CGI_10024438 superfamily 245875 473 494 0.000421362 38.4537 cl12112 GoLoco superfamily - - GoLoco motif; GoLoco motif. Q#11848 - CGI_10024440 superfamily 247727 263 359 5.20E-07 47.4247 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#11849 - CGI_10024441 superfamily 241640 470 647 1.24E-41 150.504 cl00149 Tryp_SPc superfamily N - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#11849 - CGI_10024441 superfamily 241640 246 473 4.45E-34 129.704 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#11849 - CGI_10024441 superfamily 243066 31 125 4.65E-14 68.7981 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#11851 - CGI_10024443 superfamily 219619 382 450 6.21E-19 81.8703 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#11851 - CGI_10024443 superfamily 219619 163 216 7.39E-08 50.284 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#11854 - CGI_10024446 superfamily 241680 37 241 5.92E-41 144.318 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#11855 - CGI_10024447 superfamily 241680 37 243 3.44E-42 146.244 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#11856 - CGI_10024448 superfamily 241550 217 361 1.22E-63 209.75 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#11856 - CGI_10024448 superfamily 241550 32 130 4.53E-51 176.238 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#11857 - CGI_10024449 superfamily 219999 2 125 2.21E-15 69.7385 cl18540 Nse4 superfamily C - Nse4; Nse4 is a component of the Smc5/6 DNA repair complex. It forms interactions with Smc5 and Nse1. Q#11858 - CGI_10024450 superfamily 219999 35 302 2.96E-52 181.446 cl18540 Nse4 superfamily - - Nse4; Nse4 is a component of the Smc5/6 DNA repair complex. It forms interactions with Smc5 and Nse1. Q#11859 - CGI_10024451 superfamily 245814 65 144 0.00685489 34.7723 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11859 - CGI_10024451 superfamily 245814 165 248 0.000997832 37.0997 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#11860 - CGI_10024452 superfamily 243061 2 71 4.57E-15 64.6706 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11863 - CGI_10004311 superfamily 217293 28 226 6.02E-30 115.037 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11863 - CGI_10004311 superfamily 202474 233 311 1.10E-12 65.7529 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#11868 - CGI_10001561 superfamily 241578 118 268 4.98E-27 105.066 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11868 - CGI_10001561 superfamily 245213 37 72 4.96E-10 54.565 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11868 - CGI_10001561 superfamily 245213 1 34 4.90E-09 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11868 - CGI_10001561 superfamily 245213 74 110 3.98E-08 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11868 - CGI_10001561 superfamily 241578 304 387 5.13E-07 47.6714 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#11869 - CGI_10023652 superfamily 246723 22 591 0 812.952 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#11870 - CGI_10023653 superfamily 247775 56 496 2.64E-64 216.297 cl17221 ArsB_NhaD_permease superfamily - - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#11872 - CGI_10023655 superfamily 247775 39 565 9.39E-48 174.066 cl17221 ArsB_NhaD_permease superfamily - - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#11874 - CGI_10023657 superfamily 131388 34 252 2.86E-12 66.8469 cl17985 hydr_PhnA superfamily C - "phosphonoacetate hydrolase; This family consists of examples of phosphonoacetate hydrolase, an enzyme specific for the cleavage of the C-P bond in phosphonoacetate. Phosphonates are organic compounds with a direct C-P bond that is far less labile that the C-O-P bonds of phosphate attachment sites. Phosphonates may be degraded for phosphorus and energy by broad spectrum C-P lyase encoded by large operon or by specific enzymes for some of the more common phosphonates in nature. This family represents an enzyme from the latter category. It may be found encoded near genes for phosphonate transport and for pther specific phosphonatases." Q#11875 - CGI_10023658 superfamily 247866 46 274 5.65E-16 76.7224 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#11875 - CGI_10023658 superfamily 243179 422 557 4.57E-11 60.2391 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#11878 - CGI_10023661 superfamily 241607 94 129 2.73E-08 47.6498 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#11878 - CGI_10023661 superfamily 241607 135 169 1.19E-05 40.331 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#11878 - CGI_10023661 superfamily 241607 166 199 0.000512261 36.1282 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#11884 - CGI_10023667 superfamily 247057 554 622 1.35E-20 87.7428 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#11884 - CGI_10023667 superfamily 247057 623 686 2.90E-22 92.3571 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#11884 - CGI_10023667 superfamily 248012 715 826 3.51E-17 78.7736 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#11886 - CGI_10023669 superfamily 220395 26 200 6.34E-43 144.832 cl10757 Guanylate_cyc_2 superfamily - - "Guanylylate cyclase; Members of this family of proteins catalyze the conversion of guanosine triphosphate (GTP) to 3',5'-cyclic guanosine monophosphate (cGMP) and pyrophosphate." Q#11887 - CGI_10023670 superfamily 241555 5 97 8.15E-38 126.505 cl00020 GAT_1 superfamily C - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#11888 - CGI_10023671 superfamily 247724 25 181 3.07E-47 154.921 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11889 - CGI_10023672 superfamily 247724 21 179 1.38E-49 160.699 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11890 - CGI_10023673 superfamily 248458 95 204 0.00017449 42.3009 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11893 - CGI_10023676 superfamily 245201 674 913 2.41E-39 146.228 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11896 - CGI_10023679 superfamily 241832 52 255 5.42E-66 210.225 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#11896 - CGI_10023679 superfamily 241832 253 371 9.58E-49 165.157 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#11898 - CGI_10023681 superfamily 243066 40 115 6.62E-09 49.1529 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#11901 - CGI_10023684 superfamily 216063 868 1046 1.53E-44 160.862 cl02929 Cation_ATPase_C superfamily - - "Cation transporting ATPase, C-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. This family represents 5 transmembrane helices." Q#11901 - CGI_10023684 superfamily 215733 141 274 9.40E-32 125.37 cl02811 E1-E2_ATPase superfamily C - E1-E2 ATPase; E1-E2 ATPase. Q#11901 - CGI_10023684 superfamily 222006 503 590 1.57E-20 88.8186 cl16182 Hydrolase_like2 superfamily - - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#11901 - CGI_10023684 superfamily 215733 357 469 2.22E-15 76.4498 cl02811 E1-E2_ATPase superfamily N - E1-E2 ATPase; E1-E2 ATPase. Q#11901 - CGI_10023684 superfamily 152858 1088 1121 2.61E-12 64.3976 cl13811 ATP_Ca_trans_C superfamily C - "Plasma membrane calcium transporter ATPase C terminal; This domain family is found in eukaryotes, and is approximately 60 amino acids in length. The family is found in association with pfam00689, pfam00122, pfam00702, pfam00690. There is a conserved QTQ sequence motif. This family is the C terminal of a calcium transporting ATPase located in the plasma membrane." Q#11901 - CGI_10023684 superfamily 243244 43 102 2.43E-08 52.923 cl02930 Cation_ATPase_N superfamily - - "Cation transporter/ATPase, N-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport." Q#11902 - CGI_10023685 superfamily 245864 39 501 1.15E-102 318.068 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#11903 - CGI_10023686 superfamily 241596 67 108 6.86E-10 52.9867 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#11906 - CGI_10023689 superfamily 241563 60 95 0.000953966 37.3167 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11911 - CGI_10023694 superfamily 243035 29 151 2.35E-05 39.9106 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11912 - CGI_10023695 superfamily 243035 110 226 4.77E-08 49.5406 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#11913 - CGI_10023696 superfamily 222429 13 56 0.00456991 31.8273 cl18676 Myb_DNA-bind_5 superfamily C - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#11914 - CGI_10014965 superfamily 241823 21 183 8.71E-90 268.677 cl00376 Ribosomal_L10_P0 superfamily - - "Ribosomal protein L10 family; composed of the large subunit ribosomal protein called L10 in bacteria, P0 in eukaryotes, and L10e in archaea, as well as uncharacterized P0-like eukaryotic proteins. In all three kingdoms, L10 forms a tight complex with multiple copies of the small acidic protein L12(e). This complex forms a stalk structure on the large subunit of the ribosome. The N-terminal domain (NTD) of L10 interacts with L11 protein and forms the base of the L7/L12 stalk, while the extended C-terminal helix binds to two or three dimers of the NTD of L7/L12 (L7 and L12 are identical except for an acetylated N-terminus). The L7/L12 stalk is known to contain the binding site for elongation factors G and Tu (EF-G and EF-Tu, respectively); however, there is disagreement as to whether or not L10 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, L10 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). Some eukaryotic P0 sequences have an additional C-terminal domain homologous with acidic proteins P1 and P2." Q#11915 - CGI_10014966 superfamily 242880 6 285 2.05E-103 304.499 cl02098 14-3-3 superfamily - - "14-3-3 domain; 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 14-3-3 proteins play important roles in many biological processes that are regulated by phosphorylation, including cell cycle regulation, cell proliferation, protein trafficking, metabolic regulation and apoptosis. More than 300 binding partners of the 14-3-3 domain have been identified in all subcellular compartments and include transcription factors, signaling molecules, tumor suppressors, biosynthetic enzymes, cytoskeletal proteins and apoptosis factors. 14-3-3 binding can alter the conformation, localization, stability, phosphorylation state, activity as well as molecular interactions of a target protein. They function only as dimers, some preferring strictly homodimeric interaction, while others form heterodimers. Binding of the 14-3-3 domain to its target occurs in a phosphospecific manner where it binds to one of two consensus sequences of their target proteins; RSXpSXP (mode-1) and RXXXpSXP (mode-2). In some instances, 14-3-3 domain containing proteins are involved in regulation and signaling of a number of cellular processes in phosphorylation-independent manner. Many organisms express multiple isoforms: there are seven mammalian 14-3-3 family members (beta, gamma, eta, theta, epsilon, sigma, zeta), each encoded by a distinct gene, while plants contain up to 13 isoforms. The flexible C-terminal segment of 14-3-3 isoforms shows the highest sequence variability and may significantly contribute to individual isoform uniqueness by playing an important regulatory role by occupying the ligand binding groove and blocking the binding of inappropriate ligands in a distinct manner. Elevated amounts of 14-3-3 proteins are found in the cerebrospinal fluid of patients with Creutzfeldt-Jakob disease. In protozoa, like Plasmodium or Cryptosporidium parvum 14-3-3 proteins play an important role in key steps of parasite development." Q#11916 - CGI_10014967 superfamily 248458 541 685 1.81E-10 61.9461 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11916 - CGI_10014967 superfamily 248458 76 137 0.000731749 41.1453 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#11917 - CGI_10014968 superfamily 245596 75 287 2.30E-59 195.82 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#11918 - CGI_10014969 superfamily 247792 234 274 7.55E-06 42.4328 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11918 - CGI_10014969 superfamily 245716 167 196 0.00410321 34.4563 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#11919 - CGI_10014970 superfamily 247724 57 425 1.59E-118 351.058 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11920 - CGI_10014971 superfamily 247724 57 424 1.19E-118 351.444 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11922 - CGI_10014973 superfamily 247724 445 803 1.05E-115 356.066 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11922 - CGI_10014973 superfamily 247724 57 410 1.24E-102 321.783 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11923 - CGI_10014974 superfamily 247724 82 336 2.54E-62 203.142 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11923 - CGI_10014974 superfamily 247724 57 79 6.55E-08 52.1436 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#11925 - CGI_10014976 superfamily 243088 95 219 2.65E-46 158.309 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#11925 - CGI_10014976 superfamily 149621 356 459 6.36E-19 82.3324 cl07303 Nexin_C superfamily - - Sorting nexin C terminal; This region is found a the C terminal of proteins belonging to the sorting nexin family. It is found on proteins which also contain pfam00787. Q#11926 - CGI_10014977 superfamily 243089 66 238 1.10E-40 143.168 cl02564 PXA superfamily - - PXA domain; This domain is associated with PX domains pfam00787. Q#11926 - CGI_10014977 superfamily 243090 279 357 7.19E-28 106.271 cl02565 RGS superfamily C - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#11928 - CGI_10014979 superfamily 243092 339 626 2.87E-87 276.138 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#11929 - CGI_10014980 superfamily 222150 282 307 6.73E-05 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#11929 - CGI_10014980 superfamily 222150 198 222 0.00142145 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#11929 - CGI_10014980 superfamily 222150 338 363 0.00260379 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#11932 - CGI_10014983 superfamily 247068 60 154 4.04E-12 62.3309 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11932 - CGI_10014983 superfamily 247068 262 340 1.08E-09 55.3974 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11932 - CGI_10014983 superfamily 247068 173 254 1.13E-05 43.4562 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#11935 - CGI_10018371 superfamily 241570 43 180 3.50E-13 66.9658 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#11935 - CGI_10018371 superfamily 241570 198 248 0.00474797 36.1498 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#11937 - CGI_10018373 superfamily 115027 123 349 2.16E-14 72.0957 cl17947 DUF1057 superfamily C - Alpha/beta hydrolase of unknown function (DUF1057); This family consists of several Caenorhabditis elegans specific proteins of unknown function. Members of this family have an alpha/beta hydrolase fold. Q#11938 - CGI_10018374 superfamily 247792 26 71 1.90E-05 42.818 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11938 - CGI_10018374 superfamily 128778 217 342 4.03E-08 52.2671 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#11938 - CGI_10018374 superfamily 216033 396 469 2.87E-05 43.0912 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#11938 - CGI_10018374 superfamily 241563 171 209 0.00406147 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11938 - CGI_10018374 superfamily 241563 104 141 0.00795071 35.1476 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11940 - CGI_10018376 superfamily 247941 131 267 2.53E-08 51.1825 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#11941 - CGI_10018377 superfamily 243034 102 134 0.00160445 33.6419 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#11942 - CGI_10018378 superfamily 243074 12 63 1.61E-06 45.9605 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#11946 - CGI_10018382 superfamily 113205 196 321 4.60E-51 172.582 cl04510 DUF544 superfamily - - Protein of unknown function (DUF544); Eukaryotic protein of unknown function. Q#11947 - CGI_10018383 superfamily 117815 342 491 4.60E-81 255.127 cl07780 BTD superfamily - - "Beta-trefoil DNA-binding domain; Members of this family of DNA binding domains adopt a beta-trefoil fold, that is, a capped beta-barrel with internal pseudo threefold symmetry. In the DNA-binding protein LAG-1, it also is the site of mutually exclusive interactions with NotchIC (and the viral protein EBNA2) and co-repressors (SMRT/N-Cor and CIR)." Q#11947 - CGI_10018383 superfamily 220160 210 341 1.08E-60 200.349 cl07781 LAG1-DNAbind superfamily - - "LAG1, DNA binding; Members of this family are found in various eukaryotic hypothetical proteins and in the DNA-binding protein LAG-1. They adopt a beta sandwich structure, with nine strands in two beta-sheets, in a Greek-key topology, and allow for DNA binding. This domain is also known as RHR-N (Rel-homology region) as it related to Rel domain proteins." Q#11947 - CGI_10018383 superfamily 247038 521 609 9.96E-43 149.145 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#11948 - CGI_10018384 superfamily 247792 130 179 0.0016251 35.114 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11949 - CGI_10018385 superfamily 218280 1 111 2.38E-52 176.641 cl04781 Rad21_Rec8_N superfamily - - "N terminus of Rad21 / Rec8 like protein; This family represents a conserved N-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation." Q#11949 - CGI_10018385 superfamily 113590 586 638 3.80E-11 59.2718 cl04780 Rad21_Rec8 superfamily - - "Conserved region of Rad21 / Rec8 like protein; This family represents a conserved region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation." Q#11950 - CGI_10018386 superfamily 248388 311 381 0.000422178 39.6193 cl17834 Ser_hydrolase superfamily N - "Serine hydrolase; Members of this family have serine hydrolase activity. They contain a conserved serine hydrolase motif, GXSXG/A, where the serine is a putative nucleophile. This family has an alpha-beta hydrolase fold. Eukaryotic members of this family have a conserved LXCXE motif, which binds to retinoblastomas. This motif is absent from prokaryotic members of this family." Q#11952 - CGI_10018388 superfamily 217293 15 149 5.73E-19 84.2215 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11954 - CGI_10018390 superfamily 216554 421 563 4.32E-26 105.254 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#11954 - CGI_10018390 superfamily 216554 113 250 2.64E-24 100.247 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#11955 - CGI_10018391 superfamily 217355 239 615 0 582.872 cl03873 GCS superfamily - - "Glutamate-cysteine ligase; This family represents the catalytic subunit of glutamate-cysteine ligase (E.C. 6.3.2.2), also known as gamma-glutamylcysteine synthetase (GCS). This enzyme catalyzes the rate limiting step in the biosynthesis of glutathione. The eukaryotic enzyme is a dimer of a heavy chain and a light chain with all the catalytic activity exhibited by the heavy chain (this family)." Q#11958 - CGI_10018394 superfamily 243061 96 197 2.20E-41 138.629 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11958 - CGI_10018394 superfamily 243061 37 90 1.72E-17 74.6858 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11959 - CGI_10018395 superfamily 243061 67 168 2.04E-37 127.073 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11959 - CGI_10018395 superfamily 243061 8 61 6.09E-14 64.2854 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#11960 - CGI_10018396 superfamily 246940 64 258 1.21E-19 84.6925 cl15377 Radical_SAM superfamily - - "Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin." Q#11961 - CGI_10018397 superfamily 247792 18 69 3.58E-07 47.8256 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#11961 - CGI_10018397 superfamily 241563 209 238 0.000382903 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#11962 - CGI_10018398 superfamily 241599 97 151 1.40E-23 91.536 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#11962 - CGI_10018398 superfamily 146451 265 285 5.36E-07 45.4279 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#11964 - CGI_10018400 superfamily 217293 27 226 1.89E-31 119.275 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#11964 - CGI_10018400 superfamily 202474 234 316 6.20E-17 78.0793 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#11966 - CGI_10018402 superfamily 247743 12 142 0.00032896 40.9775 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#11966 - CGI_10018402 superfamily 221913 211 410 3.93E-47 168.103 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#11967 - CGI_10018403 superfamily 245716 172 194 0.00434474 35.6119 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#11968 - CGI_10018404 superfamily 217613 164 282 2.72E-42 143.084 cl04154 Cullin_binding superfamily - - "Cullin binding; This domain binds to cullins and to Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. Neddylation is the process by which the C-terminal glycine of the ubiquitin-like protein Nedd8 is covalently linked to lysine residues in a protein through an isopeptide bond. The structure of this domain is composed entirely of alpha helices." Q#11970 - CGI_10003419 superfamily 246675 1 293 5.47E-88 274.878 cl14615 PI-PLCc_GDPD_SF superfamily - - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#11971 - CGI_10013115 superfamily 247794 1 264 1.52E-149 432.61 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#11973 - CGI_10013117 superfamily 245819 612 788 3.44E-67 221.684 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#11973 - CGI_10013117 superfamily 245225 101 269 6.57E-32 128.198 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#11973 - CGI_10013117 superfamily 245201 320 536 1.07E-30 121.875 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11973 - CGI_10013117 superfamily 219526 553 598 5.90E-07 49.5399 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#11974 - CGI_10013119 superfamily 247772 157 207 1.15E-07 47.2396 cl17218 Cupin_2 superfamily C - Cupin domain; This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). Q#11975 - CGI_10013120 superfamily 245596 3 150 3.09E-43 153.618 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#11976 - CGI_10013121 superfamily 215647 91 330 1.52E-58 193.209 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#11976 - CGI_10013121 superfamily 243029 16 71 4.31E-13 63.9089 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#11978 - CGI_10013123 superfamily 219541 302 440 1.05E-26 106.013 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#11978 - CGI_10013123 superfamily 215896 77 187 9.75E-15 71.9424 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#11979 - CGI_10013124 superfamily 241585 407 453 1.84E-06 47.5136 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#11979 - CGI_10013124 superfamily 241585 549 596 1.84E-05 44.8172 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#11979 - CGI_10013124 superfamily 241585 697 741 0.000219121 41.3504 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#11979 - CGI_10013124 superfamily 241585 454 494 0.000419621 40.58 cl00066 FU superfamily C - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#11979 - CGI_10013124 superfamily 241585 502 545 0.000517832 40.1948 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#11979 - CGI_10013124 superfamily 207627 1746 1840 1.42E-15 75.3639 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#11979 - CGI_10013124 superfamily 248289 107 164 5.42E-11 60.9007 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#11979 - CGI_10013124 superfamily 248289 299 354 7.68E-10 57.8191 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#11979 - CGI_10013124 superfamily 207627 1641 1718 7.80E-08 52.2567 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#11979 - CGI_10013124 superfamily 248289 43 100 9.64E-08 51.6559 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#11979 - CGI_10013124 superfamily 248289 169 229 7.20E-07 49.0516 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#11979 - CGI_10013124 superfamily 248289 246 293 4.39E-06 46.6483 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#11979 - CGI_10013124 superfamily 219541 1910 1937 0.000191528 42.4555 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#11979 - CGI_10013124 superfamily 241585 644 683 0.00240395 38.2598 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#11980 - CGI_10013125 superfamily 245213 92 122 5.95E-06 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#11980 - CGI_10013125 superfamily 241611 135 242 4.62E-07 47.3832 cl00102 PTX superfamily C - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#11981 - CGI_10000658 superfamily 222429 6 82 1.86E-16 70.7324 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#11984 - CGI_10000694 superfamily 219542 61 89 8.38E-06 39.5324 cl18517 Cu-oxidase_3 superfamily C - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#11986 - CGI_10000918 superfamily 241900 10 182 2.11E-23 93.8674 cl00490 EEP superfamily C - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#11987 - CGI_10001018 superfamily 245201 134 295 2.85E-21 94.1404 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11987 - CGI_10001018 superfamily 245201 428 532 2.78E-15 75.6508 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#11991 - CGI_10001367 superfamily 215647 3 212 9.65E-06 44.5217 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#11994 - CGI_10008824 superfamily 220231 3 165 1.24E-12 62.0675 cl09664 Nop16 superfamily - - Ribosome biogenesis protein Nop16; Nop16 is a protein involved in ribosome biogenesis. Q#11998 - CGI_10008828 superfamily 241573 17 344 2.82E-106 326.981 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#11998 - CGI_10008828 superfamily 246669 518 640 5.26E-42 148.195 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#11998 - CGI_10008828 superfamily 241653 355 495 3.01E-37 135.889 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#12000 - CGI_10008830 superfamily 243082 950 1162 7.94E-53 186.339 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#12000 - CGI_10008830 superfamily 243082 526 564 5.60E-13 68.8534 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#12000 - CGI_10008830 superfamily 243082 48 205 1.36E-05 47.6985 cl02553 Peptidase_C19 superfamily NC - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#12000 - CGI_10008830 superfamily 247792 876 924 0.000397184 39.6836 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12003 - CGI_10001455 superfamily 248097 114 235 2.64E-16 72.2978 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12003 - CGI_10001455 superfamily 248097 54 123 7.09E-10 54.1934 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12004 - CGI_10011054 superfamily 248012 344 443 2.75E-23 94.5668 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#12006 - CGI_10011056 superfamily 243058 405 499 1.73E-08 53.0871 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#12006 - CGI_10011056 superfamily 248012 534 619 6.25E-23 94.952 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#12007 - CGI_10011057 superfamily 241578 600 754 2.77E-14 73.0946 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12007 - CGI_10011057 superfamily 241578 1739 1898 7.98E-11 62.6942 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12007 - CGI_10011057 superfamily 241578 38 148 7.40E-17 81.1202 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12007 - CGI_10011057 superfamily 243119 1951 2007 3.94E-05 43.5765 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#12007 - CGI_10011057 superfamily 241578 1523 1672 0.000231772 42.8268 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12007 - CGI_10011057 superfamily 243119 260 316 0.000853288 39.7346 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#12010 - CGI_10011060 superfamily 243090 344 464 5.44E-65 211.051 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#12010 - CGI_10011060 superfamily 243090 185 328 2.92E-24 100.22 cl02565 RGS superfamily N - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#12010 - CGI_10011060 superfamily 243090 51 110 3.92E-15 73.6408 cl02565 RGS superfamily C - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#12015 - CGI_10011065 superfamily 245864 43 502 2.88E-84 270.304 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12018 - CGI_10011068 superfamily 222429 5 69 2.88E-06 44.5388 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#12019 - CGI_10011069 superfamily 247723 64 129 1.26E-30 106.538 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#12022 - CGI_10001944 superfamily 245596 6 286 4.54E-140 406.577 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#12022 - CGI_10001944 superfamily 247772 332 473 1.33E-88 269.915 cl17218 Cupin_2 superfamily - - Cupin domain; This family represents the conserved barrel domain of the 'cupin' superfamily ('cupa' is the Latin term for a small barrel). Q#12023 - CGI_10001945 superfamily 243519 6 454 0 574.846 cl03757 phosphohexomutase superfamily - - "The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model." Q#12024 - CGI_10001946 superfamily 242207 234 464 6.25E-103 309.167 cl00939 Bac_transf superfamily - - "Bacterial sugar transferase; This Pfam family represents a conserved region from a number of different bacterial sugar transferases, involved in diverse biosynthesis pathways." Q#12025 - CGI_10001948 superfamily 245023 1 426 0 854.211 cl09156 PS_pyruv_trans superfamily - - Polysaccharide pyruvyl transferase; Pyruvyl-transferases involved in peptidoglycan-associated polymer biosynthesis. CsaB in Bacillus anthracis is necessary for the non-covalent anchoring of proteins containing an SLH (S-layer homology) domain to peptidoglycan-associated pyruvylated polysaccharides. WcaK and AmsJ are involved in the biosynthesis of colanic acid in Escherichia coli and of amylovoran in Erwinia amylovora. Q#12026 - CGI_10001949 superfamily 245227 69 372 2.72E-109 327.204 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#12027 - CGI_10001952 superfamily 246713 114 273 1.35E-27 105.016 cl14786 ENDO3c superfamily - - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#12027 - CGI_10001952 superfamily 244116 6 92 1.57E-28 106.508 cl05528 AlkA_N superfamily C - AlkA N-terminal domain; AlkA N-terminal domain. Q#12028 - CGI_10001953 superfamily 245819 82 196 6.90E-29 110.724 cl11967 Nucleotidyl_cyc_III superfamily N - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#12028 - CGI_10001953 superfamily 241757 274 450 1.04E-13 69.1115 cl00290 EAL superfamily N - "EAL domain. This domain is found in diverse bacterial signaling proteins. It is called EAL after its conserved residues and is also known as domain of unknown function 2 (DUF2). The EAL domain has been shown to stimulate degradation of a second messenger, cyclic di-GMP, and is a good candidate for a diguanylate phosphodiesterase function. Together with the GGDEF domain, EAL might be involved in regulating cell surface adhesiveness in bacteria." Q#12028 - CGI_10001953 superfamily 247824 1 84 2.98E-36 134.304 cl17270 APH_ChoK_like superfamily N - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#12030 - CGI_10001955 superfamily 247744 10 208 1.88E-98 287.143 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#12031 - CGI_10001957 superfamily 222180 357 569 4.17E-17 80.1252 cl18644 AsmA_2 superfamily - - AsmA-like C-terminal region; This family is similar to the C-terminal of the AsmA protein of E. coli. Q#12032 - CGI_10001959 superfamily 202285 46 138 2.83E-26 99.9751 cl03645 Poly_export superfamily - - Polysaccharide biosynthesis/export protein; This is a family of periplasmic proteins involved in polysaccharide biosynthesis and/or export. Q#12032 - CGI_10001959 superfamily 220798 146 188 0.00245715 35.2756 cl14799 SLBB superfamily - - SLBB domain; SLBB domain. Q#12033 - CGI_10001960 superfamily 241614 3 140 3.94E-54 169.368 cl00105 LMWPc superfamily - - Low molecular weight phosphatase family; Q#12034 - CGI_10001961 superfamily 247074 16 111 1.30E-23 97.8065 cl15801 Wzz superfamily C - Chain length determinant protein; This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases. Q#12034 - CGI_10001961 superfamily 222392 367 417 2.17E-19 84.173 cl18670 GNVR superfamily C - "G-rich domain on putative tyrosine kinase; This domain is found between two families, Wzz, pfam02706 and CbiA pfam01656. There is a highly conserved GNVR sequence motif which characterizes this domain. The function is not known." Q#12034 - CGI_10001961 superfamily 247757 538 564 0.000200327 40.5962 cl17203 Fer4_NifH superfamily C - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#12034 - CGI_10001961 superfamily 247074 212 252 0.00034123 40.0265 cl15801 Wzz superfamily N - Chain length determinant protein; This family includes proteins involved in lipopolysaccharide (lps) biosynthesis. This family comprises the whole length of chain length determinant protein (or wzz protein) that confers a modal distribution of chain length on the O-antigen component of lps. This region is also found as part of bacterial tyrosine kinases. Q#12035 - CGI_10001962 superfamily 245596 8 171 2.32E-32 118.334 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#12036 - CGI_10001964 superfamily 245227 36 274 1.49E-35 133.14 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#12036 - CGI_10001964 superfamily 245227 243 342 1.33E-06 47.9817 cl10013 Glycosyltransferase_GTB_type superfamily N - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#12037 - CGI_10001965 superfamily 189184 1 273 1.58E-172 495.593 cl08075 wcaD superfamily - - putative colanic acid biosynthesis protein; Provisional Q#12037 - CGI_10001965 superfamily 245596 272 505 9.04E-163 464.384 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#12038 - CGI_10001967 superfamily 245206 4 352 0 545.655 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12039 - CGI_10001968 superfamily 245206 5 313 0 515.979 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12040 - CGI_10004123 superfamily 202886 370 510 3.84E-56 188.201 cl04399 Nucleoporin2 superfamily - - Nucleoporin autopeptidase; Nucleoporin autopeptidase. Q#12041 - CGI_10004124 superfamily 205121 822 846 2.54E-05 42.8764 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#12041 - CGI_10004124 superfamily 205121 393 415 0.00306025 36.7132 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#12041 - CGI_10004124 superfamily 205121 165 188 0.00857435 35.1724 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#12046 - CGI_10002178 superfamily 243263 25 536 9.62E-69 231.528 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#12047 - CGI_10002179 superfamily 220635 5 191 2.55E-45 159.237 cl12380 DUF2151 superfamily C - "Cell cycle and development regulator; This is a set of proteins conserved from worms to humans. The proteins are a PAN GU kinase substrate, Mat89Bb, essential for S-M cycles of early Drosophila embryogenesis, Xenopus embryonic cell cycles and morphogenesis, and cell division in cultured mammalian cells." Q#12048 - CGI_10002180 superfamily 213107 16 57 0.00129469 33.7828 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#12049 - CGI_10016637 superfamily 218406 4 74 3.00E-16 72.077 cl04914 MGAT2 superfamily NC - "N-acetylglucosaminyltransferase II (MGAT2); UDP-N-acetyl-D-glucosamine:alpha-6-D-mannoside beta-1,2-N- acetylglucosaminyltransferase II (EC 2.4.1.143) (GnT II/MGAT2) is a Golgi resident enzyme that catalyzes an essential step in the biosynthetic pathway leading from high mannose to complex N-linked oligosaccharides. Mutations in the MGAT2 gene lead to congenital disorder of glycosylation (CDG IIa). CDG IIa patients have an increased bleeding tendency, unrelated to coagulation factors." Q#12051 - CGI_10016639 superfamily 245864 204 573 4.09E-50 184.019 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12051 - CGI_10016639 superfamily 245864 747 940 1.04E-48 179.782 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12051 - CGI_10016639 superfamily 245864 585 653 0.00156501 40.5649 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12052 - CGI_10016640 superfamily 245864 3 195 8.71E-51 181.708 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12052 - CGI_10016640 superfamily 245864 238 431 4.39E-48 174.004 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12052 - CGI_10016640 superfamily 245864 424 534 1.02E-20 93.8822 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12055 - CGI_10016643 superfamily 241624 147 283 4.23E-29 112.035 cl00120 PP2Cc superfamily N - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#12055 - CGI_10016643 superfamily 241624 81 179 7.80E-15 71.2035 cl00120 PP2Cc superfamily C - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#12059 - CGI_10016647 superfamily 246962 67 327 5.63E-45 157.2 cl15430 Nucleoside_tran superfamily - - "Nucleoside transporter; This is a family of nucleoside transporters. In mammalian cells nucleoside transporters transport nucleoside across the plasma membrane and are essential for nucleotide synthesis via the salvage pathways for cells that lack their own de novo synthesis pathways. Also in this family is mouse and human nucleolar protein HNP36, a protein of unknown function; although it has been hypothesised to be a plasma membrane nucleoside transporter." Q#12060 - CGI_10016648 superfamily 246961 68 461 6.86E-40 146.43 cl15429 ABD superfamily - - "Alpha-Mannosidase Binding Domain of Atg19/34; These proteins are related to the Alpha-mannosidase (Ams1) Binding Domain of Atg19/Atg34, a key component in the targeting pathway that directs alpha-mannosidase and aminopeptidase I to the vacuole, either through cytoplasm-to-vacuole trafficking or via autophagy in starvation conditions. Autophagy in a eukaryotic mechanism in which cytoplasm is enclosed in double-membraned autophagosomes which fuse with a vacuole for transport into the lumen. In Saccharomyces cerevisiae, alpha-mannosidase is selectively directed to the vacuole via the direct interaction with Atg19 (and paralog Atg34) in the Cvt pathway. Ams1 binding domains (ABD) Atg19/34 have a immunoglobulin fold with eight beta-strands. The ABD is responsible for Ams1 recognition, but its deletion does not affect the fusion of Atg19 with prApe1, and the transport of prApe1 to the vacuole. The Atg19 N-terminal region is a distinct coiled-coil domain." Q#12064 - CGI_10003236 superfamily 110440 85 111 0.00437395 32.7649 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#12066 - CGI_10003238 superfamily 243091 171 276 1.35E-08 53.8775 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#12066 - CGI_10003238 superfamily 222150 409 433 3.42E-05 42.3789 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12066 - CGI_10003238 superfamily 222150 380 404 0.000162664 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12066 - CGI_10003238 superfamily 197676 366 388 0.00988818 35.1342 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#12067 - CGI_10002647 superfamily 245595 482 737 1.33E-178 537.719 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#12070 - CGI_10001438 superfamily 242274 3 155 9.09E-11 56.2666 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#12071 - CGI_10001439 superfamily 242274 15 171 0.000423242 38.5474 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#12073 - CGI_10002381 superfamily 193607 572 702 3.05E-57 191.246 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#12073 - CGI_10002381 superfamily 247792 525 566 4.33E-10 56.3 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12073 - CGI_10002381 superfamily 241554 23 181 1.86E-23 98.4967 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#12074 - CGI_10002382 superfamily 193607 1126 1255 3.03E-70 231.692 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#12074 - CGI_10002382 superfamily 247792 1077 1120 4.43E-11 60.152 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12074 - CGI_10002382 superfamily 241554 723 897 2.37E-47 168.603 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#12074 - CGI_10002382 superfamily 241554 7 168 1.39E-43 157.817 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#12075 - CGI_10009247 superfamily 216981 12 82 0.00542224 34.8158 cl17087 OTU superfamily N - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#12077 - CGI_10009249 superfamily 245596 149 446 3.22E-148 433.169 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#12077 - CGI_10009249 superfamily 247085 488 612 4.32E-12 63.6786 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#12078 - CGI_10009250 superfamily 152105 5 59 2.67E-05 37.8975 cl13169 WBP-1 superfamily C - "WW domain-binding protein 1; This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain." Q#12079 - CGI_10009251 superfamily 245596 312 554 4.96E-26 107.182 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#12082 - CGI_10009254 superfamily 247724 5 180 2.75E-77 233.094 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12083 - CGI_10009255 superfamily 243034 703 802 4.27E-19 84.7391 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#12083 - CGI_10009255 superfamily 243034 483 587 5.54E-12 63.9384 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#12083 - CGI_10009255 superfamily 243034 151 229 1.94E-09 56.6196 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#12083 - CGI_10009255 superfamily 222431 1033 1133 3.74E-21 90.3593 cl16447 RPAP3_C superfamily - - Potential Monad-binding region of RPAP3; This domain is found at the C-terminus of RNA-polymerase II-associated proteins. These proteins bind to Monad and are involved in regulating apoptosis. They contain TPR-repeats towards the N_terminus. Q#12084 - CGI_10009256 superfamily 214545 462 604 1.84E-58 196.002 cl10551 CULLIN superfamily - - Cullin; Cullin. Q#12084 - CGI_10009256 superfamily 245539 708 774 1.48E-28 109.951 cl11186 Cullin_Nedd8 superfamily - - "Cullin protein neddylation domain; This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue." Q#12089 - CGI_10021542 superfamily 205516 305 457 1.14E-82 254.252 cl18271 AcetylCoA_hyd_C superfamily - - "Acetyl-CoA hydrolase/transferase C-terminal domain; This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilises acyl-CoA and acetate to form acetyl-CoA." Q#12089 - CGI_10021542 superfamily 217098 42 213 3.12E-19 85.2806 cl15896 AcetylCoA_hydro superfamily - - "Acetyl-CoA hydrolase/transferase N-terminal domain; This family contains several enzymes which take part in pathways involving acetyl-CoA. Acetyl-CoA hydrolase EC:3.1.2.1 catalyzes the formation of acetate from acetyl-CoA, CoA transferase (CAT1) EC:2.8.3.- produces succinyl-CoA, and acetate-CoA transferase EC:2.8.3.8 utilises acyl-CoA and acetate to form acetyl-CoA." Q#12090 - CGI_10021543 superfamily 243175 6 134 7.41E-19 77.3707 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#12091 - CGI_10021544 superfamily 241832 4 79 2.62E-22 87.2001 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#12091 - CGI_10021544 superfamily 243175 84 215 2.85E-19 80.4523 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#12092 - CGI_10021545 superfamily 247724 32 87 1.45E-18 76.1616 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12093 - CGI_10021546 superfamily 247724 32 192 6.29E-55 175.158 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12094 - CGI_10021547 superfamily 247724 19 195 9.33E-65 200.966 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12095 - CGI_10021548 superfamily 245206 17 304 1.66E-78 243.286 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12097 - CGI_10021550 superfamily 150446 20 141 1.39E-10 56.6952 cl10756 OSTMP1 superfamily N - "Osteopetrosis-associated transmembrane protein 1 precursor; Members of this family of proteins are required for osteoclast and melanocyte maturation and function. Mutations give rise to autosomal recessive osteopetrosis, also called autosomal recessive Albers-Schonberg disease." Q#12098 - CGI_10021551 superfamily 247755 398 495 5.17E-40 144.919 cl17201 ABC_ATPase superfamily NC - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#12098 - CGI_10021551 superfamily 216049 65 324 8.56E-18 83.1042 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#12099 - CGI_10021552 superfamily 243053 909 1139 6.77E-67 225.98 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#12099 - CGI_10021552 superfamily 243067 770 877 5.63E-14 70.5191 cl02520 REM superfamily - - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#12102 - CGI_10021555 superfamily 241646 23 65 2.49E-08 45.9046 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#12103 - CGI_10021556 superfamily 222150 99 124 9.98E-05 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12104 - CGI_10021557 superfamily 247683 1435 1490 4.71E-27 107.098 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#12104 - CGI_10021557 superfamily 248279 285 401 6.14E-42 151.335 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#12104 - CGI_10021557 superfamily 245835 909 1068 1.44E-30 123.463 cl12013 BAR superfamily C - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#12104 - CGI_10021557 superfamily 220792 59 208 1.19E-13 70.5127 cl11150 EPL1 superfamily - - Enhancer of polycomb-like; This is a family of EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes. Q#12104 - CGI_10021557 superfamily 247999 234 279 1.03E-11 62.508 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#12104 - CGI_10021557 superfamily 247683 1324 1378 5.40E-10 57.7219 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#12106 - CGI_10021559 superfamily 244843 1 71 1.49E-10 55.7 cl08040 Ggt superfamily N - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#12107 - CGI_10021560 superfamily 213438 6 42 2.04E-12 56.4974 cl04635 F1-ATPase_epsilon superfamily - - "eukaryotic mitochondrial ATP synthase epsilon subunit; The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes, and in chloroplast thylakoid membranes. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinsic membrane domain, F1, is composed of alpha, beta, gamma, delta, and epsilon subunits (only found in eukaryotes, lacking in bacteria) with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain.The epsilon subunit is thought to be involved in the regulation of ATP synthase, since a null mutation increased oligomycin sensitivity and decreased inhibition by inhibitor protein IF1." Q#12108 - CGI_10021561 superfamily 245835 1089 1198 0.00162779 39.9991 cl12013 BAR superfamily C - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#12109 - CGI_10021562 superfamily 247725 869 978 1.34E-10 61.0203 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12109 - CGI_10021562 superfamily 243056 1592 1805 1.46E-60 209.47 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#12109 - CGI_10021562 superfamily 247725 1147 1195 3.19E-11 63.4654 cl17171 PH-like superfamily NC - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12109 - CGI_10021562 superfamily 152266 1462 1510 2.38E-07 50.3127 cl13297 DUF3350 superfamily - - Domain of unknown function (DUF3350); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 50 to 64 amino acids in length. Q#12110 - CGI_10021563 superfamily 243072 2211 2337 2.89E-30 119.025 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12110 - CGI_10021563 superfamily 243077 1252 1305 1.06E-13 69.1113 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#12110 - CGI_10021563 superfamily 206405 926 971 3.12E-12 64.5503 cl16735 DUF4339 superfamily - - "Domain of unknown function (DUF4339); This domain is found in bacteria, archaea and eukaryotes, and is approximately 50 amino acids in length. There are two completely conserved residues (G and W) that may be functionally important." Q#12111 - CGI_10021564 superfamily 246925 68 240 5.62E-05 44.2686 cl15309 LRR_RI superfamily C - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#12114 - CGI_10004430 superfamily 247684 1 201 1.95E-45 161.676 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#12116 - CGI_10004432 superfamily 245304 3 177 1.44E-55 185.456 cl10459 Peptidases_S8_S53 superfamily N - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#12116 - CGI_10004432 superfamily 201820 262 345 3.72E-19 80.7474 cl08326 P_proprotein superfamily - - Proprotein convertase P-domain; A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. Q#12117 - CGI_10004433 superfamily 245010 65 154 0.000486063 39.5983 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#12117 - CGI_10004433 superfamily 192987 128 227 0.000644895 39.0927 cl13724 TMF_TATA_bd superfamily - - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#12117 - CGI_10004433 superfamily 247068 437 512 0.00308975 36.9078 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12120 - CGI_10003886 superfamily 247724 201 468 4.17E-156 452.77 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12122 - CGI_10016549 superfamily 245364 107 186 4.00E-23 90.4904 cl10717 CactinC_cactus superfamily C - "Cactus-binding C-terminus of cactin protein; CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain pfam10312 further upstream." Q#12122 - CGI_10016549 superfamily 197732 52 82 4.41E-07 44.5507 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#12123 - CGI_10016550 superfamily 248264 2 67 0.000145409 39.1426 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#12124 - CGI_10016551 superfamily 247905 4 79 2.18E-16 71.8852 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#12128 - CGI_10016556 superfamily 241645 640 710 4.42E-10 57.3547 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#12128 - CGI_10016556 superfamily 222150 29 54 1.02E-05 43.5345 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12128 - CGI_10016556 superfamily 222150 57 80 0.000196297 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12128 - CGI_10016556 superfamily 246975 16 37 0.00447763 36.1709 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#12130 - CGI_10016559 superfamily 248097 101 228 5.94E-20 82.313 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12131 - CGI_10016560 superfamily 248012 67 206 3.48E-24 93.9272 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#12132 - CGI_10016561 superfamily 248012 61 200 4.47E-24 93.542 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#12133 - CGI_10016562 superfamily 248012 19 158 1.15E-24 93.9272 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#12134 - CGI_10016563 superfamily 241593 240 374 2.13E-11 60.353 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#12134 - CGI_10016563 superfamily 220754 30 191 1.01E-53 178.57 cl11087 BCDHK_Adom3 superfamily - - "Mitochondrial branched-chain alpha-ketoacid dehydrogenase kinase; Catabolism and synthesis of leucine, isoleucine and valine are finely balanced, allowing the body to make the most of dietary input but removing excesses to prevent toxic build-up of their corresponding keto-acids. This is the butyryl-CoA dehydrogenase, subunit A domain 3, a largely alpha-helical bundle of the enzyme BCDHK. This enzyme is the regulator of the dehydrogenase complex that breaks branched-chain amino-acids down, by phosphorylating and thereby inactivating it when synthesis is required. The domain is associated with family HATPase_c pfam02518 which is towards the C-terminal." Q#12137 - CGI_10016568 superfamily 243146 318 367 1.16E-06 45.7431 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12137 - CGI_10016568 superfamily 243146 369 414 1.27E-05 42.6615 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12137 - CGI_10016568 superfamily 243146 158 210 1.45E-05 42.6615 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12137 - CGI_10016568 superfamily 243146 216 261 0.000605681 37.6539 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12137 - CGI_10016568 superfamily 243146 253 296 0.00595203 34.7314 cl02701 Kelch_3 superfamily C - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12138 - CGI_10016569 superfamily 245008 496 573 6.04E-26 101.907 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#12143 - CGI_10005307 superfamily 215733 89 153 1.39E-07 47.5599 cl02811 E1-E2_ATPase superfamily C - E1-E2 ATPase; E1-E2 ATPase. Q#12143 - CGI_10005307 superfamily 243244 9 50 1.81E-06 42.1374 cl02930 Cation_ATPase_N superfamily N - "Cation transporter/ATPase, N-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport." Q#12144 - CGI_10005308 superfamily 241571 27 136 1.68E-28 107.883 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#12144 - CGI_10005308 superfamily 241571 215 328 6.80E-27 103.261 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#12144 - CGI_10005308 superfamily 241571 156 231 5.92E-06 43.918 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#12145 - CGI_10004495 superfamily 243161 4 69 0.000545321 35.0626 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#12146 - CGI_10013672 superfamily 246680 24 106 3.10E-14 66.9636 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#12146 - CGI_10013672 superfamily 248012 171 281 1.08E-09 55.4072 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#12148 - CGI_10013674 superfamily 248264 78 242 2.89E-51 166.258 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#12148 - CGI_10013674 superfamily 222263 1 89 1.37E-05 41.9197 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#12149 - CGI_10013675 superfamily 241600 1 73 2.99E-23 88.4514 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#12150 - CGI_10013676 superfamily 243096 11 200 2.84E-36 133.192 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#12150 - CGI_10013676 superfamily 247725 302 391 3.89E-20 86.2344 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12150 - CGI_10013676 superfamily 247725 237 276 8.77E-13 65.0484 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12150 - CGI_10013676 superfamily 247725 400 537 2.30E-11 62.0134 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12152 - CGI_10013678 superfamily 241754 56 766 0 1223.87 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#12156 - CGI_10006336 superfamily 221612 159 299 6.10E-20 91.3516 cl13889 DUF3715 superfamily - - "Protein of unknown function (DUF3715); This domain family is found in eukaryotes, and is approximately 170 amino acids in length." Q#12157 - CGI_10006337 superfamily 241628 5 164 9.74E-48 157.929 cl00130 PseudoU_synth superfamily N - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#12159 - CGI_10006339 superfamily 212639 121 831 0 721.018 cl17018 FANC superfamily C - "Fanconi anemia ID complex proteins FANCI and FANCD2; The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome." Q#12159 - CGI_10006339 superfamily 212639 920 1406 1.24E-139 461.008 cl17018 FANC superfamily N - "Fanconi anemia ID complex proteins FANCI and FANCD2; The Fanconi anemia ID complex consists of two subunits, Fanconi anemia I and Fanconi anemia D2 (FANCI-FANCD2) and plays a central role in the repair of DNA interstrand cross-links (ICLs). The complex is activated via DNA damage-induced phosphorylation by ATR (ataxia telangiectasia and Rad3-related) and monoubiquitination by the FA core complex ubiquitin ligase, and it binds to DNA at the ICL site, recognizing branched DNA structures. Defects in the complex cause Fanconi anemia, a cancer predisposition syndrome." Q#12160 - CGI_10006340 superfamily 241974 697 765 7.97E-13 66.111 cl00604 STAS superfamily N - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#12160 - CGI_10006340 superfamily 241974 577 612 1.09E-05 44.5399 cl00604 STAS superfamily C - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#12160 - CGI_10006340 superfamily 216188 238 519 2.03E-53 186.655 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#12160 - CGI_10006340 superfamily 205965 70 153 4.46E-34 125.988 cl18285 Sulfate_tra_GLY superfamily - - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#12167 - CGI_10004678 superfamily 247727 100 223 1.55E-09 54.3583 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#12168 - CGI_10004679 superfamily 247085 301 418 6.07E-15 70.9974 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#12168 - CGI_10004679 superfamily 245596 18 288 1.67E-101 306.823 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#12169 - CGI_10006874 superfamily 244881 245 546 6.15E-133 403.112 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#12169 - CGI_10006874 superfamily 215788 21 111 1.80E-27 108.035 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#12169 - CGI_10006874 superfamily 203720 655 747 1.12E-25 103.012 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#12169 - CGI_10006874 superfamily 147487 73 170 0.000858572 38.923 cl05075 SVA superfamily N - "Seminal vesicle autoantigen (SVA); This family consists of seminal vesicle autoantigen and prolactin-inducible (PIP) proteins. Seminal vesicle autoantigen (SVA) is specifically present in the seminal plasma of mice. This 19-kDa secretory glycoprotein suppresses the motility of spermatozoa by interacting with phospholipid. PIP, has several known functions. In saliva, this protein plays a role in host defence by binding to microorganisms such as Streptococcus. PIP is an aspartyl proteinase and it acts as a factor capable of suppressing T-cell apoptosis through its interaction with CD4." Q#12170 - CGI_10006875 superfamily 241754 80 114 6.77E-17 75.7003 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#12171 - CGI_10006876 superfamily 241754 74 108 2.28E-15 71.0779 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#12171 - CGI_10006876 superfamily 111612 26 68 2.28E-06 40.5422 cl03686 Myosin_N superfamily - - Myosin N-terminal SH3-like domain; This domain has an SH3-like fold. It is found at the N-terminus of many but not all myosins. The function of this domain is unknown. Q#12174 - CGI_10006879 superfamily 242828 74 337 1.08E-34 132.847 cl01996 Glyco_hydro_76 superfamily - - "Glycosyl hydrolase family 76; Family of alpha-1,6-mannanases." Q#12175 - CGI_10006880 superfamily 246918 292 348 4.14E-06 44.8851 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12175 - CGI_10006880 superfamily 246918 24 76 0.000137437 40.6479 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12175 - CGI_10006880 superfamily 245814 386 447 0.00137688 37.3864 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12175 - CGI_10006880 superfamily 246918 513 533 0.00769173 35.2551 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12176 - CGI_10006881 superfamily 220757 318 362 2.07E-09 54.9847 cl11093 FIST_C superfamily N - "FIST C domain; The FIST C domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids." Q#12176 - CGI_10006881 superfamily 243074 24 55 9.68E-05 39.7274 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#12176 - CGI_10006881 superfamily 245358 222 323 0.00348847 37.2593 cl10701 FIST superfamily C - "FIST N domain; The FIST N domain is a novel sensory domain, which is present in signal transduction proteins from Bacteria, Archaea and Eukarya. Chromosomal proximity of FIST-encoding genes to those coding for proteins involved in amino acid metabolism and transport suggest that FIST domains bind small ligands, such as amino acids." Q#12177 - CGI_10006882 superfamily 183292 82 250 7.60E-20 88.7251 cl18135 PRK11728 superfamily NC - hydroxyglutarate oxidase; Provisional Q#12181 - CGI_10006886 superfamily 247907 12 166 2.35E-10 59.3541 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#12181 - CGI_10006886 superfamily 241645 466 559 0.0020239 37.5628 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#12186 - CGI_10005399 superfamily 245206 32 269 1.01E-54 181.318 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12188 - CGI_10005401 superfamily 241563 70 107 0.000217026 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12189 - CGI_10009981 superfamily 241874 68 629 0 682.753 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#12190 - CGI_10009982 superfamily 241607 536 592 2.75E-19 83.8929 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#12191 - CGI_10009983 superfamily 241646 20 64 0.000109813 35.119 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#12192 - CGI_10009984 superfamily 248097 129 228 3.84E-17 76.1498 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12193 - CGI_10009985 superfamily 248097 6 131 2.07E-20 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12195 - CGI_10009987 superfamily 241607 209 265 4.39E-16 70.7961 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#12196 - CGI_10009988 superfamily 245819 255 391 3.68E-55 184.32 cl11967 Nucleotidyl_cyc_III superfamily C - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#12196 - CGI_10009988 superfamily 245201 22 178 1.50E-21 93.37 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12196 - CGI_10009988 superfamily 219526 194 241 7.21E-06 45.6879 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#12197 - CGI_10009989 superfamily 245225 1 264 1.69E-61 202.477 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#12198 - CGI_10009990 superfamily 245225 24 103 7.97E-07 44.9949 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#12199 - CGI_10009991 superfamily 245603 59 385 0 567.172 cl11403 pepsin_retropepsin_like superfamily - - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#12199 - CGI_10009991 superfamily 245603 409 434 3.45E-07 50.1164 cl11403 pepsin_retropepsin_like superfamily N - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#12199 - CGI_10009991 superfamily 116576 20 47 3.24E-05 41.186 cl06833 A1_Propeptide superfamily - - "A1 Propeptide; Most eukaryotic endopeptidases (Merops Family A1) are synthesised with signal and propeptides. The animal pepsin-like endopeptidase propeptides form a distinct family of propeptides, which contain a conserved motif approximately 30 residues long. In pepsinogen A, the first 11 residues of the mature pepsin sequence are displaced by residues of the propeptide. The propeptide contains two helices that block the active site cleft, in particular the conserved Asp11 residue, in pepsin, hydrogen bonds to a conserved Arg residues in the propeptide. This hydrogen bond stabilises the propeptide conformation and is probably responsible for triggering the conversion of pepsinogen to pepsin under acidic conditions." Q#12201 - CGI_10009993 superfamily 241565 407 479 2.99E-08 51.9387 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#12201 - CGI_10009993 superfamily 241565 509 573 5.10E-08 51.1683 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#12201 - CGI_10009993 superfamily 241565 620 686 2.92E-07 48.8571 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#12201 - CGI_10009993 superfamily 241565 82 134 1.94E-06 46.5459 cl00038 BRCT superfamily N - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#12201 - CGI_10009993 superfamily 241565 9 72 0.000926379 38.4567 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#12201 - CGI_10009993 superfamily 241565 723 805 0.00124241 38.1262 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#12202 - CGI_10005290 superfamily 241555 34 282 2.16E-38 138.959 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#12203 - CGI_10005291 superfamily 247068 358 454 1.88E-10 58.4789 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12203 - CGI_10005291 superfamily 247068 483 556 4.02E-06 45.4183 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12203 - CGI_10005291 superfamily 247068 562 652 0.00262186 36.9078 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12205 - CGI_10003063 superfamily 241600 50 258 4.33E-84 252.932 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#12206 - CGI_10003064 superfamily 245206 38 251 6.08E-51 169.713 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12207 - CGI_10007535 superfamily 241563 117 150 2.65E-06 44.9707 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12208 - CGI_10007536 superfamily 241577 27 187 1.04E-74 225.963 cl00056 MH2 superfamily - - "C-terminal Mad Homology 2 (MH2) domain; The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers." Q#12209 - CGI_10007537 superfamily 219542 89 187 3.20E-37 130.439 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12209 - CGI_10007537 superfamily 215896 200 306 1.98E-15 71.172 cl18351 Cu-oxidase superfamily C - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#12211 - CGI_10007539 superfamily 219541 191 339 1.46E-24 97.1538 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12211 - CGI_10007539 superfamily 215896 2 68 2.08E-08 51.912 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#12214 - CGI_10002815 superfamily 215754 185 273 6.15E-24 93.0868 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12214 - CGI_10002815 superfamily 215754 82 182 7.03E-16 70.7452 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12214 - CGI_10002815 superfamily 215754 1 74 7.05E-11 57.2632 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12216 - CGI_10002817 superfamily 243072 101 220 9.10E-32 119.796 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12216 - CGI_10002817 superfamily 243072 199 319 3.10E-30 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12216 - CGI_10002817 superfamily 243072 293 418 6.62E-30 114.403 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12216 - CGI_10002817 superfamily 243072 363 484 9.43E-26 102.847 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12216 - CGI_10002817 superfamily 243072 35 154 2.03E-25 102.077 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12216 - CGI_10002817 superfamily 243073 559 601 4.37E-06 44.4087 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#12217 - CGI_10002818 superfamily 245205 31 106 3.25E-05 40.2989 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#12219 - CGI_10002395 superfamily 241573 90 410 1.46E-115 354.716 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#12219 - CGI_10002395 superfamily 241653 421 571 1.45E-43 154.764 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#12221 - CGI_10008237 superfamily 245847 120 192 5.77E-05 40.6178 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#12222 - CGI_10008238 superfamily 241574 406 632 3.99E-97 306.435 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#12222 - CGI_10008238 superfamily 241574 706 886 6.33E-14 71.4629 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#12223 - CGI_10008239 superfamily 248097 122 247 1.35E-18 78.8462 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12224 - CGI_10008240 superfamily 241750 52 329 1.76E-40 144.263 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#12225 - CGI_10008241 superfamily 243035 227 268 1.01E-05 42.9922 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#12226 - CGI_10008242 superfamily 245206 27 192 5.09E-42 143.583 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12227 - CGI_10008243 superfamily 245206 44 105 2.72E-05 40.2109 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12229 - CGI_10008245 superfamily 215724 1 257 6.86E-140 398.531 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#12230 - CGI_10005193 superfamily 248012 52 129 1.84E-18 78.134 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#12231 - CGI_10005194 superfamily 248012 615 758 3.71E-18 82.3712 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#12233 - CGI_10005196 superfamily 247727 215 348 5.37E-12 62.0622 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#12236 - CGI_10005199 superfamily 243138 117 350 1.78E-109 324.141 cl02675 DZF superfamily - - DZF domain; The function of this domain is unknown. It is often found associated with pfam00098 or pfam00035. This domain has been predicted to belong to the nucleotidyltransferase superfamily. Q#12238 - CGI_10003461 superfamily 221540 318 452 8.37E-63 201.551 cl13742 DUF3641 superfamily - - "Protein of unknown function (DUF3641); This domain family is found in bacteria and eukaryotes, and is approximately 140 amino acids in length. The family is found in association with pfam04055. This family consists of proteins which are commonly annotated as Radical SAM domains but there is little annotation to back this up." Q#12239 - CGI_10003462 superfamily 243050 23 75 4.20E-23 88.0358 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#12239 - CGI_10003462 superfamily 243050 127 179 6.36E-18 74.1686 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#12241 - CGI_10022231 superfamily 242730 162 240 0.00811369 35.6999 cl01825 Phage_Mu_Gam superfamily C - Bacteriophage Mu Gam like protein; This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. Q#12242 - CGI_10022232 superfamily 244083 41 147 2.87E-36 124.284 cl05417 PLA2_like superfamily - - "PLA2_like: Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers." Q#12243 - CGI_10022233 superfamily 246597 84 337 1.26E-123 370.017 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#12243 - CGI_10022233 superfamily 217260 364 512 7.48E-45 156.646 cl03752 5_nucleotid_C superfamily - - "5'-nucleotidase, C-terminal domain; 5'-nucleotidase, C-terminal domain. " Q#12244 - CGI_10022234 superfamily 242715 7 179 2.20E-81 242.808 cl01799 Ribosomal_L13e superfamily - - Ribosomal protein L13e; Ribosomal protein L13e. Q#12247 - CGI_10022237 superfamily 248469 183 273 2.12E-14 72.0175 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#12247 - CGI_10022237 superfamily 248469 438 545 3.43E-14 71.2471 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#12247 - CGI_10022237 superfamily 218493 935 1073 3.16E-44 157.904 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#12248 - CGI_10022238 superfamily 221917 111 198 3.80E-17 74.1875 cl16089 CENP-L superfamily - - "Kinetochore complex Sim4 subunit Fta1; CENP-L is one of the components that assembles onto the CENP-A-nucleosome distal (CAD) centromere. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals. CENP-A, the centromere-specific histone H3 variant, assembles an array of nucleosomes and it is this that seems to be the prime candidate for specifying centromere identity. CENP-A nucleosomes directly recruit a proximal CENP-A nucleosome associated complex (NAC) comprised of CENP-M, CENP-N and CENP-T, CENP-U(50), CENP-C and CENP-H. Assembly of the CENP-A NAC at centromeres is dependent on CENP-M, CENP-N and CENP-T. Additionally, there are seven other subunits which make up the CENP-A-nucleosome distal (CAD) centromere, CENP-K, CENP-L, CENP-O, CENP-P, CENP-Q, CENP-R and CENP-S, also assembling on the CENP-A NAC. Fta1 is the equivalent component of the fission yeast Sim4 complex. The centromere, which is the basic element of chromosome inheritance, is epigenetically determined in mammals." Q#12249 - CGI_10022239 superfamily 216144 221 277 3.23E-12 63.5769 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 737 789 1.45E-08 53.1765 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 999 1048 3.65E-08 52.0209 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 852 915 4.66E-08 51.6357 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 419 477 6.37E-08 51.2505 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 96 154 4.75E-07 48.5541 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 554 612 1.86E-05 43.9317 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 677 716 0.000102437 41.6205 cl02981 Cys_rich_FGFR superfamily C - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 349 399 0.000124507 41.2353 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 481 547 0.000153557 41.2353 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 795 848 0.00019617 40.8501 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 614 674 0.000880486 38.9241 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 282 345 0.00124697 38.5389 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 918 991 0.00228726 37.3833 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12249 - CGI_10022239 superfamily 216144 157 219 0.00258465 37.3833 cl02981 Cys_rich_FGFR superfamily - - Cysteine rich repeat; This cysteine rich repeat contains four cysteines. It is found in multiple copies in a protein that binds to fibroblast growth factors. The repeat is also found in MG160 and E-selectin ligand (ESL-1). Q#12252 - CGI_10022242 superfamily 243263 263 502 9.13E-29 117.509 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#12252 - CGI_10022242 superfamily 243263 2 203 1.58E-16 80.915 cl02990 ASC superfamily C - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#12253 - CGI_10022243 superfamily 247727 54 150 6.46E-12 62.0622 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#12254 - CGI_10022244 superfamily 246940 101 299 2.93E-05 43.091 cl15377 Radical_SAM superfamily - - "Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin." Q#12255 - CGI_10022245 superfamily 219969 722 1017 4.02E-33 130.225 cl07345 ASD2 superfamily - - Apx/Shroom domain ASD2; This region is found in the actin binding protein Shroom which mediates apical contriction in epithelial cells and is required for neural tube closure. Q#12256 - CGI_10022246 superfamily 241622 29 105 4.66E-13 59.8878 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#12257 - CGI_10022247 superfamily 241584 174 251 5.68E-10 55.1951 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#12257 - CGI_10022247 superfamily 241568 13 69 0.00100691 36.672 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12259 - CGI_10022249 superfamily 241677 166 226 6.74E-28 105.417 cl00197 cyclophilin superfamily N - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#12260 - CGI_10022250 superfamily 128469 570 665 8.57E-12 62.858 cl17971 VPS9 superfamily - - Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. Q#12261 - CGI_10022251 superfamily 247723 473 556 2.07E-41 147.75 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#12261 - CGI_10022251 superfamily 243035 758 869 2.22E-06 47.2294 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#12261 - CGI_10022251 superfamily 207684 13 44 2.00E-05 43.1363 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#12262 - CGI_10022252 superfamily 241568 621 681 3.16E-08 52.8504 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 821 883 1.07E-07 51.3096 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 562 616 2.18E-07 50.154 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 1707 1752 2.58E-07 50.154 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 503 557 4.42E-07 49.3836 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 381 435 1.34E-06 47.8428 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 745 816 2.35E-06 47.0724 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 686 740 2.84E-06 47.0724 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 888 943 5.36E-06 46.302 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 1479 1533 5.46E-06 46.302 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 322 376 1.04E-05 45.5316 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 460 498 2.36E-05 44.376 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12262 - CGI_10022252 superfamily 241568 1757 1811 3.73E-05 43.6056 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12263 - CGI_10022253 superfamily 241754 4 365 0 546.91 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#12263 - CGI_10022253 superfamily 241581 461 564 2.07E-10 59.3222 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#12263 - CGI_10022253 superfamily 246669 810 846 0.00488812 36.6611 cl14603 C2 superfamily NC - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#12263 - CGI_10022253 superfamily 221571 677 718 1.14E-07 50.1939 cl13810 KIF1B superfamily - - "Kinesin protein 1B; This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00225, pfam00498. KIF1B is an anterograde motor for transport of mitochondria in axons of neuronal cells." Q#12264 - CGI_10022254 superfamily 241754 4 47 5.00E-13 61.9439 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#12265 - CGI_10022255 superfamily 241550 38 231 1.37E-72 241.383 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#12265 - CGI_10022255 superfamily 241550 388 550 2.10E-48 173.588 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#12265 - CGI_10022255 superfamily 245839 550 671 3.78E-16 75.7193 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#12265 - CGI_10022255 superfamily 241550 317 332 0.00686039 37.7833 cl00015 nt_trans superfamily NC - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#12266 - CGI_10022256 superfamily 241983 22 329 3.28E-54 182.172 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#12267 - CGI_10022257 superfamily 245839 17 52 6.05E-05 38.3549 cl12020 Anticodon_Ia_like superfamily N - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#12268 - CGI_10022258 superfamily 241983 1 246 2.93E-36 132.096 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#12269 - CGI_10022259 superfamily 242043 28 66 2.46E-11 55.8112 cl00713 Auto_anti-p27 superfamily - - Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27); This family consists of several Sjogren's syndrome/scleroderma autoantigen 1 (Autoantigen p27) sequences. It is thought that the potential association of anti-p27 with anti-centromere antibodies suggests that autoantigen p27 might play a role in mitosis. Q#12270 - CGI_10022260 superfamily 247692 797 1291 1.52E-137 436.49 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#12270 - CGI_10022260 superfamily 245206 1447 1738 5.35E-70 239.09 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12270 - CGI_10022260 superfamily 241838 218 382 5.74E-60 205.57 cl00395 FMT_core superfamily - - "Formyltransferase, catalytic core domain; Formyltransferase, catalytic core domain. The proteins of this superfamily contain a formyltransferase domain that hydrolyzes the removal of a formyl group from its substrate as part of a multistep transfer mechanism, and this alignment model represents the catalytic core of the formyltransferase domain. This family includes the following known members; Glycinamide Ribonucleotide Transformylase (GART), Formyl-FH4 Hydrolase, Methionyl-tRNA Formyltransferase, ArnA, and 10-Formyltetrahydrofolate Dehydrogenase (FDH). Glycinamide Ribonucleotide Transformylase (GART) catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Methionyl-tRNA Formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA, which plays important role in translation initiation. ArnA is required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. 10-formyltetrahydrofolate dehydrogenase (FDH) catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. Members of this family are multidomain proteins. The formyltransferase domain is located at the N-terminus of FDH, Methionyl-tRNA Formyltransferase and ArnA, and at the C-terminus of Formyl-FH4 Hydrolase. Prokaryotic Glycinamide Ribonucleotide Transformylase (GART) is a single domain protein while eukaryotic GART is a trifunctional protein that catalyzes the second, third and fifth steps in de novo purine biosynthesis." Q#12270 - CGI_10022260 superfamily 246712 410 500 1.22E-09 57.6274 cl14785 FMT_C_like superfamily - - "Carboxy-terminal domain of Formyltransferase and similar domains; This family represents the C-terminal domain of formyltransferase and similar proteins. This domain is found in a variety of enzymes with formyl transferase and alkyladenine DNA glycosylase activities. The proteins with formyltransferase function include methionyl-tRNA formyltransferase, ArnA, 10-formyltetrahydrofolate dehydrogenase and HypX proteins. Although most proteins with formyl transferase activity contain this C-terminal domain, prokaryotic glycinamide ribonucleotide transformylase (GART), a single domain protein, only contains the core catalytic domain. Thus, the C-terminal domain is not required for formyl transferase catalytic activity and may be involved in substrate binding. Some members of this family have shown nucleic acid binding capacity. The C-terminal domain of methionyl-tRNA formyltransferase is involved in tRNA binding. Alkyladenine DNA glycosylase is a distant member of this family with very low sequence similarity to other members. It catalyzes the first step in base excision repair (BER) by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site and shows ability to bind to DNA." Q#12270 - CGI_10022260 superfamily 245209 1315 1380 3.04E-08 52.9422 cl09936 PP-binding superfamily - - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#12271 - CGI_10022261 superfamily 241600 214 425 2.76E-94 285.288 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#12271 - CGI_10022261 superfamily 243092 4 156 1.31E-16 78.5308 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12272 - CGI_10022262 superfamily 242406 1 70 2.79E-07 46.0453 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#12274 - CGI_10022264 superfamily 245819 1155 1332 2.26E-65 220.914 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#12274 - CGI_10022264 superfamily 245201 837 1082 5.02E-29 117.723 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12275 - CGI_10005778 superfamily 218200 81 309 2.08E-81 256.525 cl04660 Glyco_transf_54 superfamily - - "N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein." Q#12276 - CGI_10005779 superfamily 202000 29 131 1.64E-09 55.5577 cl03375 XRCC1_N superfamily - - XRCC1 N terminal domain; XRCC1 N terminal domain. Q#12278 - CGI_10005781 superfamily 243035 1 67 6.86E-18 72.6525 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#12280 - CGI_10007543 superfamily 220691 34 157 0.000640171 39.1382 cl18569 7TM_GPCR_Srv superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#12282 - CGI_10007545 superfamily 245847 5 168 1.19E-05 42.1586 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#12283 - CGI_10007546 superfamily 118308 1 50 2.38E-10 50.9082 cl10755 Mitoc_L55 superfamily N - Mitochondrial ribosomal protein L55; Members of this family are involved in mitochondrial biogenesis and G2/M phase cell cycle progression. They form a component of the mitochondrial ribosome large subunit (39S) which comprises a 16S rRNA and about 50 distinct proteins. Q#12285 - CGI_10007548 superfamily 248100 14 70 2.79E-11 57.164 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#12285 - CGI_10007548 superfamily 248100 155 212 5.62E-10 53.6972 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#12286 - CGI_10007549 superfamily 241748 110 154 9.17E-09 51.8224 cl00279 APP_MetAP superfamily C - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#12287 - CGI_10021433 superfamily 248241 1 309 1.05E-93 287.617 cl17687 5_nucleotid superfamily N - "5' nucleotidase family; This family of eukaryotic proteins includes 5' nucleotidase enzymes, such as purine 5'-nucleotidase EC:3.1.3.5." Q#12288 - CGI_10021434 superfamily 241563 60 102 4.46E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12289 - CGI_10021435 superfamily 241780 263 673 0 636.105 cl00319 Gn_AT_II superfamily - - "Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer." Q#12289 - CGI_10021435 superfamily 247740 1067 1437 8.91E-156 487.819 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#12289 - CGI_10021435 superfamily 218318 717 1004 5.88E-150 466.637 cl04830 Glu_syn_central superfamily - - Glutamate synthase central domain; The central domain of glutamate synthase connects the amino terminal amidotransferase domain with the FMN-binding domain and has an alpha / beta overall topology. This domain appears to be a rudimentary form of the FMN-binding TIM barrel according to SCOP. Q#12289 - CGI_10021435 superfamily 241716 1494 1685 3.11E-93 304.835 cl00239 GXGXG superfamily - - "GXGXG domain. This domain of unknown function is found at the C-terminus of the large subunit (gltB) of glutamate synthase (GltS), in subunit C of tungsten formylmethanofuran dehydrogenase (FwdC) and in subunit C of molybdenum formylmethanofuran dehydrogenase (FmdC). It is also found in a primarily archeal group of proteins predicted to encode part of the large subunit of GltS. It is characterized by a repeated GXXGXXXG motif. GltS is a complex iron-sulfur flavoprotein that catalyzes the synthesis of L-glutamate from L-glutamine and 2-oxoglutarate. It requires the transfer of ammonia and electrons among three distinct active centers that carry out L-Gln hydrolysis, conversion of 2-oxoglutarate into L-Glu, and electron uptake from a donor. These catalytic sites occur in other domains within the protein or or encoded by separate genes, and are not present in the domain in this CD. FwdC and FmdC are reversible ion pumps that catalyze the formylation and deformylation of methanofuran in hyperthermophiles and bacteria. They require the presence of either tungstun (FwdC) or molybdenum (FmdC). The specific function of this domain also remains unidentified in the formylmethanofuran dehydrogenases." Q#12289 - CGI_10021435 superfamily 248054 1860 1895 2.24E-09 56.3264 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#12291 - CGI_10021437 superfamily 242385 179 476 0 537.406 cl01244 arom_aa_hydroxylase superfamily - - "Biopterin-dependent aromatic amino acid hydroxylase; a family of non-heme, iron(II)-dependent enzymes that includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH converts L-phenylalanine to L-tyrosine, an important step in phenylalanine catabolism and neurotransmitter biosynthesis, and is linked to a severe variant of phenylketonuria in humans. TyrOH and TrpOH are involved in the biosynthesis of catecholamine and serotonin, respectively. The eukaryotic enzymes are all homotetramers." Q#12291 - CGI_10021437 superfamily 245020 38 166 7.43E-13 65.1116 cl09141 ACT superfamily - - "ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme; Members of this CD belong to the superfamily of ACT regulatory domains. Pairs of ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. The ACT domain has been detected in a number of diverse proteins; some of these proteins are involved in amino acid and purine biosynthesis, phenylalanine hydroxylation, regulation of bacterial metabolism and transcription, and many remain to be characterized. ACT domain-containing enzymes involved in amino acid and purine synthesis are in many cases allosteric enzymes with complex regulation enforced by the binding of ligands. The ACT domain is commonly involved in the binding of a small regulatory molecule, such as the amino acids L-Ser and L-Phe in the case of D-3-phosphoglycerate dehydrogenase and the bifunctional chorismate mutase-prephenate dehydratase enzyme (P-protein), respectively. Aspartokinases typically consist of two C-terminal ACT domains in a tandem repeat, but the second ACT domain is inserted within the first, resulting in, what is normally the terminal beta strand of ACT2, formed from a region N-terminal of ACT1. ACT domain repeats have been shown to have nonequivalent ligand-binding sites with complex regulatory patterns such as those seen in the bifunctional enzyme, aspartokinase-homoserine dehydrogenase (ThrA). In other enzymes, such as phenylalanine hydroxylases, the ACT domain appears to function as a flexible small module providing allosteric regulation via transmission of conformational changes, these conformational changes are not necessarily initiated by regulatory ligand binding at the ACT domain itself. ACT domains are present either singularly, N- or C-terminal, or in pairs present C-terminal or between two catalytic domains. Unique to cyanobacteria are four ACT domains C-terminal to an aspartokinase domain. A few proteins are composed almost entirely of ACT domain repeats as seen in the four ACT domain protein, the ACR protein, found in higher plants; and the two ACT domain protein, the glycine cleavage system transcriptional repressor (GcvR) protein, found in some bacteria. Also seen are single ACT domain proteins similar to the Streptococcus pneumoniae ACT domain protein (uncharacterized pdb structure 1ZPV) found in both bacteria and archaea. Purportedly, the ACT domain is an evolutionarily mobile ligand binding regulatory module that has been fused to different enzymes at various times." Q#12292 - CGI_10021438 superfamily 149674 153 190 0.000199077 40.3929 cl07350 SKG6 superfamily - - Transmembrane alpha-helix domain; SKG6/Axl2 are membrane proteins that show polarised intracellular localisation. SKG6_Tmem is the highly conserved transmembrane alpha-helical domain of SKG6 and Axl2 proteins. The full-length fungal protein has a negative regulatory function in cytokinesis. Q#12293 - CGI_10021439 superfamily 241564 410 477 5.87E-28 106.966 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#12293 - CGI_10021439 superfamily 241564 211 270 1.46E-15 72.2983 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#12293 - CGI_10021439 superfamily 247792 524 562 5.07E-05 41.2772 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12294 - CGI_10021440 superfamily 190261 169 220 0.00890361 34.4466 cl03504 RFX_DNA_binding superfamily C - RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. Q#12296 - CGI_10021442 superfamily 241563 71 109 4.31E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12296 - CGI_10021442 superfamily 241563 28 59 0.00479427 35.5328 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12298 - CGI_10021444 superfamily 217064 7 119 2.14E-20 84.4684 cl03617 CLN3 superfamily N - CLN3 protein; This is a family of proteins from the CLN3 gene. A missense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Q#12299 - CGI_10021445 superfamily 241563 68 109 4.71E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12299 - CGI_10021445 superfamily 241563 28 59 0.000790062 37.844 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12299 - CGI_10021445 superfamily 246954 122 207 0.00290547 38.9386 cl15415 Sec1 superfamily NC - Sec1 family; Sec1 family. Q#12303 - CGI_10021449 superfamily 248097 147 274 2.37E-20 84.239 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12304 - CGI_10021450 superfamily 245213 145 183 5.93E-08 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12304 - CGI_10021450 superfamily 245213 109 143 8.53E-07 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12304 - CGI_10021450 superfamily 222049 10 94 1.36E-06 46.1779 cl16239 Mucin2_WxxW superfamily - - "Mucin-2 protein WxxW repeating region; This family is repeating region found on mucins 2 and 5. The function is not known, but the repeat can be present in up to 32 copies, as in a member from Branchiostoma floridae. The region carries a highly conserved WxxW sequence motif and also has at least six well conserved cysteine residues." Q#12307 - CGI_10021453 superfamily 242274 6 70 1.69E-06 42.2226 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#12309 - CGI_10021456 superfamily 241574 551 700 1.08E-56 193.571 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#12309 - CGI_10021456 superfamily 243035 185 298 4.22E-06 45.6886 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#12309 - CGI_10021456 superfamily 245226 112 148 0.00127752 38.8209 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#12310 - CGI_10021457 superfamily 216363 32 97 8.24E-12 56.7098 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#12312 - CGI_10004303 superfamily 243072 76 200 4.45E-25 98.995 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12312 - CGI_10004303 superfamily 243072 179 312 1.11E-18 81.661 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12312 - CGI_10004303 superfamily 243072 8 129 4.37E-18 80.1202 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12312 - CGI_10004303 superfamily 243073 395 428 0.00694432 34.3935 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#12317 - CGI_10019563 superfamily 243074 620 660 3.60E-11 60.2129 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#12318 - CGI_10019564 superfamily 241995 7 347 2.08E-111 328.081 cl00635 Ntn_Asparaginase_2_like superfamily - - "Ntn-hydrolase superfamily, L-Asparaginase type 2-like enzymes. This family includes Glycosylasparaginase, Taspase 1 and L-Asparaginase type 2 enzymes. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue." Q#12319 - CGI_10019565 superfamily 201806 53 285 5.17E-21 94.8581 cl18220 Peptidase_M8 superfamily NC - Leishmanolysin; Leishmanolysin. Q#12321 - CGI_10019567 superfamily 215754 175 266 7.83E-26 99.6352 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12321 - CGI_10019567 superfamily 215754 270 356 9.91E-22 88.0792 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12321 - CGI_10019567 superfamily 215754 69 162 2.21E-21 86.9236 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12322 - CGI_10019568 superfamily 243085 158 209 1.65E-13 65.0758 cl02557 DM superfamily - - "DM DNA binding domain; The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerise and bind palindromic DNA." Q#12324 - CGI_10019570 superfamily 247724 1 157 1.68E-80 238.983 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12326 - CGI_10019572 superfamily 243072 93 241 3.49E-22 92.8318 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12326 - CGI_10019572 superfamily 243072 19 143 3.40E-14 70.105 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12326 - CGI_10019572 superfamily 243072 482 592 1.80E-06 46.6078 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12327 - CGI_10019573 superfamily 215648 362 582 7.50E-21 91.8883 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#12327 - CGI_10019573 superfamily 217211 179 225 0.00608493 35.3378 cl03691 Cache_1 superfamily C - Cache domain; Cache domain. Q#12328 - CGI_10019574 superfamily 245213 329 365 5.21E-09 53.4094 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12328 - CGI_10019574 superfamily 245213 240 268 5.31E-05 41.4682 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12329 - CGI_10019575 superfamily 222269 118 234 1.21E-06 48.4738 cl18657 Cupin_8 superfamily N - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#12330 - CGI_10019576 superfamily 242406 69 219 9.51E-51 164.302 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#12331 - CGI_10019577 superfamily 243072 788 913 5.01E-36 133.663 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12331 - CGI_10019577 superfamily 243072 710 847 1.41E-27 109.395 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12331 - CGI_10019577 superfamily 247743 225 379 0.00061202 39.8756 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#12332 - CGI_10019578 superfamily 216901 1638 1812 1.59E-62 213.987 cl03466 Rap_GAP superfamily - - Rap/ran-GAP; Rap/ran-GAP. Q#12333 - CGI_10019579 superfamily 242443 78 480 5.55E-160 462.761 cl01342 Peptidase_A22B superfamily - - "Signal peptide peptidase; The members of this family are membrane proteins. In some proteins this region is found associated with pfam02225. This family corresponds with Merops subfamily A22B, the type example of which is signal peptide peptidase. There is a sequence-similarity relationship with pfam01080." Q#12334 - CGI_10019581 superfamily 241749 42 175 1.75E-10 55.0845 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#12335 - CGI_10019582 superfamily 247724 18 200 5.94E-63 196.226 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12336 - CGI_10019583 superfamily 206009 341 387 6.35E-17 76.4378 cl16430 Clathrin_H_link superfamily C - "Clathrin-H-link; This short domain is found on clathrins, and often appears on proteins directly downstream from the Clathrin-link domain pfam09268." Q#12337 - CGI_10019584 superfamily 247743 170 336 5.25E-19 82.9643 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#12338 - CGI_10019585 superfamily 241578 406 571 1.50E-40 149.686 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12338 - CGI_10019585 superfamily 241578 731 875 9.16E-29 115.467 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12338 - CGI_10019585 superfamily 241578 594 721 7.25E-28 112.77 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12338 - CGI_10019585 superfamily 241578 27 190 3.15E-42 154.752 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12338 - CGI_10019585 superfamily 241578 905 1078 7.05E-36 136.749 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12338 - CGI_10019585 superfamily 241578 218 387 1.08E-32 127.504 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12338 - CGI_10019585 superfamily 248012 1428 1504 6.68E-11 61.4397 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#12338 - CGI_10019585 superfamily 248012 1271 1370 2.48E-09 57.3332 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#12339 - CGI_10019586 superfamily 245202 25 102 3.87E-43 139.872 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#12340 - CGI_10019587 superfamily 247684 19 188 0.000455431 39.8796 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#12341 - CGI_10019588 superfamily 245201 1189 1403 1.36E-33 131.59 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12341 - CGI_10019588 superfamily 241739 1473 1788 5.75E-27 113.081 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#12341 - CGI_10019588 superfamily 216167 8 173 2.26E-34 131.94 cl02999 DNA_photolyase superfamily - - DNA photolyase; This domain binds a light harvesting cofactor. Q#12341 - CGI_10019588 superfamily 243141 521 639 1.92E-21 92.7646 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#12341 - CGI_10019588 superfamily 245201 803 1016 6.46E-17 82.2197 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12341 - CGI_10019588 superfamily 241738 1801 1895 0.000337746 40.986 cl00266 HGTP_anticodon superfamily - - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#12342 - CGI_10019589 superfamily 241599 65 120 1.76E-15 69.1944 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#12345 - CGI_10019592 superfamily 241750 174 374 7.18E-39 140.401 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#12346 - CGI_10019593 superfamily 241588 28 131 6.10E-21 81.9449 cl00070 GHB_like superfamily - - "Glycoprotein hormone beta chain homologues; This family of cystine-knot hormones includes the beta chains of gonadotropins, thyrotropins, follitropins, choriogonadotropins and more. The members are reproductive hormones that consist of two glycosylated chains (alpha and beta), which form a tightly bound dimer." Q#12353 - CGI_10019600 superfamily 246910 38 85 0.00645085 31.567 cl15257 GIY-YIG_SF superfamily N - "GIY-YIG nuclease domain superfamily; The GIY-YIG nuclease domain superfamily includes a large and diverse group of proteins involved in many cellular processes, such as class I homing GIY-YIG family endonucleases, prokaryotic nucleotide excision repair proteins UvrC and Cho, type II restriction enzymes, the endonuclease/reverse transcriptase of eukaryotic retrotransposable elements, and a family of eukaryotic enzymes that repair stalled replication forks. All of these members contain a conserved GIY-YIG nuclease domain that may serve as a scaffold for the coordination of a divalent metal ion required for catalysis of the phosphodiester bond cleavage. By combining with different specificity, targeting, or other domains, the GIY-YIG nucleases may perform different functions." Q#12354 - CGI_10001746 superfamily 245814 173 240 4.07E-07 47.0987 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12354 - CGI_10001746 superfamily 245814 98 152 0.000598067 37.5566 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12356 - CGI_10004698 superfamily 241547 4 100 1.79E-33 117.649 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#12358 - CGI_10004700 superfamily 241599 152 211 3.08E-13 64.9572 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#12365 - CGI_10004978 superfamily 216363 51 134 1.25E-12 59.7914 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#12369 - CGI_10004982 superfamily 110440 340 366 3.35E-05 41.2393 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#12372 - CGI_10002584 superfamily 241754 20 701 0 812.652 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#12372 - CGI_10002584 superfamily 247725 1504 1619 6.64E-41 149.472 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12372 - CGI_10002584 superfamily 247725 2013 2123 2.36E-31 121.752 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12372 - CGI_10002584 superfamily 243052 1700 1854 6.39E-26 107.061 cl02480 MyTH4 superfamily - - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#12372 - CGI_10002584 superfamily 243052 1157 1300 3.21E-20 90.4975 cl02480 MyTH4 superfamily - - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#12372 - CGI_10002584 superfamily 215882 1420 1528 6.99E-14 70.7726 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#12372 - CGI_10002584 superfamily 241645 1304 1372 0.00170721 39.0428 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#12372 - CGI_10002584 superfamily 215882 1947 2036 0.00537274 37.6455 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#12373 - CGI_10013333 superfamily 215647 296 497 0.000188851 41.8253 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#12374 - CGI_10013334 superfamily 241578 942 1083 2.89E-13 70.1157 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12374 - CGI_10013334 superfamily 241578 26 193 5.43E-09 56.6337 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12374 - CGI_10013334 superfamily 219821 849 888 1.62E-06 48.5214 cl07136 VWA_N superfamily N - "VWA N-terminal; This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits." Q#12374 - CGI_10013334 superfamily 217211 257 322 4.82E-06 46.5086 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#12374 - CGI_10013334 superfamily 217211 1136 1201 1.80E-05 44.9678 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#12376 - CGI_10013336 superfamily 241874 59 589 1.61E-121 376.921 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#12377 - CGI_10013338 superfamily 220692 25 332 4.30E-32 123.468 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#12379 - CGI_10013340 superfamily 245213 461 497 1.06E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12379 - CGI_10013340 superfamily 245213 537 573 2.48E-06 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12379 - CGI_10013340 superfamily 245213 347 383 3.82E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12379 - CGI_10013340 superfamily 245213 309 345 4.23E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12379 - CGI_10013340 superfamily 245213 499 535 5.72E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12379 - CGI_10013340 superfamily 245213 271 307 6.97E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12379 - CGI_10013340 superfamily 245213 423 459 1.34E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12379 - CGI_10013340 superfamily 245213 385 421 1.48E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12379 - CGI_10013340 superfamily 205157 234 269 4.09E-05 42.1395 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#12380 - CGI_10013341 superfamily 245213 436 472 2.67E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 170 206 4.23E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 398 434 5.73E-07 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 284 320 7.71E-07 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 208 244 1.88E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 360 396 3.42E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 132 168 6.25E-06 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 246 282 8.64E-06 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 322 358 2.73E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 245213 94 130 0.000169549 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12380 - CGI_10013341 superfamily 243060 479 536 3.69E-06 45.4476 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#12381 - CGI_10013342 superfamily 245213 311 347 9.99E-08 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12381 - CGI_10013342 superfamily 245213 425 461 9.99E-08 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12381 - CGI_10013342 superfamily 245213 235 271 1.85E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12381 - CGI_10013342 superfamily 245213 273 309 3.17E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12381 - CGI_10013342 superfamily 245213 387 423 3.17E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12381 - CGI_10013342 superfamily 245213 349 384 7.49E-07 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12381 - CGI_10013342 superfamily 245213 463 498 7.49E-07 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12381 - CGI_10013342 superfamily 245213 501 536 6.97E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12381 - CGI_10013342 superfamily 243035 204 231 0.00539695 35.8148 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#12383 - CGI_10013344 superfamily 245201 64 232 1.17E-10 59.9429 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12384 - CGI_10020759 superfamily 248264 270 438 3.45E-19 88.8333 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#12385 - CGI_10020760 superfamily 241583 116 197 3.97E-13 65.6702 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#12387 - CGI_10020762 superfamily 246680 235 293 6.34E-05 40.0114 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#12389 - CGI_10020764 superfamily 246597 7 299 0 674.462 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#12393 - CGI_10020769 superfamily 246613 23 246 9.79E-147 422.889 cl14058 lectin_L-type superfamily - - "legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely." Q#12394 - CGI_10020770 superfamily 247724 30 295 4.41E-141 412.409 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12394 - CGI_10020770 superfamily 216255 217 511 6.26E-126 374.188 cl03076 Dynamin_M superfamily - - "Dynamin central region; This region lies between the GTPase domain, see pfam00350, and the pleckstrin homology (PH) domain, see pfam00169." Q#12395 - CGI_10020771 superfamily 243061 134 234 8.24E-42 139.785 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#12400 - CGI_10020776 superfamily 248458 304 431 1.08E-05 46.5381 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#12401 - CGI_10020777 superfamily 207411 100 136 3.37E-10 54.3729 cl01438 zf-AN1 superfamily - - "AN1-like Zinc finger; Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues." Q#12401 - CGI_10020777 superfamily 207411 10 52 5.24E-10 53.6025 cl01438 zf-AN1 superfamily - - "AN1-like Zinc finger; Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues." Q#12401 - CGI_10020777 superfamily 145783 209 226 0.00145513 35.4856 cl03724 UIM superfamily - - Ubiquitin interaction motif; This motif is called the ubiquitin interaction motif. One of the proteins containing this motif is a receptor for poly-ubiquitination chains for the proteasome. This motif has a pattern of conservation characteristic of an alpha helix. Q#12402 - CGI_10020778 superfamily 203593 130 250 4.59E-21 86.5854 cl18243 Mod_r superfamily - - "Modifier of rudimentary (Mod(r)) protein; This family represents a conserved region approximately 150 residues long within a number of eukaryotic proteins that show homology with Drosophila melanogaster Modifier of rudimentary (Mod(r)) proteins. The N-terminal half of Mod(r) proteins is acidic, whereas the C-terminal half is basic, and both of these regions are represented in this family. Members of this family include the Vps37 subunit of the endosomal sorting complex ESCRT-I, a complex involved in recruiting transport machinery for protein sorting at the multivesicular body (MVB). The yeast ESCRT-I complex consists of three proteins (Vps23, Vps28 and Vps37). The mammalian homologue of Vps37 interacts with Tsg101 (Pfam: PF05743) through its mod(r) domain and its function is essential for lysosomal sorting of EGF receptors." Q#12403 - CGI_10020779 superfamily 248145 11 264 5.14E-118 344.211 cl17591 CAF1 superfamily - - CAF1 family ribonuclease; The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localises to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom#resolution. Q#12404 - CGI_10020780 superfamily 243092 36 287 1.72E-24 100.872 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12404 - CGI_10020780 superfamily 192471 285 325 5.53E-07 46.9906 cl10872 DUF2372 superfamily C - Uncharacterized conserved protein (DUF2372); This family consists of proteins found from plants to humans. The function is not known. Q#12405 - CGI_10020781 superfamily 241609 566 647 1.36E-37 137.124 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#12405 - CGI_10020781 superfamily 245201 736 1016 1.10E-151 455.06 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12405 - CGI_10020781 superfamily 243092 1 197 6.34E-31 124.37 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12405 - CGI_10020781 superfamily 243040 423 552 2.52E-22 95.1649 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#12405 - CGI_10020781 superfamily 245814 214 292 3.08E-05 43.2629 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12406 - CGI_10020782 superfamily 199940 6 148 1.22E-100 288.006 cl03715 Mago_nashi superfamily - - "Mago nashi proteins, integral members of the exon junction complex; Members of this family, which was originally identified in Drosophila and called mago nashi, are integral members of the exon junction complex (EJC). The EJC is a multiprotein complex that is deposited on spliced mRNAs after intron removal at a conserved position upstream of the exon-exon junction, and transported to the cytoplasm where it has been shown to influence translation, surveillance, and localization of the spliced mRNA. It consists of four core proteins (eIF4AIII, Barentsz [Btz], Mago, and Y14), mRNA, and ATP and is supposed to be a binding platform for more peripherally and transiently associated factors along mRNA travel. Mago and Y14 form a stable heterodimer that stabilizes the complex by inhibiting eIF4AIII's ATPase activity. In humans, but not Drosophila, EJC is involved in nonsense-mediated mRNA decay (NMD) via binding to Upf3b, a central NMD effector. EJC is stripped off the mRNA during the first round of translation and then the complex components are transported back into the nucleus and recycled. The Mago-Y14 heterodimer has been shown to interact with the cytoplasmic protein PYM, an EJC disassembly factor, and specifically binds to the karyopherin nuclear receptor importin 13." Q#12409 - CGI_10020785 superfamily 218611 246 409 1.54E-54 182.368 cl05191 DMAP1 superfamily - - "DNA methyltransferase 1-associated protein 1 (DMAP1); DNA methylation can contribute to transcriptional silencing through several transcriptionally repressive complexes, which include methyl-CpG binding domain proteins (MBDs) and histone deacetylases (HDACs). The chief enzyme that maintains mammalian DNA methylation, DNMT1, can also establish a repressive transcription complex. The non-catalytic amino terminus of DNMT1 binds to HDAC2 and DMAP1 (for DNMT1 associated protein), and can mediate transcriptional repression. DMAP1 has intrinsic transcription repressive activity, and binds to the transcriptional co-repressor TSG101. DMAP1 is targeted to replication foci through interaction with the far N terminus of DNMT1 throughout S phase, whereas HDAC2 joins DNMT1 and DMAP1 only during late S phase, providing a platform for how histones may become deacetylated in heterochromatin following replication." Q#12409 - CGI_10020785 superfamily 212556 155 202 4.55E-14 67.0478 cl18296 SANT_DMAP1_like superfamily - - "SANT/myb-like domain of Human Dna Methyltransferase 1 Associated Protein 1-like; These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#12410 - CGI_10020786 superfamily 247899 126 233 2.41E-05 45.3327 cl17345 AccA superfamily NC - Acetyl-CoA carboxylase alpha subunit [Lipid metabolism] Q#12413 - CGI_10020789 superfamily 248193 12 352 4.73E-74 239.85 cl17639 MiaA superfamily - - "tRNA delta(2)-isopentenylpyrophosphate transferase [Translation, ribosomal structure and biogenesis]" Q#12414 - CGI_10020790 superfamily 241596 343 402 1.07E-11 60.3055 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#12414 - CGI_10020790 superfamily 216269 62 228 1.04E-20 91.1496 cl03082 Myc_N superfamily C - "Myc amino-terminal region; The myc family belongs to the basic helix-loop-helix leucine zipper class of transcription factors, see pfam00010. Myc forms a heterodimer with Max, and this complex regulates cell growth through direct activation of genes involved in cell replication. Mutations in the C-terminal 20 residues of this domain cause unique changes in the induction of apoptosis, transformation, and G2 arrest." Q#12415 - CGI_10020791 superfamily 247725 1 100 7.67E-12 60.3862 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12416 - CGI_10020792 superfamily 241884 669 759 1.50E-59 200.549 cl00467 Ntn_hydrolase superfamily N - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#12416 - CGI_10020792 superfamily 247792 16 59 6.76E-07 47.4404 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12416 - CGI_10020792 superfamily 241563 129 163 3.63E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12417 - CGI_10020793 superfamily 241592 303 373 5.69E-06 43.8559 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#12417 - CGI_10020793 superfamily 247999 12 57 0.000136625 39.5026 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#12419 - CGI_10001710 superfamily 241563 89 124 0.000493591 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12420 - CGI_10017354 superfamily 243086 237 284 7.90E-16 72.0225 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#12420 - CGI_10017354 superfamily 215647 331 438 7.17E-12 64.1668 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#12421 - CGI_10017355 superfamily 245201 630 877 3.73E-113 350.584 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12421 - CGI_10017355 superfamily 243051 414 506 0.000207782 41.2094 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#12421 - CGI_10017355 superfamily 241613 311 329 0.00645382 35.7346 cl00104 LDLa superfamily C - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#12422 - CGI_10017356 superfamily 220744 170 307 8.84E-30 115.407 cl11075 OAS1_C superfamily - - "2'-5'-oligoadenylate synthetase 1, domain 2, C-terminus; This is the largely alpha-helical, C-terminal half of 2'-5'-oligoadenylate synthetase 1, being described as domain 2 of the enzyme and homologous to a tandem ubiquitin repeat. It carries the region of enzymic activity between 320 and 344 at the extreme C-terminal end. Oligoadenylate synthetases are antiviral enzymes that counteract vial attack by degrading viral RNA. The enzyme uses ATP in 2'-specific nucleotidyl transfer reactions to synthesise 2'.5'-oligoadenylates, which activate latent ribonuclease, resulting in degradation of viral RNA and inhibition of virus replication. This domain is often associated with NTP_transf_2 pfam01909." Q#12422 - CGI_10017356 superfamily 245818 105 156 5.54E-05 42.0017 cl11966 Rel-Spo_like superfamily C - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#12423 - CGI_10017357 superfamily 247744 2 45 2.17E-07 45.2965 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#12423 - CGI_10017357 superfamily 247744 42 85 5.31E-07 44.1409 cl17190 NK superfamily N - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#12427 - CGI_10017361 superfamily 241624 131 478 8.56E-38 138.999 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#12428 - CGI_10017362 superfamily 243146 80 134 8.77E-06 44.5875 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12428 - CGI_10017362 superfamily 243146 206 252 9.71E-05 41.5059 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12428 - CGI_10017362 superfamily 243146 135 197 0.000192922 40.7355 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12428 - CGI_10017362 superfamily 243146 253 319 0.00038138 39.5799 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12428 - CGI_10017362 superfamily 243146 21 58 0.000388436 39.567 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#12430 - CGI_10017364 superfamily 248097 48 171 6.85E-15 67.2902 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12431 - CGI_10017365 superfamily 248097 48 149 1.27E-12 61.127 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12432 - CGI_10017366 superfamily 243066 46 151 5.53E-17 76.5021 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#12432 - CGI_10017366 superfamily 198867 189 267 0.000103347 40.4025 cl06652 BACK superfamily N - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#12436 - CGI_10017370 superfamily 241577 213 403 1.88E-148 422.415 cl00056 MH2 superfamily - - "C-terminal Mad Homology 2 (MH2) domain; The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers." Q#12436 - CGI_10017370 superfamily 241576 8 130 1.35E-75 233.576 cl00055 MH1 superfamily - - "N-terminal Mad Homology 1 (MH1) domain; The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. Receptor-regulated SMAD proteins (R-SMADs, including SMAD1, SMAD2, SMAD3, SMAD5, and SMAD9) are activated by phosphorylation by transforming growth factor (TGF)-beta type I receptors. The active R-SMAD associates with a common mediator SMAD (Co-SMAD or SMAD4) and other cofactors, which together translocate to the nucleus to regulate gene expression. The inhibitory or antagonistic SMADs (I-SMADs, including SMAD6 and SMAD7) negatively regulate TGF-beta signaling by competing with R-SMADs for type I receptor or Co-SMADs. MH1 domains of R-SMAD and SMAD4 contain a nuclear localization signal as well as DNA-binding activity. The activated R-SMAD/SMAD4 complex then binds with very low affinity to a DNA sequence CAGAC called SMAD-binding element (SBE) via the MH1 domain." Q#12437 - CGI_10017371 superfamily 219542 39 139 1.08E-37 135.062 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12437 - CGI_10017371 superfamily 215896 153 299 1.06E-19 85.8096 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#12437 - CGI_10017371 superfamily 219541 381 521 8.62E-11 59.7895 cl18516 Cu-oxidase_2 superfamily C - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12438 - CGI_10017372 superfamily 219542 39 139 2.69E-38 127.743 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12440 - CGI_10017374 superfamily 241577 12 192 4.52E-79 236.892 cl00056 MH2 superfamily - - "C-terminal Mad Homology 2 (MH2) domain; The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers." Q#12441 - CGI_10017375 superfamily 219542 150 253 8.50E-36 131.595 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12441 - CGI_10017375 superfamily 219541 512 660 7.41E-26 104.087 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12441 - CGI_10017375 superfamily 215896 261 390 2.20E-12 65.394 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#12442 - CGI_10017376 superfamily 216363 69 153 4.60E-19 77.8958 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#12443 - CGI_10017377 superfamily 219542 87 187 1.44E-38 139.299 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12443 - CGI_10017377 superfamily 219541 436 606 1.63E-21 91.7611 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12443 - CGI_10017377 superfamily 215896 201 323 1.90E-15 74.2536 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#12444 - CGI_10004147 superfamily 241583 365 405 0.00278564 37.6031 cl00064 ZnMc superfamily NC - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#12446 - CGI_10004149 superfamily 241568 16 59 4.06E-09 50.154 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12446 - CGI_10004149 superfamily 241568 72 121 2.28E-07 45.5316 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12449 - CGI_10005590 superfamily 245864 488 834 4.02E-61 214.45 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12449 - CGI_10005590 superfamily 241567 24 290 9.76E-20 89.2198 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#12450 - CGI_10005591 superfamily 215733 179 410 3.56E-55 192.01 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#12450 - CGI_10005591 superfamily 216063 832 1035 5.72E-28 112.327 cl02929 Cation_ATPase_C superfamily - - "Cation transporting ATPase, C-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport. This family represents 5 transmembrane helices." Q#12450 - CGI_10005591 superfamily 222006 470 547 1.28E-24 99.9894 cl16182 Hydrolase_like2 superfamily - - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#12450 - CGI_10005591 superfamily 243244 84 158 3.14E-22 92.6458 cl02930 Cation_ATPase_N superfamily - - "Cation transporter/ATPase, N-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport." Q#12450 - CGI_10005591 superfamily 226572 658 779 0.000294953 41.0052 cl18761 COG4087 superfamily - - Soluble P-type ATPase [General function prediction only] Q#12451 - CGI_10005592 superfamily 243179 212 276 4.64E-06 44.2794 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#12451 - CGI_10005592 superfamily 243179 310 374 4.64E-06 44.2794 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#12452 - CGI_10005593 superfamily 243179 158 272 1.53E-11 59.8539 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#12455 - CGI_10011622 superfamily 216347 401 819 8.28E-130 397.677 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#12455 - CGI_10011622 superfamily 202361 253 353 1.93E-09 56.1697 cl03680 Cu_amine_oxidN3 superfamily - - "Copper amine oxidase, N3 domain; This domain is the second or third structural domain in copper amine oxidases, it is known as the N3 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ)." Q#12458 - CGI_10011625 superfamily 241632 4 368 5.00E-134 390.46 cl00137 SERPIN superfamily - - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#12459 - CGI_10011626 superfamily 241594 415 468 9.92E-10 58.7304 cl00077 HECTc superfamily C - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#12460 - CGI_10011627 superfamily 241632 4 368 1.99E-132 386.223 cl00137 SERPIN superfamily - - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#12461 - CGI_10011628 superfamily 245814 85 142 1.34E-07 49.3275 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12461 - CGI_10011628 superfamily 245814 164 230 5.66E-06 44.654 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12461 - CGI_10011628 superfamily 245814 12 49 0.00562112 35.4604 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12462 - CGI_10011629 superfamily 245814 240 308 5.92E-07 46.2154 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12463 - CGI_10011630 superfamily 245213 1483 1516 0.000205827 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 2114 2145 0.000216586 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 1711 1746 0.000608165 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 813 855 0.000667189 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 2344 2386 0.00125994 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 664 697 0.00287456 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 1213 1248 0.00320881 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 395 426 0.00390404 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 1626 1661 0.00635158 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 1836 1866 0.00734443 37.231 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 519 552 0.0093357 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 216290 2598 2691 1.11E-27 111.997 cl03089 Cu2_monooxygen superfamily N - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#12463 - CGI_10011630 superfamily 205157 1258 1293 4.58E-07 49.4583 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#12463 - CGI_10011630 superfamily 241578 2262 2302 1.80E-06 50.076 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12463 - CGI_10011630 superfamily 241578 2300 2341 1.97E-06 50.076 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12463 - CGI_10011630 superfamily 201391 869 910 2.50E-05 44.6298 cl02926 TB superfamily - - TB domain; This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. Q#12463 - CGI_10011630 superfamily 201391 111 153 3.60E-05 43.8594 cl02926 TB superfamily - - TB domain; This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. Q#12463 - CGI_10011630 superfamily 241578 352 392 0.000117855 44.6832 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12463 - CGI_10011630 superfamily 241578 2032 2070 0.000137264 44.6832 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12463 - CGI_10011630 superfamily 241578 1084 1124 0.000159379 44.298 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12463 - CGI_10011630 superfamily 201391 761 796 0.000237859 41.5482 cl02926 TB superfamily - - TB domain; This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. Q#12463 - CGI_10011630 superfamily 201391 1546 1583 0.000282849 41.163 cl02926 TB superfamily - - TB domain; This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. Q#12463 - CGI_10011630 superfamily 245213 918 958 0.000429486 40.7952 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 241578 204 245 0.000744109 42.372 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12463 - CGI_10011630 superfamily 201391 1891 1934 0.00105576 39.6222 cl02926 TB superfamily - - TB domain; This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. Q#12463 - CGI_10011630 superfamily 201391 2168 2209 0.001144 39.6222 cl02926 TB superfamily - - TB domain; This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. Q#12463 - CGI_10011630 superfamily 245213 1044 1084 0.00223751 38.484 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 1951 1992 0.00238577 38.382 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 201391 572 616 0.00324116 38.0814 cl02926 TB superfamily - - TB domain; This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. Q#12463 - CGI_10011630 superfamily 245213 1752 1793 0.00347936 37.9968 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 241578 999 1039 0.00359476 40.0608 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12463 - CGI_10011630 superfamily 221695 334 357 0.00448407 37.8198 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#12463 - CGI_10011630 superfamily 245213 706 734 0.00557619 37.6116 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 2222 2262 0.0061924 37.3284 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 1794 1834 0.0082691 36.9432 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12463 - CGI_10011630 superfamily 245213 623 662 0.00836722 36.9432 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12464 - CGI_10011631 superfamily 241580 78 155 1.70E-47 156.175 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#12466 - CGI_10011633 superfamily 246669 507 640 1.03E-18 83.4016 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#12466 - CGI_10011633 superfamily 246669 373 493 1.98E-18 82.3047 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#12467 - CGI_10011634 superfamily 241659 55 120 7.57E-11 55.6039 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#12467 - CGI_10011634 superfamily 241659 153 221 2.94E-09 51.3667 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#12469 - CGI_10011636 superfamily 245206 6 266 2.67E-99 294.879 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12472 - CGI_10011639 superfamily 248097 207 329 3.55E-19 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12473 - CGI_10011640 superfamily 201383 116 215 1.15E-29 107.574 cl02923 Ribosomal_L5_C superfamily - - ribosomal L5P family C-terminus; This region is found associated with pfam00281. Q#12473 - CGI_10011640 superfamily 109342 59 112 5.34E-16 69.2676 cl08254 Ribosomal_L5 superfamily - - Ribosomal protein L5; Ribosomal protein L5. Q#12476 - CGI_10010327 superfamily 241575 72 137 4.57E-09 53.8155 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#12476 - CGI_10010327 superfamily 241575 182 242 1.14E-07 49.5783 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#12476 - CGI_10010327 superfamily 243132 306 645 6.08E-74 243.02 cl02661 A_deamin superfamily - - "Adenosine-deaminase (editase) domain; Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defence against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc." Q#12478 - CGI_10010329 superfamily 243092 312 427 9.54E-11 61.1968 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12479 - CGI_10010330 superfamily 242406 153 300 5.69E-25 98.4324 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#12481 - CGI_10010332 superfamily 241563 40 80 7.05E-07 46.7036 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12482 - CGI_10010333 superfamily 243092 19 259 7.72E-43 155.186 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12482 - CGI_10010333 superfamily 222150 388 412 8.72E-05 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12482 - CGI_10010333 superfamily 222150 416 438 9.36E-05 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12482 - CGI_10010333 superfamily 222150 445 468 0.000104732 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12483 - CGI_10010334 superfamily 247916 115 172 1.84E-05 43.9107 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#12484 - CGI_10010335 superfamily 247916 127 185 1.10E-06 47.7627 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#12485 - CGI_10010336 superfamily 247916 131 193 1.03E-08 53.9259 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#12486 - CGI_10010337 superfamily 247916 129 187 1.08E-05 44.6811 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#12490 - CGI_10010342 superfamily 199166 64 205 9.42E-30 112.036 cl15308 AMN1 superfamily C - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#12491 - CGI_10010343 superfamily 243047 14 118 7.01E-44 152.11 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#12493 - CGI_10008109 superfamily 247866 37 253 6.20E-48 162.237 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#12494 - CGI_10008110 superfamily 246669 110 220 1.35E-54 181.61 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#12494 - CGI_10008110 superfamily 246669 1 98 1.27E-38 137.701 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#12494 - CGI_10008110 superfamily 241578 232 486 1.58E-102 312.769 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12496 - CGI_10001409 superfamily 241574 227 369 1.34E-10 59.5217 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#12496 - CGI_10001409 superfamily 238012 29 61 0.00387515 35.0226 cl11390 EGF_Lam superfamily N - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#12497 - CGI_10010732 superfamily 213107 27 61 8.62E-07 44.1881 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#12498 - CGI_10010733 superfamily 216966 23 159 8.06E-24 93.9278 cl03523 HORMA superfamily N - "HORMA domain; The HORMA (for Hop1p, Rev7p and MAD2) domain has been suggested to recognise chromatin states that result from DNA adducts, double stranded breaks or non-attachment to the spindle and acts as an adaptor that recruits other proteins. MAD2 is a spindle checkpoint protein which prevents progression of the cell cycle upon detection of a defect in mitotic spindle integrity." Q#12503 - CGI_10010738 superfamily 245201 53 300 6.08E-56 183.592 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12505 - CGI_10010740 superfamily 241805 70 145 1.54E-25 94.107 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#12505 - CGI_10010740 superfamily 191937 1 60 3.71E-33 113.057 cl06898 Ribosomal_S13_N superfamily - - Ribosomal S13/S15 N-terminal domain; This domain is found at the N-terminus of ribosomal S13 and S15 proteins. This domain is also identified as NUC021. Q#12506 - CGI_10010741 superfamily 245206 1 309 1.34E-138 405.143 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12506 - CGI_10010741 superfamily 176932 334 425 1.49E-34 124.975 cl03838 FAR_C superfamily - - "C-terminal domain of fatty acyl CoA reductases; C-terminal domain of fatty acyl CoA reductases, a family of SDR-like proteins. SDRs or short-chain dehydrogenases/reductases are Rossmann-fold NAD(P)H-binding proteins. Many proteins in this FAR_C family may function as fatty acyl-CoA reductases (FARs), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as the biosynthesis of insect pheromones, plant cuticular wax production, and mammalian wax biosynthesis. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols. The function of this C-terminal domain is unclear." Q#12507 - CGI_10010742 superfamily 247824 36 271 1.16E-76 237.961 cl17270 APH_ChoK_like superfamily - - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#12508 - CGI_10010743 superfamily 247069 6 158 1.09E-14 72.7614 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#12513 - CGI_10001294 superfamily 219542 37 149 6.77E-36 130.825 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12513 - CGI_10001294 superfamily 219541 465 602 5.07E-22 92.5315 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#12513 - CGI_10001294 superfamily 215896 157 340 3.44E-18 81.9576 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#12516 - CGI_10019783 superfamily 241622 935 1035 2.12E-19 85.311 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#12516 - CGI_10019783 superfamily 241622 833 916 3.61E-17 78.7626 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#12516 - CGI_10019783 superfamily 241622 1127 1184 2.05E-16 76.4514 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#12516 - CGI_10019783 superfamily 241622 410 487 8.15E-08 51.4135 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#12516 - CGI_10019783 superfamily 241622 237 314 1.30E-06 47.9467 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#12516 - CGI_10019783 superfamily 241647 137 164 2.82E-05 42.9002 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#12517 - CGI_10019784 superfamily 241622 32 96 1.90E-06 44.4799 cl00117 PDZ superfamily C - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#12517 - CGI_10019784 superfamily 247744 133 207 5.75E-24 95.8227 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#12519 - CGI_10019786 superfamily 241563 68 104 5.44E-06 42.4664 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12519 - CGI_10019786 superfamily 241563 10 47 0.000117813 38.8575 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12523 - CGI_10019790 superfamily 247044 9 119 2.21E-42 138.153 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#12524 - CGI_10019791 superfamily 246669 127 254 7.91E-11 60.495 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#12529 - CGI_10019796 superfamily 198898 2 59 2.36E-24 95.1274 cl07406 c-SKI_SMAD_bind superfamily N - c-SKI Smad4 binding domain; c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4 Q#12530 - CGI_10019797 superfamily 245213 544 583 2.84E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12530 - CGI_10019797 superfamily 246918 397 452 1.52E-10 58.7523 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12530 - CGI_10019797 superfamily 245814 464 541 3.09E-07 49.8113 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12530 - CGI_10019797 superfamily 246918 331 374 3.22E-05 43.3443 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12532 - CGI_10019799 superfamily 241706 97 178 2.43E-35 120.74 cl00229 eIF1_SUI1_like superfamily - - "Eukaryotic initiation factor 1 and related proteins; Members of the eIF1/SUI1 (eukaryotic initiation factor 1) family are found in eukaryotes, archaea, and some bacteria; eukaryotic members are understood to play an important role in accurate initiator codon recognition during translation initiation. eIF1 interacts with 18S rRNA in the 40S ribosomal subunit during eukaryotic translation initiation. Point mutations in the yeast eIF1 implicate the protein in maintaining accurate start-site selection but its mechanism of action is unknown. The function of non-eukaryotic family members is also unclear." Q#12534 - CGI_10019801 superfamily 247741 48 378 0 622.347 cl17187 Aldolase_Class_I superfamily - - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#12535 - CGI_10019802 superfamily 242670 11 148 3.95E-60 185.499 cl01729 VKOR superfamily - - "Vitamin K epoxide reductase (VKOR) family; VKOR (also named VKORC1) is an integral membrane protein that catalyzes the reduction of vitamin K 2,3-epoxide and vitamin K to vitamin K hydroquinone, an essential co-factor subsequently used in the gamma-carboxylation of glutamic acid residues in blood coagulation enzymes. This family includes enzymes that are present in vertebrates, Drosophila, plants, bacteria, and archaea. All homologs of VKOR contain an active site CXXC motif, which is switched between reduced and disulfide-bonded states during the reaction cycle. In some plant and bacterial homologs, the VKOR domain is fused with domains of the thioredoxin family of oxidoreductases which may function as redox partners in initiating the reduction cascade. Warfarin, a widely used oral anticoagulant used in medicine as well as rodenticides, inhibits the activity of VKOR, resulting in decreased levels of reduced vitamin K, which is required for the function of several clotting factors. However, anticoagulation effect of warfarin is significantly associated with polymorphism of certain genes, including VKORC1. Interestingly, in rodents, an adaptive trait appears to have evolved convergently by selection on new or standing genetic polymorphisms in VKORC1 as well as by adaptive introgressive hybridization between species, likely brought about by human-mediated dispersal." Q#12536 - CGI_10019803 superfamily 243175 40 133 1.06E-24 93.0467 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#12536 - CGI_10019803 superfamily 241832 4 38 3.79E-16 69.194 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#12537 - CGI_10019804 superfamily 243175 93 219 8.38E-44 145.434 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#12537 - CGI_10019804 superfamily 241832 4 79 9.81E-34 117.729 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#12540 - CGI_10019807 superfamily 247724 66 268 3.86E-75 236.663 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12541 - CGI_10019808 superfamily 245814 803 876 9.92E-11 60.1955 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12541 - CGI_10019808 superfamily 245814 619 672 2.84E-05 44.0171 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12541 - CGI_10019808 superfamily 245814 900 979 4.74E-20 87.5495 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12541 - CGI_10019808 superfamily 245814 700 779 1.29E-19 86.1148 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12541 - CGI_10019808 superfamily 245814 375 461 5.59E-09 55.1442 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12542 - CGI_10019809 superfamily 241563 71 109 7.08E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12543 - CGI_10019810 superfamily 241563 68 101 0.000855702 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12544 - CGI_10019811 superfamily 245847 43 184 4.32E-20 82.6045 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#12546 - CGI_10019813 superfamily 217473 61 291 1.01E-23 101.288 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#12547 - CGI_10019814 superfamily 241832 126 215 1.40E-08 50.3846 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#12551 - CGI_10004586 superfamily 241564 77 146 1.38E-23 88.4767 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#12553 - CGI_10012268 superfamily 192535 49 235 7.03E-05 41.8126 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#12555 - CGI_10012270 superfamily 215866 38 153 1.11E-05 43.468 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#12556 - CGI_10012271 superfamily 245716 64 86 0.000737638 36.4533 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#12556 - CGI_10012271 superfamily 245716 120 144 0.00346063 34.5273 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#12558 - CGI_10012273 superfamily 241640 65 292 1.58E-100 296.88 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#12559 - CGI_10012274 superfamily 241640 67 312 1.22E-73 230.626 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#12560 - CGI_10012275 superfamily 245864 146 378 9.04E-65 215.99 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12560 - CGI_10012275 superfamily 245864 43 195 0.00386589 37.643 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12562 - CGI_10012277 superfamily 247724 71 261 1.21E-113 327.593 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12563 - CGI_10012278 superfamily 247724 41 79 5.03E-17 72.5907 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12566 - CGI_10003862 superfamily 243098 1434 1479 2.02E-12 64.5415 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#12566 - CGI_10003862 superfamily 243098 1105 1168 2.49E-08 52.6003 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#12567 - CGI_10003863 superfamily 243092 3 321 2.26E-68 217.588 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12568 - CGI_10003864 superfamily 246925 55 196 0.00052423 39.261 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#12569 - CGI_10003865 superfamily 248300 265 322 1.79E-05 41.4497 cl17746 RAP superfamily - - "RAP domain; This domain is found in various eukaryotic species, where it is found in proteins that are important in various parasite-host cell interactions. It is thought to be an RNA-binding domain. The domain is involved in plant defence in response to bacterial infection." Q#12570 - CGI_10003866 superfamily 241862 379 617 2.37E-24 102.82 cl00437 COG0428 superfamily - - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#12572 - CGI_10005750 superfamily 241568 907 962 1.68E-11 61.3248 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12572 - CGI_10005750 superfamily 243124 98 256 5.30E-41 149.114 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#12572 - CGI_10005750 superfamily 155088 477 618 5.32E-17 79.9474 cl02758 AMOP superfamily - - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#12572 - CGI_10005750 superfamily 243065 636 783 0.000322497 41.2333 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#12573 - CGI_10005751 superfamily 218493 443 590 9.40E-50 173.312 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#12577 - CGI_10000986 superfamily 241600 19 229 9.79E-100 291.837 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#12578 - CGI_10000713 superfamily 247756 6 243 5.20E-80 242.663 cl17202 HAD superfamily - - haloacid dehalogenase-like hydrolase; haloacid dehalogenase-like hydrolase. Q#12580 - CGI_10024202 superfamily 241563 59 98 1.51E-05 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12580 - CGI_10024202 superfamily 110440 483 509 0.00171223 37.3873 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#12580 - CGI_10024202 superfamily 128778 109 215 0.00310615 37.6295 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#12582 - CGI_10024204 superfamily 245201 13 255 1.91E-106 311.778 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12583 - CGI_10024205 superfamily 245201 93 262 6.28E-74 228.501 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12583 - CGI_10024205 superfamily 246908 17 45 4.59E-09 51.7595 cl15255 SH2 superfamily C - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#12585 - CGI_10024207 superfamily 241600 101 272 4.33E-71 220.19 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#12586 - CGI_10024208 superfamily 241782 23 251 5.98E-105 324.719 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#12586 - CGI_10024208 superfamily 241782 255 325 7.08E-29 117.482 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#12588 - CGI_10024210 superfamily 222150 907 932 8.43E-06 44.3049 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12588 - CGI_10024210 superfamily 222150 879 902 8.07E-05 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12588 - CGI_10024210 superfamily 197676 865 887 0.00200998 37.4453 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#12588 - CGI_10024210 superfamily 222150 787 810 0.00977989 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12589 - CGI_10024211 superfamily 243035 754 870 1.82E-24 100.387 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#12589 - CGI_10024211 superfamily 216897 498 576 7.33E-21 88.8924 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#12589 - CGI_10024211 superfamily 245847 625 741 1.35E-13 69.3033 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#12589 - CGI_10024211 superfamily 243119 931 979 0.000213554 40.505 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#12589 - CGI_10024211 superfamily 203209 19 169 0.000260232 42.0657 cl12305 STOP superfamily C - "STOP protein; Neurons contain abundant subsets of highly stable microtubules that resist de-polymerising conditions such as exposure to the cold. Stable microtubules are thought to be essential for neuronal development, maintenance, and function. STOP is a major factor responsible for the intriguing stability properties of neuronal microtubules and is important for synaptic plasticity. Additionally knowledge of STOPs function and properties may help in the treatment of neuroleptics in illnesses such as schizophrenia, currently thought to result from synaptic defects." Q#12590 - CGI_10024212 superfamily 215866 4 141 0.00236167 36.5344 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#12591 - CGI_10024213 superfamily 248469 177 227 1.55E-08 51.2167 cl17915 HAD_like superfamily N - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#12591 - CGI_10024213 superfamily 248469 8 120 0.000185851 39.2755 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#12596 - CGI_10024218 superfamily 247736 364 445 3.83E-07 47.7094 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#12597 - CGI_10024219 superfamily 245206 1315 1546 1.46E-59 212.53 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12597 - CGI_10024219 superfamily 244888 176 461 1.45E-64 224.204 cl08282 Acyl_transf_1 superfamily - - Acyl transferase domain; Acyl transferase domain. Q#12597 - CGI_10024219 superfamily 245210 1 62 2.02E-15 79.9094 cl09938 cond_enzymes superfamily N - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#12597 - CGI_10024219 superfamily 245209 1585 1662 0.000265039 41.4669 cl09936 PP-binding superfamily - - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#12598 - CGI_10024220 superfamily 245210 2 297 2.74E-120 357.638 cl09938 cond_enzymes superfamily C - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#12600 - CGI_10024222 superfamily 243061 44 88 4.65E-21 80.849 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#12601 - CGI_10024223 superfamily 243091 68 177 2.34E-12 65.0483 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#12601 - CGI_10024223 superfamily 222150 549 574 1.80E-05 43.1493 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12601 - CGI_10024223 superfamily 222150 577 602 1.86E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12601 - CGI_10024223 superfamily 222150 605 630 5.81E-05 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12601 - CGI_10024223 superfamily 222150 633 656 0.00172066 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12603 - CGI_10024225 superfamily 243072 100 224 1.00E-31 117.099 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12603 - CGI_10024225 superfamily 243072 35 149 1.80E-28 107.855 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12603 - CGI_10024225 superfamily 243073 333 369 3.37E-05 40.9165 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#12604 - CGI_10024226 superfamily 248338 14 126 1.95E-26 111.154 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#12604 - CGI_10024226 superfamily 248338 366 447 3.08E-12 64.8274 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#12605 - CGI_10024227 superfamily 241832 57 228 2.06E-117 335.629 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#12608 - CGI_10024230 superfamily 248097 120 241 3.78E-13 63.4382 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12609 - CGI_10024231 superfamily 248097 82 210 4.33E-11 57.3121 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12610 - CGI_10024232 superfamily 248097 133 261 1.28E-09 54.2305 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#12612 - CGI_10024234 superfamily 217473 133 324 5.00E-24 102.443 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#12619 - CGI_10016651 superfamily 245213 310 346 5.64E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12619 - CGI_10016651 superfamily 245213 538 574 7.23E-07 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12619 - CGI_10016651 superfamily 245213 462 498 8.60E-07 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12619 - CGI_10016651 superfamily 245213 386 422 1.97E-06 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12619 - CGI_10016651 superfamily 245213 277 307 1.52E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12619 - CGI_10016651 superfamily 245213 356 383 0.000197272 40.6978 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12619 - CGI_10016651 superfamily 245213 432 459 0.000371403 39.9274 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12619 - CGI_10016651 superfamily 245213 508 535 0.00577163 36.4606 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12619 - CGI_10016651 superfamily 243061 688 789 7.58E-35 130.155 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#12619 - CGI_10016651 superfamily 243061 1118 1219 3.53E-34 128.229 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#12619 - CGI_10016651 superfamily 243061 579 680 1.44E-33 126.688 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#12619 - CGI_10016651 superfamily 243061 902 1003 1.96E-33 126.303 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#12619 - CGI_10016651 superfamily 243061 1010 1111 5.02E-33 125.147 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#12619 - CGI_10016651 superfamily 243061 796 895 1.78E-31 120.525 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#12620 - CGI_10016652 superfamily 245213 1500 1539 6.64E-08 51.4834 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 245213 1581 1615 1.83E-05 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 245213 1459 1494 2.20E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 245213 1332 1367 7.91E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 245213 1791 1825 0.000126707 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 245213 1663 1703 0.000690042 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 245213 891 924 0.00113937 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 245213 1624 1662 0.00114961 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 243124 90 184 2.52E-12 67.0669 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#12620 - CGI_10016652 superfamily 205157 1544 1579 9.23E-08 50.9991 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#12620 - CGI_10016652 superfamily 243065 276 403 5.98E-07 50.1329 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#12620 - CGI_10016652 superfamily 241578 720 763 0.000152789 43.9128 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12620 - CGI_10016652 superfamily 241578 1197 1239 0.000349743 42.7572 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12620 - CGI_10016652 superfamily 245213 685 724 0.00335919 37.6116 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12620 - CGI_10016652 superfamily 241578 974 1014 0.00345407 40.0608 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12620 - CGI_10016652 superfamily 221695 790 812 0.00554919 37.0494 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#12620 - CGI_10016652 superfamily 245213 1064 1096 0.00743446 36.8412 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12621 - CGI_10016653 superfamily 241599 119 213 7.03E-08 48.7789 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#12623 - CGI_10016657 superfamily 219593 93 158 1.16E-22 94.0646 cl06721 Abi_HHR superfamily C - "Abl-interactor HHR; The region featured in this family is found towards the N-terminus of a number of adaptor proteins that interact with Abl-family tyrosine kinases. More specifically, it is termed the homeo-domain homologous region (HHR), as it is similar to the DNA-binding region of homeo-domain proteins. Other homeo-domain proteins have been implicated in specifying positional information during embryonic development, and in the regulation of the expression of cell-type specific genes. The Abl-interactor proteins are thought to coordinate the cytoplasmic and nuclear functions of the Abl-family kinases, and seem to be involved in cytoskeletal reorganisation, but their precise role remains unclear." Q#12623 - CGI_10016657 superfamily 247683 374 406 1.94E-13 66.576 cl17036 SH3 superfamily C - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#12623 - CGI_10016657 superfamily 241624 504 683 1.09E-09 58.1006 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#12624 - CGI_10016658 superfamily 191103 18 135 1.56E-36 140.477 cl04777 TIMELESS superfamily C - Timeless protein; The timeless gene in Drosophila melanogaster and its homologues in a number of other insects and mammals (including human) are involved in circadian rhythm control. This family includes a related proteins from a number of fungal species. Q#12624 - CGI_10016658 superfamily 191103 142 197 7.53E-11 63.0523 cl04777 TIMELESS superfamily N - Timeless protein; The timeless gene in Drosophila melanogaster and its homologues in a number of other insects and mammals (including human) are involved in circadian rhythm control. This family includes a related proteins from a number of fungal species. Q#12625 - CGI_10016659 superfamily 246925 97 252 6.67E-06 47.3502 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#12625 - CGI_10016659 superfamily 214507 485 527 7.13E-05 41.2616 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#12625 - CGI_10016659 superfamily 243030 22 48 0.000471221 38.3823 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#12627 - CGI_10016661 superfamily 245201 14 42 1.87E-07 45.1675 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12628 - CGI_10016662 superfamily 245201 201 517 1.06E-42 159.03 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12629 - CGI_10016663 superfamily 243176 368 628 4.60E-109 349.598 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#12629 - CGI_10016663 superfamily 248318 108 161 1.22E-19 85.9505 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#12629 - CGI_10016663 superfamily 243097 1678 1948 8.06E-97 316.927 cl02572 PIPKc superfamily - - "Phosphatidylinositol phosphate kinases (PIPK) catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. CD alignment includes type II phosphatidylinositol phosphate kinases (PIPKII-beta), type I andII PIPK (-alpha, -beta, and -gamma) kinases and related yeast Fab1p and Mss4p kinases. Signaling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. The catalytic core domains of PIPKs are structurally similar to PI3K, PI4K, and cAMP-dependent protein kinases (PKA), the dimerization region is a unique feature of the PIPKs." Q#12629 - CGI_10016663 superfamily 243038 208 251 2.02E-08 53.9859 cl02442 DEP superfamily N - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#12630 - CGI_10016664 superfamily 218241 8 222 1.57E-56 181.711 cl04723 Far-17a_AIG1 superfamily - - "FAR-17a/AIG1-like protein; This family includes the hamster androgen-induced FAR-17a protein, and its human homologue, the AIG1 protein. The function of these proteins is unknown. This family also includes homologous regions from a number of other metazoan proteins." Q#12636 - CGI_10016670 superfamily 241648 102 144 1.39E-17 75.487 cl00158 ZnF_GATA superfamily C - Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Q#12636 - CGI_10016670 superfamily 241648 17 49 5.28E-11 57.3826 cl00158 ZnF_GATA superfamily C - Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Q#12638 - CGI_10016672 superfamily 236582 104 280 4.10E-24 99.8231 cl18895 PRK09599 superfamily C - 6-phosphogluconate dehydrogenase-like protein; Reviewed Q#12639 - CGI_10016673 superfamily 246918 1189 1239 1.49E-06 47.9667 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12639 - CGI_10016673 superfamily 246918 673 728 0.00376201 37.5663 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12642 - CGI_10016676 superfamily 245226 81 248 1.13E-17 78.4964 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#12643 - CGI_10015924 superfamily 246925 111 378 5.73E-22 94.3445 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#12644 - CGI_10015925 superfamily 216939 55 120 6.59E-30 104.282 cl03492 PC4 superfamily - - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#12645 - CGI_10015926 superfamily 245836 303 480 1.91E-55 188.23 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#12645 - CGI_10015926 superfamily 245836 58 194 2.58E-53 182.452 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#12646 - CGI_10015927 superfamily 246918 171 222 0.000193423 38.3367 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12646 - CGI_10015927 superfamily 204025 232 263 0.000782474 36.0753 cl07344 PLAC superfamily - - PLAC (protease and lacunin) domain; The PLAC (protease and lacunin) domain is a short six-cysteine region that is usually found at the C terminal of proteins. It is found in a range of proteins including PACE4 (paired basic amino acid cleaving enzyme 4) and the extracellular matrix protein lacunin. Q#12646 - CGI_10015927 superfamily 246918 112 139 0.00767071 33.533 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#12649 - CGI_10015930 superfamily 243050 4 50 2.90E-26 99.7224 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#12649 - CGI_10015930 superfamily 241645 221 307 1.51E-18 79.9347 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#12650 - CGI_10015931 superfamily 247724 8 181 8.87E-89 260.617 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12651 - CGI_10015932 superfamily 247724 23 178 1.45E-80 239.015 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12652 - CGI_10015933 superfamily 247724 24 166 8.74E-60 185.473 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12653 - CGI_10015934 superfamily 247724 22 173 2.32E-84 248.645 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12656 - CGI_10015937 superfamily 247941 7 146 6.01E-05 40.0117 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#12662 - CGI_10015943 superfamily 248264 58 135 3.85E-13 62.2546 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#12663 - CGI_10015944 superfamily 247805 24 137 6.26E-10 52.7248 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#12664 - CGI_10015945 superfamily 247905 1 43 5.24E-09 50.6993 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#12669 - CGI_10015950 superfamily 247856 26 88 5.32E-06 41.7645 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12669 - CGI_10015950 superfamily 247856 67 119 0.00015061 37.9125 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12669 - CGI_10015950 superfamily 247856 99 167 0.000716299 35.9865 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12670 - CGI_10015951 superfamily 247856 30 87 4.72E-08 46.0017 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12670 - CGI_10015951 superfamily 247856 62 121 0.00106976 34.0605 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12671 - CGI_10015952 superfamily 243085 25 62 2.99E-19 80.0287 cl02557 DM superfamily - - "DM DNA binding domain; The DM domain is named after dsx and mab-3. dsx contains a single amino-terminal DM domain, whereas mab-3 contains two amino-terminal domains. The DM domain has a pattern of conserved zinc chelating residues C2H2C4. The dsx DM domain has been shown to dimerise and bind palindromic DNA." Q#12671 - CGI_10015952 superfamily 112299 198 235 3.11E-11 57.8042 cl04098 DMA superfamily - - DMRTA motif; This region is found to the C-terminus of the pfam00751. DM-domain proteins with this motif are known as DMRTA proteins. The function of this region is unknown. Q#12673 - CGI_10002933 superfamily 115634 1 226 2.03E-154 431.153 cl06169 Phage_lambda_P superfamily - - "Replication protein P; This family consists of several Bacteriophage lambda replication protein P like proteins. The bacteriophage lambda P protein promoters replication of the phage chromosome by recruiting a key component of the cellular replication machinery to the viral origin. Specifically, P protein delivers one or more molecules of Escherichia coli DnaB helicase to a nucleoprotein structure formed by the lambda O initiator at the lambda replication origin." Q#12675 - CGI_10004409 superfamily 241600 15 160 9.34E-39 132.749 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#12676 - CGI_10004410 superfamily 220659 2 46 5.15E-16 65.4236 cl10943 SAYSvFN superfamily N - Uncharacterized conserved domain (SAYSvFN); This domain of approximately 75 residues contains a highly conserved SATSv/iFN motif. The function is unknown but the domain is conserved from plants to humans. Q#12679 - CGI_10001597 superfamily 241574 190 350 1.10E-55 189.719 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#12679 - CGI_10001597 superfamily 241574 418 575 2.62E-13 68.7665 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#12681 - CGI_10001808 superfamily 247743 162 329 8.10E-25 98.7575 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#12683 - CGI_10008280 superfamily 217293 215 406 1.18E-39 144.313 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#12683 - CGI_10008280 superfamily 217280 113 194 3.15E-19 87.2609 cl14953 Fe_hyd_lg_C superfamily C - "Iron only hydrogenase large subunit, C-terminal domain; Iron only hydrogenase large subunit, C-terminal domain. " Q#12684 - CGI_10008281 superfamily 243092 12 289 2.86E-48 164.815 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12687 - CGI_10008284 superfamily 215754 196 286 2.73E-18 77.6788 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12687 - CGI_10008284 superfamily 215754 14 94 1.66E-17 75.3676 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12687 - CGI_10008284 superfamily 215754 95 190 3.10E-16 71.9008 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#12688 - CGI_10008285 superfamily 248011 790 870 0.00132557 40.1726 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#12688 - CGI_10008285 superfamily 241546 2671 2788 2.05E-38 143.185 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#12688 - CGI_10008285 superfamily 243086 2563 2607 4.94E-08 53.1478 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#12688 - CGI_10008285 superfamily 248011 1421 1490 1.54E-05 45.8662 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#12688 - CGI_10008285 superfamily 248011 1505 1567 6.85E-05 43.9402 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#12688 - CGI_10008285 superfamily 248011 891 951 0.000555783 41.2438 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#12688 - CGI_10008285 superfamily 222537 3677 3804 0.00470176 40.2779 cl16609 DUF4271 superfamily N - Domain of unknown function (DUF4271); This family of integral membrane proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 221 and 326 amino acids in length. Q#12689 - CGI_10008286 superfamily 246976 3 645 0 728.394 cl15483 Dymeclin superfamily N - "Dyggve-Melchior-Clausen syndrome protein; Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. Mutations in the gene coding for this protein in humans give rise to the disorder Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800) which is an autosomal-recessive disorder characterized by the association of a spondylo-epi-metaphyseal dysplasia and mental retardation. DYM transcripts are widely expressed throughout human development and Dymeclin is not an integral membrane protein of the ER, but rather a peripheral membrane protein dynamically associated with the Golgi apparatus." Q#12690 - CGI_10001867 superfamily 245213 73 108 1.55E-10 54.1798 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12690 - CGI_10001867 superfamily 245213 111 146 8.61E-08 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12690 - CGI_10001867 superfamily 245847 152 218 1.58E-09 53.3294 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#12692 - CGI_10002048 superfamily 241583 55 209 7.45E-42 146.365 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#12693 - CGI_10002049 superfamily 241642 109 166 2.38E-05 40.5885 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#12698 - CGI_10004512 superfamily 245213 849 881 0.000170522 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12698 - CGI_10004512 superfamily 245213 138 173 0.000311411 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12698 - CGI_10004512 superfamily 245213 345 377 0.00228084 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12698 - CGI_10004512 superfamily 245213 606 637 0.00268633 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12698 - CGI_10004512 superfamily 245213 262 294 0.00571227 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12698 - CGI_10004512 superfamily 192997 2350 2443 3.02E-23 99.9635 cl18184 Sterol-sensing superfamily C - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#12698 - CGI_10004512 superfamily 243060 1280 1366 7.94E-11 62.0112 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#12698 - CGI_10004512 superfamily 241578 640 684 4.50E-08 55.0835 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12698 - CGI_10004512 superfamily 241578 928 971 3.16E-07 52.3871 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12698 - CGI_10004512 superfamily 241578 160 216 1.05E-06 50.8464 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12698 - CGI_10004512 superfamily 241578 1145 1188 6.83E-06 48.5352 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12698 - CGI_10004512 superfamily 245213 429 468 0.00030313 41.1804 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12698 - CGI_10004512 superfamily 241578 685 719 0.000336794 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12698 - CGI_10004512 superfamily 241578 1022 1062 0.000613028 42.372 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12698 - CGI_10004512 superfamily 241578 1186 1228 0.000988803 41.9868 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12698 - CGI_10004512 superfamily 241578 517 559 0.00165354 41.2164 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#12698 - CGI_10004512 superfamily 221695 242 265 0.00177986 38.9754 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#12699 - CGI_10004513 superfamily 247856 103 156 1.74E-09 51.7797 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12699 - CGI_10004513 superfamily 247856 70 127 1.45E-06 43.6905 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12699 - CGI_10004513 superfamily 247856 134 191 0.00759315 33.2901 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12700 - CGI_10002685 superfamily 247792 17 58 1.58E-07 43.5884 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12701 - CGI_10002686 superfamily 247792 783 827 1.07E-11 61.3076 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12704 - CGI_10013194 superfamily 241584 327 400 4.32E-05 41.7131 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#12704 - CGI_10013194 superfamily 214531 114 144 4.33E-05 41.0481 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#12704 - CGI_10013194 superfamily 214531 146 188 0.000602635 37.5813 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#12705 - CGI_10013195 superfamily 247856 142 196 2.10E-16 75.2769 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12705 - CGI_10013195 superfamily 247856 64 125 9.71E-14 67.5729 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12705 - CGI_10013195 superfamily 241584 467 548 2.42E-09 55.5803 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#12705 - CGI_10013195 superfamily 241584 767 843 0.000251461 40.1723 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#12705 - CGI_10013195 superfamily 214531 310 351 5.78E-07 47.5965 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#12705 - CGI_10013195 superfamily 214531 277 307 8.20E-06 44.1297 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#12705 - CGI_10013195 superfamily 241584 669 750 0.000265761 40.2905 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#12707 - CGI_10013197 superfamily 243066 9 99 1.74E-41 146.544 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#12707 - CGI_10013197 superfamily 219619 334 401 7.36E-10 56.4471 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#12708 - CGI_10013198 superfamily 247792 809 849 0.000114495 40.892 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12708 - CGI_10013198 superfamily 204923 850 898 1.01E-17 78.7452 cl13837 VPS11_C superfamily - - "Vacuolar protein sorting protein 11 C terminal; This domain family is found in eukaryotes, and is approximately 50 amino acids in length. Vps 11 is one of the evolutionarily conserved class C vacuolar protein sorting genes (c-vps: vps11, vps16, vps18, and vps33), whose products physically associate to form the c-vps protein complex required for vesicle docking and fusion." Q#12711 - CGI_10013201 superfamily 247856 130 183 3.51E-10 53.7057 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12711 - CGI_10013201 superfamily 247856 54 116 4.66E-10 53.3205 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12712 - CGI_10013202 superfamily 247856 22 84 1.99E-12 59.0985 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12712 - CGI_10013202 superfamily 247856 100 151 1.53E-09 51.0093 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12713 - CGI_10013203 superfamily 247856 60 122 4.80E-09 49.8537 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12713 - CGI_10013203 superfamily 247856 96 160 5.77E-07 44.4609 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12713 - CGI_10013203 superfamily 247856 134 189 2.04E-05 40.2237 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12714 - CGI_10013204 superfamily 247856 49 111 1.31E-11 56.7873 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12714 - CGI_10013204 superfamily 247856 85 149 1.97E-08 48.3129 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12714 - CGI_10013204 superfamily 247856 126 177 2.75E-08 47.9277 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12715 - CGI_10013205 superfamily 247856 27 89 4.17E-15 66.0321 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12715 - CGI_10013205 superfamily 247856 103 156 1.41E-13 61.7949 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12716 - CGI_10013206 superfamily 246664 1 64 1.55E-22 93.4082 cl14561 An_peroxidase_like superfamily N - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#12717 - CGI_10013207 superfamily 248264 1048 1215 4.67E-13 68.8029 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#12717 - CGI_10013207 superfamily 222263 955 1025 0.00152599 38.4529 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#12720 - CGI_10003155 superfamily 245847 5 83 0.000439169 35.6102 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#12721 - CGI_10007244 superfamily 245201 10 218 3.32E-125 360.928 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12722 - CGI_10007245 superfamily 241571 33 144 8.01E-18 75.5266 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#12723 - CGI_10007246 superfamily 202748 141 190 1.33E-16 70.9831 cl04238 DUF307 superfamily - - Domain of unknown function (DUF307); Domain occurs as one or more copies in a small family of putative membrane proteins. Q#12724 - CGI_10007247 superfamily 243166 84 199 6.35E-07 50.3698 cl02759 TRAM_LAG1_CLN8 superfamily C - TLC domain; TLC domain. Q#12724 - CGI_10007247 superfamily 245874 837 928 0.000274129 41.2578 cl12111 TNFR superfamily - - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#12724 - CGI_10007247 superfamily 245874 1205 1296 0.000583988 40.1022 cl12111 TNFR superfamily - - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#12725 - CGI_10007248 superfamily 245835 7 237 4.75E-123 374.094 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#12726 - CGI_10007249 superfamily 247723 209 280 4.68E-14 67.5823 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#12726 - CGI_10007249 superfamily 247723 75 149 7.03E-14 66.8119 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#12726 - CGI_10007249 superfamily 247723 412 483 1.72E-06 45.7589 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#12727 - CGI_10007250 superfamily 220394 11 701 3.38E-146 450.373 cl10752 Meckelin superfamily - - "Meckelin (Transmembrane protein 67); Members of this family are thought to be related to the ciliary basal body. Defects result in Meckel syndrome type 3, an autosomal recessive disorder characterized by a combination of renal cysts and variably associated features including developmental anomalies of the central nervous system (typically encephalocele), hepatic ductal dysplasia and cysts, and polydactyly. Joubert syndrome type 6 is also a manifestation of certain mutations; it is an autosomal recessive congenital malformation of the cerebellar vermis and brainstem with abnormalities of axonal decussation (crossing in the brain) affecting the corticospinal tract and superior cerebellar peduncles. Individuals with Joubert syndrome have motor and behavioral abnormalities, including an inability to walk due to severe clumsiness and 'mirror' movements, and cognitive and behavioural disturbances." Q#12729 - CGI_10004622 superfamily 243072 4 81 3.69E-20 78.5794 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12730 - CGI_10006612 superfamily 217574 4 119 8.19E-39 136.584 cl04089 eRF1_1 superfamily - - "eRF1 domain 1; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#12730 - CGI_10006612 superfamily 217575 124 256 1.33E-37 133.169 cl04090 eRF1_2 superfamily - - "eRF1 domain 2; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#12730 - CGI_10006612 superfamily 146221 259 358 1.07E-36 129.595 cl04091 eRF1_3 superfamily - - "eRF1 domain 3; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#12731 - CGI_10006613 superfamily 247999 53 96 9.31E-13 64.8192 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#12731 - CGI_10006613 superfamily 247999 214 259 7.43E-10 56.0662 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#12731 - CGI_10006613 superfamily 241581 731 786 0.00105152 38.5214 cl00062 FHA superfamily NC - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#12733 - CGI_10006615 superfamily 241872 8 177 4.72E-14 66.2416 cl00453 CDP-OH_P_transf superfamily - - CDP-alcohol phosphatidyltransferase; All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. Q#12734 - CGI_10006616 superfamily 128469 207 303 3.06E-25 99.4519 cl17971 VPS9 superfamily - - Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. Q#12736 - CGI_10006618 superfamily 242534 67 146 1.34E-15 70.7255 cl01495 Glyco_hydro_10 superfamily C - Glycosyl hydrolase family 10; Glycosyl hydrolase family 10. Q#12737 - CGI_10006619 superfamily 242534 717 963 2.82E-46 168.566 cl01495 Glyco_hydro_10 superfamily - - Glycosyl hydrolase family 10; Glycosyl hydrolase family 10. Q#12737 - CGI_10006619 superfamily 242534 222 467 1.87E-45 166.255 cl01495 Glyco_hydro_10 superfamily - - Glycosyl hydrolase family 10; Glycosyl hydrolase family 10. Q#12737 - CGI_10006619 superfamily 221377 1158 1250 2.09E-08 54.0118 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#12737 - CGI_10006619 superfamily 216848 591 675 3.10E-06 47.046 cl03406 CBM_4_9 superfamily C - Carbohydrate binding domain; This family includes diverse carbohydrate binding domains. Q#12741 - CGI_10003732 superfamily 243267 20 386 3.55E-148 427.03 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#12745 - CGI_10005968 superfamily 241599 268 325 2.85E-16 72.6612 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#12745 - CGI_10005968 superfamily 198730 183 249 3.28E-46 155.289 cl02582 Pou superfamily - - Pou domain - N-terminal to homeobox domain; Pou domain - N-terminal to homeobox domain. Q#12746 - CGI_10017007 superfamily 215529 1 122 4.18E-62 194.188 cl18336 PLN02978 superfamily C - pyridoxal kinase Q#12747 - CGI_10017008 superfamily 241672 4 127 8.71E-13 62.4056 cl00192 ribokinase_pfkB_like superfamily N - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#12748 - CGI_10017009 superfamily 246681 2 259 3.21E-118 342.553 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#12750 - CGI_10017011 superfamily 242889 9 452 5.44E-117 352.711 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#12751 - CGI_10017012 superfamily 220078 44 168 6.19E-26 100.662 cl07513 DUF1917 superfamily N - Domain of unknown function (DUF1917); This domain is found in various hypothetical and basophilic leukaemia proteins. It has no known function. Q#12752 - CGI_10017013 superfamily 243179 266 362 0.000338873 38.6679 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#12753 - CGI_10017014 superfamily 248458 112 233 1.82E-05 43.0245 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#12754 - CGI_10017015 superfamily 243179 132 244 3.86E-10 55.2315 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#12755 - CGI_10017016 superfamily 243179 140 246 1.19E-13 65.2467 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#12756 - CGI_10017017 superfamily 247856 35 92 1.09E-07 47.1573 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12756 - CGI_10017017 superfamily 247856 104 165 1.22E-06 44.0757 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12756 - CGI_10017017 superfamily 247856 175 234 0.00026601 37.5273 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12757 - CGI_10017018 superfamily 245716 93 119 3.65E-06 43.7721 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#12758 - CGI_10017019 superfamily 245596 4 261 8.16E-164 464.034 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#12758 - CGI_10017019 superfamily 193687 289 355 5.57E-19 81.0447 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#12759 - CGI_10017020 superfamily 241592 1 136 1.63E-82 240.962 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#12760 - CGI_10017021 superfamily 241704 69 210 1.00E-33 119.786 cl00227 PEBP superfamily - - "PhosphatidylEthanolamine-Binding Protein (PEBP) domain; PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). A number of biological roles for members of the PEBP family include serine protease inhibition, membrane biogenesis, regulation of flowering plant stem architecture, and Raf-1 kinase inhibition. Although their overall structures are similar, the members of the PEBP family bind very different substrates including phospholipids, opioids, and hydrophobic odorant molecules as well as having different oligomerization states (monomer/dimer/tetramer)." Q#12761 - CGI_10017022 superfamily 245835 80 283 3.11E-115 345.615 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#12761 - CGI_10017022 superfamily 147001 467 626 1.57E-25 105.483 cl04636 ICA69 superfamily N - "Islet cell autoantigen ICA69, C-terminal domain; This family includes a 69 kD protein which has been identified as an islet cell autoantigen in type I diabetes mellitus. Its precise function is unknown." Q#12762 - CGI_10017023 superfamily 241563 177 212 2.55E-05 42.2744 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12762 - CGI_10017023 superfamily 247792 17 80 0.00357673 35.8844 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#12762 - CGI_10017023 superfamily 128778 228 347 0.00023094 40.3259 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#12763 - CGI_10017024 superfamily 248022 67 387 3.84E-34 130.861 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#12764 - CGI_10017025 superfamily 247724 87 185 3.69E-17 74.8147 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12766 - CGI_10017027 superfamily 245201 2 248 1.15E-74 231.271 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12768 - CGI_10017029 superfamily 247057 305 362 2.08E-11 60.3869 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#12769 - CGI_10017030 superfamily 241550 553 784 2.94E-69 235.605 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#12769 - CGI_10017030 superfamily 245839 783 902 9.29E-40 144.656 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#12769 - CGI_10017030 superfamily 241550 48 231 8.09E-21 94.2366 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#12769 - CGI_10017030 superfamily 241550 195 270 1.39E-06 50.1097 cl00015 nt_trans superfamily NC - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#12770 - CGI_10017031 superfamily 241659 11 103 3.14E-37 124.876 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#12771 - CGI_10017032 superfamily 247804 66 109 4.99E-07 46.4146 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#12771 - CGI_10017032 superfamily 203011 363 434 2.39E-12 62.6097 cl04515 SWIRM superfamily - - SWIRM domain; This SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in chromosomal proteins. It contains a helix-turn helix motif and binds to DNA. Q#12771 - CGI_10017032 superfamily 241760 8 56 6.14E-07 46.5168 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#12774 - CGI_10017037 superfamily 245847 25 147 0.00132411 37.0968 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#12775 - CGI_10017038 superfamily 247941 7 89 1.03E-05 41.7293 cl17387 Methyltransf_21 superfamily N - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#12778 - CGI_10003016 superfamily 149104 7 101 9.89E-28 98.3154 cl06748 Renin_r superfamily - - "Renin receptor-like protein; The sequences featured in this family are similar to a region of the human renin receptor that bears a putative transmembrane spanning segment. The renin receptor is involved in intracellular signal transduction by the activation of the ERK1/ERK2 pathway, and it also serves to increase the efficiency of angiotensinogen cleavage by receptor-bound renin, therefore facilitating angiotensin II generation and action on a cell surface." Q#12779 - CGI_10007999 superfamily 247068 467 562 3.47E-25 102.007 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12779 - CGI_10007999 superfamily 247068 571 669 4.26E-25 102.007 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12779 - CGI_10007999 superfamily 247068 249 344 6.23E-18 81.2057 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12779 - CGI_10007999 superfamily 247068 130 238 1.25E-14 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12779 - CGI_10007999 superfamily 247068 365 458 2.17E-14 70.8053 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12779 - CGI_10007999 superfamily 247068 682 779 3.99E-10 58.4789 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12779 - CGI_10007999 superfamily 247068 14 121 1.47E-06 47.6934 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12782 - CGI_10008002 superfamily 247724 169 340 2.44E-99 302.942 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12782 - CGI_10008002 superfamily 191952 395 447 1.58E-21 88.8798 cl06963 NOGCT superfamily - - NOGCT (NUC087) domain; This C terminal domain is found in the NOG subfamily of nucleolar GTP-binding proteins. Q#12783 - CGI_10008003 superfamily 246925 79 211 2.26E-06 48.1206 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#12783 - CGI_10008003 superfamily 245814 289 370 9.88E-06 43.6481 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12784 - CGI_10008004 superfamily 245814 297 351 7.49E-05 40.9517 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12784 - CGI_10008004 superfamily 243030 26 51 0.00639232 34.9155 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#12785 - CGI_10008005 superfamily 245814 300 370 0.000288057 39.0876 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12786 - CGI_10008006 superfamily 245814 210 264 7.41E-05 40.5665 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12786 - CGI_10008006 superfamily 246925 18 101 0.00630834 36.5646 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#12786 - CGI_10008006 superfamily 214507 146 191 0.00969674 33.9428 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#12787 - CGI_10008007 superfamily 247068 466 564 7.06E-26 103.932 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12787 - CGI_10008007 superfamily 247068 250 346 2.08E-24 99.6953 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12787 - CGI_10008007 superfamily 247068 576 667 5.28E-21 90.0653 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12787 - CGI_10008007 superfamily 247068 365 458 4.28E-20 87.3689 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12787 - CGI_10008007 superfamily 247068 139 242 3.81E-16 75.8129 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12787 - CGI_10008007 superfamily 247068 680 769 2.11E-14 70.8053 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12787 - CGI_10008007 superfamily 247068 25 131 1.06E-09 56.9382 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12788 - CGI_10008008 superfamily 247068 464 560 7.14E-23 95.4581 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12788 - CGI_10008008 superfamily 247068 248 345 9.64E-22 91.9913 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12788 - CGI_10008008 superfamily 247068 569 658 1.01E-19 86.2133 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12788 - CGI_10008008 superfamily 247068 140 240 5.63E-18 81.2057 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12788 - CGI_10008008 superfamily 247068 671 764 7.62E-18 80.8205 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12788 - CGI_10008008 superfamily 247068 363 456 3.41E-14 70.0349 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12788 - CGI_10008008 superfamily 247068 22 129 7.31E-07 48.4638 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12789 - CGI_10008009 superfamily 247068 462 558 3.51E-27 107.784 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12789 - CGI_10008009 superfamily 247068 566 661 4.33E-22 93.1469 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12789 - CGI_10008009 superfamily 247068 247 342 9.04E-22 92.3765 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12789 - CGI_10008009 superfamily 247068 674 767 4.92E-17 78.5093 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12789 - CGI_10008009 superfamily 247068 360 453 2.81E-15 73.5017 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12789 - CGI_10008009 superfamily 247068 138 238 1.30E-14 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12789 - CGI_10008009 superfamily 247068 23 127 2.36E-07 50.0046 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#12791 - CGI_10003262 superfamily 243069 24 173 1.54E-53 171.561 cl02525 Band_7 superfamily N - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#12792 - CGI_10003263 superfamily 243072 49 144 1.38E-17 75.4978 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12793 - CGI_10003264 superfamily 241583 118 289 1.06E-49 172.558 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#12794 - CGI_10003265 superfamily 248458 342 513 6.03E-12 65.7981 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#12794 - CGI_10003265 superfamily 248458 152 282 1.81E-09 58.0941 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#12795 - CGI_10003266 superfamily 243035 277 358 9.54E-17 75.7141 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#12796 - CGI_10003267 superfamily 248012 502 624 1.72E-25 103.122 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#12796 - CGI_10003267 superfamily 245814 395 461 7.47E-06 44.7368 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#12797 - CGI_10005664 superfamily 217473 96 216 1.58E-08 53.5229 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#12798 - CGI_10005665 superfamily 217473 49 216 7.86E-09 54.2933 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#12799 - CGI_10005666 superfamily 219877 175 281 6.19E-49 171.769 cl11719 STAG superfamily - - "STAG domain; STAG domain proteins are subunits of cohesin complex - a protein complex required for sister chromatid cohesion in eukaryotes. The STAG domain is present in Schizosaccharomyces pombe mitotic cohesin Psc3, and the meiosis specific cohesin Rec11. Many organisms express a meiosis-specific STAG protein, for example, mice and humans have a meiosis specific variant called STAG3, although budding yeast does not have a meiosis specific version." Q#12799 - CGI_10005666 superfamily 217473 1300 1523 8.06E-32 127.866 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#12800 - CGI_10005667 superfamily 247856 86 147 1.23E-09 51.0093 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12801 - CGI_10005668 superfamily 202715 30 127 2.45E-38 126.924 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#12802 - CGI_10005669 superfamily 247069 154 291 3.43E-20 87.0138 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#12803 - CGI_10005670 superfamily 243096 301 480 2.64E-45 160.926 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#12803 - CGI_10005670 superfamily 247683 720 773 3.99E-14 68.4319 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#12803 - CGI_10005670 superfamily 243054 49 199 5.55E-09 55.5296 cl02488 SPEC superfamily N - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#12803 - CGI_10005670 superfamily 247725 487 612 2.05E-50 173.176 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#12805 - CGI_10000157 superfamily 220015 19 116 2.72E-19 77.3734 cl07407 RPA_C superfamily - - "Replication protein A C terminal; This domain corresponds to the C terminal of the single stranded DNA binding protein RPA (replication protein A). RPA is involved in many DNA metabolic pathways including DNA replication, DNA repair, recombination, cell cycle and DNA damage checkpoints." Q#12805 - CGI_10000157 superfamily 245205 3 22 3.06E-06 41.0457 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#12806 - CGI_10003163 superfamily 220249 67 128 5.60E-10 51.4521 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#12808 - CGI_10003165 superfamily 243087 9 69 1.37E-06 44.508 cl02562 PWI superfamily - - PWI domain; PWI domain. Q#12808 - CGI_10003165 superfamily 219911 222 257 0.000716297 38.497 cl07255 PRP3 superfamily C - pre-mRNA processing factor 3 (PRP3); Pre-mRNA processing factor 3 (PRP3) is a U4/U6-associated splicing factor. The human PRP3 has been implicated in autosomal retinitis pigmentosa. Q#12809 - CGI_10001627 superfamily 243689 70 99 0.00167695 32.6005 cl04271 IBN_N superfamily C - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#12811 - CGI_10017134 superfamily 245847 89 205 0.000632194 37.5362 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#12813 - CGI_10017136 superfamily 245596 81 256 3.51E-77 235.939 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#12814 - CGI_10017138 superfamily 241563 61 100 0.000239909 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12815 - CGI_10017139 superfamily 241958 48 165 3.91E-18 81.0225 cl00573 SDF superfamily NC - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#12816 - CGI_10017140 superfamily 241958 49 377 6.71E-43 155.751 cl00573 SDF superfamily - - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#12817 - CGI_10017141 superfamily 245208 79 511 4.27E-142 422.935 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#12818 - CGI_10017142 superfamily 246908 1252 1336 1.65E-39 143.532 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#12818 - CGI_10017142 superfamily 246908 1345 1436 1.67E-24 100.762 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#12818 - CGI_10017142 superfamily 245202 1148 1201 1.23E-06 48.1455 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#12818 - CGI_10017142 superfamily 150144 309 555 6.48E-09 56.3275 cl09624 Tex_N superfamily - - Tex-like protein N-terminal domain; This presumed domain is found at the N-terminus of Bordetella pertussis tex. This protein defines a novel family of prokaryotic transcriptional accessory factors. Q#12818 - CGI_10017142 superfamily 247832 699 812 8.38E-09 55.2657 cl17278 UPF0081 superfamily - - Uncharacterized protein family (UPF0081); Uncharacterized protein family (UPF0081). Q#12821 - CGI_10017145 superfamily 245605 14 143 6.36E-47 149.665 cl11409 RNAP_RPB11_RPB3 superfamily - - "RPB11 and RPB3 subunits of RNA polymerase; The eukaryotic RPB11 and RPB3 subunits of RNA polymerase (RNAP), as well as their archaeal (L and D subunits) and bacterial (alpha subunit) counterparts, are involved in the assembly of RNAP, a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts." Q#12822 - CGI_10017146 superfamily 245201 90 210 2.55E-05 44.9202 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#12823 - CGI_10017147 superfamily 214507 395 453 0.000130488 40.106 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#12824 - CGI_10017148 superfamily 218140 202 643 2.84E-155 458.986 cl04579 Anoctamin superfamily - - "Calcium-activated chloride channel; The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes." Q#12825 - CGI_10017149 superfamily 243072 189 298 3.95E-21 90.1354 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12825 - CGI_10017149 superfamily 243072 78 233 1.85E-20 88.2094 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12827 - CGI_10017151 superfamily 245226 90 250 3.10E-64 200.497 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#12828 - CGI_10017152 superfamily 216082 75 263 7.88E-30 119.86 cl08284 Glyco_hydro_15 superfamily C - Glycosyl hydrolases family 15; In higher organisms this family is represented by phosphorylase kinase subunits. Q#12828 - CGI_10017152 superfamily 213147 296 323 0.00686604 35.6889 cl17040 ADDz superfamily NC - "ADDz for ATRX, Dnmt3 and Dnmt3l PHD-like zinc finger domain; The ADDz zinc finger domain is present in the chromatin-associated proteins cytosine-5-methyltransferase 3 (Dnmt3) and ATRX, a SNF2 type transcription factor protein. The Dnmt3 family includes two active DNA methyltransferases, Dnmt3a and -3b, and one regulatory factor Dnmt3l. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The ADDz domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif." Q#12836 - CGI_10017160 superfamily 220695 108 232 0.00762629 36.4027 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#12837 - CGI_10000440 superfamily 247856 28 80 1.43E-07 44.4609 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12839 - CGI_10009804 superfamily 241718 61 181 3.07E-59 189.945 cl00241 IF6 superfamily N - "Ribosome anti-association factor IF6 binds the large ribosomal subunit and prevents the two subunits from associating during translation initiation. IF6 comprises a family of translation factors that includes both eukaryotic (eIF6) and archeal (aIF6) members. All members of this family have a conserved pentameric fold referred to as a beta/alpha propeller. The eukaryotic IF6 members have a moderately conserved C-terminal extension which is not required for ribosomal binding, and may have an alternative function." Q#12842 - CGI_10009807 superfamily 245213 1603 1639 0.000237706 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12842 - CGI_10009807 superfamily 245213 1717 1753 0.00078648 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12842 - CGI_10009807 superfamily 245213 157 192 0.00163323 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12842 - CGI_10009807 superfamily 245213 783 818 0.00192965 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12842 - CGI_10009807 superfamily 245213 600 635 0.00307838 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12843 - CGI_10009809 superfamily 220692 109 254 6.81E-07 49.1249 cl18570 7TM_GPCR_Srw superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#12844 - CGI_10009810 superfamily 244843 1 119 2.89E-20 84.9752 cl08040 Ggt superfamily N - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#12845 - CGI_10009811 superfamily 192535 165 251 2.75E-05 43.3534 cl18179 7TM_GPCR_Srsx superfamily N - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#12846 - CGI_10009812 superfamily 241832 9 128 1.06E-56 175.552 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#12847 - CGI_10009813 superfamily 241832 29 170 2.51E-48 155.136 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#12848 - CGI_10009814 superfamily 202668 477 576 3.58E-30 116.995 cl04110 BK_channel_a superfamily - - Calcium-activated BK potassium channel alpha subunit; Calcium-activated BK potassium channel alpha subunit. Q#12848 - CGI_10009814 superfamily 219619 247 322 1.29E-11 62.6103 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#12849 - CGI_10009815 superfamily 247984 47 232 9.88E-51 165.476 cl17430 FtsJ superfamily - - "FtsJ-like methyltransferase; This family consists of FtsJ from various bacterial and archaeal sources FtsJ is a methyltransferase, but actually has no effect on cell division. FtsJ's substrate is the 23S rRNA. The 1.5 A crystal structure of FtsJ in complex with its cofactor S-adenosylmethionine revealed that FtsJ has a methyltransferase fold. This family also includes the N terminus of flaviviral NS5 protein. It has been hypothesised that the N-terminal domain of NS5 is a methyltransferase involved in viral RNA capping." Q#12850 - CGI_10009816 superfamily 207654 246 317 2.76E-22 88.2686 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#12850 - CGI_10009816 superfamily 207654 88 152 3.05E-20 82.4906 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#12850 - CGI_10009816 superfamily 207654 21 80 2.28E-18 77.483 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#12850 - CGI_10009816 superfamily 207654 172 236 1.42E-12 61.6898 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#12852 - CGI_10004212 superfamily 241577 24 188 1.19E-82 245.116 cl00056 MH2 superfamily - - "C-terminal Mad Homology 2 (MH2) domain; The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers." Q#12855 - CGI_10004215 superfamily 245867 9 94 0.00217274 35.2406 cl12090 DUF2156 superfamily NC - "Uncharacterized conserved protein (DUF2156); This domain, found in various hypothetical prokaryotic proteins, has no known function." Q#12856 - CGI_10011484 superfamily 245225 207 667 2.20E-49 182.828 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#12857 - CGI_10011485 superfamily 245225 58 147 0.00012627 40.6891 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#12858 - CGI_10011486 superfamily 219078 32 127 1.33E-05 40.9575 cl12333 DUF1113 superfamily - - Protein of unknown function (DUF1113); This family consists of several bacterial proteins of unknown function. Q#12859 - CGI_10011487 superfamily 243034 95 188 7.49E-19 81.6575 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#12859 - CGI_10011487 superfamily 222431 327 416 6.08E-22 89.5889 cl16447 RPAP3_C superfamily - - Potential Monad-binding region of RPAP3; This domain is found at the C-terminus of RNA-polymerase II-associated proteins. These proteins bind to Monad and are involved in regulating apoptosis. They contain TPR-repeats towards the N_terminus. Q#12860 - CGI_10011488 superfamily 241563 68 104 2.64E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#12862 - CGI_10011490 superfamily 245864 149 261 9.72E-27 111.987 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12863 - CGI_10011491 superfamily 245864 88 360 8.95E-59 198.656 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#12866 - CGI_10011494 superfamily 243134 134 254 4.19E-39 133.929 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#12866 - CGI_10011494 superfamily 243134 2 120 9.16E-26 98.4903 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#12869 - CGI_10011497 superfamily 243064 24 198 2.59E-11 58.1958 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#12870 - CGI_10011498 superfamily 243064 22 202 3.19E-16 72.4482 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#12871 - CGI_10011499 superfamily 243092 2 235 3.52E-38 136.696 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12872 - CGI_10011500 superfamily 110440 523 550 0.00682929 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#12873 - CGI_10011501 superfamily 222150 163 188 6.44E-05 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12873 - CGI_10011501 superfamily 222150 219 244 0.000793315 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12873 - CGI_10011501 superfamily 222150 191 215 0.00286382 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12876 - CGI_10002962 superfamily 214531 20 62 8.61E-09 47.5965 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#12876 - CGI_10002962 superfamily 214531 64 105 1.43E-06 41.4333 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#12877 - CGI_10004601 superfamily 243051 173 295 8.45E-22 90.1297 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#12877 - CGI_10004601 superfamily 241571 7 123 8.70E-08 49.3331 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#12877 - CGI_10004601 superfamily 245213 131 155 0.000286409 38.2266 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#12879 - CGI_10004603 superfamily 218200 28 259 1.12E-71 229.561 cl04660 Glyco_transf_54 superfamily - - "N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein." Q#12881 - CGI_10004605 superfamily 222150 401 426 0.00561638 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#12882 - CGI_10004606 superfamily 248458 30 409 9.53E-24 100.466 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#12886 - CGI_10010161 superfamily 248022 1 122 5.13E-09 51.8947 cl17468 Aa_trans superfamily N - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#12887 - CGI_10010162 superfamily 199166 237 421 7.24E-10 57.7224 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#12887 - CGI_10010162 superfamily 199166 144 315 4.76E-08 52.3296 cl15308 AMN1 superfamily N - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#12887 - CGI_10010162 superfamily 243074 89 129 0.000409218 38.2565 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#12888 - CGI_10010163 superfamily 245230 2 426 0 951.72 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#12889 - CGI_10010164 superfamily 243082 438 834 6.84E-89 288.236 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#12890 - CGI_10010165 superfamily 241799 30 267 1.49E-132 378.751 cl00339 SugarP_isomerase superfamily - - "SugarP_isomerase: Sugar Phosphate Isomerase family; includes type A ribose 5-phosphate isomerase (RPI_A), glucosamine-6-phosphate (GlcN6P) deaminase, and 6-phosphogluconolactonase (6PGL). RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium, the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate. 6PGL converts 6-phosphoglucono-1,5-lactone to 6-phosphogluconate, the second step of the oxidative phase of the pentose phosphate pathway." Q#12891 - CGI_10010166 superfamily 247743 198 321 2.21E-12 64.4747 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#12891 - CGI_10010166 superfamily 243072 71 178 3.91E-12 63.5566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12891 - CGI_10010166 superfamily 209247 393 472 3.43E-09 53.9397 cl11083 ClpB_D2-small superfamily - - "C-terminal, D2-small domain, of ClpB protein; This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, pfam00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighboring subunit and thereby providing enough binding energy to stabilise the functional assembly. The domain is associated with two Clp_N, pfam02861, at the N-terminus as well as AAA, pfam00004 and AAA_2, pfam07724." Q#12892 - CGI_10010167 superfamily 248458 1196 1282 0.000683428 42.3009 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#12892 - CGI_10010167 superfamily 222006 677 753 2.85E-08 52.995 cl16182 Hydrolase_like2 superfamily N - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#12892 - CGI_10010167 superfamily 215733 96 289 0.000271947 42.5523 cl02811 E1-E2_ATPase superfamily C - E1-E2 ATPase; E1-E2 ATPase. Q#12892 - CGI_10010167 superfamily 241762 814 881 0.00107796 38.8447 cl00297 R3H superfamily - - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#12892 - CGI_10010167 superfamily 226572 976 1024 0.00458332 37.9236 cl18761 COG4087 superfamily N - Soluble P-type ATPase [General function prediction only] Q#12893 - CGI_10010168 superfamily 244363 22 126 6.77E-30 107.499 cl06336 Commd superfamily N - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#12896 - CGI_10010171 superfamily 241659 61 129 7.84E-10 55.2187 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#12897 - CGI_10010172 superfamily 247856 163 215 4.12E-07 47.1573 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12897 - CGI_10010172 superfamily 246925 1 141 4.24E-11 62.7582 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#12898 - CGI_10010173 superfamily 221416 62 223 2.46E-27 110.58 cl13517 Spt20 superfamily - - Spt20 family; This presumed domain is found in the Spt20 proteins from both human and yeast. The Spt20 protein is part of the SAGA complex which is a large cmplex mediating histone deacetylation. Yeast Spt20 has been shown to play a role in structural integrity of the SAGA complex as as no intact SAGA could be purified in spt20 deletion strains. Q#12900 - CGI_10027190 superfamily 243089 94 258 1.80E-42 153.953 cl02564 PXA superfamily - - PXA domain; This domain is associated with PX domains pfam00787. Q#12900 - CGI_10027190 superfamily 149621 839 946 2.50E-33 125.475 cl07303 Nexin_C superfamily - - Sorting nexin C terminal; This region is found a the C terminal of proteins belonging to the sorting nexin family. It is found on proteins which also contain pfam00787. Q#12900 - CGI_10027190 superfamily 243088 642 765 7.94E-30 115.934 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#12900 - CGI_10027190 superfamily 243090 390 463 1.48E-16 77.4541 cl02565 RGS superfamily N - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#12901 - CGI_10027191 superfamily 247941 4 123 8.77E-06 41.9377 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#12902 - CGI_10027192 superfamily 241825 41 180 5.08E-22 91.0623 cl00379 Ribosomal_L18_L5e superfamily - - "Ribosomal L18/L5e: L18 (L5e) is a ribosomal protein found in the central protuberance (CP) of the large subunit. L18 binds 5S rRNA and induces a conformational change that stimulates the binding of L5 to 5S rRNA. Association of 5S rRNA with 23S rRNA depends on the binding of L18 and L5 to 5S rRNA. L18/L5e is generally described as L18 in prokaryotes and archaea, and as L5e (or L5) in eukaryotes. In bacteria, the CP proteins L5, L18, and L25 are required for the ribosome to incorporate 5S rRNA into the large subunit, one of the last steps in ribosome assembly. In archaea, both L18 and L5 bind 5S rRNA; in eukaryotes, only the L18 homolog (L5e) binds 5S rRNA but a homolog to L5 is also identified." Q#12902 - CGI_10027192 superfamily 241825 242 381 5.08E-22 91.0623 cl00379 Ribosomal_L18_L5e superfamily - - "Ribosomal L18/L5e: L18 (L5e) is a ribosomal protein found in the central protuberance (CP) of the large subunit. L18 binds 5S rRNA and induces a conformational change that stimulates the binding of L5 to 5S rRNA. Association of 5S rRNA with 23S rRNA depends on the binding of L18 and L5 to 5S rRNA. L18/L5e is generally described as L18 in prokaryotes and archaea, and as L5e (or L5) in eukaryotes. In bacteria, the CP proteins L5, L18, and L25 are required for the ribosome to incorporate 5S rRNA into the large subunit, one of the last steps in ribosome assembly. In archaea, both L18 and L5 bind 5S rRNA; in eukaryotes, only the L18 homolog (L5e) binds 5S rRNA but a homolog to L5 is also identified." Q#12902 - CGI_10027192 superfamily 222592 403 483 6.27E-35 126.128 cl16705 Ribosomal_L18_c superfamily - - Ribosomal L18 C-terminal region; This domain is the C-terminal end of ribosomal L18/L5 proteins. Q#12904 - CGI_10027194 superfamily 247856 47 133 2.47E-05 39.0681 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#12909 - CGI_10027199 superfamily 245604 17 91 2.01E-32 111.086 cl11404 Biotinyl_lipoyl_domains superfamily N - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#12911 - CGI_10027201 superfamily 245226 16 83 6.85E-08 46.5249 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#12912 - CGI_10027202 superfamily 243066 5 91 1.03E-10 53.7109 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#12914 - CGI_10027204 superfamily 243072 188 304 2.85E-28 107.855 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12914 - CGI_10027204 superfamily 243072 80 238 3.35E-28 107.855 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12914 - CGI_10027204 superfamily 243073 403 444 3.54E-11 58.2759 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#12914 - CGI_10027204 superfamily 243072 31 53 0.00670043 34.452 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#12915 - CGI_10027205 superfamily 247724 204 290 0.00539716 36.3343 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12915 - CGI_10027205 superfamily 247755 41 75 0.00706298 36.0687 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#12916 - CGI_10027206 superfamily 247724 27 189 7.12E-48 156.847 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12917 - CGI_10027207 superfamily 242323 81 320 2.90E-68 215.661 cl01132 FA_hydroxylase superfamily - - "Fatty acid hydroxylase superfamily; This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins." Q#12917 - CGI_10027207 superfamily 242849 23 80 5.64E-11 57.5989 cl02041 Cyt-b5 superfamily N - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#12920 - CGI_10027210 superfamily 178061 2 281 1.10E-163 458.858 cl18094 PLN02442 superfamily - - S-formylglutathione hydrolase Q#12922 - CGI_10027212 superfamily 243092 35 195 2.10E-23 100.872 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12922 - CGI_10027212 superfamily 243092 125 393 1.16E-12 67.7452 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12925 - CGI_10027215 superfamily 246680 12 90 0.00545084 34.0996 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#12932 - CGI_10027222 superfamily 247684 12 446 0 585.975 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#12935 - CGI_10027225 superfamily 243084 573 675 8.50E-53 177.941 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#12936 - CGI_10027226 superfamily 241568 601 660 1.58E-05 43.6056 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12936 - CGI_10027226 superfamily 241568 415 472 5.47E-05 42.0648 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12936 - CGI_10027226 superfamily 241568 476 531 0.000779627 38.598 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12936 - CGI_10027226 superfamily 241568 665 720 0.000892298 38.2128 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12937 - CGI_10027227 superfamily 248317 134 593 8.02E-104 325.49 cl17763 GDA1_CD39 superfamily - - GDA1/CD39 (nucleoside phosphatase) family; GDA1/CD39 (nucleoside phosphatase) family. Q#12938 - CGI_10027228 superfamily 243092 16 328 3.65E-24 99.3316 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#12939 - CGI_10027229 superfamily 221381 213 494 1.25E-93 292.373 cl13455 DUF3508 superfamily - - Domain of unknown function (DUF3508); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 280 amino acids in length. This domain has two conserved sequence motifs: GFC and GLL. This family is also known as UPF0704. Q#12940 - CGI_10027230 superfamily 247683 16 68 3.43E-22 89.5641 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#12940 - CGI_10027230 superfamily 246669 417 457 4.34E-12 63.9611 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#12941 - CGI_10027231 superfamily 246669 2 163 8.48E-53 180.291 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#12942 - CGI_10027232 superfamily 244307 107 501 1.99E-178 523.044 cl06123 DHR2_DOCK superfamily - - "Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins; DOCK proteins comprise a family of atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. They are also called the CZH (CED-5, Dock180, and MBC-zizimin homology) family, after the first family members identified. Dock180 was first isolated as a binding partner for the adaptor protein Crk. The Caenorhabditis elegans protein, Ced-5, is essential for cell migration and phagocytosis, while the Drosophila ortholog, Myoblast city (MBC), is necessary for myoblast fusion and dorsal closure. DOCKs are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1 (or Dock180), 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1, and DHR-2 (also called CZH2 or Docker). This alignment model represents the DHR-2 domain of DOCK proteins, which contains the catalytic GEF activity for Rac and/or Cdc42." Q#12942 - CGI_10027232 superfamily 247999 800 845 0.000135403 40.6582 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#12943 - CGI_10027233 superfamily 248241 42 503 0 569.968 cl17687 5_nucleotid superfamily - - "5' nucleotidase family; This family of eukaryotic proteins includes 5' nucleotidase enzymes, such as purine 5'-nucleotidase EC:3.1.3.5." Q#12944 - CGI_10027234 superfamily 243095 96 301 1.46E-66 218.099 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#12945 - CGI_10027235 superfamily 221838 224 429 1.10E-56 193.257 cl15150 Apc4 superfamily - - "Anaphase-promoting complex, cyclosome, subunit 4; Apc4 is one of the larger of the subunits of the anaphase-promoting complex or cyclosome. This family represents the long domain downstream of the WD40 repeat/s that are present on the Apc4 subunits. The anaphase-promoting complex is a multiprotein subunit E3 ubiquitin ligase complex that controls segregation of chromosomes and exit from mitosis in eukaryotes. Results in C.elegans show that the primary essential role of the spindle assembly checkpoint is not in the chromosome segregation process itself but rather in delaying anaphase onset until all chromosomes are properly attached to the spindle. the APC/C is likely to be required for all metaphase-to-anaphase transitions in a multicellular organism." Q#12945 - CGI_10027235 superfamily 205130 6 52 5.40E-14 67.7027 cl14918 Apc4_WD40 superfamily - - "Anaphase-promoting complex subunit 4 WD40 domain; Apc4 contains an N-terminal propeller-shaped WD40 domain.The N-terminus of Afi1 serves to stabilise the union between Apc4 and Apc5, both of which lie towards the bottom-front of the APC," Q#12946 - CGI_10027236 superfamily 222551 65 188 1.69E-06 43.9753 cl16625 DUF4285 superfamily - - Domain of unknown function (DUF4285); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 157 and 206 amino acids in length. Q#12950 - CGI_10027240 superfamily 243263 21 410 1.13E-71 234.61 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#12951 - CGI_10027241 superfamily 242166 78 174 2.67E-34 118.109 cl00881 SQR_QFR_TM superfamily - - "Succinate:quinone oxidoreductase (SQR) and Quinol:fumarate reductase (QFR) family, transmembrane subunits; SQR catalyzes the oxidation of succinate to fumarate coupled to the reduction of quinone to quinol, while QFR catalyzes the reverse reaction. SQR, also called succinate dehydrogenase or Complex II, is part of the citric acid cycle and the aerobic respiratory chain, while QFR is involved in anaerobic respiration with fumarate as the terminal electron acceptor. SQRs may reduce either high or low potential quinones while QFRs oxidize only low potential quinols. SQR and QFR share a common subunit arrangement, composed of a flavoprotein catalytic subunit, an iron-sulfur protein and one or two hydrophobic transmembrane subunits. The structural arrangement allows efficient electron transfer between the catalytic subunit, through iron-sulfur centers, and the transmembrane subunit(s) containing the electron donor/acceptor (quinol or quinone). The reversible reduction of quinone is an essential feature of respiration, allowing the transfer of electrons between respiratory complexes. SQRs and QFRs can be classified into five types (A-E) according to the number of their hydrophobic subunits and heme groups. This classification is consistent with the characteristics and phylogeny of the catalytic and iron-sulfur subunits. Type E proteins, e.g. non-classical archael SQRs, contain atypical transmembrane subunits and are not included in this hierarchy. The heme and quinone binding sites reside in the transmembrane subunits. Although succinate oxidation and fumarate reduction are carried out by separate enzymes in most organisms, some bifunctional enzymes that exhibit both SQR and QFR activities exist." Q#12952 - CGI_10027242 superfamily 149071 266 344 1.52E-11 61.9501 cl06710 DUF1647 superfamily N - Protein of unknown function (DUF1647); The sequences making up this family are all derived from hypothetical proteins expressed by C. elegans. The region in question is approximately 160 amino acids long. The GO annotation for this protein indicates the protein to be involved in nematode larval development and to have a positive regulation on growth rate. Q#12957 - CGI_10027247 superfamily 247744 1333 1509 3.01E-61 208.155 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#12957 - CGI_10027247 superfamily 203750 268 393 3.95E-29 115.87 cl18248 Sad1_UNC superfamily - - "Sad1 / UNC-like C-terminal; The C. elegans UNC-84 protein is a nuclear envelope protein that is involved in nuclear anchoring and migration during development. The S. pombe Sad1 protein localises at the spindle pole body. UNC-84 and and Sad1 share a common C-terminal region, that is often termed the SUN (Sad1 and UNC) domain. In mammals, the SUN domain is present in two proteins, Sun1 and Sun2. The SUN domain of Sun2 has been demonstrated to be in the periplasm." Q#12958 - CGI_10027248 superfamily 244843 2 36 3.22E-05 41.8329 cl08040 Ggt superfamily NC - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#12961 - CGI_10027251 superfamily 244843 61 194 6.97E-29 111.939 cl08040 Ggt superfamily N - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#12961 - CGI_10027251 superfamily 244843 2 74 0.000525808 38.7513 cl08040 Ggt superfamily NC - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#12962 - CGI_10027252 superfamily 191217 1 67 4.64E-30 103.063 cl07859 zf-DNL superfamily - - DNL zinc finger; The domain is named after a short C-terminal motif of D(N/H)L. This domain is a novel zinc-finger protein essential for protein import into mitochondria. Q#12963 - CGI_10027253 superfamily 216212 280 802 0 675.162 cl03037 HCO3_cotransp superfamily - - HCO3- transporter family; This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. Q#12964 - CGI_10027254 superfamily 216212 397 923 0 734.097 cl03037 HCO3_cotransp superfamily - - HCO3- transporter family; This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. Q#12967 - CGI_10027257 superfamily 241568 144 169 0.00724393 32.82 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#12967 - CGI_10027257 superfamily 241619 13 73 0.00998409 32.5541 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#12970 - CGI_10027260 superfamily 247724 1 169 9.62E-101 291.415 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12971 - CGI_10027261 superfamily 243078 6 114 6.62E-36 134.647 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#12972 - CGI_10027262 superfamily 247724 27 196 6.10E-120 342.11 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#12973 - CGI_10027263 superfamily 243034 53 137 7.90E-06 45.0636 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#12973 - CGI_10027263 superfamily 243034 375 437 0.00125068 38.13 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#12974 - CGI_10027264 superfamily 247771 22 210 1.26E-79 249.093 cl17217 malate_synt superfamily C - "Malate synthase catalyzes the Claisen condensation of glyoxylate and acetyl-CoA to malyl-CoA , which hydrolyzes to malate and CoA. This reaction is part of the glyoxylate cycle, which allows certain organisms, like plants and fungi, to derive their carbon requirements from two-carbon compounds, by bypassing the two carboxylation steps of the citric acid cycle." Q#12975 - CGI_10027265 superfamily 245009 18 134 2.42E-28 101.976 cl09109 NTF2_like superfamily - - "Nuclear transport factor 2 (NTF2-like) superfamily. This family includes members of the NTF2 family, Delta-5-3-ketosteroid isomerases, Scytalone Dehydratases, and the beta subunit of Ring hydroxylating dioxygenases. This family is a classic example of divergent evolution wherein the proteins have many common structural details but diverge greatly in their function. For example, nuclear transport factor 2 (NTF2) mediates the nuclear import of RanGDP and binds to both RanGDP and FxFG repeat-containing nucleoporins while Ketosteroid isomerases catalyze the isomerization of delta-5-3-ketosteroid to delta-4-3-ketosteroid, by intramolecular transfer of the C4-beta proton to the C6-beta position. While the function of the beta sub-unit of the Ring hydroxylating dioxygenases is not known, Scytalone Dehydratases catalyzes two reactions in the biosynthetic pathway that produces fungal melanin. Members of the NTF2-like superfamily are widely distributed among bacteria, archaea and eukaryotes." Q#12981 - CGI_10016599 superfamily 241600 300 510 4.66E-78 245.998 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#12981 - CGI_10016599 superfamily 241600 74 250 2.70E-68 220.575 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#12983 - CGI_10016601 superfamily 241609 30 110 2.26E-20 81.6555 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#12983 - CGI_10016601 superfamily 241609 115 188 1.21E-16 71.2674 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#12987 - CGI_10016606 superfamily 241691 236 356 9.64E-05 41.3436 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#12988 - CGI_10016607 superfamily 241640 11 59 1.15E-23 90.0282 cl00149 Tryp_SPc superfamily N - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#12989 - CGI_10016608 superfamily 245206 40 295 3.14E-146 415.455 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#12990 - CGI_10016609 superfamily 216421 104 384 4.46E-47 164.133 cl03153 Lamp superfamily - - Lysosome-associated membrane glycoprotein (Lamp); Lysosome-associated membrane glycoprotein (Lamp). Q#12993 - CGI_10016612 superfamily 217615 60 232 1.09E-65 209.146 cl04158 Allantoicase superfamily - - "Allantoicase repeat; This family is found in pairs in Allantoicases, forming the majority of the protein. These proteins allow the use of purines as secondary nitrogen sources in nitrogen-limiting conditions through the reaction: allantoate + H(2)0 = (-)-ureidoglycolate + urea." Q#12993 - CGI_10016612 superfamily 217615 255 413 4.91E-51 170.626 cl04158 Allantoicase superfamily - - "Allantoicase repeat; This family is found in pairs in Allantoicases, forming the majority of the protein. These proteins allow the use of purines as secondary nitrogen sources in nitrogen-limiting conditions through the reaction: allantoate + H(2)0 = (-)-ureidoglycolate + urea." Q#12997 - CGI_10009619 superfamily 241752 80 168 5.88E-28 102.783 cl00283 ADP_ribosyl superfamily C - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#12998 - CGI_10009620 superfamily 241752 102 214 1.18E-41 143.999 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#12999 - CGI_10009621 superfamily 241752 815 929 4.36E-38 139.762 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#12999 - CGI_10009621 superfamily 207713 640 709 4.48E-06 45.7937 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#13000 - CGI_10009622 superfamily 243106 63 131 1.35E-19 80.8984 cl02608 BAH superfamily C - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#13002 - CGI_10009624 superfamily 246723 113 753 0 750.672 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#13003 - CGI_10009625 superfamily 241584 57 135 0.00149655 35.4775 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#13004 - CGI_10006516 superfamily 248318 203 236 1.11E-13 66.3053 cl17764 FYVE superfamily C - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#13004 - CGI_10006516 superfamily 151903 496 537 1.70E-11 59.7756 cl12988 Rbsn superfamily - - Rabenosyn Rab binding domain; Rabenosyn-5 (Rbsn) is a multivalent effector with interacts with the Rab family.Rsbn contains distinct Rab4 and Rab5 binding sites within residues 264-500 and 627-784 respectively. Rab proteins are GTPases involved in the regulation of all stages of membrane trafficking. Q#13005 - CGI_10006517 superfamily 210108 3 42 1.44E-11 53.0796 cl15447 Ribosomal_L29e superfamily - - Ribosomal L29e protein family; Ribosomal L29e protein family. Q#13007 - CGI_10006519 superfamily 216363 125 218 1.18E-12 61.3322 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#13008 - CGI_10006699 superfamily 245201 362 631 2.26E-117 364.451 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13008 - CGI_10006699 superfamily 243146 970 1012 2.32E-07 49.197 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13008 - CGI_10006699 superfamily 243146 789 834 4.56E-06 45.2419 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13008 - CGI_10006699 superfamily 243146 934 980 0.000208726 40.2343 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13008 - CGI_10006699 superfamily 243146 878 919 0.000362317 39.567 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13009 - CGI_10006700 superfamily 247776 8 119 7.29E-45 143.45 cl17222 Transthyretin_like superfamily - - "Transthyretin_like. This domain is present in the transthyretin-like protein (TLP) family which includes transthyretin (TTR) and a transthyretin-related protein called 5-hydroxyisourate hydrolase (HIUase). TTR and HIUase are homotetrameric proteins with each subunit consisting of eight beta-strands arranged in two sheets and a short alpha-helix. The central channel of the tetramer contains two independent binding sites, each located between a pair of subunits. TTR transports thyroid hormones and retinol in the blood serum of vertebrates while HIUase catalyzes the second step in a three-step ureide pathway. TTRs are highly conserved and found only in vertebrates while the HIUases are found in a wide range of bacterial, plant, fungal, slime mold and vertebrate organisms." Q#13010 - CGI_10006701 superfamily 241571 466 530 3.07E-08 52.7998 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#13010 - CGI_10006701 superfamily 246918 373 424 1.65E-15 73.0047 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13011 - CGI_10006702 superfamily 242006 87 280 4.14E-76 241.635 cl00653 Endonuclease_V superfamily - - "Endonuclease_V, a DNA repair enzyme that initiates repair of nitrosative deaminated purine bases; Endonuclease_V (EndoV) is an enzyme that can initiate repair of all possible deaminated DNA bases. EndoV cleaves the DNA strand containing lesions at the second phosphodiester bond 3' to the lesion using Mg2+ as a cofactor. EndoV homologs are conserved throughout all domains of life from bacteria to humans. EndoV is encoded by the nfi gene and nfi null mutant mice have a phenotype prone to cancer. The ability of endonuclease V to recognize mismatches and abnormal replicative DNA structures suggests that the enzyme plays an important role in DNA metabolism. The details of downstream processing for the EndoV pathway remain unknown." Q#13011 - CGI_10006702 superfamily 245010 405 487 0.0014762 37.2123 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#13012 - CGI_10006703 superfamily 216554 62 241 3.62E-33 122.203 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#13013 - CGI_10006704 superfamily 241578 550 700 4.12E-14 72.5986 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13013 - CGI_10006704 superfamily 241578 182 310 6.47E-05 44.1281 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13014 - CGI_10001377 superfamily 247856 84 136 8.45E-06 40.9941 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13018 - CGI_10009304 superfamily 241563 59 94 0.00127193 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13019 - CGI_10009305 superfamily 241563 61 96 0.00402266 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13020 - CGI_10009306 superfamily 247684 26 250 6.76E-56 189.411 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#13022 - CGI_10009308 superfamily 245226 24 127 6.35E-12 60.3921 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#13024 - CGI_10009310 superfamily 245606 37 203 2.08E-94 281.289 cl11410 TPP_enzyme_PYR superfamily - - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#13024 - CGI_10009310 superfamily 217227 226 346 4.01E-31 115.001 cl08363 Transketolase_C superfamily - - "Transketolase, C-terminal domain; The C-terminal domain of transketolase has been proposed as a regulatory molecule binding site." Q#13025 - CGI_10020348 superfamily 245601 5 89 1.34E-31 112.03 cl11399 HP superfamily C - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#13026 - CGI_10020349 superfamily 192445 45 154 1.62E-21 87.4675 cl10818 Med4 superfamily C - "Vitamin-D-receptor interacting Mediator subunit 4; Members of this family function as part of the Mediator (Med) complex, which links DNA-bound transcriptional regulators and the general transcription machinery, particularly the RNA polymerase II enzyme. They play a role in basal transcription by mediating activation or repression according to the specific complement of transcriptional regulators bound to the promoter." Q#13027 - CGI_10020350 superfamily 242885 41 202 5.63E-80 239.036 cl02106 IF4E superfamily - - Eukaryotic initiation factor 4E; Eukaryotic initiation factor 4E. Q#13029 - CGI_10020352 superfamily 245213 259 293 3.32E-06 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13029 - CGI_10020352 superfamily 245213 182 217 1.61E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13030 - CGI_10020353 superfamily 213152 1231 1260 2.40E-05 43.2878 cl17045 TM_EGFR-like superfamily C - "Transmembrane domain of the Epidermal Growth Factor Receptor family of Protein Tyrosine Kinases; PTKs catalyze the transfer of the gamma-phosphoryl group from ATP to tyrosine (tyr) residues in protein substrates. EGFR (HER, ErbB) subfamily members include EGFR (HER1, ErbB1), HER2 (ErbB2), HER3 (ErbB3), HER4 (ErbB4), and similar proteins. They are receptor PTKs (RTKs) containing an extracellular EGF-related ligand-binding region, a transmembrane (TM) helix, and a cytoplasmic region with a tyr kinase domain and a regulatory C-terminal tail. They are activated by ligand-induced dimerization, resulting in the phosphorylation of tyr residues in the C-terminal tail, which serve as binding sites for downstream signaling molecules. Collectively, they can recognize a variety of ligands including EGF, TGFalpha, and neuregulins, among others. All four subfamily members can form homo- or heterodimers. HER3 contains an impaired kinase domain and depends on its heterodimerization partner for activation. EGFR subfamily members are involved in signaling pathways leading to a broad range of cellular responses including cell proliferation, differentiation, migration, growth inhibition, and apoptosis. The TM domain not only serves as a membrane anchor, but also plays an important role in receptor dimerization and optimal activation. Mutations in the TM domain of EGFR family RTKs have been associated with increased breast cancer risk." Q#13030 - CGI_10020353 superfamily 219525 393 440 1.15E-07 50.4953 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 234 281 6.05E-06 45.4877 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 451 496 2.79E-05 43.5618 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 718 764 3.59E-05 43.1766 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 826 874 4.68E-05 42.7914 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 882 928 0.000122663 41.6358 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 777 819 0.000160393 41.2506 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 662 710 0.000187254 40.8654 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 935 978 0.000520367 39.7098 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 504 554 0.0010685 38.5542 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 1156 1187 0.00148313 38.169 cl06646 GCC2_GCC3 superfamily C - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 347 385 0.00188905 37.7838 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 294 333 0.00237024 37.7838 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13030 - CGI_10020353 superfamily 219525 563 607 0.00421841 37.0134 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#13031 - CGI_10020354 superfamily 241644 393 526 8.83E-25 103.051 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#13031 - CGI_10020354 superfamily 241644 543 691 1.62E-24 102.28 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#13031 - CGI_10020354 superfamily 241644 708 860 8.17E-18 83.0205 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#13031 - CGI_10020354 superfamily 244994 1658 1741 1.03E-15 75.3502 cl08520 Cdc6_C superfamily - - "Winged-helix domain of essential DNA replication protein Cell division control protein (Cdc6), which mediates DNA binding; This model characterizes the winged-helix, C-terminal domain of the Cell division control protein (Cdc6_C). Cdc6 (also known as Cell division cycle 6 or Cdc18) functions as a regulator at the early stages of DNA replication, by helping to recruit and load the Minichromosome Maintenance Complex (MCM) onto DNA and may have additional roles in the control of mitotic entry. Precise duplication of chromosomal DNA is required for genomic stability during replication. Cdc6 has an essential role in DNA replication and irregular expression of Cdc6 may lead to genomic instability. Cdc6 over-expression is observed in many cancerous lesions. DNA replication begins when an origin recognition complex (ORC) binds to a replication origin site on the chromatin. Studies indicate that Cdc6 interacts with ORC through the Orc1 subunit, and that this association increases the specificity of the ORC-origins interaction. Further studies suggest that hydrolysis of Cdc6-bound ATP promotes the association of the replication licensing factor Cdt1 with origins through an interaction with Orc6 and this in turn promotes the loading of MCM2-7 helicase onto chromatin. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication. S-Cdk (S-phase cyclin and cyclin-dependent kinase complex) prevents rereplication by causing the Cdc6 protein to dissociate from ORC and prevents the Cdc6 and MCM proteins from reassembling at any origin. By phosphorylating Cdc6, S-Cdk also triggers Cdc6's ubiquitination. The Cdc6 protein is composed of three domains, an N-terminal AAA+ domain with Walker A and B, and Sensor-1 and -2 motifs. The central region contains a conserved nucleotide binding/ATPase domain and is a member of the ATPase superfamily. The C-terminal domain (Cdc6_C) is a conserved winged-helix domain that possibly mediates protein-protein interactions or direct DNA interactions. Cdc6 is conserved in eukaryotes, and related genes are found in Archaea. The winged helix fold structure of Cdc6_C is similar to the structures of other eukaryotic replication initiators without apparent sequence similarity." Q#13031 - CGI_10020354 superfamily 247743 1397 1560 4.91E-08 53.3039 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13031 - CGI_10020354 superfamily 243106 941 1063 1.60E-10 60.8532 cl02608 BAH superfamily - - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#13032 - CGI_10020355 superfamily 217519 7 177 4.28E-87 260.94 cl04030 PRP38 superfamily - - PRP38 family; Members of this family are related to the pre mRNA splicing factor PRP38 from yeast. Therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation. Q#13033 - CGI_10020356 superfamily 217519 1 101 5.97E-47 155.01 cl04030 PRP38 superfamily N - PRP38 family; Members of this family are related to the pre mRNA splicing factor PRP38 from yeast. Therefore all the members of this family could be involved in splicing. This conserved region could be involved in RNA binding. The putative domain is about 180 amino acids in length. PRP38 is a unique component of the U4/U6.U5 tri-small nuclear ribonucleoprotein (snRNP) particle and is necessary for an essential step late in spliceosome maturation. Q#13034 - CGI_10020357 superfamily 241644 183 216 0.000743509 37.5669 cl00154 UBCc superfamily C - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#13038 - CGI_10020361 superfamily 246680 608 689 1.85E-16 75.5885 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#13040 - CGI_10020363 superfamily 243141 26 181 3.76E-16 75.0454 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#13041 - CGI_10020364 superfamily 247057 315 379 1.67E-33 120.068 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#13041 - CGI_10020364 superfamily 247999 201 246 2.59E-10 55.9596 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#13042 - CGI_10020365 superfamily 245847 108 158 0.00817129 33.63 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#13043 - CGI_10020366 superfamily 241574 599 805 5.31E-87 281.011 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#13043 - CGI_10020366 superfamily 241574 864 1054 2.88E-31 123.85 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#13045 - CGI_10020368 superfamily 151064 66 113 6.33E-23 88.1591 cl11145 DUF2453 superfamily C - Protein of unknown function (DUF2453); Some members of this family are purported to contain GAF domains but this could not be confirmed. The function is not known. It is likely to be a transmembrane protein. Q#13046 - CGI_10020369 superfamily 241578 267 403 4.40E-26 103.628 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13046 - CGI_10020369 superfamily 219821 184 253 7.53E-11 58.9218 cl07136 VWA_N superfamily N - "VWA N-terminal; This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits." Q#13047 - CGI_10020370 superfamily 247802 55 244 4.21E-139 408.978 cl17248 RIO superfamily - - "RIO kinase family, catalytic domain. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). RIO kinases are atypical protein serine kinases present in archaea, bacteria and eukaryotes. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. RIO kinases contain a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. Most organisms contain at least two RIO kinases, RIO1 and RIO2. A third protein, RIO3, is present in multicellular eukaryotes. In yeast, RIO1 and RIO2 are essential for survival. They function as non-ribosomal factors necessary for late 18S rRNA processing. RIO1 is also required for proper cell cycle progression and chromosome maintenance. The biological substrates for RIO kinases are still unknown." Q#13047 - CGI_10020370 superfamily 218308 415 711 7.93E-63 216.409 cl09342 Peroxin-3 superfamily - - Peroxin-3; Peroxin-3 is a peroxisomal protein. It is thought to be involve in membrane vesicle assembly prior to the translocation of matrix proteins. Q#13048 - CGI_10020371 superfamily 243092 638 861 6.48E-27 113.584 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13048 - CGI_10020371 superfamily 243092 1262 1634 1.09E-24 106.65 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13048 - CGI_10020371 superfamily 243092 424 740 1.62E-23 103.184 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13048 - CGI_10020371 superfamily 243092 1502 1831 1.37E-16 81.6124 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13048 - CGI_10020371 superfamily 243092 788 1017 1.11E-12 69.286 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13048 - CGI_10020371 superfamily 191163 21 194 3.10E-11 63.9827 cl04888 DUF667 superfamily - - "Protein of unknown function (DUF667); This family of proteins are highly conserved in eukaryotes. Some proteins in the family are annotated as transcription factors. However, there is currently no support for this in the literature." Q#13048 - CGI_10020371 superfamily 243092 1189 1300 5.41E-11 64.2784 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13052 - CGI_10020375 superfamily 241631 11 170 1.31E-62 201.682 cl00136 Sec7 superfamily - - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#13052 - CGI_10020375 superfamily 247725 208 320 4.11E-62 198.304 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13053 - CGI_10020376 superfamily 241782 135 499 5.49E-97 301.044 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#13054 - CGI_10020377 superfamily 247725 7 120 4.74E-65 209.91 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13054 - CGI_10020377 superfamily 243095 120 325 9.17E-61 201.518 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#13055 - CGI_10020378 superfamily 245201 9 294 1.46E-178 501.938 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13056 - CGI_10020379 superfamily 238191 110 209 1.15E-08 55.0308 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#13056 - CGI_10020379 superfamily 215859 304 346 0.000174801 41.0479 cl18347 Peptidase_S9 superfamily N - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#13058 - CGI_10020381 superfamily 241753 96 506 0 831.701 cl00285 Aconitase superfamily - - "Aconitase catalytic domain; Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle; Aconitase catalytic domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. Aconitase, in its active form, contains a 4Fe-4S iron-sulfur cluster; three cysteine residues have been shown to be ligands of the 4Fe-4S cluster. This is the Aconitase core domain, including structural domains 1, 2 and 3, which binds the Fe-S cluster. The aconitase family also contains the following proteins: - Iron-responsive element binding protein (IRE-BP), a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid." Q#13058 - CGI_10020381 superfamily 241693 588 736 6.32E-99 305.162 cl00215 Aconitase_swivel superfamily - - "Aconitase swivel domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The aconitase family contains the following proteins: - Iron-responsive element binding protein (IRE-BP). IRE-BP is a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid." Q#13059 - CGI_10020382 superfamily 247905 587 709 8.46E-27 108.864 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13059 - CGI_10020382 superfamily 247805 266 411 6.28E-17 80.4592 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#13060 - CGI_10020383 superfamily 203679 2 62 3.26E-09 47.9959 cl06533 Med9 superfamily N - "RNA polymerase II transcription mediator complex subunit 9; This family of Med9 proteins is conserved in yeasts. It forms part of the middle region of Mediator. Med9 has two functional domains. The species-specific amino-terminal half (aa 1-63) plays a regulatory role in transcriptional regulation, whereas this well-conserved carboxy-terminal half (aa 64-149) has a more fundamental function involved in direct binding to the amino-terminal portions of Med4 and Med7 and the assembly of Med9 into the Middle module. Also, some unidentified factor(s) in med9 extracts may impact the binding of TFIID to the promoter." Q#13061 - CGI_10020384 superfamily 241733 1 58 1.46E-16 69.9844 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#13063 - CGI_10020386 superfamily 227527 8 230 2.41E-44 155.124 cl15315 LUC7 superfamily - - "U1 snRNP component, mediates U1 snRNP association with cap-binding complex [RNA processing and modification]" Q#13064 - CGI_10020387 superfamily 245210 88 470 2.70E-155 489.377 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#13064 - CGI_10020387 superfamily 247637 1642 1930 5.73E-99 323.367 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#13064 - CGI_10020387 superfamily 245206 1947 2184 2.12E-72 252.755 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#13064 - CGI_10020387 superfamily 244888 578 892 1.07E-90 300.867 cl08282 Acyl_transf_1 superfamily - - Acyl transferase domain; Acyl transferase domain. Q#13064 - CGI_10020387 superfamily 245206 1386 1600 2.45E-20 95.5936 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#13064 - CGI_10020387 superfamily 214837 940 1088 1.75E-14 74.185 cl11739 PKS_DH superfamily - - Dehydratase domain in polyketide synthase (PKS) enzymes; Dehydratase domain in polyketide synthase (PKS) enzymes. Q#13065 - CGI_10020388 superfamily 241754 28 759 1.37E-158 483.306 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#13066 - CGI_10020389 superfamily 245208 39 468 0 651.014 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#13067 - CGI_10020390 superfamily 220679 4 142 5.79E-20 82.7601 cl18567 Methyltransf_16 superfamily - - Putative methyltransferase; Putative methyltransferase. Q#13068 - CGI_10020391 superfamily 247805 50 242 9.26E-87 279.755 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#13068 - CGI_10020391 superfamily 247905 252 381 9.52E-35 130.821 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13068 - CGI_10020391 superfamily 246723 494 1135 2.58E-84 288.581 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#13069 - CGI_10020392 superfamily 243161 3 68 3.31E-07 44.6926 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#13070 - CGI_10020393 superfamily 247684 2 38 3.81E-16 75.1215 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#13071 - CGI_10020394 superfamily 218118 709 774 1.06E-11 61.8613 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#13072 - CGI_10020395 superfamily 248022 97 381 1.09E-29 118.534 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#13073 - CGI_10020396 superfamily 247792 145 184 0.00113828 35.114 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13074 - CGI_10020397 superfamily 247725 17 116 7.62E-08 48.6989 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13074 - CGI_10020397 superfamily 247725 130 223 3.81E-07 46.6519 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13077 - CGI_10004552 superfamily 203134 7 67 3.75E-30 110.079 cl04866 CHORD superfamily - - "CHORD; CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development." Q#13077 - CGI_10004552 superfamily 241659 227 314 2.35E-29 108.538 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#13077 - CGI_10004552 superfamily 203134 150 214 2.87E-29 107.768 cl04866 CHORD superfamily - - "CHORD; CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development." Q#13080 - CGI_10007398 superfamily 243077 1 29 4.03E-07 47.5401 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#13080 - CGI_10007398 superfamily 241832 86 143 3.24E-11 60.8509 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#13081 - CGI_10007399 superfamily 247805 201 404 7.03E-68 220.819 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#13081 - CGI_10007399 superfamily 247905 414 550 3.14E-32 120.806 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13083 - CGI_10007401 superfamily 243092 75 233 5.15E-19 85.4644 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13085 - CGI_10007403 superfamily 245599 274 435 2.16E-30 115.401 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#13087 - CGI_10007405 superfamily 241752 124 238 5.92E-41 138.221 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#13089 - CGI_10004647 superfamily 245205 22 144 3.34E-32 119.687 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#13089 - CGI_10004647 superfamily 245205 167 287 2.01E-18 81.3112 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#13090 - CGI_10001876 superfamily 204080 214 411 2.07E-43 152.426 cl18252 BAAT_C superfamily - - BAAT / Acyl-CoA thioester hydrolase C terminal; This catalytic domain is found at the C terminal of acyl-CoA thioester hydrolases and bile acid-CoA:amino acid N-acetyltransferases (BAAT). Q#13090 - CGI_10001876 superfamily 218259 15 150 2.10E-37 133.163 cl04742 Bile_Hydr_Trans superfamily - - "Acyl-CoA thioester hydrolase/BAAT N-terminal region; This family consists of the amino termini of acyl-CoA thioester hydrolase and bile acid-CoA:amino acid N-acetyltransferase (BAAT). This region is not thought to contain the active site of either enzyme. Thioesterase isoforms have been identified in peroxisomes, cytoplasm and mitochondria, where they are thought to have distinct functions in lipid metabolism. For example, in peroxisomes, the hydrolase acts on bile-CoA esters." Q#13092 - CGI_10001878 superfamily 218259 15 150 3.13E-36 128.155 cl04742 Bile_Hydr_Trans superfamily - - "Acyl-CoA thioester hydrolase/BAAT N-terminal region; This family consists of the amino termini of acyl-CoA thioester hydrolase and bile acid-CoA:amino acid N-acetyltransferase (BAAT). This region is not thought to contain the active site of either enzyme. Thioesterase isoforms have been identified in peroxisomes, cytoplasm and mitochondria, where they are thought to have distinct functions in lipid metabolism. For example, in peroxisomes, the hydrolase acts on bile-CoA esters." Q#13092 - CGI_10001878 superfamily 204080 214 313 1.22E-09 56.1265 cl18252 BAAT_C superfamily C - BAAT / Acyl-CoA thioester hydrolase C terminal; This catalytic domain is found at the C terminal of acyl-CoA thioester hydrolases and bile acid-CoA:amino acid N-acetyltransferases (BAAT). Q#13099 - CGI_10005842 superfamily 248360 13 157 7.21E-43 143.185 cl17806 DER1 superfamily C - "Der1-like family; The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae contains of proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process. The mutant classes were called 'der' for 'degradation in the ER'. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein, that is localised to the ER. Deletion of DER1 abolished degradation of the substrate proteins. The function of the Der1 protein seems to be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. Suggesting that this family may also mediate degradation of misfolded proteins (Bateman A pers. obs.)." Q#13105 - CGI_10022058 superfamily 241972 49 288 5.86E-93 278.656 cl00600 Ribosomal_L7Ae superfamily - - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#13106 - CGI_10022059 superfamily 203401 19 129 7.88E-28 100.453 cl05607 Med22 superfamily - - "Surfeit locus protein 5 subunit 22 of Mediator complex; This family consists of several eukaryotic Surfeit locus protein 5 (SURF5) sequences. The human Surfeit locus has been mapped on chromosome 9q34.1. The locus includes six tightly clustered housekeeping genes (Surf1-6), and the gene organisation is similar in human, mouse and chicken Surfeit locus. The Med22 subunit of Mediator complex is part of the essential core head region." Q#13107 - CGI_10022060 superfamily 247792 349 389 4.44E-05 40.892 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13109 - CGI_10022062 superfamily 245201 34 296 2.84E-44 159.201 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13110 - CGI_10022063 superfamily 241750 25 140 1.90E-17 82.2459 cl00281 metallo-dependent_hydrolases superfamily NC - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#13113 - CGI_10022066 superfamily 241609 57 133 1.28E-22 85.8927 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#13113 - CGI_10022066 superfamily 241609 7 52 1.51E-13 61.6251 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#13114 - CGI_10022067 superfamily 241584 413 476 0.000939373 38.2463 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#13115 - CGI_10022068 superfamily 222048 544 662 0.00562108 37.6474 cl16238 ATG2_CAD superfamily - - "Autophagy-related protein 2 CAD motif; The Atg2 protein, an integral membrane protein, is required for a range of functions including the regulation of autophagy in conjunction with the Atg1-Atg13 complex. Atg2 binds Atg9. The precise function of this region, with its characteristic highly conserved CAD sequence motif, is not known." Q#13116 - CGI_10022069 superfamily 241584 578 641 0.000203273 40.5575 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#13117 - CGI_10022070 superfamily 204985 19 107 1.29E-13 67.9671 cl14987 Chorein_N superfamily C - "N-terminal region of Chorein, a TM vesicle-mediated sorter; Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport." Q#13118 - CGI_10022071 superfamily 241782 91 490 4.51E-115 345.728 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#13119 - CGI_10022072 superfamily 183292 47 456 0 534.786 cl18135 PRK11728 superfamily - - hydroxyglutarate oxidase; Provisional Q#13120 - CGI_10022073 superfamily 243109 539 706 9.52E-108 327.3 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#13120 - CGI_10022073 superfamily 241584 452 541 9.94E-11 59.4323 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#13120 - CGI_10022073 superfamily 128778 283 386 1.13E-17 80.3866 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#13120 - CGI_10022073 superfamily 241563 157 189 0.00240523 36.5463 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13121 - CGI_10022074 superfamily 218223 242 511 2.63E-110 340.115 cl04698 Radial_spoke superfamily N - "Radial spokehead-like protein; This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologues, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene." Q#13121 - CGI_10022074 superfamily 218223 14 181 1.09E-56 198.362 cl04698 Radial_spoke superfamily C - "Radial spokehead-like protein; This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologues, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene." Q#13122 - CGI_10022075 superfamily 241647 251 281 1.16E-08 52.145 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#13125 - CGI_10022078 superfamily 247684 6 429 5.21E-88 280.318 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#13126 - CGI_10022079 superfamily 245206 16 301 4.59E-101 318.177 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#13127 - CGI_10022080 superfamily 248438 7 108 8.29E-09 53.4001 cl17884 COG1214 superfamily C - "Inactive homolog of metal-dependent proteases, putative molecular chaperone [Posttranslational modification, protein turnover, chaperones]" Q#13128 - CGI_10022081 superfamily 247727 43 136 4.10E-15 68.6106 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#13129 - CGI_10022082 superfamily 243035 120 223 2.54E-14 66.1041 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13130 - CGI_10022083 superfamily 241583 110 266 1.37E-77 242.495 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#13130 - CGI_10022083 superfamily 243048 311 493 2.14E-17 80.048 cl02471 HX superfamily - - Hemopexin-like repeats.; Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). This CD contains 4 instances of the repeat. Q#13131 - CGI_10022084 superfamily 243072 653 758 4.64E-09 57.0082 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13131 - CGI_10022084 superfamily 247038 487 568 0.000836869 40.5224 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13131 - CGI_10022084 superfamily 243697 2 88 1.57E-46 166.039 cl04295 CG-1 superfamily N - CG-1 domain; CG-1 domains are highly conserved domains of about 130 amino-acid residues containing a predicted bipartite NLS and named after a partial cDNA clone isolated from parsley encoding a sequence-specific DNA-binding protein. CG-1 domains are associated with CAMTA proteins (for CAlModulin -binding Transcription Activator) that are transcription factors containing a calmodulin -binding domain and ankyrins (ANK) motifs. Q#13131 - CGI_10022084 superfamily 221856 1461 1625 4.64E-29 117.372 cl15165 Cnd1_N superfamily - - "non-SMC mitotic condensation complex subunit 1, N-term; The three non-SMC (structural maintenance of chromosomes) subunits of the mitotic condensation complex are Cnd1-3. The whole complex is essential for viability and the condensing of chromosomes in mitosis. This is the conserved N-terminus of the subunit 1." Q#13133 - CGI_10022086 superfamily 241624 13 265 7.51E-65 206.023 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#13135 - CGI_10022088 superfamily 242385 1065 1370 0 608.676 cl01244 arom_aa_hydroxylase superfamily - - "Biopterin-dependent aromatic amino acid hydroxylase; a family of non-heme, iron(II)-dependent enzymes that includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH converts L-phenylalanine to L-tyrosine, an important step in phenylalanine catabolism and neurotransmitter biosynthesis, and is linked to a severe variant of phenylketonuria in humans. TyrOH and TrpOH are involved in the biosynthesis of catecholamine and serotonin, respectively. The eukaryotic enzymes are all homotetramers." Q#13136 - CGI_10022089 superfamily 241874 7 547 0 613.032 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#13137 - CGI_10022090 superfamily 241874 40 579 0 583.372 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#13138 - CGI_10022091 superfamily 248097 25 154 2.26E-35 121.218 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#13139 - CGI_10022092 superfamily 248097 53 176 1.16E-29 107.736 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#13140 - CGI_10022093 superfamily 241754 80 834 0 1227.06 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#13140 - CGI_10022093 superfamily 111612 32 74 1.41E-14 70.973 cl03686 Myosin_N superfamily - - Myosin N-terminal SH3-like domain; This domain has an SH3-like fold. It is found at the N-terminus of many but not all myosins. The function of this domain is unknown. Q#13140 - CGI_10022093 superfamily 225871 1838 1956 0.0065799 38.6066 cl18728 COG3334 superfamily C - Uncharacterized conserved protein [Function unknown] Q#13142 - CGI_10022095 superfamily 247723 184 254 2.78E-35 124.997 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13142 - CGI_10022095 superfamily 247723 290 363 4.17E-26 100.006 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13142 - CGI_10022095 superfamily 247723 21 66 1.00E-20 85.4505 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13143 - CGI_10022096 superfamily 247723 26 105 5.46E-36 122.478 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13143 - CGI_10022096 superfamily 247723 118 161 1.02E-24 93.1793 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13144 - CGI_10022097 superfamily 241632 161 238 0.00494133 37.2321 cl00137 SERPIN superfamily NC - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#13145 - CGI_10022098 superfamily 243072 10 112 3.89E-26 98.995 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13146 - CGI_10022099 superfamily 247794 6 315 4.99E-170 477.657 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#13147 - CGI_10022100 superfamily 247794 1 238 5.15E-130 372.883 cl17240 FDH_GDH_like superfamily N - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#13149 - CGI_10022102 superfamily 245303 29 385 4.85E-132 388.841 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#13151 - CGI_10022104 superfamily 241599 11 66 3.43E-10 54.1717 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#13152 - CGI_10022105 superfamily 215647 463 537 3.92E-13 68.404 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#13152 - CGI_10022105 superfamily 215647 401 468 1.26E-12 66.8632 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#13152 - CGI_10022105 superfamily 243086 340 386 9.99E-10 55.459 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#13153 - CGI_10022106 superfamily 243026 4 130 5.15E-09 52.0731 cl02417 Myelin_PLP superfamily N - Myelin proteolipid protein (PLP or lipophilin); Myelin proteolipid protein (PLP or lipophilin). Q#13159 - CGI_10006254 superfamily 217293 38 233 1.24E-37 136.609 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#13159 - CGI_10006254 superfamily 202474 240 319 8.74E-10 57.2785 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#13161 - CGI_10006256 superfamily 196138 123 176 4.23E-19 78.4687 cl11065 TAF8 superfamily - - "TATA Binding Protein (TBP) Associated Factor 8; The TATA Binding Protein (TBP) Associated Factor 8 (TAF8) is one of several TAFs that bind TBP, and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and the assembly of the preinitiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs' functions, such as serving as activator-binding sites, involvement in the core-promoter recognition, or a role in the essential catalytic activity of the complex. The mouse ortholog of TAF8 is called taube nuss protein (TBN), and is required for early embryonic development. TBN mutant mice exhibit disturbances in the balance between cell death and cell survival in the early embryo. TAF8 plays a role in the differentiation of preadipocyte fibroblasts to adipocytes; it is also required for the integration of TAF10 into the TAF complex. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF8 is also a component of a small TAF complex (SMAT), which contains TAF8, TAF10 and SUPT7L. Several TAFs interact via histone-fold motifs. The histone fold (HFD) is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. TAF8 contains an H4 related histone fold motif, and interacts with several subunits of TFIID, including TBP and the histone-fold protein TAF10. Currently, five HF-containing TAF pairs have been described or suggested to exist in TFIID: TAF6-TAF9, TAF4-TAF12, TAF11-TAF13, TAF8-TAF10 and TAF3-TAF10." Q#13161 - CGI_10006256 superfamily 248099 10 84 7.32E-21 83.913 cl17545 Bromo_TP superfamily - - Bromodomain associated; This domain is predicted to bind DNA and is often found associated with pfam00439 and in transcription factors. It has a histone-like fold. Q#13167 - CGI_10006262 superfamily 243269 1 401 3.88E-97 300.339 cl03012 Ammonium_transp superfamily - - Ammonium Transporter Family; Ammonium Transporter Family. Q#13168 - CGI_10006263 superfamily 245201 213 542 1.16E-58 204.64 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13170 - CGI_10002629 superfamily 243072 384 509 4.59E-40 145.219 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13170 - CGI_10002629 superfamily 243072 714 839 5.76E-40 145.219 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13170 - CGI_10002629 superfamily 243072 780 905 1.46E-38 140.982 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13170 - CGI_10002629 superfamily 243072 846 971 3.72E-38 139.826 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13170 - CGI_10002629 superfamily 243072 582 707 4.89E-36 134.048 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13170 - CGI_10002629 superfamily 243072 450 575 1.86E-31 120.951 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13170 - CGI_10002629 superfamily 243072 279 410 1.22E-29 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13171 - CGI_10012455 superfamily 245814 14 89 4.59E-07 45.8302 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13171 - CGI_10012455 superfamily 245814 160 235 0.000113723 39.2612 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13174 - CGI_10012458 superfamily 241640 1133 1368 3.62E-64 220.611 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#13174 - CGI_10012458 superfamily 243040 466 549 4.67E-14 71.3838 cl02447 CRD_FZ superfamily C - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#13174 - CGI_10012458 superfamily 243040 825 906 4.61E-13 68.6874 cl02447 CRD_FZ superfamily C - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#13174 - CGI_10012458 superfamily 243040 96 200 2.46E-12 66.3762 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#13174 - CGI_10012458 superfamily 243040 698 804 1.23E-11 64.4502 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#13174 - CGI_10012458 superfamily 243040 200 281 1.17E-10 61.3686 cl02447 CRD_FZ superfamily C - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#13174 - CGI_10012458 superfamily 243040 590 676 7.18E-10 59.0574 cl02447 CRD_FZ superfamily C - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#13174 - CGI_10012458 superfamily 243040 345 443 1.53E-09 57.9018 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#13174 - CGI_10012458 superfamily 241613 1815 1846 2.63E-08 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13174 - CGI_10012458 superfamily 241613 959 995 2.16E-07 49.8978 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13174 - CGI_10012458 superfamily 241613 2051 2080 2.22E-06 47.2014 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13174 - CGI_10012458 superfamily 241613 998 1032 5.25E-06 46.0458 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13174 - CGI_10012458 superfamily 241613 1373 1401 0.000187453 41.4234 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13174 - CGI_10012458 superfamily 241613 1778 1813 0.00179327 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13174 - CGI_10012458 superfamily 243051 1858 1990 1.29E-20 92.0284 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#13174 - CGI_10012458 superfamily 243051 1411 1563 1.38E-13 70.8425 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#13174 - CGI_10012458 superfamily 243051 1611 1774 1.55E-08 55.4345 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#13174 - CGI_10012458 superfamily 243040 3 62 0.000135577 42.4939 cl02447 CRD_FZ superfamily N - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#13177 - CGI_10012461 superfamily 245595 356 480 9.98E-50 177.943 cl11393 Peptidase_M14_like superfamily N - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#13177 - CGI_10012461 superfamily 245595 275 357 2.61E-30 121.704 cl11393 Peptidase_M14_like superfamily C - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#13182 - CGI_10012466 superfamily 216574 162 299 9.10E-38 136.569 cl14794 FAD_binding_4 superfamily - - "FAD binding domain; This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidises the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan." Q#13184 - CGI_10012468 superfamily 246748 26 461 0 648.736 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#13185 - CGI_10012469 superfamily 241811 49 141 1.07E-19 79.0391 cl00355 Ribosomal_S14 superfamily - - Ribosomal protein S14p/S29e; This family includes both ribosomal S14 from prokaryotes and S29 from eukaryotes. Q#13187 - CGI_10012471 superfamily 243035 2 62 0.000163786 37.9646 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13187 - CGI_10012471 superfamily 241619 74 145 0.000284337 36.4061 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#13188 - CGI_10012472 superfamily 241559 86 183 8.60E-08 47.3055 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#13189 - CGI_10012473 superfamily 217210 719 1161 1.32E-94 315.371 cl10595 Ald_Xan_dh_C2 superfamily - - Molybdopterin-binding domain of aldehyde dehydrogenase; Molybdopterin-binding domain of aldehyde dehydrogenase. Q#13189 - CGI_10012473 superfamily 243326 602 708 8.13E-26 104.522 cl03161 Ald_Xan_dh_C superfamily - - "Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain; Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. " Q#13189 - CGI_10012473 superfamily 201981 170 244 8.99E-17 77.5208 cl08334 Fer2_2 superfamily - - [2Fe-2S] binding domain; [2Fe-2S] binding domain. Q#13189 - CGI_10012473 superfamily 244932 467 552 2.62E-09 56.3557 cl08390 CO_deh_flav_C superfamily - - CO dehydrogenase flavoprotein C-terminal domain; CO dehydrogenase flavoprotein C-terminal domain. Q#13189 - CGI_10012473 superfamily 241649 104 160 0.00419972 37.1102 cl00159 fer2 superfamily - - "2Fe-2S iron-sulfur cluster binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins, which act as electron carriers in photosynthesis and ferredoxins, which participate in redox chains (from bacteria to mammals). Fold is ismilar to thioredoxin." Q#13190 - CGI_10012474 superfamily 247724 9 239 8.89E-157 446.167 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#13190 - CGI_10012474 superfamily 243184 335 438 1.04E-59 192.794 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#13190 - CGI_10012474 superfamily 243185 242 332 1.53E-57 186.604 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#13191 - CGI_10012475 superfamily 241894 16 74 3.63E-13 58.9562 cl00481 SecE superfamily N - "SecE/Sec61-gamma subunits of protein translocation complex; SecE is part of the SecYEG complex in bacteria which translocates proteins from the cytoplasm. In eukaryotes the complex, made from Sec61-gamma and Sec61-alpha translocates protein from the cytoplasm to the ER. Archaea have a similar complex." Q#13192 - CGI_10012476 superfamily 243056 116 419 1.86E-39 145.912 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#13194 - CGI_10012478 superfamily 243075 45 117 2.74E-19 81.9807 cl02536 SAND superfamily - - "SAND domain; The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerisation. This region is also found in the putative transcription factor RegA from the multicellular green alga Volvox cateri. This region of RegA is known as the VARL domain." Q#13195 - CGI_10012479 superfamily 247038 99 187 5.06E-13 63.6277 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13196 - CGI_10011917 superfamily 245201 587 835 3.95E-163 490.114 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13197 - CGI_10011918 superfamily 247860 67 492 0 772.002 cl17306 HgmA superfamily - - "homogentisate 1,2-dioxygenase; Homogentisate dioxygenase cleaves the aromatic ring during the metabolic degradation of Phe and Tyr. Homogentisate dioxygenase deficiency causes alkaptonuria. The structure of homogentisate dioxygenase shows that the enzyme forms a hexamer arrangement comprised of a dimer of trimers. The active site iron ion is coordinated near the interface between the trimers." Q#13199 - CGI_10011920 superfamily 243109 426 543 2.59E-12 64.9905 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#13199 - CGI_10011920 superfamily 243109 43 161 8.38E-09 54.5901 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#13199 - CGI_10011920 superfamily 243109 273 344 9.02E-07 48.4443 cl02614 SPRY superfamily N - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#13199 - CGI_10011920 superfamily 243109 596 725 6.48E-05 43.0515 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#13201 - CGI_10011922 superfamily 217473 99 323 2.34E-26 109.377 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#13202 - CGI_10011923 superfamily 248097 186 309 1.51E-29 110.047 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#13205 - CGI_10011926 superfamily 245835 97 160 0.0030757 38.0031 cl12013 BAR superfamily NC - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#13205 - CGI_10011926 superfamily 241563 28 58 0.00405222 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13206 - CGI_10011927 superfamily 241563 18 50 0.000657748 37.7019 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13208 - CGI_10011930 superfamily 218109 236 267 1.03E-07 49.2462 cl12292 Gly_transf_sug superfamily N - "Glycosyltransferase sugar-binding region containing DXD motif; The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases." Q#13209 - CGI_10011931 superfamily 215647 96 308 1.31E-45 157.385 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#13209 - CGI_10011931 superfamily 243029 3 55 1.03E-06 45.4193 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#13210 - CGI_10011932 superfamily 215647 118 358 1.89E-53 180.882 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#13210 - CGI_10011932 superfamily 243029 36 108 5.05E-11 58.68 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#13214 - CGI_10004900 superfamily 241817 29 74 2.08E-19 78.3882 cl00365 F1-ATPase_gamma superfamily C - "mitochondrial ATP synthase gamma subunit; The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain of F-ATPases is composed of alpha, beta, gamma, delta, and epsilon (not present in bacteria) subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain." Q#13215 - CGI_10004901 superfamily 241817 10 208 6.51E-70 217.06 cl00365 F1-ATPase_gamma superfamily N - "mitochondrial ATP synthase gamma subunit; The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain of F-ATPases is composed of alpha, beta, gamma, delta, and epsilon (not present in bacteria) subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain." Q#13221 - CGI_10013649 superfamily 198867 46 146 7.85E-43 148.075 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#13221 - CGI_10013649 superfamily 243146 334 380 3.01E-15 70.2799 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13221 - CGI_10013649 superfamily 243146 228 273 3.30E-13 64.605 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13221 - CGI_10013649 superfamily 243146 289 333 1.34E-12 62.5759 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13221 - CGI_10013649 superfamily 243146 422 466 5.40E-11 58.0566 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13221 - CGI_10013649 superfamily 243146 195 239 4.88E-09 52.5607 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13221 - CGI_10013649 superfamily 243146 369 418 9.55E-08 48.8118 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13221 - CGI_10013649 superfamily 243066 5 41 9.95E-05 40.7521 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#13222 - CGI_10013650 superfamily 243034 2040 2172 4.96E-19 85.5095 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1272 1369 6.34E-14 70.8719 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1833 1934 3.15E-13 68.5608 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1657 1758 5.34E-13 68.1756 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1763 1864 1.96E-11 63.5532 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1377 1508 2.00E-10 60.4716 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1982 2070 3.57E-10 59.7012 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1480 1576 4.68E-09 56.2344 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1204 1301 7.91E-08 52.7676 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13222 - CGI_10013650 superfamily 243034 1581 1688 3.08E-07 50.8416 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13224 - CGI_10013652 superfamily 118132 12 57 0.00987064 35.2001 cl09814 MamL-1 superfamily - - "MamL-1 domain; The MamL-1 domain is a polypeptide of up to 70 residues, numbers 15-67 of which adopt an elongated kinked helix that wraps around ANK and CSL forming one of the complexes in the build-up of the Notch transcriptional complex for recruiting general transcription factors." Q#13226 - CGI_10013654 superfamily 222150 263 288 4.10E-05 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13227 - CGI_10013655 superfamily 206088 18 49 5.72E-09 48.4863 cl16476 zf-CCHC_3 superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#13228 - CGI_10013656 superfamily 245213 399 433 4.75E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13228 - CGI_10013656 superfamily 245213 287 321 0.000175562 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13228 - CGI_10013656 superfamily 245213 479 511 0.0068564 34.9198 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13228 - CGI_10013656 superfamily 219501 19 86 4.41E-12 62.3478 cl06622 MNNL superfamily - - N terminus of Notch ligand; This entry represents a region of conserved sequence at the N terminus of several Notch ligand proteins. Q#13229 - CGI_10013657 superfamily 241686 539 601 1.05E-16 77.2609 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#13229 - CGI_10013657 superfamily 241686 339 402 1.48E-15 73.7941 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#13229 - CGI_10013657 superfamily 241686 170 232 1.85E-15 73.7941 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#13229 - CGI_10013657 superfamily 241686 248 308 5.96E-15 72.2533 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#13229 - CGI_10013657 superfamily 241686 419 479 6.52E-13 66.0901 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#13229 - CGI_10013657 superfamily 241686 614 677 8.03E-13 66.0901 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#13229 - CGI_10013657 superfamily 241686 73 151 4.65E-08 52.2229 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#13229 - CGI_10013657 superfamily 248469 1246 1372 0.0020597 38.8903 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#13229 - CGI_10013657 superfamily 215733 838 1084 1.33E-60 208.958 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#13230 - CGI_10013658 superfamily 148061 61 240 2.71E-77 234.642 cl18026 FRG1 superfamily - - "FRG1-like family; The human FRG1 gene maps to human chromosome 4q35 and has been identified as a candidate for facioscapulohumeral muscular dystrophy. Currently, the function of FRG1 is unknown." Q#13233 - CGI_10013661 superfamily 241573 64 240 1.29E-45 163.272 cl00051 CysPc superfamily C - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#13234 - CGI_10013662 superfamily 241573 67 243 1.29E-47 169.82 cl00051 CysPc superfamily C - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#13236 - CGI_10013664 superfamily 246925 197 305 0.00132457 39.6462 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#13238 - CGI_10013666 superfamily 243034 266 347 0.000287262 39.6708 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13238 - CGI_10013666 superfamily 243158 335 370 1.54E-08 51.4056 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#13238 - CGI_10013666 superfamily 243158 373 407 0.000990117 37.5384 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#13239 - CGI_10013667 superfamily 247743 26 178 1.49E-07 49.0667 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13239 - CGI_10013667 superfamily 203973 249 338 1.46E-20 84.4804 cl16006 Rep_fac_C superfamily - - "Replication factor C C-terminal domain; This is the C-terminal domain of RFC (replication factor-C) protein of the clamp loader complex which binds to the DNA sliding clamp (proliferating cell nuclear antigen, PCNA). The five modules of RFC assemble into a right-handed spiral, which results in only three of the five RFC subunits (RFC-A, RFC-B and RFC-C) making contact with PCNA, leaving a wedge-shaped gap between RFC-E and the PCNA clamp-loader complex. The C-terminal is vital for the correct orientation of RFC-E with respect to RFC-A." Q#13240 - CGI_10013668 superfamily 247068 457 553 5.17E-28 110.096 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13240 - CGI_10013668 superfamily 247068 243 340 2.64E-25 102.392 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13240 - CGI_10013668 superfamily 247068 562 657 8.27E-23 95.0729 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13240 - CGI_10013668 superfamily 247068 356 449 1.93E-15 73.8869 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13240 - CGI_10013668 superfamily 247068 133 235 9.05E-13 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13240 - CGI_10013668 superfamily 247068 679 756 9.85E-13 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13240 - CGI_10013668 superfamily 247068 24 124 7.32E-07 48.4638 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13243 - CGI_10013671 superfamily 248012 564 695 1.26E-24 100.426 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#13243 - CGI_10013671 superfamily 214507 450 503 6.44E-05 41.2616 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#13245 - CGI_10005050 superfamily 224772 2 41 0.000368133 36.1565 cl15312 KptA superfamily N - "RNA:NAD 2'-phosphotransferase [Translation, ribosomal structure and biogenesis]" Q#13246 - CGI_10005051 superfamily 247725 633 748 6.83E-62 204.42 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13246 - CGI_10005051 superfamily 243053 242 478 1.13E-58 199.786 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#13249 - CGI_10005054 superfamily 217293 318 502 6.10E-08 52.2499 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#13250 - CGI_10005055 superfamily 241810 85 140 1.87E-07 46.739 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#13252 - CGI_10005057 superfamily 245814 692 768 2.10E-09 55.5224 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13254 - CGI_10000710 superfamily 241888 1 182 1.66E-93 275.799 cl00473 BI-1-like superfamily N - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#13255 - CGI_10000711 superfamily 218440 18 210 0.00409386 38.3641 cl14936 AF-4 superfamily NC - "AF-4 proto-oncoprotein; This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homologue Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila." Q#13255 - CGI_10000711 superfamily 208802 216 275 0.00722572 37.0255 cl07974 DRE_TIM_metallolyase superfamily NC - "DRE-TIM metallolyase superfamily; The DRE-TIM metallolyase superfamily includes 2-isopropylmalate synthase (IPMS), alpha-isopropylmalate synthase (LeuA), 3-hydroxy-3-methylglutaryl-CoA lyase, homocitrate synthase, citramalate synthase, 4-hydroxy-2-oxovalerate aldolase, re-citrate synthase, transcarboxylase 5S, pyruvate carboxylase, AksA, and FrbC. These members all share a conserved triose-phosphate isomerase (TIM) barrel domain consisting of a core beta(8)-alpha(8) motif with the eight parallel beta strands forming an enclosed barrel surrounded by eight alpha helices. The domain has a catalytic center containing a divalent cation-binding site formed by a cluster of invariant residues that cap the core of the barrel. In addition, the catalytic site includes three invariant residues - an aspartate (D), an arginine (R), and a glutamate (E) - which is the basis for the domain name "DRE-TIM"." Q#13258 - CGI_10000633 superfamily 241645 6 92 1.82E-40 130.43 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#13259 - CGI_10000959 superfamily 241600 40 258 1.85E-86 259.095 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#13260 - CGI_10001106 superfamily 241811 5 55 4.45E-26 92.8257 cl00355 Ribosomal_S14 superfamily - - Ribosomal protein S14p/S29e; This family includes both ribosomal S14 from prokaryotes and S29 from eukaryotes. Q#13261 - CGI_10001107 superfamily 241739 158 413 2.62E-162 468.613 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#13261 - CGI_10001107 superfamily 241738 419 641 3.02E-77 246.826 cl00266 HGTP_anticodon superfamily - - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#13261 - CGI_10001107 superfamily 241805 63 112 1.42E-17 77.6608 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#13261 - CGI_10001107 superfamily 241805 17 38 0.000193971 39.9112 cl00349 S15_NS1_EPRS_RNA-bind superfamily N - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#13261 - CGI_10001107 superfamily 241805 137 164 0.000788549 38.0622 cl00349 S15_NS1_EPRS_RNA-bind superfamily C - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#13262 - CGI_10001720 superfamily 241623 123 365 1.01E-137 397.46 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#13262 - CGI_10001720 superfamily 202180 408 439 1.68E-08 50.54 cl03505 FATC superfamily - - "FATC domain; The FATC domain is named after FRAP, ATM, TRRAP C-terminal. The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability." Q#13264 - CGI_10001816 superfamily 247684 24 87 3.19E-15 69.9987 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#13265 - CGI_10001817 superfamily 110440 310 335 0.000487165 37.3873 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#13270 - CGI_10003138 superfamily 221744 25 236 2.74E-14 72.0835 cl18614 CABIT superfamily - - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#13270 - CGI_10003138 superfamily 247057 589 655 0.00524917 35.5896 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#13271 - CGI_10003139 superfamily 245202 13 88 1.28E-29 107.67 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#13273 - CGI_10003141 superfamily 243689 35 97 2.07E-08 52.2457 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#13273 - CGI_10003141 superfamily 219817 106 225 1.71E-06 47.2277 cl07129 Xpo1 superfamily - - "Exportin 1-like protein; The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus." Q#13275 - CGI_10003143 superfamily 245210 13 260 6.89E-110 329.994 cl09938 cond_enzymes superfamily N - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#13275 - CGI_10003143 superfamily 242376 295 387 2.62E-23 93.0586 cl01225 SCP2 superfamily - - "SCP-2 sterol transfer family; This domain is involved in binding sterols. It is found in the SCP2 protein, as well as the C terminus of the enzyme estradiol 17 beta-dehydrogenase EC:1.1.1.62. The UNC-24 protein contains an SPFH domain pfam01145." Q#13276 - CGI_10002148 superfamily 241600 115 265 1.37E-48 162.025 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#13276 - CGI_10002148 superfamily 241619 38 84 0.00016845 38.7173 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#13277 - CGI_10002149 superfamily 241600 126 338 3.11E-83 253.702 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#13277 - CGI_10002149 superfamily 241619 34 80 9.01E-05 39.8729 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#13281 - CGI_10002034 superfamily 241832 127 230 1.17E-54 173.126 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#13281 - CGI_10002034 superfamily 241832 32 86 3.35E-11 57.7316 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#13282 - CGI_10007588 superfamily 217293 81 228 1.57E-20 88.4587 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#13282 - CGI_10007588 superfamily 202474 235 332 6.20E-13 66.5233 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#13283 - CGI_10007590 superfamily 241748 314 525 1.30E-25 105.721 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#13283 - CGI_10007590 superfamily 216431 117 253 2.62E-06 46.1221 cl08317 Creatinase_N superfamily - - Creatinase/Prolidase N-terminal domain; This family includes the N-terminal non-catalytic domains from creatinase and prolidase. The exact function of this domain is uncertain. Q#13286 - CGI_10007593 superfamily 241868 132 298 8.41E-74 227.388 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#13287 - CGI_10007594 superfamily 220647 33 151 4.39E-11 56.5672 cl18565 L_HGMIC_fpl superfamily C - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#13289 - CGI_10007596 superfamily 220647 8 177 2.66E-20 83.9163 cl18565 L_HGMIC_fpl superfamily - - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#13295 - CGI_10018345 superfamily 247856 51 111 3.91E-14 70.3263 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13297 - CGI_10018347 superfamily 247723 129 206 1.11E-43 144.69 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13298 - CGI_10018348 superfamily 238191 25 238 5.95E-58 192.932 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#13299 - CGI_10018349 superfamily 238191 21 256 9.33E-14 70.824 cl18907 Esterase_lipase superfamily N - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#13300 - CGI_10018350 superfamily 238191 3 80 9.80E-07 45.0156 cl18907 Esterase_lipase superfamily N - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#13302 - CGI_10018352 superfamily 238191 31 415 3.94E-81 261.113 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#13303 - CGI_10018353 superfamily 247743 303 449 1.50E-07 51.8168 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13303 - CGI_10018353 superfamily 243092 1069 1283 0.00233764 40.396 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13305 - CGI_10018355 superfamily 248097 37 156 3.22E-31 110.433 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#13306 - CGI_10018356 superfamily 248097 5 69 1.26E-08 46.8746 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#13307 - CGI_10018357 superfamily 241613 37 72 0.000297508 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13307 - CGI_10018357 superfamily 214531 316 344 0.000572198 37.5813 cl18310 LY superfamily C - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#13309 - CGI_10018359 superfamily 247727 122 228 0.000110031 39.3355 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#13310 - CGI_10018360 superfamily 202224 46 154 3.15E-19 81.1879 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#13311 - CGI_10018361 superfamily 222437 612 763 3.34E-78 253.406 cl16458 Rab3-GTPase_cat superfamily - - "Rab3 GTPase-activating protein catalytic subunit; This family is the probable catalytic subunit of the GTPase activating protein that has specificity for Rab3 subfamily (RAB3A, RAB3B, RAB3C and RAB3D). It is likely to convert active Rab3-GTP to the inactive form Rab3-GDP. Rab3 proteins are involved in regulated exocytosis of neurotransmitters and hormones. The Rab3 GTPase-activating complex is a heterodimer composed of RAB3GAP and RAB3-GAP150. This complex interacts with DMXL2." Q#13312 - CGI_10018362 superfamily 247038 834 899 6.48E-09 56.3089 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13312 - CGI_10018362 superfamily 247038 322 375 9.79E-08 52.8421 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13312 - CGI_10018362 superfamily 247038 56 124 4.73E-07 50.9161 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13312 - CGI_10018362 superfamily 247038 649 741 3.57E-06 48.2197 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13312 - CGI_10018362 superfamily 247038 996 1079 7.84E-06 47.0641 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13312 - CGI_10018362 superfamily 220608 1177 1294 3.75E-34 130.889 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#13312 - CGI_10018362 superfamily 220608 1963 2093 1.31E-13 71.183 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#13312 - CGI_10018362 superfamily 247038 916 993 3.71E-11 62.8452 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13312 - CGI_10018362 superfamily 247038 147 223 0.000296535 42.0444 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13312 - CGI_10018362 superfamily 247038 747 821 0.000925326 40.5036 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13313 - CGI_10018363 superfamily 243035 32 156 2.79E-15 68.0301 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13314 - CGI_10018364 superfamily 243035 145 269 1.22E-15 70.3413 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13314 - CGI_10018364 superfamily 243035 29 126 4.55E-10 55.3185 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13315 - CGI_10018365 superfamily 247038 218 301 4.09E-10 58.2349 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13315 - CGI_10018365 superfamily 247038 948 998 3.37E-06 46.2937 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#13316 - CGI_10018366 superfamily 242856 79 152 6.90E-38 127.111 cl02050 Ribosomal_S25 superfamily N - S25 ribosomal protein; S25 ribosomal protein. Q#13317 - CGI_10018367 superfamily 241607 380 410 2.81E-06 44.183 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#13317 - CGI_10018367 superfamily 241607 313 353 7.78E-05 39.9458 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#13317 - CGI_10018367 superfamily 242318 91 212 1.43E-20 88.4106 cl01126 EI24 superfamily N - "Etoposide-induced protein 2.4 (EI24); This family contains a number of eukaryotic etoposide-induced 2.4 (EI24) proteins approximately 350 residues long as well as bacterial CysZ proteins (formerly known as DUF540). In cells treated with the cytotoxic drug etoposide, EI24 is induced by p53. It has been suggested to play an important role in negative cell growth control." Q#13319 - CGI_10018369 superfamily 243035 144 224 1.38E-06 44.533 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13320 - CGI_10003317 superfamily 243179 39 117 4.78E-07 45.4165 cl02781 tetraspanin_LEL superfamily C - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#13321 - CGI_10003318 superfamily 248469 407 522 4.56E-13 68.1655 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#13321 - CGI_10003318 superfamily 248469 154 242 3.46E-11 62.3875 cl17915 HAD_like superfamily N - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#13321 - CGI_10003318 superfamily 218493 1000 1138 1.50E-43 156.363 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#13321 - CGI_10003318 superfamily 248054 618 648 0.00194061 37.8368 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#13322 - CGI_10003319 superfamily 247740 71 393 1.67E-165 470.828 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#13325 - CGI_10003322 superfamily 216653 146 261 1.18E-09 56.0663 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#13326 - CGI_10002975 superfamily 244859 2 190 3.57E-12 62.9493 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#13331 - CGI_10027269 superfamily 247792 332 375 2.51E-10 55.5296 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13333 - CGI_10027271 superfamily 241600 64 274 1.39E-85 257.554 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#13335 - CGI_10027273 superfamily 213147 46 173 2.57E-57 196.367 cl17040 ADDz superfamily - - "ADDz for ATRX, Dnmt3 and Dnmt3l PHD-like zinc finger domain; The ADDz zinc finger domain is present in the chromatin-associated proteins cytosine-5-methyltransferase 3 (Dnmt3) and ATRX, a SNF2 type transcription factor protein. The Dnmt3 family includes two active DNA methyltransferases, Dnmt3a and -3b, and one regulatory factor Dnmt3l. DNA methylation is an important epigenetic mechanism involved in diverse biological processes such as embryonic development, gene expression, and genomic imprinting. The ADDz domain is a PHD-like zinc finger motif that contains two parts, a C2-C2 and a PHD-like zinc finger. PHD zinc finger domains have been identified in more than 40 proteins that are mainly involved in chromatin mediated transcriptional control; the classical PHD zinc finger has a C4-H-C3 motif that spans about 50-80 amino acids. In ADDz, the conserved histidine residue of the PHD finger is replaced by a cysteine, and an additional zinc finger C2-C2 like motif is located about twenty residues upstream of the C4-C-C3 motif." Q#13335 - CGI_10027273 superfamily 247905 1412 1595 8.03E-17 79.9744 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13335 - CGI_10027273 superfamily 247805 998 1159 8.55E-14 71.2144 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#13338 - CGI_10027276 superfamily 247684 13 427 1.05E-87 279.933 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#13339 - CGI_10027277 superfamily 217661 20 142 2.05E-32 113.134 cl18422 Pam16 superfamily - - "Pam16; The Pam16 protein is the fifth essential subunit of the pre-sequence translocase-associated protein import motor (PAM). In Saccharomyces cerevisiae, Pam16 is required for preprotein translocation into the matrix, but not for protein insertion into the inner membrane. Pam16 has a degenerate J domain. J-domain proteins play important regulatory roles as co-chaperones, recruiting Hsp70 partners and accelerating the ATP-hydrolysis step of the chaperone cycle. Pam16's J-like domain strongly interacts with Pam18's J domain, leading to a productive interaction of Pam18 with mtHsp70 at the mitochondria import channel. Pam18 stimulates the ATPase activity of mtHsp70." Q#13340 - CGI_10027278 superfamily 246925 14 98 0.001289 38.8758 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#13341 - CGI_10027279 superfamily 192997 292 427 1.06E-28 113.445 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#13342 - CGI_10027280 superfamily 243072 47 182 1.94E-24 98.6098 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13342 - CGI_10027280 superfamily 241596 280 330 8.34E-07 46.4383 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#13342 - CGI_10027280 superfamily 243123 348 389 2.07E-07 47.9381 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#13343 - CGI_10027281 superfamily 243093 19 99 3.05E-05 43.673 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#13344 - CGI_10027282 superfamily 247683 366 413 2.83E-16 75.1918 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#13344 - CGI_10027282 superfamily 247683 107 156 4.03E-16 74.8066 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#13344 - CGI_10027282 superfamily 247683 1071 1121 5.36E-08 51.3095 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#13344 - CGI_10027282 superfamily 247683 265 321 7.73E-12 62.5557 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#13345 - CGI_10027283 superfamily 222370 28 109 1.18E-28 104.912 cl16386 Longin superfamily - - "Regulated-SNARE-like domain; Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain." Q#13345 - CGI_10027283 superfamily 201526 124 180 2.49E-22 87.9788 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#13346 - CGI_10027284 superfamily 246597 55 129 1.06E-29 110.781 cl13995 MPP_superfamily superfamily C - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#13346 - CGI_10027284 superfamily 246597 203 271 1.12E-26 102.306 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#13347 - CGI_10027285 superfamily 220736 407 543 2.38E-27 107.01 cl11068 PTEN_C2 superfamily - - "C2 domain of PTEN tumour-suppressor protein; This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (pfam00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane." Q#13347 - CGI_10027285 superfamily 241574 302 400 5.92E-09 54.3797 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#13351 - CGI_10027289 superfamily 245226 146 313 7.28E-18 80.0372 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#13353 - CGI_10027291 superfamily 245864 52 319 7.37E-37 137.41 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#13354 - CGI_10027292 superfamily 248264 55 153 4.20E-10 55.7062 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#13357 - CGI_10027295 superfamily 215869 115 480 7.61E-40 147.274 cl10564 SecY superfamily - - SecY translocase; SecY translocase. Q#13357 - CGI_10027295 superfamily 204513 78 112 3.48E-07 47.0327 cl11188 Plug_translocon superfamily - - Plug domain of Sec61p; The Sec61/SecY translocon mediates translocation of proteins across the membrane and integration of membrane proteins into the lipid bilayer. The structure of the translocon revealed a plug domain blocking the pore on the lumenal side.The plug is unlikely to be important for sealing the translocation pore in yeast but it plays a role in stabilising Sec61p during translocon formation. The domain runs from residues 52-74. Q#13358 - CGI_10027296 superfamily 241640 215 453 5.57E-88 270.687 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#13359 - CGI_10027297 superfamily 241640 23 255 6.45E-88 263.368 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#13360 - CGI_10027298 superfamily 245323 2756 3037 9.15E-145 455.164 cl10511 Beach superfamily - - "BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins." Q#13360 - CGI_10027298 superfamily 247725 2595 2719 6.28E-34 130.108 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13360 - CGI_10027298 superfamily 248318 3572 3625 7.26E-22 93.2693 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#13360 - CGI_10027298 superfamily 243092 3106 3298 6.96E-20 92.7832 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13361 - CGI_10027299 superfamily 246680 551 633 0.000258885 40.0114 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#13362 - CGI_10027300 superfamily 243352 3 279 7.26E-109 319.154 cl03224 Porin3 superfamily - - "Eukaryotic porin family that forms channels in the mitochondrial outer membrane; The porin family 3 contains two sub-families that play vital roles in the mitochondrial outer membrane, a translocase for unfolded pre-proteins (Tom40) and the voltage-dependent anion channel (VDAC) that regulates the flux of mostly anionic metabolites through the outer mitochondrial membrane." Q#13363 - CGI_10027301 superfamily 206528 197 270 1.64E-26 100.326 cl18290 PAP2_C superfamily - - PAP2 superfamily C-terminal; This family is closely related to the C-terminal a region of PAP2. Q#13367 - CGI_10027305 superfamily 243179 125 261 1.09E-18 78.9289 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#13369 - CGI_10027307 superfamily 243306 8 205 6.66E-94 275.597 cl03114 RNase_PH superfamily - - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#13370 - CGI_10027308 superfamily 245835 10 252 4.94E-85 274.598 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#13370 - CGI_10027308 superfamily 243095 242 437 8.15E-84 269.712 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#13371 - CGI_10027309 superfamily 242916 28 179 1.60E-45 153.974 cl02166 RRS1 superfamily - - Ribosome biogenesis regulatory protein (RRS1); This family consists of several eukaryotic ribosome biogenesis regulatory (RRS1) proteins. RRS1 is a nuclear protein that is essential for the maturation of 25 S rRNA and the 60 S ribosomal subunit assembly in Saccharomyces cerevisiae. Q#13374 - CGI_10027312 superfamily 241675 173 408 8.96E-133 394.691 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#13374 - CGI_10027312 superfamily 146962 143 179 3.39E-05 43.1952 cl04602 DUF592 superfamily N - Protein of unknown function (DUF592); This region is found in some SIR2 family proteins (pfam02146). Q#13375 - CGI_10027313 superfamily 243077 14 68 9.26E-06 40.2213 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#13376 - CGI_10027315 superfamily 245814 36 93 6.48E-08 49.3275 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13376 - CGI_10027315 superfamily 245814 195 270 5.00E-06 43.9664 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13376 - CGI_10027315 superfamily 245814 109 165 5.42E-05 40.8532 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13377 - CGI_10027316 superfamily 245814 158 214 3.50E-06 43.1644 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13378 - CGI_10027317 superfamily 245814 8 62 7.18E-06 40.0828 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13380 - CGI_10027319 superfamily 241666 39 357 3.52E-74 234.837 cl00184 CAS_like superfamily - - "Clavaminic acid synthetase (CAS) -like; CAS is a trifunctional Fe(II)/ 2-oxoglutarate (2OG) oxygenase carrying out three reactions in the biosynthesis of clavulanic acid, an inhibitor of class A serine beta-lactamases. In general, Fe(II)-2OG oxygenases catalyze a hydroxylation reaction, which leads to the incorporation of an oxygen atom from dioxygen into a hydroxyl group and conversion of 2OG to succinate and CO2" Q#13381 - CGI_10027320 superfamily 246680 1475 1542 5.42E-11 61.1974 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#13381 - CGI_10027320 superfamily 246680 1360 1437 2.53E-08 53.1082 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#13382 - CGI_10027321 superfamily 222170 51 114 0.00112917 35.6838 cl16285 Dehalogenase superfamily N - Reductive dehalogenase subunit; This family is most frequently associated with a Fer4 iron-sulfur cluster towards the C-terminal region. Q#13384 - CGI_10027323 superfamily 245601 1 63 2.59E-16 74.2808 cl11399 HP superfamily C - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#13384 - CGI_10027323 superfamily 245601 238 294 0.00387503 36.1461 cl11399 HP superfamily N - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#13387 - CGI_10027326 superfamily 243050 292 346 1.12E-22 89.4573 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13387 - CGI_10027326 superfamily 243050 78 136 6.94E-22 87.1071 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13387 - CGI_10027326 superfamily 243050 233 284 1.39E-21 86.3756 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13387 - CGI_10027326 superfamily 243050 182 225 6.58E-16 70.9299 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13387 - CGI_10027326 superfamily 243050 28 75 1.80E-13 64.163 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13388 - CGI_10027327 superfamily 195146 119 202 1.48E-35 125.125 cl05674 PET superfamily - - "PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions; PET domain is involved in protein-protein interactions and is usually found in conjunction with LIM domain, which is also a protein-protein interaction domain. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. The PET domain has been found at the N-terminal of four known groups of proteins: prickle, testin, LIMPETin/LIM-9 and overexpressed breast tumor protein (OEBT). Prickle has been implicated in regulation of cell movement through its association with the Dishevelled (Dsh) protein in the planar cell polarity (PCP) pathway. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell contact areas, and at focal adhesion plaques. It interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin, and is involved in cell motility and adhesion events. Knockout mice experiments reveal tumor repressor function of Testin. LIMPETin/LIM-9 contains an N-terminal PET domain and 6 LIM domains at the C-terminal. In Schistosoma mansoni, where LIMPETin was first identified, it is down regulated in sexually mature adult females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator. In C. elegans, LIM-9 may play a role in regulating the assembly and maintenance of the muscle A-band by forming a protein complex with SCPL-1 and UNC-89 and other proteins. OEBT displays a PET domain with two LIM domains, and is predicted to be localized in the nucleus with a possible role in cancer differentiation." Q#13388 - CGI_10027327 superfamily 243050 212 269 1.22E-23 91.6915 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13389 - CGI_10027328 superfamily 241832 366 414 2.63E-05 42.2954 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#13390 - CGI_10027329 superfamily 222150 560 582 0.000209901 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13390 - CGI_10027329 superfamily 222150 587 609 0.000572371 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13390 - CGI_10027329 superfamily 222150 94 116 0.000762343 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13390 - CGI_10027329 superfamily 222150 67 89 0.00180981 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13390 - CGI_10027329 superfamily 222150 670 694 0.00272428 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13392 - CGI_10027331 superfamily 204716 28 178 0.00121258 37.9903 cl18257 Git3 superfamily C - "G protein-coupled glucose receptor regulating Gpa2; Git3 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. Git3 contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This is the conserved N-terminus of these proteins, and the C-terminal conserved region is now in family Git3_C." Q#13392 - CGI_10027331 superfamily 207642 178 240 0.00336288 35.1594 cl02558 GED superfamily - - Dynamin GTPase effector domain; Dynamin GTPase effector domain. Q#13394 - CGI_10027333 superfamily 109875 25 95 1.28E-11 57.1004 cl17923 T4_deiodinase superfamily C - "Iodothyronine deiodinase; Iodothyronine deiodinase converts thyroxine (T4) to 3,5,3'-triiodothyronine (T3)." Q#13395 - CGI_10027334 superfamily 109875 1 67 1.38E-20 81.368 cl17923 T4_deiodinase superfamily N - "Iodothyronine deiodinase; Iodothyronine deiodinase converts thyroxine (T4) to 3,5,3'-triiodothyronine (T3)." Q#13399 - CGI_10027338 superfamily 245864 4 160 3.04E-41 147.04 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#13400 - CGI_10003601 superfamily 149481 11 211 1.18E-28 116.355 cl07163 CLCA_N superfamily - - Calcium-activated chloride channel; The CLCA family of calcium-activated chloride channels has been identified in many epithelial and endothelial cell types as well as in smooth muscle cells and has four or five putative transmembrane regions. Additionally to their role as chloride channels some CLCA proteins function as adhesion molecules and may also have roles as tumour suppressors. The domain described here is found at the N-terminus of CLCAs. Q#13400 - CGI_10003601 superfamily 149481 210 416 2.34E-27 112.503 cl07163 CLCA_N superfamily N - Calcium-activated chloride channel; The CLCA family of calcium-activated chloride channels has been identified in many epithelial and endothelial cell types as well as in smooth muscle cells and has four or five putative transmembrane regions. Additionally to their role as chloride channels some CLCA proteins function as adhesion molecules and may also have roles as tumour suppressors. The domain described here is found at the N-terminus of CLCAs. Q#13400 - CGI_10003601 superfamily 150094 606 675 1.62E-13 69.7109 cl09605 DUF1973 superfamily N - Domain of unknown function (DUF1973); Members of his family of functionally uncharacterized domains are found in various eukaryotic calcium-dependent chloride channels. Q#13401 - CGI_10003602 superfamily 242123 41 352 1.87E-167 472.663 cl00826 DS superfamily - - "Deoxyhypusine synthase; Eukaryotic initiation factor 5A (eIF-5A) contains an unusual amino acid, hypusine [N epsilon-(4-aminobutyl-2-hydroxy)lysine]. The first step in the post-translational formation of hypusine is catalyzed by the enzyme deoxyhypusine synthase (DS) EC:1.1.1.249. The modified version of eIF-5A, and DS, are required for eukaryotic cell proliferation." Q#13402 - CGI_10003603 superfamily 221406 23 524 0 621.79 cl13500 DUF3550 superfamily - - Protein of unknown function (DUF3550/UPF0682); This family of proteins is functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 249 to 606 amino acids in length. Q#13404 - CGI_10003605 superfamily 245201 19 275 2.20E-172 487.651 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13404 - CGI_10003605 superfamily 246710 307 370 2.23E-06 45.5712 cl14783 DOMON_like superfamily C - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#13404 - CGI_10003605 superfamily 241832 369 423 6.29E-25 97.91 cl00388 Thioredoxin_like superfamily NC - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#13405 - CGI_10002168 superfamily 247723 179 247 1.00E-05 43.4477 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13405 - CGI_10002168 superfamily 222269 474 534 1.82E-05 45.007 cl18657 Cupin_8 superfamily N - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#13406 - CGI_10002169 superfamily 243703 2 38 1.34E-13 59.1443 cl04309 RNAP_Rpb7_N_like superfamily C - "RNAP_Rpb7_N_like: This conserved domain represents the N-terminal ribonucleoprotein (RNP) domain of the Rpb7 subunit of eukaryotic RNA polymerase (RNAP) II and its homologs, Rpa43 of eukaryotic RNAP I, Rpc25 of eukaryotic RNAP III, and RpoE (subunit E) of archaeal RNAP. These proteins have, in addition to their N-terminal RNP domain, a C-terminal oligonucleotide-binding (OB) domain. Each of these subunits heterodimerizes with another RNAP subunit (Rpb7 to Rpb4, Rpc25 to Rpc17, RpoE to RpoF, and Rpa43 to Rpa14). The heterodimer is thought to tether the RNAP to a given promoter via its interactions with a promoter-bound transcription factor.The heterodimer is also thought to bind and position nascent RNA as it exits the polymerase complex." Q#13408 - CGI_10013968 superfamily 217473 96 319 3.68E-28 114.384 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#13411 - CGI_10013971 superfamily 241578 219 396 4.60E-47 163.54 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13411 - CGI_10013971 superfamily 207701 2 95 1.05E-16 76.9494 cl02699 VIT superfamily N - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#13412 - CGI_10013972 superfamily 241599 194 256 1.99E-20 82.6764 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#13413 - CGI_10013973 superfamily 246683 24 347 2.06E-138 399.187 cl14648 Aldose_epim superfamily - - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#13414 - CGI_10013974 superfamily 246683 25 349 6.66E-153 436.167 cl14648 Aldose_epim superfamily - - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#13417 - CGI_10013977 superfamily 247984 187 401 5.68E-43 154.305 cl17430 FtsJ superfamily - - "FtsJ-like methyltransferase; This family consists of FtsJ from various bacterial and archaeal sources FtsJ is a methyltransferase, but actually has no effect on cell division. FtsJ's substrate is the 23S rRNA. The 1.5 A crystal structure of FtsJ in complex with its cofactor S-adenosylmethionine revealed that FtsJ has a methyltransferase fold. This family also includes the N terminus of flaviviral NS5 protein. It has been hypothesised that the N-terminal domain of NS5 is a methyltransferase involved in viral RNA capping." Q#13417 - CGI_10013977 superfamily 243107 41 73 8.66E-11 58.6728 cl02611 G-patch superfamily C - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#13417 - CGI_10013977 superfamily 241647 716 746 0.00361246 36.0395 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#13418 - CGI_10013978 superfamily 241624 435 709 9.84E-90 285.374 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#13418 - CGI_10013978 superfamily 149089 703 781 2.91E-14 69.6078 cl06733 PP2C_C superfamily - - "Protein serine/threonine phosphatase 2C, C-terminal domain; Protein phosphatase 2C (PP2C) is involved in regulating cellular responses to stress in various eukaryotes. It consists of two domains: an N-terminal catalytic domain and a C-terminal domain characteristic of mammalian PP2Cs. This domain consists of three antiparallel alpha helices, one of which packs against two corresponding alpha-helices of the N-terminal domain. The C-terminal domain does not seem to play a role in catalysis, but it may provide protein substrate specificity due to the cleft that is created between it and the catalytic domain." Q#13418 - CGI_10013978 superfamily 215882 139 271 1.17E-11 62.6834 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#13418 - CGI_10013978 superfamily 247725 233 325 1.05E-05 45.3106 cl17171 PH-like superfamily NC - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13420 - CGI_10013980 superfamily 241728 9 164 6.11E-47 151.54 cl00253 Dtyr_deacylase superfamily - - D-Tyrosyl-tRNAtyr deacylases; a class of tRNA-dependent hydrolases which are capable of hydrolyzing the ester bond of D-Tyrosyl-tRNA reducing the level of cellular D-Tyrosine while recycling the peptidyl-tRNA; found in bacteria and in eukaryotes but not in archea; beta barrel-like fold structure; forms homodimers in which two surface cavities serve as the active site for tRNA binding Q#13421 - CGI_10013981 superfamily 217473 145 322 1.75E-26 109.762 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#13422 - CGI_10013982 superfamily 243034 27 129 8.31E-11 57.7752 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13422 - CGI_10013982 superfamily 246597 144 332 2.83E-141 405.864 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#13423 - CGI_10013983 superfamily 217473 140 199 1.30E-05 45.0485 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#13425 - CGI_10013985 superfamily 219532 4 24 0.000395548 34.9826 cl06657 OB_NTP_bind superfamily N - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#13426 - CGI_10013986 superfamily 241983 10 334 3.62E-44 154.823 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#13427 - CGI_10005149 superfamily 241555 8 169 2.26E-25 98.7826 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#13432 - CGI_10005154 superfamily 245827 172 583 9.23E-128 391.198 cl11986 TOP1Ac superfamily - - "DNA Topoisomerase, subtype IA; DNA-binding, ATP-binding and catalytic domain of bacterial DNA topoisomerases I and III, and eukaryotic DNA topoisomerase III and eubacterial and archael reverse gyrases. Topoisomerases clevage single or double stranded DNA and then rejoin the broken phosphodiester backbone. Proposed catalytic mechanism of single stranded DNA cleavage is by phosphoryl transfer through a tyrosine nucleophile using acid/base catalysis. Tyr is activated by a nearby group (not yet identified) acting as a general base for nucleophilic attack on the 5' phosphate of the scissile bond. Arg and Lys stabilize the pentavalent transition state. Glu then acts as a proton donor for the leaving 3'-oxygen, upon cleavage of the scissile strand." Q#13432 - CGI_10005154 superfamily 242046 3 166 1.05E-49 172.418 cl00718 TOPRIM superfamily - - "Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function." Q#13433 - CGI_10005155 superfamily 241766 14 300 5.25E-126 364.467 cl00303 PNP_UDP_1 superfamily - - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#13435 - CGI_10004542 superfamily 241599 3 54 5.08E-11 54.9421 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#13436 - CGI_10004543 superfamily 110440 251 278 0.00557805 33.9205 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#13438 - CGI_10004545 superfamily 244906 9 90 2.45E-19 79.1664 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#13439 - CGI_10004546 superfamily 248282 91 151 0.000136345 42.0009 cl17728 DUF11 superfamily - - Domain of unknown function DUF11; A domain of unknown function found in multiple copies in several archaebacterial proteins. Q#13441 - CGI_10004548 superfamily 147590 7 71 1.61E-19 76.5257 cl07862 ATP_synt_H superfamily - - ATP synthase subunit H; ATP synthase subunit H is an extremely hydrophobic of approximately 9 kDa. This subunit may be required for assembly of vacuolar ATPase. Q#13442 - CGI_10004549 superfamily 248054 15 49 7.78E-05 40.9184 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#13444 - CGI_10004031 superfamily 245598 89 285 1.55E-73 235.251 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#13444 - CGI_10004031 superfamily 244899 395 451 0.000226704 39.7806 cl08302 S-100 superfamily N - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#13444 - CGI_10004031 superfamily 247856 427 479 0.000632126 38.2977 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13445 - CGI_10004032 superfamily 247912 53 406 6.18E-28 113.75 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#13447 - CGI_10019109 superfamily 243072 7 124 3.45E-29 104.773 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13448 - CGI_10019110 superfamily 247792 477 517 6.49E-06 43.5884 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13449 - CGI_10019111 superfamily 241677 39 161 1.25E-73 220.686 cl00197 cyclophilin superfamily C - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#13452 - CGI_10019114 superfamily 247723 122 207 1.33E-59 186.061 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13452 - CGI_10019114 superfamily 247723 22 111 3.21E-58 182.995 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13456 - CGI_10019118 superfamily 241619 4 53 0.000462518 37.1765 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#13458 - CGI_10019120 superfamily 241619 1 53 5.22E-05 36.2937 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#13464 - CGI_10019126 superfamily 202746 190 418 2.68E-82 255.682 cl08402 Hexokinase_2 superfamily - - Hexokinase; Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam00349. Some members of the family have two copies of each of these domains. Q#13465 - CGI_10019127 superfamily 241559 24 126 2.00E-26 100.248 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#13466 - CGI_10019128 superfamily 241559 22 124 5.21E-33 119.508 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#13469 - CGI_10019131 superfamily 241559 13 166 5.86E-30 110.263 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#13470 - CGI_10019132 superfamily 243092 1756 2043 1.80E-69 238.389 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13470 - CGI_10019132 superfamily 243092 2007 2337 4.30E-44 164.815 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13470 - CGI_10019132 superfamily 218721 509 665 1.76E-22 101.039 cl05344 TROVE superfamily N - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#13470 - CGI_10019132 superfamily 218721 317 412 4.63E-17 84.0897 cl05344 TROVE superfamily C - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#13470 - CGI_10019132 superfamily 205451 977 1057 1.18E-08 54.8919 cl16203 DUF4062 superfamily - - "Domain of unknown function (DUF4062); This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. There is a conserved SST sequence motif." Q#13470 - CGI_10019132 superfamily 247743 1212 1337 0.000132761 43.3424 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13473 - CGI_10019135 superfamily 248054 38 83 0.000144492 39.7628 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#13474 - CGI_10019136 superfamily 243158 284 318 2.84E-05 41.3904 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#13474 - CGI_10019136 superfamily 243158 210 243 0.000121636 39.4644 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#13474 - CGI_10019136 superfamily 243158 166 197 0.000446733 37.9236 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#13474 - CGI_10019136 superfamily 236274 69 144 0.00230708 37.7055 cl18891 PRK08485 superfamily N - DNA polymerase III subunit delta'; Validated Q#13474 - CGI_10019136 superfamily 243158 370 400 0.00886682 34.0716 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#13475 - CGI_10019137 superfamily 222150 1092 1117 0.00147585 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13475 - CGI_10019137 superfamily 222150 474 499 0.00181044 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13477 - CGI_10019139 superfamily 241636 190 376 6.15E-76 239.027 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#13479 - CGI_10019141 superfamily 247727 159 293 1.28E-12 65.3903 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#13479 - CGI_10019141 superfamily 241636 109 136 0.000376229 39.4932 cl00145 TBOX superfamily C - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#13480 - CGI_10019142 superfamily 247727 83 259 1.07E-18 81.5687 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#13482 - CGI_10019144 superfamily 242091 155 276 1.38E-22 93.5477 cl00786 MgtE superfamily - - Divalent cation transporter; This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding (Bateman A unpubl.) Q#13482 - CGI_10019144 superfamily 242091 333 517 4.86E-08 51.9901 cl00786 MgtE superfamily - - Divalent cation transporter; This region is the integral membrane part of the eubacterial MgtE family of magnesium transporters. Related regions are found also in archaebacterial and eukaryotic proteins. All the archaebacterial and eukaryotic examples have two copies of the region. This suggests that the eubacterial examples may act as dimers. Members of this family probably transport Mg2+ or other divalent cations into the cell. The alignment contains two highly conserved aspartates that may be involved in cation binding (Bateman A unpubl.) Q#13484 - CGI_10019146 superfamily 243066 394 490 8.79E-29 112.014 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#13484 - CGI_10019146 superfamily 243146 724 779 1.26E-08 52.6638 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13484 - CGI_10019146 superfamily 243146 834 884 1.85E-06 46.5006 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13484 - CGI_10019146 superfamily 198867 497 588 3.41E-06 46.3827 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#13484 - CGI_10019146 superfamily 243146 685 725 2.29E-05 43.2057 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13484 - CGI_10019146 superfamily 243146 894 926 0.00370905 36.4855 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13485 - CGI_10019147 superfamily 242035 6 157 3.23E-37 126.603 cl00698 CGI-121 superfamily - - "Kinase binding protein CGI-121; CGI-121 has been shown to bind to the p53-related protein kinase (PRPK). PRPK is a novel protein kinase which binds to and induces phosphorylation of the tumour suppressor protein p53. CGI-121 is part of a conserved protein complex, KEOPS. The KEOPS complex is involved in telomere uncapping and telomere elongation. Interestingly this family also include archaeal homologues, formerly in the DUF509 family. A structure for these proteins has been solved by structural genomics." Q#13486 - CGI_10019148 superfamily 214507 348 399 3.24E-08 49.736 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#13487 - CGI_10019149 superfamily 242885 51 214 9.02E-57 180.101 cl02106 IF4E superfamily - - Eukaryotic initiation factor 4E; Eukaryotic initiation factor 4E. Q#13488 - CGI_10016102 superfamily 217613 177 290 1.13E-55 178.137 cl04154 Cullin_binding superfamily - - "Cullin binding; This domain binds to cullins and to Rbx-1, components of an E3 ubiquitin ligase complex for neddylation. Neddylation is the process by which the C-terminal glycine of the ubiquitin-like protein Nedd8 is covalently linked to lysine residues in a protein through an isopeptide bond. The structure of this domain is composed entirely of alpha helices." Q#13489 - CGI_10016103 superfamily 247743 378 517 2.88E-06 46.1344 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13490 - CGI_10016104 superfamily 247824 104 295 7.61E-12 63.7827 cl17270 APH_ChoK_like superfamily N - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#13492 - CGI_10016106 superfamily 216334 97 202 0.00240058 37.7456 cl12245 Glypican superfamily N - Glypican; Glypican. Q#13493 - CGI_10016107 superfamily 247824 53 226 1.89E-07 48.5549 cl17270 APH_ChoK_like superfamily - - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#13494 - CGI_10016108 superfamily 216334 20 449 5.02E-162 479.569 cl12245 Glypican superfamily C - Glypican; Glypican. Q#13495 - CGI_10016109 superfamily 246669 623 818 6.02E-81 266.88 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#13495 - CGI_10016109 superfamily 244307 1690 2102 0 571.977 cl06123 DHR2_DOCK superfamily - - "Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins; DOCK proteins comprise a family of atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. They are also called the CZH (CED-5, Dock180, and MBC-zizimin homology) family, after the first family members identified. Dock180 was first isolated as a binding partner for the adaptor protein Crk. The Caenorhabditis elegans protein, Ced-5, is essential for cell migration and phagocytosis, while the Drosophila ortholog, Myoblast city (MBC), is necessary for myoblast fusion and dorsal closure. DOCKs are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1 (or Dock180), 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1, and DHR-2 (also called CZH2 or Docker). This alignment model represents the DHR-2 domain of DOCK proteins, which contains the catalytic GEF activity for Rac and/or Cdc42." Q#13495 - CGI_10016109 superfamily 247725 150 257 4.75E-41 150.141 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13495 - CGI_10016109 superfamily 221285 36 131 9.27E-21 90.4882 cl13339 DUF3398 superfamily - - Domain of unknown function (DUF3398); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is about 100 amino acids in length. Q#13496 - CGI_10016110 superfamily 248109 75 155 0.00459174 34.1741 cl17555 NLPC_P60 superfamily C - NlpC/P60 family; The function of this domain is unknown. It is found in several lipoproteins. Q#13497 - CGI_10016111 superfamily 245814 157 236 1.43E-08 50.1803 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13497 - CGI_10016111 superfamily 241607 104 136 0.000175257 38.0198 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#13497 - CGI_10016111 superfamily 243049 26 84 0.000115692 38.9859 cl02472 IGFBP superfamily C - Insulin-like growth factor binding protein; Insulin-like growth factor binding protein. Q#13498 - CGI_10016112 superfamily 241563 68 109 1.00E-05 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13499 - CGI_10016113 superfamily 241563 31 72 2.80E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13500 - CGI_10016114 superfamily 241563 83 124 1.41E-05 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13500 - CGI_10016114 superfamily 241563 43 73 0.00617166 35.1476 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13501 - CGI_10016115 superfamily 222150 70 97 0.000773286 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13501 - CGI_10016115 superfamily 222150 41 67 0.00213782 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13502 - CGI_10016116 superfamily 222150 279 306 0.000270919 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13502 - CGI_10016116 superfamily 222150 250 276 0.000691127 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13503 - CGI_10016117 superfamily 222150 277 304 0.00083981 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13503 - CGI_10016117 superfamily 222150 248 274 0.00120595 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13505 - CGI_10016119 superfamily 220647 7 169 2.54E-26 100.48 cl18565 L_HGMIC_fpl superfamily - - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#13506 - CGI_10016120 superfamily 190261 128 190 5.86E-22 89.9154 cl03504 RFX_DNA_binding superfamily - - RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. Q#13507 - CGI_10016121 superfamily 242902 1144 1312 4.18E-45 162.105 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#13507 - CGI_10016121 superfamily 243056 817 1012 8.53E-16 77.7845 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#13508 - CGI_10016122 superfamily 217598 1 71 6.50E-28 103.318 cl04130 KCNQ_channel superfamily N - KCNQ voltage-gated potassium channel; This family matches to the C-terminal tail of KCNQ type potassium channels. Q#13509 - CGI_10016123 superfamily 243072 93 222 5.19E-28 111.321 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13509 - CGI_10016123 superfamily 243072 172 315 3.09E-24 100.151 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13509 - CGI_10016123 superfamily 243072 302 415 1.36E-19 86.6686 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13509 - CGI_10016123 superfamily 243072 428 548 1.02E-14 72.4162 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13511 - CGI_10005242 superfamily 247725 148 275 2.96E-75 232.957 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13511 - CGI_10005242 superfamily 241631 2 121 2.35E-43 151.221 cl00136 Sec7 superfamily N - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#13512 - CGI_10012581 superfamily 241799 2 49 7.45E-13 59.1128 cl00339 SugarP_isomerase superfamily N - "SugarP_isomerase: Sugar Phosphate Isomerase family; includes type A ribose 5-phosphate isomerase (RPI_A), glucosamine-6-phosphate (GlcN6P) deaminase, and 6-phosphogluconolactonase (6PGL). RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium, the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate. 6PGL converts 6-phosphoglucono-1,5-lactone to 6-phosphogluconate, the second step of the oxidative phase of the pentose phosphate pathway." Q#13513 - CGI_10012582 superfamily 245201 10 141 1.38E-82 249.415 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13515 - CGI_10012584 superfamily 248458 175 348 3.95E-07 50.0049 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13516 - CGI_10012585 superfamily 248458 20 302 2.55E-08 53.8569 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13518 - CGI_10012587 superfamily 248458 61 312 0.00460321 37.2933 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13519 - CGI_10012588 superfamily 248458 169 548 3.91E-27 112.022 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13520 - CGI_10012589 superfamily 247723 15 103 1.93E-66 202.483 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13521 - CGI_10012590 superfamily 220626 130 387 6.35E-32 121.959 cl18564 GpcrRhopsn4 superfamily - - "Rhodopsin-like GPCR transmembrane domain; This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers." Q#13525 - CGI_10012595 superfamily 248458 183 292 0.000738262 40.7601 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13525 - CGI_10012595 superfamily 241607 393 433 1.33E-10 57.6994 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#13528 - CGI_10012598 superfamily 241609 141 214 5.78E-27 102.456 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#13528 - CGI_10012598 superfamily 241609 221 291 1.83E-26 100.915 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#13528 - CGI_10012598 superfamily 243093 4 65 2.88E-08 50.6066 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#13530 - CGI_10012600 superfamily 241672 7 340 2.33E-98 296.448 cl00192 ribokinase_pfkB_like superfamily - - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#13531 - CGI_10007464 superfamily 241574 354 582 5.59E-93 293.723 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#13531 - CGI_10007464 superfamily 241574 639 824 3.38E-13 68.7665 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#13533 - CGI_10007466 superfamily 190881 374 413 0.00265632 35.5471 cl18167 Sulf_transp superfamily - - Sulphur transport; This is an integral membrane protein. It is predicted to have a function in the transport of sulphur-containing molecules. It contains several conserved glycines and an invariant cysteine that is probably an important functional residue. Q#13535 - CGI_10007468 superfamily 241574 398 573 1.02E-56 197.038 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#13535 - CGI_10007468 superfamily 241574 642 798 1.26E-12 67.6109 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#13535 - CGI_10007468 superfamily 245847 49 183 3.57E-09 55.9715 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#13539 - CGI_10007473 superfamily 204376 86 143 1.11E-06 44.0561 cl10817 DUF2260 superfamily C - "Uncharacterized conserved protein (DUF2260); This domain, found in various hypothetical bacterial proteins, has no known function." Q#13540 - CGI_10007474 superfamily 241782 173 532 1.27E-105 324.156 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#13540 - CGI_10007474 superfamily 241782 1 69 1.00E-13 72.4357 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#13541 - CGI_10006391 superfamily 247683 25 80 7.21E-13 60.8441 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#13542 - CGI_10006392 superfamily 247792 7 54 0.000184548 39.7364 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13542 - CGI_10006392 superfamily 241563 152 183 0.000770489 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13545 - CGI_10005402 superfamily 241563 68 109 0.000100903 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13546 - CGI_10005403 superfamily 215647 88 245 3.16E-34 127.34 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#13546 - CGI_10005403 superfamily 243029 13 80 1.69E-08 50.8121 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#13547 - CGI_10005404 superfamily 215647 125 354 4.24E-53 178.186 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#13547 - CGI_10005404 superfamily 243029 36 102 5.52E-19 80.2512 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#13550 - CGI_10028517 superfamily 243097 61 345 2.83E-97 298.141 cl02572 PIPKc superfamily - - "Phosphatidylinositol phosphate kinases (PIPK) catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. CD alignment includes type II phosphatidylinositol phosphate kinases (PIPKII-beta), type I andII PIPK (-alpha, -beta, and -gamma) kinases and related yeast Fab1p and Mss4p kinases. Signaling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. The catalytic core domains of PIPKs are structurally similar to PI3K, PI4K, and cAMP-dependent protein kinases (PKA), the dimerization region is a unique feature of the PIPKs." Q#13551 - CGI_10028518 superfamily 243097 65 149 1.35E-18 81.5697 cl02572 PIPKc superfamily C - "Phosphatidylinositol phosphate kinases (PIPK) catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. CD alignment includes type II phosphatidylinositol phosphate kinases (PIPKII-beta), type I andII PIPK (-alpha, -beta, and -gamma) kinases and related yeast Fab1p and Mss4p kinases. Signaling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. The catalytic core domains of PIPKs are structurally similar to PI3K, PI4K, and cAMP-dependent protein kinases (PKA), the dimerization region is a unique feature of the PIPKs." Q#13557 - CGI_10028525 superfamily 243161 7 80 2.58E-07 43.5778 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#13559 - CGI_10028528 superfamily 215647 97 342 1.38E-57 200.913 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#13559 - CGI_10028528 superfamily 217252 1476 1538 1.21E-19 87.2351 cl08372 Pyr_redox_dim superfamily - - "Pyridine nucleotide-disulphide oxidoreductase, dimerisation domain; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases." Q#13559 - CGI_10028528 superfamily 215691 1303 1381 1.78E-16 77.241 cl15766 Pyr_redox superfamily - - Pyridine nucleotide-disulphide oxidoreductase; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. Q#13559 - CGI_10028528 superfamily 248054 1133 1310 5.04E-07 50.7639 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#13559 - CGI_10028528 superfamily 243146 458 507 1.90E-05 44.2023 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13559 - CGI_10028528 superfamily 243146 561 614 0.000197988 41.1207 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13559 - CGI_10028528 superfamily 243146 606 646 0.000272777 40.7226 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13559 - CGI_10028528 superfamily 243146 664 710 0.00219458 38.0391 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13560 - CGI_10028529 superfamily 245201 311 508 8.34E-41 152.305 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13560 - CGI_10028529 superfamily 246680 13 98 2.26E-14 70.9045 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#13561 - CGI_10028530 superfamily 242903 217 352 2.85E-55 186.847 cl02148 APC10-like superfamily - - "APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination; This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here." Q#13561 - CGI_10028530 superfamily 241594 531 831 9.24E-57 197.143 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#13563 - CGI_10028532 superfamily 204522 482 536 2.28E-15 70.5861 cl11212 PNPOx_C superfamily - - "Pyridoxine 5'-phosphate oxidase C-terminal dimerisation region; Pyridoxine 5'-phosphate oxidase (PNPOx) catalyzes the terminal step in the biosynthesis of pyridoxal 5'-phosphate (PLP), a cofactor used by many enzymes involved in amino acid metabolism. The enzyme oxidises either the 4'-hydroxyl group of pyridoxine 5'-phosphate (PNP) or the 4'-primary amine of pyridoxamine 5'-phosphate (PMP) to an aldehyde. PNPOx is a homodimeric enzyme with one flavin mononucleotide (FMN) molecule non-covalently bound to each subunit. This domain represents one of the two dimerisation regions of the protein, located at the edge of the dimer interface, at the C-terminus, being the last three beta strands, S6, S7, and S8 along with the last three residues to the end. In Myxococcus xanthus pdxH, S6 runs from residues 178-192, S7 from 200-206 and S8 from 211-215. the extended loop, of residues 167-177 may well be involved in the pocket formed between the two dimers that positions the FMN molecule." Q#13563 - CGI_10028532 superfamily 241815 72 95 7.09E-10 58.1687 cl00361 Transcrip_reg superfamily C - "Transcriptional regulator; This is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region." Q#13563 - CGI_10028532 superfamily 241827 141 195 1.03E-09 55.7082 cl00381 PNPOx_like superfamily C - "Pyridoxine 5'-phosphate (PNP) oxidase-like proteins; The PNPOx-like superfamily is composed of pyridoxine 5'-phosphate (PNP) oxidases and other flavin mononucleotide (FMN) binding proteins, which catalyze FMN-mediated redox reactions." Q#13563 - CGI_10028532 superfamily 241827 397 427 2.03E-06 45.693 cl00381 PNPOx_like superfamily N - "Pyridoxine 5'-phosphate (PNP) oxidase-like proteins; The PNPOx-like superfamily is composed of pyridoxine 5'-phosphate (PNP) oxidases and other flavin mononucleotide (FMN) binding proteins, which catalyze FMN-mediated redox reactions." Q#13564 - CGI_10028533 superfamily 241599 144 203 5.35E-13 64.1868 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#13565 - CGI_10028534 superfamily 223224 759 1301 0 648.535 cl18700 HyuB superfamily - - "N-methylhydantoinase B/acetone carboxylase, alpha subunit [Amino acid transport and metabolism / Secondary metabolites biosynthesis, transport, and catabolism]" Q#13565 - CGI_10028534 superfamily 216816 257 558 6.85E-97 312.686 cl18380 Hydantoinase_A superfamily - - Hydantoinase/oxoprolinase; This family includes the enzymes hydantoinase and oxoprolinase EC:3.5.2.9. Both reactions involve the hydrolysis of 5-membered rings via hydrolysis of their internal imide bonds. Q#13565 - CGI_10028534 superfamily 218571 33 238 2.19E-57 197.477 cl05110 Hydant_A_N superfamily - - Hydantoinase/oxoprolinase N-terminal region; This family is found at the N-terminus of the pfam01968 family. Q#13566 - CGI_10028535 superfamily 247792 21 46 0.00993585 33.9584 cl17238 RING superfamily N - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13566 - CGI_10028535 superfamily 241645 321 395 0.00150335 36.8524 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#13567 - CGI_10028536 superfamily 247724 44 130 7.44E-22 86.741 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#13569 - CGI_10028538 superfamily 216434 19 212 1.09E-59 205.389 cl08318 PPDK_N superfamily N - "Pyruvate phosphate dikinase, PEP/pyruvate binding domain; This enzyme catalyzes the reversible conversion of ATP to AMP, pyrophosphate and phosphoenolpyruvate (PEP)." Q#13569 - CGI_10028538 superfamily 248254 605 681 1.38E-11 62.0144 cl17700 PEP-utilizers superfamily C - "PEP-utilising enzyme, mobile domain; This domain is a "swivelling" beta/beta/alpha domain which is thought to be mobile in all proteins known to contain it." Q#13570 - CGI_10028539 superfamily 241680 54 246 0.00856987 36.8926 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#13572 - CGI_10028541 superfamily 248192 35 362 2.16E-74 239.864 cl17638 PLN02808 superfamily - - alpha-galactosidase Q#13573 - CGI_10028542 superfamily 243362 1239 1434 5.10E-33 127.156 cl03262 DnaJ_C superfamily - - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#13573 - CGI_10028542 superfamily 243077 1134 1188 1.90E-23 96.4605 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#13573 - CGI_10028542 superfamily 241663 878 1128 1.56E-63 220 cl00181 Endostatin-like superfamily - - "Endostatin-like domain; the angiogenesis inhibitor endostatin is a C-terminal fragment of collagen XV/XVIII, a proteoglycan/collagen found in vessel walls and basement membranes; this domain has a compact globular fold similar to that of C-type lectins; endostatin XVIII is monomeric and contains a heparin-binding epitope and zinc binding sites while endostatin XV is trimeric and contains neither of these sites; the generation of endostatin or endostatin-like collagen XV/XVIII fragments is catalyzed by proteolytic enzymes within the protease-sensitive hinge region of the C-terminal domain; endostatin inhibits endothelial cell migration in vitro and appears to be highly effective in murine in vivo studies" Q#13574 - CGI_10028543 superfamily 247792 14 64 1.08E-06 46.67 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13574 - CGI_10028543 superfamily 241563 99 130 0.000341958 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13575 - CGI_10028544 superfamily 247792 14 62 1.02E-09 55.1444 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13575 - CGI_10028544 superfamily 241563 102 135 0.00148073 37.0736 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13576 - CGI_10028545 superfamily 207662 107 190 1.07E-54 179.724 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#13576 - CGI_10028545 superfamily 245599 211 454 3.73E-99 300.961 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#13578 - CGI_10028547 superfamily 207662 130 199 1.07E-43 150.644 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#13578 - CGI_10028547 superfamily 245599 291 513 7.78E-90 278.236 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#13579 - CGI_10028548 superfamily 248313 104 188 0.000553463 37.981 cl17759 EamA superfamily N - EamA-like transporter family; This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. The family used to be known as DUF6. Q#13581 - CGI_10028550 superfamily 241752 133 484 4.72E-158 455.576 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#13581 - CGI_10028550 superfamily 242589 7 108 1.55E-43 150.17 cl01581 WGR superfamily - - "WGR domain; The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs) as well as the putative Escherichia coli molybdate metabolism regulator and related bacterial proteins, a small family of bacterial DNA ligases, and various other bacterial proteins of unknown function. It has been called WGR after the most conserved central motif of the domain. The domain occurs in single-domain proteins and in a variety of domain architectures, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain." Q#13587 - CGI_10028556 superfamily 248458 27 202 1.78E-29 118.57 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13587 - CGI_10028556 superfamily 248458 447 540 0.000295197 41.9157 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13587 - CGI_10028556 superfamily 248458 269 338 0.00530101 38.0637 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13589 - CGI_10028558 superfamily 243077 26 79 5.41E-23 88.3713 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#13595 - CGI_10028564 superfamily 247057 36 74 0.00505098 31.5381 cl15755 SAM_superfamily superfamily N - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#13597 - CGI_10028566 superfamily 247856 65 120 8.28E-10 51.0093 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13598 - CGI_10028568 superfamily 152928 30 111 2.25E-32 112.875 cl13875 DUF3695 superfamily - - Protein of unknown function (DUF3695); This family of proteins is found in eukaryotes. Proteins in this family are typically between 157 and 192 amino acids in length. There is a single completely conserved residue D that may be functionally important. Q#13602 - CGI_10028572 superfamily 246925 11 115 0.000477096 38.1054 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#13602 - CGI_10028572 superfamily 214507 134 158 0.00642333 32.402 cl15307 LRRCT superfamily C - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#13603 - CGI_10028573 superfamily 214545 465 600 2.02E-20 88.916 cl10551 CULLIN superfamily - - Cullin; Cullin. Q#13603 - CGI_10028573 superfamily 198939 710 770 2.45E-20 86.1528 cl08488 APC2 superfamily - - "Anaphase promoting complex (APC) subunit 2; The anaphase promoting complex or cyclosome (APC2) is an E3 ubiquitin ligase which is part of the SCF family of ubiquitin ligases. Ubiquitin ligases catalyze the transfer of ubiquitin from the ubiquitin conjugating enzyme (E2), to the substrate protein." Q#13604 - CGI_10028574 superfamily 247743 446 532 0.000431576 41.3627 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13604 - CGI_10028574 superfamily 117164 1636 1695 3.09E-16 76.5173 cl07270 DUF1771 superfamily - - Domain of unknown function (DUF1771); This domain is always found adjacent to pfam01713. Q#13604 - CGI_10028574 superfamily 243111 1706 1778 1.87E-15 74.6419 cl02619 Smr superfamily - - "Smr domain; This family includes the Smr (Small MutS Related) proteins, and the C-terminal region of the MutS2 protein. It has been suggested that this domain interacts with the MutS1 protein in the case of Smr proteins and with the N-terminal MutS related region of MutS2. This domain exhibits nicking endonuclease activity that might have a role in mismatch repair or genetic recombination. It shows no significant double strand cleavage or exonuclease activity. The full-length human NEDD4-binding protein 2 also has the polynucleotide kinase activity." Q#13604 - CGI_10028574 superfamily 241998 1072 1179 0.00290897 40.7364 cl00640 DHQS superfamily C - "3-dehydroquinate synthase (EC 4.6.1.3); 3-Dehydroquinate synthase is an enzyme in the common pathway of aromatic amino acid biosynthesis that catalyzes the conversion of 3-deoxy-D-arabino-heptulosonic acid 7-phosphate (DAHP) into 3-dehydroquinic acid. This synthesis of aromatic amino acids is an essential metabolic function for most prokaryotic as well as lower eukaryotic cells, including plants. The pathway is absent in humans; therefore, DHQS represents a potential target for the development of novel and selective antimicrobial agents. Owing to the threat posed by the spread of pathogenic bacteria resistant to many currently used antimicrobial drugs, there is clearly a need to develop new anti-infective drugs acting at novel targets. A further potential use for DHQS inhibitors is as herbicides." Q#13604 - CGI_10028574 superfamily 243130 1186 1224 0.00307317 37.8299 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#13605 - CGI_10028575 superfamily 247805 347 486 1.54E-25 104.342 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#13605 - CGI_10028575 superfamily 247905 511 668 4.12E-10 58.7885 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13605 - CGI_10028575 superfamily 243778 723 813 3.71E-37 135.815 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#13605 - CGI_10028575 superfamily 219532 847 947 7.96E-32 121.267 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#13606 - CGI_10028576 superfamily 245201 65 180 0.000121942 42.609 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#13607 - CGI_10028577 superfamily 241559 1287 1374 5.33E-07 50.7723 cl00030 CH superfamily N - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#13607 - CGI_10028577 superfamily 241559 1436 1485 1.76E-13 69.622 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#13608 - CGI_10028578 superfamily 245847 680 833 2.44E-34 129.394 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#13608 - CGI_10028578 superfamily 243119 859 910 2.92E-07 48.9692 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#13609 - CGI_10028579 superfamily 241600 8 76 4.28E-16 68.843 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#13613 - CGI_10028583 superfamily 243035 317 443 9.47E-24 95.3793 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13613 - CGI_10028583 superfamily 243035 191 288 2.29E-19 83.4381 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13613 - CGI_10028583 superfamily 243035 19 155 7.02E-24 96.2197 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13616 - CGI_10003868 superfamily 248289 14 50 0.000286368 34.414 cl17735 VWC superfamily C - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#13617 - CGI_10003869 superfamily 199005 13 49 2.79E-07 47.2267 cl10889 Cir_N superfamily - - "N-terminal domain of CBF1 interacting co-repressor CIR; This is a 45 residue conserved region at the N-terminal end of a family of proteins referred to as CIRs (CBF1-interacting co-repressors). CBF1 (centromere-binding factor 1) acts as a transcription factor that causes repression by binding specifically to GTGGGAA motifs in responsive promoters, and it requires CIR as a co-repressor. CIR binds to histone deacetylase and to SAP30 and serves as a linker between CBF1 and the histone deacetylase complex." Q#13618 - CGI_10003870 superfamily 222436 59 253 8.78E-40 140.868 cl16454 DUF4203 superfamily - - Domain of unknown function (DUF4203); This is the N-terminal region of 7tm proteins. The function is not known. Q#13623 - CGI_10017851 superfamily 248097 11 58 3.05E-06 39.941 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#13625 - CGI_10017853 superfamily 245847 31 176 0.00435542 35.556 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#13626 - CGI_10017854 superfamily 241763 165 390 4.39E-106 315.479 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#13626 - CGI_10017854 superfamily 117343 1 77 1.54E-37 132.958 cl07399 CathepsinC_exc superfamily N - "Cathepsin C exclusion domain; Cathepsin C (dipeptidyl peptidase I) is the physiological activator of a group of serine proteases. This domain corresponds to the exclusion domain whose structure excludes the approach of a polypeptide apart from its termini. It forms an enclosed beta barrel structure composed from 8 anti-parallel beta strands. Based on a structural comparison and interaction data, it is suggested that the exclusion domain originates from a metallo-protease inhibitor." Q#13626 - CGI_10017854 superfamily 203856 106 142 0.00486481 34.4889 cl06937 Propeptide_C1 superfamily - - Peptidase family C1 propeptide; This motif is found at the N terminal of some members of the Peptidase_C1 family (pfam00112) and is involved in activation of this peptidase. Q#13627 - CGI_10017855 superfamily 243038 128 214 2.15E-13 65.0545 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#13627 - CGI_10017855 superfamily 247725 7 111 2.35E-15 71.245 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#13629 - CGI_10017857 superfamily 207701 43 150 5.64E-32 117.781 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#13629 - CGI_10017857 superfamily 241578 233 312 6.54E-10 56.8397 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13629 - CGI_10017857 superfamily 244875 342 410 0.00820938 36.6108 cl08255 Na_K-ATPase superfamily N - Sodium / potassium ATPase beta chain; Sodium / potassium ATPase beta chain. Q#13630 - CGI_10017858 superfamily 244837 86 341 5.83E-66 222.179 cl07971 Glyco_hydro_3 superfamily - - Glycosyl hydrolase family 3 N terminal domain; Glycosyl hydrolase family 3 N terminal domain. Q#13630 - CGI_10017858 superfamily 222669 652 719 2.30E-08 52.0108 cl17048 Fn3-like superfamily - - Fibronectin type III-like domain; This domain has a fibronectin type III-like structure. It is often found in association with pfam00933 and pfam01915. Its function is unknown. Q#13631 - CGI_10017859 superfamily 243353 656 699 1.78E-10 57.4392 cl03225 GRIP superfamily - - "GRIP domain; The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. At least some of these domains have been shown to bind to GTPase Arl1, see structures in." Q#13632 - CGI_10017860 superfamily 203864 721 750 1.35E-05 43.1839 cl06967 NUC153 superfamily - - NUC153 domain; This small domain is found in a a novel nucleolar family. Q#13633 - CGI_10017861 superfamily 241596 19 78 7.27E-10 56.0683 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#13633 - CGI_10017861 superfamily 243045 99 166 8.96E-10 56.4875 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#13633 - CGI_10017861 superfamily 243045 240 301 4.34E-07 48.3983 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#13633 - CGI_10017861 superfamily 204647 463 492 0.00921523 34.5851 cl12942 HIF-1 superfamily - - "Hypoxia-inducible factor-1; HIF-1 is a transcriptional complex and controls cellular systemic homeostatic responses to oxygen availability. In the presence of oxygen HIF-1 alpha is targeted for proteasomal degradation by pHVL, a ubiquitination complex." Q#13634 - CGI_10017862 superfamily 247675 41 323 4.63E-123 358.731 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#13635 - CGI_10017863 superfamily 217408 566 1084 1.87E-64 228.964 cl15645 Nucleoporin_C superfamily - - "Non-repetitive/WGA-negative nucleoporin C-terminal; This is the C-termainl half of a family of nucleoporin proteins. Nucleoporins are the main components of the nuclear pore complex in eukaryotic cells, and mediate bidirectional nucleocytoplasmic transport, especially of mRNA and proteins. Two nucleoporin classes are known: one is characterized by the FG repeat pfam03093; the other is represented by this family, and lacks any repeats. RNA undergoing nuclear export first encounters the basket of the nuclear pore and many nucleoporins are accessible on the basket side of the pore." Q#13640 - CGI_10017868 superfamily 241870 14 168 1.32E-53 182.677 cl00451 MoCF_BD superfamily - - "MoCF_BD: molybdenum cofactor (MoCF) binding domain (BD). This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. The domain is presumed to bind molybdopterin." Q#13641 - CGI_10017869 superfamily 241870 167 518 1.45E-125 375.293 cl00451 MoCF_BD superfamily - - "MoCF_BD: molybdenum cofactor (MoCF) binding domain (BD). This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. The domain is presumed to bind molybdopterin." Q#13641 - CGI_10017869 superfamily 217567 42 131 5.63E-20 86.8476 cl04083 MoeA_N superfamily C - MoeA N-terminal region (domain I and II); This family contains two structural domains. One of these contains the conserved DGXA motif. This region is found in proteins involved in biosynthesis of molybdopterin cofactor however the exact molecular function of this region is uncertain. Q#13643 - CGI_10017871 superfamily 243263 64 524 3.63E-86 278.522 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#13645 - CGI_10017873 superfamily 241584 3355 3431 0.00525466 38.6315 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#13645 - CGI_10017873 superfamily 241619 1638 1673 0.0022071 39.4877 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#13646 - CGI_10017874 superfamily 201590 512 664 2.30E-10 60.3042 cl03090 HH_signal superfamily N - "Hedgehog amino-terminal signalling domain; For the carboxyl Hint module, see pfam01079. Hedgehog is a family of secreted signal molecules required for embryonic cell differentiation." Q#13646 - CGI_10017874 superfamily 201590 812 893 0.00336945 38.3478 cl03090 HH_signal superfamily N - "Hedgehog amino-terminal signalling domain; For the carboxyl Hint module, see pfam01079. Hedgehog is a family of secreted signal molecules required for embryonic cell differentiation." Q#13647 - CGI_10017875 superfamily 245213 42 78 4.19E-12 61.1134 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13647 - CGI_10017875 superfamily 245213 10 40 0.000281304 38.7718 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13647 - CGI_10017875 superfamily 220376 128 252 1.33E-10 58.5739 cl10729 DUF2040 superfamily - - "Coiled-coil domain-containing protein 55 (DUF2040); This entry is a conserved domain of approximately 130 residues of proteins conserved from fungi to humans. The proteins do contain a coiled-coil domain, but the function is unknown." Q#13648 - CGI_10000789 superfamily 248291 109 172 2.41E-13 62.3109 cl17737 Skp1_POZ superfamily N - "Skp1 family, tetramerisation domain; Skp1 family, tetramerisation domain. " Q#13650 - CGI_10005796 superfamily 241754 474 816 1.94E-67 233.281 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#13653 - CGI_10005799 superfamily 241563 84 122 6.43E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13656 - CGI_10000747 superfamily 241760 9 57 2.58E-21 88.1727 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#13656 - CGI_10000747 superfamily 241760 65 113 2.58E-21 88.1727 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#13657 - CGI_10003823 superfamily 243035 287 404 2.34E-11 60.3261 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13657 - CGI_10003823 superfamily 243035 126 197 2.08E-09 54.5481 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13657 - CGI_10003823 superfamily 244363 51 112 0.00151894 37.794 cl06336 Commd superfamily NC - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#13659 - CGI_10003826 superfamily 243161 44 81 0.00242271 33.1774 cl02739 THAP superfamily NC - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#13661 - CGI_10004998 superfamily 115363 278 321 5.44E-05 40.433 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#13661 - CGI_10004998 superfamily 241578 37 152 0.00534483 36.2784 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13662 - CGI_10004999 superfamily 222258 28 75 7.97E-05 42.17 cl18656 AAA_30 superfamily NC - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#13663 - CGI_10005000 superfamily 115363 273 292 4.49E-05 40.433 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#13665 - CGI_10005002 superfamily 115363 153 215 2.91E-10 55.841 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#13665 - CGI_10005002 superfamily 115363 228 271 6.47E-08 49.2926 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#13665 - CGI_10005002 superfamily 207713 330 400 1.17E-05 42.7122 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#13666 - CGI_10002314 superfamily 245206 37 310 4.74E-83 254.895 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#13667 - CGI_10002315 superfamily 243066 11 98 3.27E-16 69.5041 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#13669 - CGI_10002317 superfamily 247856 82 132 0.00798282 32.1345 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13670 - CGI_10002318 superfamily 241874 15 95 3.67E-41 142.588 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#13671 - CGI_10012437 superfamily 115363 661 745 0.00149343 37.7366 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#13671 - CGI_10012437 superfamily 241578 522 598 0.00186129 38.9748 cl00057 vWFA superfamily NC - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13673 - CGI_10012439 superfamily 216363 4 77 2.39E-15 65.5694 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#13676 - CGI_10012442 superfamily 241563 68 109 5.37E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13676 - CGI_10012442 superfamily 241563 28 59 0.0020477 36.3032 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13677 - CGI_10012443 superfamily 241578 211 393 1.42E-64 211.869 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13677 - CGI_10012443 superfamily 219821 59 196 2.80E-22 92.8194 cl07136 VWA_N superfamily - - "VWA N-terminal; This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits." Q#13677 - CGI_10012443 superfamily 217211 425 518 1.15E-15 72.7021 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#13678 - CGI_10012444 superfamily 247750 1 237 1.70E-105 314.976 cl17196 E1_enzyme_family superfamily - - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#13678 - CGI_10012444 superfamily 247750 326 368 5.42E-14 70.3741 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#13679 - CGI_10012445 superfamily 217473 130 338 1.97E-25 106.295 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#13680 - CGI_10012446 superfamily 241563 187 228 3.87E-09 53.6372 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13681 - CGI_10012447 superfamily 241571 603 708 3.94E-12 64.3558 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#13681 - CGI_10012447 superfamily 243061 182 287 3.53E-30 116.287 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#13681 - CGI_10012447 superfamily 243061 18 118 7.39E-30 115.132 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#13681 - CGI_10012447 superfamily 243061 309 393 1.27E-19 85.8566 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#13681 - CGI_10012447 superfamily 243061 402 497 3.59E-17 78.923 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#13681 - CGI_10012447 superfamily 243061 505 600 2.17E-11 61.7138 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#13681 - CGI_10012447 superfamily 243068 716 857 3.61E-09 57.17 cl02523 Zona_pellucida superfamily NC - Zona pellucida-like domain; Zona pellucida-like domain. Q#13681 - CGI_10012447 superfamily 243061 123 175 2.92E-08 52.7294 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#13686 - CGI_10012452 superfamily 247750 4 254 1.15E-118 359.296 cl17196 E1_enzyme_family superfamily C - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#13686 - CGI_10012452 superfamily 247750 322 479 4.73E-74 243.351 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#13686 - CGI_10012452 superfamily 202124 227 289 8.33E-12 61.024 cl08340 UBACT superfamily - - Repeat in ubiquitin-activating (UBA) protein; Repeat in ubiquitin-activating (UBA) protein. Q#13687 - CGI_10012453 superfamily 243092 305 431 0.00402984 38.0848 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13687 - CGI_10012453 superfamily 110440 484 508 0.00629603 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#13688 - CGI_10012454 superfamily 245106 1 84 5.46E-45 143.168 cl09615 UBA_e1_C superfamily N - Ubiquitin-activating enzyme e1 C-terminal domain; This presumed domain found at the C-terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterized. Q#13689 - CGI_10008437 superfamily 243141 3 115 4.13E-24 93.535 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#13690 - CGI_10008438 superfamily 248097 140 263 5.11E-15 68.831 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#13691 - CGI_10008439 superfamily 243034 1245 1375 7.52E-07 49.3008 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13691 - CGI_10008439 superfamily 247743 219 290 0.00303943 38.6663 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13691 - CGI_10008439 superfamily 247743 779 824 0.000497966 41.0312 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13692 - CGI_10008440 superfamily 247743 118 274 4.09E-09 55.6688 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13693 - CGI_10008441 superfamily 243072 2 122 7.73E-33 117.099 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13695 - CGI_10008443 superfamily 245210 39 421 0 513.565 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#13696 - CGI_10008444 superfamily 241554 53 130 7.56E-11 56.1998 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#13698 - CGI_10008446 superfamily 221499 18 72 9.53E-23 90.7437 cl13671 CAF1C_H4-bd superfamily - - Histone-binding protein RBBP4 or subunit C of CAF1 complex; The CAF-1 complex is a conserved heterotrimeric protein complex that promotes histone H3 and H4 deposition onto newly synthesized DNA during replication or DNA repair; specifically it facilitates replication-dependent nucleosome assembly with the major histone H3 (H3.1). This domain is an alpha helix which sits just upstream of the WD40 seven-bladed beta-propeller in the human RbAp46 protein. RbAp46 folds into the beta-propeller and binds histone H4 in a groove formed between this N-terminal helix and an extended loop inserted into blade six. Q#13698 - CGI_10008446 superfamily 243092 106 279 8.21E-18 81.6124 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13698 - CGI_10008446 superfamily 243092 203 379 3.69E-06 46.9444 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13699 - CGI_10008447 superfamily 247856 271 336 9.82E-22 90.3566 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13699 - CGI_10008447 superfamily 247856 14 80 1.34E-11 61.4667 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13700 - CGI_10008448 superfamily 244906 1531 1584 3.03E-18 81.8027 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#13700 - CGI_10008448 superfamily 192987 731 836 0.0026048 38.3223 cl13724 TMF_TATA_bd superfamily - - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#13701 - CGI_10008449 superfamily 241750 8 50 0.000258973 36.8836 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#13706 - CGI_10018055 superfamily 243161 3 104 5.65E-16 72.853 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#13708 - CGI_10018057 superfamily 247755 441 654 3.92E-135 406.54 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#13708 - CGI_10018057 superfamily 216049 148 396 6.85E-43 158.218 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#13708 - CGI_10018057 superfamily 216049 743 921 1.19E-24 105.061 cl18356 ABC_membrane superfamily C - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#13710 - CGI_10018059 superfamily 241636 62 254 4.52E-105 312.214 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#13712 - CGI_10018061 superfamily 220533 53 670 0 541.545 cl12375 Dpy19 superfamily - - "Q-cell neuroblast polarisation; Dyp-19, formerly known as DUF2211, is a transmembrane domain family that is required to orient the neuroblast cells, QR and QL accurately on the anterior-posterior axis: QL and QR are born in the same anterior-posterior position, but polarise and migrate left-right asymmetrically, QL migrating towards the posterior and QR migrating towards the anterior. It is also required, with unc-40, to express mab-5 correctly in the Q cell descendants. The Dpy-19 protein derives from the C. elegans DUMPY mutant." Q#13713 - CGI_10018062 superfamily 241554 63 90 0.0008769 35.0542 cl00019 Macro superfamily NC - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#13714 - CGI_10018063 superfamily 248458 327 474 1.28E-10 62.3313 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13714 - CGI_10018063 superfamily 248458 40 218 1.35E-10 62.3313 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13715 - CGI_10018064 superfamily 244881 17 300 0 523.369 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#13716 - CGI_10018065 superfamily 209898 31 53 0.00596541 32.3754 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#13717 - CGI_10018067 superfamily 245835 110 321 0.00753248 37.7206 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#13719 - CGI_10018069 superfamily 221612 62 224 1.02E-20 92.122 cl13889 DUF3715 superfamily - - "Protein of unknown function (DUF3715); This domain family is found in eukaryotes, and is approximately 170 amino acids in length." Q#13719 - CGI_10018069 superfamily 222449 1418 1486 0.00585 38.646 cl16468 Shisa superfamily N - Wnt and FGF inhibitory regulator; Shisa is a transcription factor-type molecule that physically interacts with immature forms of the Wnt receptor Frizzled and the FGF receptor within the endoplasmic reticulum to inhibit their post-translational maturation and trafficking to the cell surface. Q#13721 - CGI_10018071 superfamily 243072 125 201 1.56E-07 47.3782 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13721 - CGI_10018071 superfamily 241699 16 100 1.89E-28 103.81 cl00221 ACBP superfamily - - Acyl CoA binding protein (ACBP) binds thiol esters of long fatty acids and coenzyme A in a one-to-one binding mode with high specificity and affinity. Acyl-CoAs are important intermediates in fatty lipid synthesis and fatty acid degradation and play a role in regulation of intermediary metabolism and gene regulation. The suggested role of ACBP is to act as a intracellular acyl-CoA transporter and pool former. ACBPs are present in a large group of eukaryotic species and several tissue-specific isoforms have been detected. Q#13722 - CGI_10018072 superfamily 244897 26 210 1.65E-23 96.0158 cl08298 PTZ00007 superfamily C - (NAP-L) nucleosome assembly protein -L; Provisional Q#13723 - CGI_10018073 superfamily 243050 23 76 1.44E-25 94.0281 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13723 - CGI_10018073 superfamily 243050 93 139 1.69E-19 77.9095 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13724 - CGI_10018074 superfamily 247724 10 175 1.81E-129 364.928 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#13727 - CGI_10018077 superfamily 243084 1413 1524 1.58E-48 170.908 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#13727 - CGI_10018077 superfamily 243084 1535 1645 1.03E-44 159.737 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#13727 - CGI_10018077 superfamily 246955 591 1056 1.54E-171 531.506 cl15417 DUF3591 superfamily - - "Protein of unknown function (DUF3591); This domain is found in eukaryotes and is typically between 445 to 462 amino acids in length. Most members are annotated as being transcription initiation factor TFIID subunit 1, and this region is the conserved central portion of these proteins." Q#13727 - CGI_10018077 superfamily 246955 1084 1320 2.22E-10 64.6951 cl15417 DUF3591 superfamily N - "Protein of unknown function (DUF3591); This domain is found in eukaryotes and is typically between 445 to 462 amino acids in length. Most members are annotated as being transcription initiation factor TFIID subunit 1, and this region is the conserved central portion of these proteins." Q#13727 - CGI_10018077 superfamily 150052 18 45 2.06E-05 44.6816 cl07760 TBP-binding superfamily C - "TATA box-binding protein binding; Members of this family adopt a structure consisting of three alpha helices and a beta-hairpin. They bind to TATA box-binding protein (TBP), inhibiting TBP interaction with the TATA element, thereby resulting in shutting down of gene transcription." Q#13728 - CGI_10018078 superfamily 216939 3 69 3.41E-09 50.3541 cl03492 PC4 superfamily - - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#13729 - CGI_10018079 superfamily 244880 2 163 1.14E-29 108.017 cl08263 TBP_TLF superfamily - - "TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. New members of the TBP family, called TBP-like proteins (TBLP, TLF, TLP) or TBP-related factors (TRF1, TRF2,TRP), are similar to the core domain of TBPs, with identical or chemically similar amino acids at many equivalent positions, suggesting similar structure. However, TLFs contain distinct, conserved amino acids at several positions that distinguish them from TBP." Q#13732 - CGI_10018083 superfamily 241555 3 216 1.48E-66 210.598 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#13733 - CGI_10018084 superfamily 243066 50 69 0.000307535 34.5889 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#13734 - CGI_10018085 superfamily 243066 48 141 3.03E-26 103.466 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#13734 - CGI_10018085 superfamily 198867 150 247 6.72E-26 102.42 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#13734 - CGI_10018085 superfamily 243146 389 442 0.000217426 39.739 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13734 - CGI_10018085 superfamily 243146 444 480 0.00922016 34.6524 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#13738 - CGI_10018089 superfamily 247724 14 174 6.27E-118 334.976 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#13740 - CGI_10018091 superfamily 192535 43 166 0.00122864 38.3458 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#13741 - CGI_10022675 superfamily 247907 2499 2650 1.43E-27 112.897 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#13741 - CGI_10022675 superfamily 247907 2929 3066 1.72E-24 103.652 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#13741 - CGI_10022675 superfamily 247907 2311 2473 4.20E-24 102.496 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#13741 - CGI_10022675 superfamily 247907 2745 2882 3.88E-20 90.9404 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#13741 - CGI_10022675 superfamily 247907 2120 2278 1.02E-16 80.9252 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#13741 - CGI_10022675 superfamily 238012 956 1003 1.26E-15 75.0834 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 858 905 8.99E-14 69.6906 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 1052 1093 6.41E-12 64.2978 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 811 849 8.46E-12 63.9126 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 1401 1450 1.74E-11 63.1422 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 647 692 2.86E-11 62.3718 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 1506 1543 2.94E-09 56.5938 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 907 955 3.57E-09 56.2086 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 1007 1050 1.48E-07 51.5862 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 1451 1502 2.55E-05 45.0378 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 352 395 3.35E-05 44.6526 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 1098 1155 7.18E-05 43.497 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 243034 3142 3210 0.000102923 43.5228 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13741 - CGI_10022675 superfamily 238012 759 808 0.000235327 41.9562 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 238012 697 758 0.000443957 41.1858 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#13741 - CGI_10022675 superfamily 243198 27 264 6.81E-72 243.959 cl02806 Laminin_N superfamily - - Laminin N-terminal (Domain VI); Laminin N-terminal (Domain VI). Q#13741 - CGI_10022675 superfamily 243080 470 597 3.47E-29 116.592 cl02548 Laminin_B superfamily - - Laminin B (Domain IV); Laminin B (Domain IV). Q#13741 - CGI_10022675 superfamily 243080 1220 1345 9.61E-21 91.9388 cl02548 Laminin_B superfamily - - Laminin B (Domain IV); Laminin B (Domain IV). Q#13741 - CGI_10022675 superfamily 243092 3306 3485 7.18E-14 73.9084 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13741 - CGI_10022675 superfamily 203372 1994 2124 2.66E-06 49.0088 cl05515 Laminin_II superfamily - - "Laminin Domain II; It has been suggested that the domains I and II from laminin A, B1 and B2 may come together to form a triple helical coiled-coil structure." Q#13742 - CGI_10022676 superfamily 243034 654 749 1.31E-11 62.3976 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13742 - CGI_10022676 superfamily 242008 555 618 0.00660602 37.9876 cl00656 Cas1_I-II-III superfamily N - "CRISPR/Cas system-associated protein Cas1; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas1 is the most universal CRISPR system protein thought to be involved in spacer integration; Cas1 is metal-dependent deoxyribonuclease, also binds RNA; Shown to possess a unique fold consisting of a N-terminal beta-strand domain and a C-terminal alpha-helical domain" Q#13743 - CGI_10022677 superfamily 243092 20 104 1.49E-13 66.2044 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13744 - CGI_10022678 superfamily 201844 209 237 3.31E-15 70.017 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#13745 - CGI_10022679 superfamily 201844 44 74 1.66E-14 66.5502 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#13745 - CGI_10022679 superfamily 201844 92 121 9.92E-13 61.5426 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#13745 - CGI_10022679 superfamily 201844 6 33 3.92E-12 60.0018 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#13745 - CGI_10022679 superfamily 201844 152 179 1.75E-11 58.0758 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#13745 - CGI_10022679 superfamily 219850 188 236 4.91E-05 43.3747 cl07173 SNF2_assoc superfamily N - Bacterial SNF2 helicase associated; This domain is found in bacterial proteins of the SWF/SNF/SWI helicase family to the N-terminus of the SNF2 family N-terminal domain (pfam00176) and together with the Helicase conserved C-terminal domain (pfam00271). The function of the domain is not clear. Q#13746 - CGI_10022680 superfamily 247727 84 191 0.000217388 39.3355 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#13750 - CGI_10022684 superfamily 189857 7 108 1.29E-28 102.713 cl07832 Caveolin superfamily N - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#13751 - CGI_10022685 superfamily 189857 1 121 1.02E-40 134.3 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#13752 - CGI_10022686 superfamily 247746 32 142 0.00268964 36.8526 cl17192 ATP-synt_B superfamily - - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#13753 - CGI_10022687 superfamily 245106 43 168 5.25E-72 215.586 cl09615 UBA_e1_C superfamily - - Ubiquitin-activating enzyme e1 C-terminal domain; This presumed domain found at the C-terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterized. Q#13754 - CGI_10022688 superfamily 247068 353 442 1.36E-14 71.1905 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13754 - CGI_10022688 superfamily 247068 40 129 6.17E-13 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13754 - CGI_10022688 superfamily 247068 246 339 2.88E-11 61.5605 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13754 - CGI_10022688 superfamily 247068 567 656 1.18E-08 53.8566 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13754 - CGI_10022688 superfamily 247068 141 233 1.36E-08 53.4714 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13754 - CGI_10022688 superfamily 247068 464 548 3.62E-08 52.3158 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13754 - CGI_10022688 superfamily 247068 836 901 3.15E-07 49.2342 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13754 - CGI_10022688 superfamily 247068 666 757 1.31E-06 47.3082 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#13757 - CGI_10022691 superfamily 203031 32 86 1.23E-05 40.004 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#13758 - CGI_10022692 superfamily 241691 198 254 0.00114266 37.4916 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#13759 - CGI_10022693 superfamily 218847 23 170 1.98E-38 131.854 cl18479 CDO_I superfamily - - Cysteine dioxygenase type I; Cysteine dioxygenase type I (EC:1.13.11.20) converts cysteine to cysteinesulphinic acid and is the rate-limiting step in sulphate production. Q#13760 - CGI_10022694 superfamily 246669 533 651 9.03E-72 232.19 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#13760 - CGI_10022694 superfamily 246669 379 493 1.82E-53 180.915 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#13760 - CGI_10022694 superfamily 246669 221 340 2.53E-51 175.157 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#13761 - CGI_10022695 superfamily 247723 148 199 1.04E-20 86.6876 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13762 - CGI_10022696 superfamily 204577 87 198 5.30E-25 97.3756 cl12569 DUF2838 superfamily - - Protein of unknown function (DUF2838); This bacterial family of proteins has no known function. Q#13763 - CGI_10022697 superfamily 247905 904 1036 1.49E-22 95.3824 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13763 - CGI_10022697 superfamily 247805 501 690 1.90E-12 66.2068 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#13764 - CGI_10022698 superfamily 217311 33 399 1.86E-126 383.225 cl18402 DUF229 superfamily N - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#13767 - CGI_10022701 superfamily 244363 26 188 3.50E-75 226.041 cl06336 Commd superfamily - - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#13769 - CGI_10022703 superfamily 245213 3264 3300 2.93E-10 59.5726 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1447 1484 2.99E-10 59.5726 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2069 2105 1.04E-09 58.0318 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1681 1718 1.51E-09 57.6466 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2532 2568 1.66E-09 57.2614 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 905 942 2.22E-09 56.8762 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 3186 3222 3.30E-09 56.491 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2763 2798 3.93E-09 56.491 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 3109 3146 4.01E-09 56.1058 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 944 980 4.59E-09 56.1058 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2953 2990 6.53E-09 55.7206 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1876 1913 9.04E-09 55.3354 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1720 1757 1.22E-08 54.9502 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2108 2143 1.67E-08 54.565 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1524 1560 2.38E-08 54.1798 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1331 1367 2.47E-08 53.7946 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1759 1797 2.50E-08 53.7946 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2380 2416 3.97E-08 53.4094 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2839 2874 4.46E-08 53.4094 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 3032 3068 4.54E-08 53.0242 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2915 2950 6.54E-08 52.639 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2456 2492 9.79E-08 52.2538 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 3224 3260 1.10E-07 52.2538 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2571 2606 1.25E-07 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2029 2066 1.29E-07 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1915 1950 1.51E-07 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2647 2682 2.22E-07 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 3148 3183 2.32E-07 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1371 1405 2.37E-07 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1100 1137 2.86E-07 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2494 2530 3.25E-07 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 3070 3106 4.05E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1953 1989 4.49E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1991 2026 4.69E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 866 903 6.53E-07 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2685 2721 6.81E-07 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 982 1020 7.02E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2877 2913 7.48E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1060 1097 8.32E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1799 1836 8.60E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 3302 3338 9.11E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1142 1174 1.32E-06 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1177 1213 1.78E-06 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2609 2644 2.46E-06 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1022 1057 3.27E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1407 1445 3.36E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2723 2760 7.16E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1563 1600 7.51E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2992 3030 1.08E-05 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1293 1328 1.16E-05 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1255 1291 1.28E-05 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1838 1873 1.42E-05 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1603 1641 1.55E-05 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1486 1521 1.64E-05 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2146 2181 1.84E-05 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2418 2454 1.98E-05 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2340 2377 3.36E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2223 2260 0.000124925 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2801 2836 0.000127195 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1643 1679 0.000135876 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 1221 1252 0.00179573 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 840 864 0.00221218 39.5422 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 245213 2306 2338 0.00561166 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13769 - CGI_10022703 superfamily 243060 3580 3649 4.61E-09 57.3888 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#13769 - CGI_10022703 superfamily 243060 3837 3936 3.84E-06 48.5292 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#13769 - CGI_10022703 superfamily 243060 3459 3538 1.96E-05 46.218 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#13769 - CGI_10022703 superfamily 243060 117 215 2.39E-05 46.218 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#13769 - CGI_10022703 superfamily 243060 3714 3777 4.88E-05 45.0624 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#13773 - CGI_10022707 superfamily 247723 483 539 3.05E-08 51.4992 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13774 - CGI_10022708 superfamily 243035 84 191 1.23E-21 86.1345 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13775 - CGI_10022709 superfamily 243035 107 204 2.61E-20 82.6677 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13776 - CGI_10022710 superfamily 241691 16 170 2.42E-06 44.697 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#13780 - CGI_10004339 superfamily 248338 158 271 0.00311496 37.5809 cl17784 Peptidase_C48 superfamily NC - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#13781 - CGI_10004340 superfamily 243263 37 218 4.45E-36 137.539 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#13781 - CGI_10004340 superfamily 243263 3 43 2.22E-08 54.3362 cl02990 ASC superfamily C - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#13784 - CGI_10004489 superfamily 216339 170 406 5.32E-149 425.646 cl08308 Tub superfamily - - Tub family; Tub family. Q#13790 - CGI_10006088 superfamily 241622 235 315 1.96E-22 92.2446 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#13790 - CGI_10006088 superfamily 241622 362 415 1.68E-09 55.3384 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#13790 - CGI_10006088 superfamily 244397 569 616 0.00207632 37.2693 cl06515 DUF1525 superfamily NC - Protein of unknown function (DUF1525); Protein of unknown function (DUF1525). Q#13791 - CGI_10006089 superfamily 247058 1 185 7.09E-59 186.997 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#13800 - CGI_10005287 superfamily 247724 3 137 1.64E-22 89.9019 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#13802 - CGI_10005289 superfamily 217900 5 106 2.33E-33 127.698 cl04403 APG9 superfamily N - "Autophagy protein Apg9; In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialised compartment essential for these vesicle-mediated alternative targeting pathways." Q#13805 - CGI_10015178 superfamily 241563 11 43 1.41E-07 48.2444 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13806 - CGI_10015179 superfamily 219431 138 188 2.24E-05 42.8032 cl06504 zf-CW superfamily - - "CW-type Zinc Finger; This domain appears to be a zinc finger. The alignment shows four conserved cysteine residues and a conserved tryptophan. It was first identified by, and is predicted to be a "highly specialised mononuclear four-cysteine zinc finger...that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including...chromatin methylation status and early embryonic development." Weak homology to pfam00628 further evidences these predictions (personal obs: C Yeats). Twelve different CW-domain-containing protein subfamilies are described, with different subfamilies being characteristic of vertebrates, higher plants and other animals in which these domain is found." Q#13807 - CGI_10015180 superfamily 247856 97 159 3.17E-16 74.8917 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13807 - CGI_10015180 superfamily 247856 216 278 1.83E-15 72.5805 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13807 - CGI_10015180 superfamily 247856 23 83 2.15E-15 72.1953 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13807 - CGI_10015180 superfamily 247856 290 349 4.32E-13 65.6469 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13807 - CGI_10015180 superfamily 247057 784 831 4.61E-05 42.2265 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#13808 - CGI_10015181 superfamily 243069 192 278 0.00202081 37.7324 cl02525 Band_7 superfamily NC - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#13809 - CGI_10015182 superfamily 220608 39 151 1.17E-25 104.31 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#13810 - CGI_10015183 superfamily 220608 31 151 8.20E-24 98.9174 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#13811 - CGI_10015184 superfamily 222150 1531 1556 0.000171956 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13811 - CGI_10015184 superfamily 222150 233 257 0.000237433 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13812 - CGI_10015185 superfamily 241874 26 578 0 551.4 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#13814 - CGI_10015187 superfamily 245847 22 170 2.24E-37 134.787 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#13814 - CGI_10015187 superfamily 245847 212 349 2.26E-27 107.438 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#13814 - CGI_10015187 superfamily 245847 399 535 0.000485771 39.0228 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#13816 - CGI_10015189 superfamily 147024 89 264 1.27E-62 204.585 cl04658 OGFr_N superfamily - - "Opioid growth factor receptor (OGFr) conserved region; Opioid peptides act as growth factors in neural and non-neural cells and tissues, in addition to serving in neurotransmission/neuromodulation in the nervous system. The Opioid growth factor receptor is an integral membrane protein associated with the nucleus. The conserved region is situated at the N-terminus of the member proteins with a series of imperfect repeats lying immediately to its C-terminus." Q#13817 - CGI_10015190 superfamily 247755 446 684 1.42E-138 408.851 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#13817 - CGI_10015190 superfamily 216049 129 399 3.30E-33 128.943 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#13818 - CGI_10015191 superfamily 198738 317 405 1.50E-39 137.399 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#13818 - CGI_10015191 superfamily 247057 116 194 6.03E-20 83.5657 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#13819 - CGI_10015192 superfamily 198738 314 402 1.26E-43 148.57 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#13819 - CGI_10015192 superfamily 247057 163 241 5.43E-25 97.0477 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#13822 - CGI_10015195 superfamily 245208 3 630 0 677.51 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#13823 - CGI_10015196 superfamily 243082 421 523 8.47E-06 45.7414 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#13824 - CGI_10005534 superfamily 246616 2 192 3.61E-18 81.1993 cl14105 MetH superfamily - - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#13825 - CGI_10005535 superfamily 245864 54 388 1.93E-53 185.174 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#13826 - CGI_10005536 superfamily 243135 249 524 6.38E-103 316.144 cl02666 KU superfamily - - "Ku-core domain; includes the central DNA-binding beta-barrels, polypeptide rings, and the C-terminal arm of Ku proteins. The Ku protein consists of two tightly associated homologous subunits, Ku70 and Ku80, and was originally identified as an autoantigen recognized by the sera of patients with an autoimmunity disease. In eukaryotes, the Ku heterodimer contributes to genomic integrity through its ability to bind DNA double-strand breaks and facilitate repair by non-homologous end-joining. The bacterial Ku homologs does not contain the conserved N-terminal extension that is present in the eukaryotic Ku protein." Q#13826 - CGI_10005536 superfamily 241578 28 244 6.34E-63 208.756 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13826 - CGI_10005536 superfamily 207684 570 605 0.000756736 37.7436 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#13828 - CGI_10005538 superfamily 241889 89 234 8.54E-57 183.986 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#13830 - CGI_10005540 superfamily 241777 349 498 1.91E-37 138.892 cl00316 Cation_efflux superfamily N - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#13830 - CGI_10005540 superfamily 241777 166 292 7.25E-29 114.239 cl00316 Cation_efflux superfamily C - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#13831 - CGI_10005541 superfamily 241862 438 563 2.26E-23 99.3528 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#13838 - CGI_10014948 superfamily 192535 71 189 3.24E-07 50.6722 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#13841 - CGI_10014951 superfamily 147626 77 106 0.000433474 36.8495 cl05227 DUF1519 superfamily N - Protein of unknown function (DUF1519); This family consists of several putative homing endonuclease proteins of around 245 residues in length which appear to be found exclusively in Naegleria species. The function of this family is unclear. Q#13842 - CGI_10014952 superfamily 222150 753 777 0.000511814 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#13843 - CGI_10014953 superfamily 247792 9 59 4.24E-08 50.1368 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13845 - CGI_10014955 superfamily 247792 866 916 2.32E-08 52.448 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13845 - CGI_10014955 superfamily 243056 143 303 9.05E-48 171.721 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#13846 - CGI_10014956 superfamily 241647 6 36 4.35E-09 48.6782 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#13846 - CGI_10014956 superfamily 244886 45 153 4.67E-59 181.377 cl08278 Rotamase_2 superfamily - - PPIC-type PPIASE domain; PPIC-type PPIASE domain. Q#13847 - CGI_10014957 superfamily 243034 1127 1242 5.42E-16 75.8795 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13847 - CGI_10014957 superfamily 243034 1066 1155 7.52E-08 51.9972 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13847 - CGI_10014957 superfamily 243034 934 1049 2.28E-07 50.4564 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13847 - CGI_10014957 superfamily 243034 1212 1282 2.97E-07 50.0712 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13848 - CGI_10014958 superfamily 241795 445 575 3.24E-56 189.729 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#13848 - CGI_10014958 superfamily 241795 310 442 5.70E-56 188.958 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#13848 - CGI_10014958 superfamily 241795 161 295 1.11E-46 163.535 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#13848 - CGI_10014958 superfamily 241832 11 112 5.49E-39 140.934 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#13848 - CGI_10014958 superfamily 220249 787 854 3.41E-19 83.4236 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#13849 - CGI_10014959 superfamily 220249 54 121 1.80E-17 71.8676 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#13850 - CGI_10014960 superfamily 245230 2 305 5.69E-63 214.099 cl10017 Tubulin_FtsZ superfamily C - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#13850 - CGI_10014960 superfamily 245230 258 356 2.14E-09 58.0931 cl10017 Tubulin_FtsZ superfamily N - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#13851 - CGI_10014961 superfamily 247743 1317 1402 0.00759381 38.6663 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13851 - CGI_10014961 superfamily 193256 2299 2562 2.73E-64 223.671 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#13851 - CGI_10014961 superfamily 193251 1926 2199 4.90E-50 182.443 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#13851 - CGI_10014961 superfamily 193257 2942 3157 8.69E-45 165.544 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#13851 - CGI_10014961 superfamily 193253 2574 2922 1.61E-39 153.654 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#13851 - CGI_10014961 superfamily 247743 1593 1736 3.13E-06 48.8308 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#13855 - CGI_10005607 superfamily 198825 62 134 4.46E-30 105.959 cl03763 CaMBD superfamily - - "Calmodulin binding domain; Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other." Q#13855 - CGI_10005607 superfamily 219619 4 44 1.28E-08 47.9728 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#13858 - CGI_10011151 superfamily 241792 3 141 7.33E-82 240.534 cl00332 Ribosomal_S11 superfamily - - Ribosomal protein S11; Ribosomal protein S11. Q#13859 - CGI_10011152 superfamily 241974 525 617 3.36E-19 83.8302 cl00604 STAS superfamily - - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#13859 - CGI_10011152 superfamily 216188 194 488 2.42E-54 188.196 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#13859 - CGI_10011152 superfamily 205965 78 158 4.85E-36 130.61 cl18285 Sulfate_tra_GLY superfamily - - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#13860 - CGI_10011153 superfamily 248020 23 353 1.12E-35 139.907 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#13861 - CGI_10011154 superfamily 248020 23 353 1.68E-37 140.678 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#13862 - CGI_10011155 superfamily 246925 29 171 8.30E-12 65.4545 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#13863 - CGI_10011156 superfamily 246925 19 158 0.000108946 40.0314 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#13864 - CGI_10011157 superfamily 241554 2 164 2.01E-26 103.504 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#13865 - CGI_10011158 superfamily 241613 34 67 4.56E-09 53.7498 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13865 - CGI_10011158 superfamily 241613 233 267 9.66E-09 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13865 - CGI_10011158 superfamily 241613 74 107 1.31E-08 52.209 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13865 - CGI_10011158 superfamily 241613 151 181 2.18E-07 48.7422 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13865 - CGI_10011158 superfamily 241613 199 229 3.35E-06 45.2754 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#13865 - CGI_10011158 superfamily 245213 353 384 0.00106566 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13865 - CGI_10011158 superfamily 214531 472 511 2.29E-10 57.6116 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#13865 - CGI_10011158 superfamily 214531 513 554 2.08E-09 54.9153 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#13865 - CGI_10011158 superfamily 214531 560 601 3.77E-09 54.1449 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#13865 - CGI_10011158 superfamily 214531 425 460 0.000751318 38.3517 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#13865 - CGI_10011158 superfamily 215683 622 662 0.00542738 35.9939 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#13866 - CGI_10011159 superfamily 247692 68 432 3.85E-158 460.423 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#13867 - CGI_10011160 superfamily 247724 132 351 3.31E-142 412.655 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#13867 - CGI_10011160 superfamily 243184 446 553 7.05E-57 187.494 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#13867 - CGI_10011160 superfamily 243185 359 440 1.20E-36 130.748 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#13868 - CGI_10011161 superfamily 227527 8 224 1.96E-35 132.012 cl15315 LUC7 superfamily - - "U1 snRNP component, mediates U1 snRNP association with cap-binding complex [RNA processing and modification]" Q#13869 - CGI_10011162 superfamily 114045 836 902 6.91E-05 45.0126 cl15903 Herpes_LMP1 superfamily NC - "Herpesvirus latent membrane protein 1 (LMP1); This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus. LMP1 of EBV is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N-terminus and a long cytoplasmic carboxy tail of 200 amino acids. EBV latent membrane protein 1 (LMP1) is essential for EBV-mediated transformation and has been associated with several cases of malignancies. EBV-like viruses in Cynomolgus monkeys (Macaca fascicularis) have been associated with high lymphoma rates in immunosuppressed monkeys" Q#13869 - CGI_10011162 superfamily 114045 720 817 0.000691354 41.931 cl15903 Herpes_LMP1 superfamily N - "Herpesvirus latent membrane protein 1 (LMP1); This family consists of several latent membrane protein 1 or LMP1s mostly from Epstein-Barr virus. LMP1 of EBV is a 62-65 kDa plasma membrane protein possessing six membrane spanning regions, a short cytoplasmic N-terminus and a long cytoplasmic carboxy tail of 200 amino acids. EBV latent membrane protein 1 (LMP1) is essential for EBV-mediated transformation and has been associated with several cases of malignancies. EBV-like viruses in Cynomolgus monkeys (Macaca fascicularis) have been associated with high lymphoma rates in immunosuppressed monkeys" Q#13870 - CGI_10011163 superfamily 243092 96 211 9.66E-06 45.7888 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13872 - CGI_10011165 superfamily 241640 219 389 0.004801 36.0113 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#13874 - CGI_10011167 superfamily 241640 195 366 0.00268923 36.7817 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#13875 - CGI_10011168 superfamily 219740 194 351 1.42E-06 48.9546 cl06992 Peptidase_S64 superfamily N - "Peptidase family S64; This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1. The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS:S1) and to have a typical trypsin-like catalytic triad." Q#13876 - CGI_10000778 superfamily 243092 2 52 0.000601605 36.544 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13877 - CGI_10001266 superfamily 247684 9 100 4.84E-14 66.1467 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#13878 - CGI_10001267 superfamily 247684 34 253 1.57E-26 107.378 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#13879 - CGI_10018717 superfamily 243124 147 205 1.15E-12 62.0593 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#13880 - CGI_10018718 superfamily 241886 5 234 1.18E-64 205.871 cl00470 Aldo_ket_red superfamily C - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#13886 - CGI_10018724 superfamily 245847 78 193 0.000697851 37.8672 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#13887 - CGI_10018725 superfamily 241610 1 52 4.01E-16 67.275 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#13888 - CGI_10018726 superfamily 247724 165 375 3.50E-15 73.3497 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#13891 - CGI_10004715 superfamily 247792 275 315 1.09E-07 47.7596 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#13891 - CGI_10004715 superfamily 248312 25 143 2.25E-05 42.7344 cl17758 PMP22_Claudin superfamily C - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#13892 - CGI_10004716 superfamily 248312 26 164 5.55E-05 40.4232 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#13893 - CGI_10004717 superfamily 207690 382 402 0.00580427 34.6009 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#13894 - CGI_10004718 superfamily 241628 250 429 1.47E-56 192.442 cl00130 PseudoU_synth superfamily C - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#13894 - CGI_10004718 superfamily 241628 107 135 3.89E-09 56.4662 cl00130 PseudoU_synth superfamily C - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#13895 - CGI_10004719 superfamily 241628 2 238 1.20E-55 185.123 cl00130 PseudoU_synth superfamily N - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#13896 - CGI_10004720 superfamily 217473 516 677 3.30E-18 85.4945 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#13897 - CGI_10004721 superfamily 216981 145 268 4.07E-13 65.2466 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#13898 - CGI_10004722 superfamily 247724 5 123 5.52E-62 189.847 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#13900 - CGI_10003351 superfamily 241567 10 148 5.93E-31 114.235 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#13901 - CGI_10003352 superfamily 241567 378 512 1.07E-27 112.694 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#13901 - CGI_10003352 superfamily 241567 14 156 1.48E-26 109.227 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#13901 - CGI_10003352 superfamily 241567 663 753 3.27E-09 56.8399 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#13901 - CGI_10003352 superfamily 241567 242 355 1.96E-06 48.3655 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#13902 - CGI_10003353 superfamily 241787 31 85 1.92E-07 43.7374 cl00326 Ribosomal_L23 superfamily C - Ribosomal protein L23; Ribosomal protein L23. Q#13903 - CGI_10003354 superfamily 245084 1 318 0 576.02 cl09506 catalase_like superfamily N - "Catalase-like heme-binding proteins and protein domains; Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity." Q#13904 - CGI_10003355 superfamily 245084 65 494 0 849.897 cl09506 catalase_like superfamily - - "Catalase-like heme-binding proteins and protein domains; Catalase is a ubiquitous enzyme found in both prokaryotes and eukaryotes involved in the protection of cells from the toxic effects of peroxides. It catalyses the conversion of hydrogen peroxide to water and molecular oxygen. Several other related protein families share the catalase fold and bind to heme, but do not necessarily have catalase activity." Q#13907 - CGI_10015305 superfamily 247905 47 91 1.86E-05 41.0693 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13908 - CGI_10015306 superfamily 247905 2 38 2.51E-06 43.7397 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13912 - CGI_10015310 superfamily 241563 60 99 7.56E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13912 - CGI_10015310 superfamily 110440 521 547 0.00327917 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#13917 - CGI_10015315 superfamily 247905 102 146 0.000123067 39.9137 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13918 - CGI_10015316 superfamily 245818 242 389 2.55E-19 85.3015 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#13918 - CGI_10015316 superfamily 247723 51 129 5.27E-13 65.8742 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#13918 - CGI_10015316 superfamily 217750 480 536 0.000215548 40.255 cl04280 PAP_assoc superfamily - - Cid1 family poly A polymerase; This domain is found in poly(A) polymerases and has been shown to have polynucleotide adenylyltransferase activity. Proteins in this family have been located to both the nucleus and the cytoplasm. Q#13919 - CGI_10015317 superfamily 241599 240 289 3.32E-06 43.7713 cl00084 homeodomain superfamily C - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#13920 - CGI_10015318 superfamily 245818 157 278 8.87E-16 71.8195 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#13922 - CGI_10015320 superfamily 238191 14 157 2.32E-22 97.788 cl18907 Esterase_lipase superfamily NC - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#13925 - CGI_10015323 superfamily 227463 7 274 1.99E-27 109.399 cl12195 COG5134 superfamily - - Uncharacterized conserved protein [Function unknown] Q#13926 - CGI_10015324 superfamily 248014 709 893 3.97E-53 183.152 cl17460 Csf4_U superfamily - - CRISPR/Cas system-associated DinG family helicase Csf4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase Q#13926 - CGI_10015324 superfamily 219153 246 432 4.44E-50 174.851 cl15854 DEAD_2 superfamily - - "DEAD_2; This represents a conserved region within a number of RAD3-like DNA-binding helicases that are seemingly ubiquitous - members include proteins of eukaryotic, bacterial and archaeal origin. RAD3 is involved in nucleotide excision repair, and forms part of the transcription factor TFIIH in yeast." Q#13926 - CGI_10015324 superfamily 248014 44 85 0.00506662 39.1863 cl17460 Csf4_U superfamily C - CRISPR/Cas system-associated DinG family helicase Csf4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase Q#13927 - CGI_10015325 superfamily 243175 123 194 1.04E-20 83.4443 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#13927 - CGI_10015325 superfamily 241832 1 43 4.80E-05 39.9146 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#13928 - CGI_10015326 superfamily 207613 551 586 9.45E-10 55.0177 cl02491 VHP superfamily - - Villin headpiece domain; Villin headpiece domain. Q#13933 - CGI_10015331 superfamily 247692 244 590 0 552.898 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#13933 - CGI_10015331 superfamily 247692 88 178 1.43E-23 102.295 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#13934 - CGI_10015332 superfamily 243092 58 346 3.60E-10 60.0412 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#13936 - CGI_10015334 superfamily 245213 62 85 0.00163161 34.7598 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13937 - CGI_10001870 superfamily 247866 320 471 0.00452365 37.8172 cl17312 PhyH superfamily C - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#13940 - CGI_10001220 superfamily 243051 57 143 1.09E-17 75.1069 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#13941 - CGI_10003739 superfamily 216981 51 140 9.46E-05 40.2086 cl17087 OTU superfamily C - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#13942 - CGI_10003741 superfamily 227778 25 129 0.000208668 39.7915 cl17122 VPS24 superfamily C - Conserved protein implicated in secretion [Cell motility and secretion] Q#13943 - CGI_10003742 superfamily 243072 59 189 8.05E-31 114.788 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13943 - CGI_10003742 superfamily 243072 140 271 7.71E-23 92.8318 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#13943 - CGI_10003742 superfamily 243073 361 400 1.06E-14 67.8805 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#13944 - CGI_10003743 superfamily 243130 805 844 2.52E-06 45.5339 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#13944 - CGI_10003743 superfamily 245342 353 454 4.25E-06 45.8623 cl10594 ERCC4 superfamily - - ERCC4 domain; This domain is a family of nucleases. The family includes EME1 which is an essential component of a Holliday junction resolvase. EME1 interacts with MUS81 to form a DNA structure-specific endonuclease. Q#13945 - CGI_10003744 superfamily 202224 1473 1573 3.55E-11 62.3132 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#13945 - CGI_10003744 superfamily 214721 1387 1439 0.000374724 40.3132 cl18313 JmjC superfamily - - "A domain family that is part of the cupin metalloenzyme superfamily; Probable enzymes, but of unknown functions, that regulate chromatin reorganisation processes (Clissold and Ponting, in press)." Q#13950 - CGI_10023604 superfamily 207794 326 771 0 574.24 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#13950 - CGI_10023604 superfamily 243574 873 1031 1.13E-28 114.349 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#13950 - CGI_10023604 superfamily 243574 28 187 9.94E-21 91.2371 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#13950 - CGI_10023604 superfamily 207794 1167 1203 1.48E-12 70.0133 cl02948 GH20_hexosaminidase superfamily C - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#13950 - CGI_10023604 superfamily 245008 790 821 1.96E-07 49.8792 cl09101 E_set superfamily C - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#13950 - CGI_10023604 superfamily 245008 1198 1247 0.000432752 39.864 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#13951 - CGI_10023605 superfamily 207794 319 620 3.91E-133 400.9 cl02948 GH20_hexosaminidase superfamily C - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#13951 - CGI_10023605 superfamily 243574 26 184 1.04E-26 107.03 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#13953 - CGI_10023607 superfamily 246918 469 527 1.05E-11 60.6783 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 497 549 3.29E-12 62.9895 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 614 660 2.10E-10 57.5967 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 383 435 9.46E-10 55.6707 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 326 378 1.00E-09 55.6707 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 269 321 1.22E-09 55.2855 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 155 207 8.91E-08 49.8927 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 98 136 7.74E-06 44.4999 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 212 264 2.14E-05 42.9591 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13954 - CGI_10023608 superfamily 246918 34 88 0.000113206 41.0331 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13955 - CGI_10023609 superfamily 246918 808 860 6.42E-15 71.4639 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13955 - CGI_10023609 superfamily 246918 922 974 1.99E-13 66.8415 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13955 - CGI_10023609 superfamily 246918 694 746 6.45E-13 65.6859 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13955 - CGI_10023609 superfamily 246918 580 618 4.61E-08 51.4335 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13955 - CGI_10023609 superfamily 246918 516 574 4.21E-07 48.7371 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13955 - CGI_10023609 superfamily 246918 637 689 1.57E-06 46.8111 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#13957 - CGI_10023611 superfamily 219936 108 218 4.42E-13 64.1941 cl18534 SPA superfamily - - Stabilisation of polarity axis; Yeast AFI1 (ARF3-interaction protein 1) has been shown to interact with the outer plaque of the spindle pole body. In Aspergillus nidulans the protein member is necessary for stabilisation of the polarity axes during septation. and in S. cerevisiae it functions as a polarisation-specific docking factor. Q#13958 - CGI_10023612 superfamily 242889 422 490 2.42E-09 55.7111 cl02111 PCI superfamily C - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#13963 - CGI_10023617 superfamily 246709 34 123 0.00512693 34.9505 cl14782 RNase_H superfamily C - "RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, Type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription." Q#13965 - CGI_10023619 superfamily 241622 254 335 1.18E-20 86.0814 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#13967 - CGI_10023621 superfamily 248458 166 525 2.83E-16 79.2801 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#13968 - CGI_10023622 superfamily 241563 60 95 0.00257223 37.0736 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#13970 - CGI_10023624 superfamily 128778 40 153 0.00104924 37.6295 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#13970 - CGI_10023624 superfamily 110440 424 446 0.00491836 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#13971 - CGI_10023625 superfamily 109916 1302 1449 2.38E-29 117.158 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 412 559 2.85E-28 114.076 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 1009 1151 2.39E-25 105.602 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 1599 1743 1.17E-23 100.594 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 720 850 1.05E-20 92.12 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 1457 1594 2.66E-15 75.9416 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 111 244 2.85E-15 75.9416 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 1161 1298 7.95E-14 71.7044 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 258 408 8.15E-13 68.6228 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 1895 2026 1.54E-12 67.8524 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 565 713 2.85E-12 67.082 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 858 971 1.07E-07 52.8296 cl03002 CIMR superfamily C - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 1750 1887 2.45E-05 45.1256 cl03002 CIMR superfamily - - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 2203 2243 0.000547022 41.2736 cl03002 CIMR superfamily C - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13971 - CGI_10023625 superfamily 109916 12 102 0.00212965 39.3476 cl03002 CIMR superfamily N - Cation-independent mannose-6-phosphate receptor repeat; The cation-independent mannose-6-phosphate receptor contains 15 copies of a repeat. Q#13972 - CGI_10023626 superfamily 245213 622 658 5.98E-10 56.1058 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13972 - CGI_10023626 superfamily 245213 546 582 3.65E-09 53.7946 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13972 - CGI_10023626 superfamily 245213 585 620 3.55E-08 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13972 - CGI_10023626 superfamily 245814 336 406 8.07E-07 47.4839 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13972 - CGI_10023626 superfamily 245213 660 695 2.94E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13972 - CGI_10023626 superfamily 245814 235 308 5.11E-05 42.0911 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13972 - CGI_10023626 superfamily 241578 17 185 6.34E-33 125.964 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#13972 - CGI_10023626 superfamily 245213 193 228 0.000153004 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#13972 - CGI_10023626 superfamily 245814 509 543 0.00739163 35.747 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#13973 - CGI_10023627 superfamily 247780 364 501 3.66E-59 196.988 cl17226 NAD_bind_amino_acid_DH superfamily C - "NAD(P) binding domain of amino acid dehydrogenase-like proteins; Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts." Q#13973 - CGI_10023627 superfamily 247739 62 248 6.01E-27 107.324 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#13973 - CGI_10023627 superfamily 202408 244 341 8.58E-44 152.269 cl08368 ELFV_dehydrog_N superfamily N - "Glu/Leu/Phe/Val dehydrogenase, dimerisation domain; Glu/Leu/Phe/Val dehydrogenase, dimerisation domain. " Q#13977 - CGI_10023631 superfamily 195146 24 120 1.43E-61 203.355 cl05674 PET superfamily - - "PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions; PET domain is involved in protein-protein interactions and is usually found in conjunction with LIM domain, which is also a protein-protein interaction domain. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. The PET domain has been found at the N-terminal of four known groups of proteins: prickle, testin, LIMPETin/LIM-9 and overexpressed breast tumor protein (OEBT). Prickle has been implicated in regulation of cell movement through its association with the Dishevelled (Dsh) protein in the planar cell polarity (PCP) pathway. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell contact areas, and at focal adhesion plaques. It interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin, and is involved in cell motility and adhesion events. Knockout mice experiments reveal tumor repressor function of Testin. LIMPETin/LIM-9 contains an N-terminal PET domain and 6 LIM domains at the C-terminal. In Schistosoma mansoni, where LIMPETin was first identified, it is down regulated in sexually mature adult females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator. In C. elegans, LIM-9 may play a role in regulating the assembly and maintenance of the muscle A-band by forming a protein complex with SCPL-1 and UNC-89 and other proteins. OEBT displays a PET domain with two LIM domains, and is predicted to be localized in the nucleus with a possible role in cancer differentiation." Q#13977 - CGI_10023631 superfamily 195146 197 272 2.28E-43 152.893 cl05674 PET superfamily N - "PET ((Prickle Espinas Testin) domain is involved in protein-protein interactions; PET domain is involved in protein-protein interactions and is usually found in conjunction with LIM domain, which is also a protein-protein interaction domain. The PET containing proteins serve as adaptors or scaffolds to support the assembly of multimeric protein complexes. The PET domain has been found at the N-terminal of four known groups of proteins: prickle, testin, LIMPETin/LIM-9 and overexpressed breast tumor protein (OEBT). Prickle has been implicated in regulation of cell movement through its association with the Dishevelled (Dsh) protein in the planar cell polarity (PCP) pathway. Testin is a cytoskeleton associated focal adhesion protein that localizes along actin stress fibers, at cell contact areas, and at focal adhesion plaques. It interacts with a variety of cytoskeletal proteins, including zyxin, mena, VASP, talin, and actin, and is involved in cell motility and adhesion events. Knockout mice experiments reveal tumor repressor function of Testin. LIMPETin/LIM-9 contains an N-terminal PET domain and 6 LIM domains at the C-terminal. In Schistosoma mansoni, where LIMPETin was first identified, it is down regulated in sexually mature adult females compared to sexually immature adult females and adult males. Its differential expression indicates that it is a transcription regulator. In C. elegans, LIM-9 may play a role in regulating the assembly and maintenance of the muscle A-band by forming a protein complex with SCPL-1 and UNC-89 and other proteins. OEBT displays a PET domain with two LIM domains, and is predicted to be localized in the nucleus with a possible role in cancer differentiation." Q#13977 - CGI_10023631 superfamily 243050 343 398 7.43E-34 124.847 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13977 - CGI_10023631 superfamily 243050 403 461 1.26E-33 124.088 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13977 - CGI_10023631 superfamily 243050 280 338 3.88E-28 108.499 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13977 - CGI_10023631 superfamily 243050 128 186 3.88E-28 108.499 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#13980 - CGI_10023634 superfamily 246723 22 342 1.20E-173 499.743 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#13980 - CGI_10023634 superfamily 204144 356 500 1.58E-42 148.936 cl08525 Leuk-A4-hydro_C superfamily - - "Leukotriene A4 hydrolase, C-terminal; Members of this family adopt a structure consisting of two layers of parallel alpha-helices, five in the inner layer and four in the outer, arranged in an antiparallel manner, with perpendicular loops containing short helical segments on top. They are required for the formation of a deep cleft harbouring the catalytic Zn2+ site in Leukotriene A4 hydrolase." Q#13982 - CGI_10023636 superfamily 215754 214 305 1.36E-21 86.9236 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#13982 - CGI_10023636 superfamily 215754 117 208 9.62E-20 81.916 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#13982 - CGI_10023636 superfamily 215754 1 115 3.22E-16 71.9008 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#13983 - CGI_10023637 superfamily 150854 32 278 2.56E-51 172.599 cl10929 MRP-S22 superfamily - - "Mitochondrial 28S ribosomal protein S22; This is the conserved N-terminus and central portion of the mitochondrial small subunit 28S ribosomal protein S22. Mammalian mitochondria carry out the synthesis of 13 polypeptides that are essential for oxidative phosphorylation and, hence, for the synthesis of the majority of the ATP used by eukaryotic organisms. The number of proteins produced by prokaryotes is smaller, reflected in the lower number of ribosomal proteins present in them." Q#13984 - CGI_10023638 superfamily 247905 336 458 2.62E-25 102.701 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#13984 - CGI_10023638 superfamily 247805 52 194 7.84E-19 84.3112 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#13985 - CGI_10023639 superfamily 151040 666 899 2.25E-96 304.244 cl11116 DUF2451 superfamily - - Protein of unknown function C-terminus (DUF2451); This protein is found in eukaryotes but its function is not known. The C-terminal part of some members is DUF2450. Q#13987 - CGI_10023641 superfamily 243109 219 422 4.12E-93 280.72 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#13987 - CGI_10023641 superfamily 243109 34 199 2.66E-70 221.784 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#13988 - CGI_10023642 superfamily 247099 48 423 2.16E-142 418.742 cl15845 MntH superfamily - - Mn2+ and Fe2+ transporters of the NRAMP family [Inorganic ion transport and metabolism] Q#13989 - CGI_10023643 superfamily 243035 21 105 2.36E-15 66.8745 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#13990 - CGI_10023644 superfamily 248020 72 404 1.61E-41 152.619 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#13991 - CGI_10023645 superfamily 241580 360 433 1.14E-08 52.9414 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#13994 - CGI_10023648 superfamily 241754 17 153 3.65E-39 149.555 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#13995 - CGI_10023649 superfamily 216739 442 476 0.000236848 39.7234 cl03383 PC_rep superfamily - - Proteasome/cyclosome repeat; Proteasome/cyclosome repeat. Q#13995 - CGI_10023649 superfamily 216739 689 720 0.00433191 36.2566 cl03383 PC_rep superfamily - - Proteasome/cyclosome repeat; Proteasome/cyclosome repeat. Q#13996 - CGI_10023650 superfamily 243034 958 1057 4.76E-10 58.1604 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13996 - CGI_10023650 superfamily 243034 661 736 1.79E-07 50.4564 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13996 - CGI_10023650 superfamily 243034 1027 1122 8.32E-07 48.5304 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13996 - CGI_10023650 superfamily 243034 695 790 0.00380463 36.9744 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#13999 - CGI_10026546 superfamily 245598 137 333 1.22E-71 230.629 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#13999 - CGI_10026546 superfamily 247856 430 500 0.00013069 40.2237 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#13999 - CGI_10026546 superfamily 247856 475 527 0.000687749 38.2977 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#14000 - CGI_10026547 superfamily 245598 84 207 4.87E-28 109.291 cl11396 Patatin_and_cPLA2 superfamily C - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#14000 - CGI_10026547 superfamily 244899 253 300 4.82E-05 41.3214 cl08302 S-100 superfamily N - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#14001 - CGI_10026548 superfamily 241620 27 65 6.41E-08 43.798 cl00113 CRIB superfamily - - "PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules." Q#14002 - CGI_10026549 superfamily 241620 25 67 1.65E-15 70.7619 cl00113 CRIB superfamily - - "PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules." Q#14002 - CGI_10026549 superfamily 245201 183 477 3.87E-150 433.176 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14007 - CGI_10026554 superfamily 241742 893 1039 1.98E-41 150.443 cl00271 PI3Ka superfamily - - "Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture." Q#14007 - CGI_10026554 superfamily 246669 708 875 3.66E-32 124.778 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#14007 - CGI_10026554 superfamily 207610 513 599 8.62E-16 75.4429 cl02484 PI3K_rbd superfamily - - "PI3-kinase family, ras-binding domain; Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding pfam00788 domains (unpublished observation)." Q#14007 - CGI_10026554 superfamily 241623 1069 1129 2.80E-11 65.2938 cl00119 PI3Kc_like superfamily C - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#14009 - CGI_10026556 superfamily 243034 601 702 2.62E-07 50.4564 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#14009 - CGI_10026556 superfamily 243034 240 333 1.53E-05 45.0636 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#14010 - CGI_10026557 superfamily 216152 141 421 5.51E-94 302.696 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#14011 - CGI_10026558 superfamily 216152 130 348 6.22E-63 206.396 cl02988 Glyco_transf_10 superfamily C - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#14013 - CGI_10026560 superfamily 243035 25 112 5.92E-17 73.8081 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14013 - CGI_10026560 superfamily 243035 146 221 8.63E-10 53.3926 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14014 - CGI_10026561 superfamily 243035 204 300 2.00E-17 75.7341 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14014 - CGI_10026561 superfamily 245205 42 123 4.43E-08 49.1585 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#14015 - CGI_10026562 superfamily 241599 188 246 4.14E-22 86.9136 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#14018 - CGI_10026565 superfamily 241599 128 173 3.05E-21 83.832 cl00084 homeodomain superfamily N - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#14020 - CGI_10026567 superfamily 215733 207 446 1.05E-39 148.482 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#14020 - CGI_10026567 superfamily 222006 569 605 6.21E-07 48.7578 cl16182 Hydrolase_like2 superfamily N - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#14020 - CGI_10026567 superfamily 221564 1 113 2.11E-06 47.5915 cl13797 P5-ATPase superfamily - - "P5-type ATPase cation transporter; This domain family is found in eukaryotes, and is typically between 110 and 126 amino acids in length. The family is found in association with pfam00122, pfam00702. P-type ATPases comprise a large superfamily of proteins, present in both prokaryotes and eukaryotes, that transport inorganic cations and other substrates across cell membranes." Q#14020 - CGI_10026567 superfamily 243244 147 195 3.13E-05 43.293 cl02930 Cation_ATPase_N superfamily N - "Cation transporter/ATPase, N-terminus; Members of this families are involved in Na+/K+, H+/K+, Ca++ and Mg++ transport." Q#14020 - CGI_10026567 superfamily 226572 795 838 9.59E-05 42.546 cl18761 COG4087 superfamily NC - Soluble P-type ATPase [General function prediction only] Q#14021 - CGI_10026568 superfamily 241874 40 620 0 590.305 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#14022 - CGI_10026569 superfamily 241564 1103 1171 4.50E-22 92.3287 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#14022 - CGI_10026569 superfamily 241564 729 796 2.54E-18 81.5431 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#14023 - CGI_10026570 superfamily 247792 110 151 3.29E-11 55.0784 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#14024 - CGI_10026571 superfamily 217293 11 171 1.10E-29 114.267 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#14024 - CGI_10026571 superfamily 202474 222 281 0.000449016 39.9445 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#14025 - CGI_10026572 superfamily 118686 105 212 1.12E-22 90.2385 cl10855 LOH1CR12 superfamily - - Tumour suppressor protein; This is a region of 130 amino acids that is the most conserved region of hypothetical proteins involved in loss of heterozygosity and thus tumour suppression. The exact function is not known. Q#14026 - CGI_10026573 superfamily 247875 88 274 3.92E-36 128.935 cl17321 2OG-FeII_Oxy_2 superfamily - - 2OG-Fe(II) oxygenase superfamily; 2OG-Fe(II) oxygenase superfamily. Q#14029 - CGI_10026576 superfamily 221296 14 1682 0 1053.46 cl13353 DUF3414 superfamily - - Protein of unknown function (DUF3414); This family of proteins are functionally uncharacterized. This protein is found in eukaryotes. Proteins in this family are typically between 764 to 2011 amino acids in length. This protein has a conserved LLG sequence motif. Q#14031 - CGI_10015197 superfamily 243189 66 148 4.31E-30 107.015 cl02793 Cyt_c_Oxidase_Va superfamily N - "Cytochrome c oxidase subunit Va. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Va is one of three mammalian subunits that lacks a transmembrane region. Subunit Va is located on the matrix side of the membrane and binds thyroid hormone T2, releasing allosteric inhibition caused by the binding of ATP to subunit IV and allowing high turnover at elevated intramitochondrial ATP/ADP ratios." Q#14034 - CGI_10015200 superfamily 248472 49 160 6.17E-32 111.972 cl17918 Ribosomal_P1_P2_L12p superfamily - - "Ribosomal protein P1, P2, and L12p. Ribosomal proteins P1 and P2 are the eukaryotic proteins that are functionally equivalent to bacterial L7/L12. L12p is the archaeal homolog. Unlike other ribosomal proteins, the archaeal L12p and eukaryotic P1 and P2 do not share sequence similarity with their bacterial counterparts. They are part of the ribosomal stalk (called the L7/L12 stalk in bacteria), along with 28S rRNA and the proteins L11 and P0 in eukaryotes (23S rRNA, L11, and L10e in archaea). In bacterial ribosomes, L7/L12 homodimers bind the extended C-terminal helix of L10 to anchor the L7/L12 molecules to the ribosome. Eukaryotic P1/P2 heterodimers and archaeal L12p homodimers are believed to bind the L10 equivalent proteins, eukaryotic P0 and archaeal L10e, in a similar fashion. P1 and P2 (L12p, L7/L12) are the only proteins in the ribosome to occur as multimers, always appearing as sets of dimers. Recent data indicate that most archaeal species contain six copies of L12p (three homodimers), while eukaryotes have two copies each of P1 and P2 (two heterodimers). Bacteria may have four or six copies (two or three homodimers), depending on the species. As in bacteria, the stalk is crucial for binding of initiation, elongation, and release factors in eukaryotes and archaea." Q#14035 - CGI_10015201 superfamily 204614 1 127 5.77E-39 129.691 cl12769 Med21 superfamily - - "Subunit 21 of Mediator complex; Med21 has been known as Srb7 in yeasts, hSrb7 in humans and Trap 19 in Drosophila. The heterodimer of the two subunits Med7 and Med21 appears to act as a hinge between the middle and the tail regions of Mediator." Q#14036 - CGI_10015202 superfamily 206009 356 421 2.34E-35 131.136 cl16430 Clathrin_H_link superfamily - - "Clathrin-H-link; This short domain is found on clathrins, and often appears on proteins directly downstream from the Clathrin-link domain pfam09268." Q#14036 - CGI_10015202 superfamily 150065 331 354 1.40E-05 44.4544 cl07778 Clathrin-link superfamily - - "Clathrin, heavy-chain linker; Members of this family adopt a structure consisting of alpha-alpha superhelix. They are predominantly found in clathrin, where they act as a heavy-chain linker domain." Q#14036 - CGI_10015202 superfamily 216475 256 288 0.00211362 37.9117 cl03194 Clathrin_propel superfamily - - "Clathrin propeller repeat; Clathrin is the scaffold protein of the basket-like coat that surrounds coated vesicles. The soluble assembly unit, a triskelion, contains three heavy chains and three light chains in an extended three-legged structure. Each leg contains one heavy and one light chain. The N-terminus of the heavy chain is known as the globular domain, and is composed of seven repeats which form a beta propeller." Q#14036 - CGI_10015202 superfamily 216475 19 56 0.00648526 36.3709 cl03194 Clathrin_propel superfamily - - "Clathrin propeller repeat; Clathrin is the scaffold protein of the basket-like coat that surrounds coated vesicles. The soluble assembly unit, a triskelion, contains three heavy chains and three light chains in an extended three-legged structure. Each leg contains one heavy and one light chain. The N-terminus of the heavy chain is known as the globular domain, and is composed of seven repeats which form a beta propeller." Q#14036 - CGI_10015202 superfamily 216475 296 330 0.00702617 36.3709 cl03194 Clathrin_propel superfamily - - "Clathrin propeller repeat; Clathrin is the scaffold protein of the basket-like coat that surrounds coated vesicles. The soluble assembly unit, a triskelion, contains three heavy chains and three light chains in an extended three-legged structure. Each leg contains one heavy and one light chain. The N-terminus of the heavy chain is known as the globular domain, and is composed of seven repeats which form a beta propeller." Q#14037 - CGI_10015203 superfamily 246918 87 134 8.09E-08 47.9667 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14040 - CGI_10015206 superfamily 241886 1 236 1.53E-71 222.434 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#14041 - CGI_10015207 superfamily 241886 11 220 1.05E-71 222.434 cl00470 Aldo_ket_red superfamily C - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#14042 - CGI_10015208 superfamily 241886 233 507 6.50E-86 269.429 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#14042 - CGI_10015208 superfamily 241886 1 232 1.54E-66 218.968 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#14043 - CGI_10015209 superfamily 241886 11 269 2.59E-88 267.118 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#14044 - CGI_10015210 superfamily 199575 1 116 1.06E-52 166.246 cl15439 BTG superfamily - - BTG family; BTG family. Q#14045 - CGI_10015211 superfamily 218891 9 262 1.40E-22 97.0698 cl05564 Ins_P5_2-kin superfamily C - "Inositol-pentakisphosphate 2-kinase; This is a family of inositol-pentakisphosphate 2-kinases (EC 2.7.1.158) (also known as inositol 1,3,4,5,6-pentakisphosphate 2-kinase, Ins(1,3,4,5,6)P5 2-kinase) and InsP5 2-kinase). This enzyme phosphorylates Ins(1,3,4,5,6)P5 to form Ins(1,2,3,4,5,6)P6 (also known as InsP6 or phytate). InsP6 is involved in many processes such as mRNA export, nonhomologous end-joining, endocytosis and ion channel regulation." Q#14045 - CGI_10015211 superfamily 218891 418 502 1.37E-07 52.0015 cl05564 Ins_P5_2-kin superfamily N - "Inositol-pentakisphosphate 2-kinase; This is a family of inositol-pentakisphosphate 2-kinases (EC 2.7.1.158) (also known as inositol 1,3,4,5,6-pentakisphosphate 2-kinase, Ins(1,3,4,5,6)P5 2-kinase) and InsP5 2-kinase). This enzyme phosphorylates Ins(1,3,4,5,6)P5 to form Ins(1,2,3,4,5,6)P6 (also known as InsP6 or phytate). InsP6 is involved in many processes such as mRNA export, nonhomologous end-joining, endocytosis and ion channel regulation." Q#14046 - CGI_10015212 superfamily 241563 74 112 8.09E-05 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14047 - CGI_10015213 superfamily 241563 65 100 0.00120971 35.726 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14047 - CGI_10015213 superfamily 242730 161 239 0.0024874 36.0851 cl01825 Phage_Mu_Gam superfamily C - Bacteriophage Mu Gam like protein; This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. Q#14049 - CGI_10015215 superfamily 110440 92 119 0.00281364 32.3797 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#14050 - CGI_10015216 superfamily 241563 62 102 0.000251731 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14054 - CGI_10001871 superfamily 219275 12 344 2.11E-101 310.821 cl06188 ORC3_N superfamily - - Origin recognition complex (ORC) subunit 3 N-terminus; This family represents the N-terminus (approximately 300 residues) of subunit 3 of the eukaryotic origin recognition complex (ORC). Origin recognition complex (ORC) is composed of six subunits that are essential for cell viability. They collectively bind to the autonomously replicating sequence (ARS) in a sequence-specific manner and lead to the chromatin loading of other replication factors that are essential for initiation of DNA replication. Q#14055 - CGI_10001357 superfamily 247916 68 171 2.33E-07 49.3214 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#14055 - CGI_10001357 superfamily 242215 128 287 0.00571213 37.6485 cl00949 Acetyltransf_2 superfamily N - "N-acetyltransferase; Arylamine N-acetyltransferase (NAT) is a cytosolic enzyme of approximately 30kDa. It facilitates the transfer of an acetyl group from Acetyl Coenzyme A on to a wide range of arylamine, N-hydroxyarylamines and hydrazines. Acetylation of these compounds generally results in inactivation. NAT is found in many species from Mycobacteria (M. tuberculosis, M. smegmatis etc) to man. It was the first enzyme to be observed to have polymorphic activity amongst human individuals. NAT is responsible for the inactivation of Isoniazid (a drug used to treat Tuberculosis) in humans. The NAT protein has also been shown to be involved in the breakdown of folic acid." Q#14056 - CGI_10001651 superfamily 241554 3 104 3.97E-21 88.0963 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#14056 - CGI_10001651 superfamily 241554 161 292 2.25E-07 48.806 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#14059 - CGI_10002364 superfamily 247805 35 181 1.21E-08 51.5692 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#14061 - CGI_10027425 superfamily 221744 35 299 1.07E-22 95.1954 cl18614 CABIT superfamily - - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#14063 - CGI_10027427 superfamily 221744 308 557 8.45E-07 49.7419 cl18614 CABIT superfamily - - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#14071 - CGI_10027437 superfamily 247792 75 122 5.12E-08 47.8256 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#14072 - CGI_10027438 superfamily 222324 56 144 3.56E-14 64.3354 cl16352 zf-3CxxC superfamily - - Zinc-binding domain; This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue. Q#14073 - CGI_10027439 superfamily 117982 61 95 0.000142179 36.9981 cl09683 CFC superfamily - - "Cripto_Frl-1_Cryptic (CFC); CFC domain is one half of the membrane protein Cripto, a protein overexpressed in many tumours and structurally similar to the C-terminal extracellular portions of Jagged 1 and Jagged 2. CFC is approx 40-residues long, compacted by three internal disulphide bridges, and binds Alk4 via a hydrophobic patch. CFC is structurally homologous to the VWFC-like domain." Q#14075 - CGI_10027441 superfamily 216112 402 766 7.34E-71 243.743 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#14075 - CGI_10027441 superfamily 221913 1532 1741 2.75E-57 200.074 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#14075 - CGI_10027441 superfamily 221913 2634 2745 1.82E-27 113.404 cl18626 AAA_12 superfamily N - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#14075 - CGI_10027441 superfamily 222258 1484 1521 4.44E-05 45.6368 cl18656 AAA_30 superfamily NC - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#14075 - CGI_10027441 superfamily 216112 2608 2635 0.000382127 43.8243 cl02964 RNB superfamily C - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#14076 - CGI_10027442 superfamily 216112 516 720 1.38E-58 207.919 cl02964 RNB superfamily C - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#14078 - CGI_10027444 superfamily 216112 10 368 1.15E-76 253.373 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#14079 - CGI_10027445 superfamily 221913 276 483 4.21E-60 198.533 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#14079 - CGI_10027445 superfamily 222258 216 297 1.17E-07 50.6444 cl18656 AAA_30 superfamily N - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#14079 - CGI_10027445 superfamily 222258 28 62 3.09E-05 43.7108 cl18656 AAA_30 superfamily C - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#14080 - CGI_10027446 superfamily 222592 23 103 6.06E-35 118.038 cl16705 Ribosomal_L18_c superfamily - - Ribosomal L18 C-terminal region; This domain is the C-terminal end of ribosomal L18/L5 proteins. Q#14082 - CGI_10027448 superfamily 241571 203 315 2.17E-22 94.0162 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14082 - CGI_10027448 superfamily 241613 165 199 1.29E-13 66.4613 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#14082 - CGI_10027448 superfamily 241571 461 570 1.42E-13 68.2078 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14082 - CGI_10027448 superfamily 243035 39 162 3.63E-10 58.4001 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14082 - CGI_10027448 superfamily 241571 339 449 3.55E-07 49.3331 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14082 - CGI_10027448 superfamily 243092 607 687 3.17E-05 45.4036 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14083 - CGI_10027449 superfamily 241571 166 278 3.64E-21 92.4754 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14083 - CGI_10027449 superfamily 243035 2 125 9.91E-11 61.8669 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14083 - CGI_10027449 superfamily 241568 762 816 3.34E-10 59.0136 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#14083 - CGI_10027449 superfamily 241613 128 162 4.78E-09 55.2906 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#14083 - CGI_10027449 superfamily 245213 2189 2225 3.62E-08 52.639 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14083 - CGI_10027449 superfamily 245213 2559 2594 5.42E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14083 - CGI_10027449 superfamily 245213 2478 2516 3.28E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14083 - CGI_10027449 superfamily 245213 2518 2555 1.97E-05 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14083 - CGI_10027449 superfamily 245213 2677 2706 6.76E-05 43.009 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14083 - CGI_10027449 superfamily 241568 986 1041 9.64E-05 42.8352 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#14083 - CGI_10027449 superfamily 241568 1286 1341 0.000116124 42.8352 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#14083 - CGI_10027449 superfamily 245213 2442 2476 0.000344723 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14083 - CGI_10027449 superfamily 245213 2072 2107 0.000470545 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14083 - CGI_10027449 superfamily 241568 820 863 0.00521634 37.8276 cl00043 CCP superfamily C - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#14083 - CGI_10027449 superfamily 241568 1120 1163 0.00915174 37.0572 cl00043 CCP superfamily C - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#14083 - CGI_10027449 superfamily 222049 2343 2427 7.43E-06 46.5631 cl16239 Mucin2_WxxW superfamily - - "Mucin-2 protein WxxW repeating region; This family is repeating region found on mucins 2 and 5. The function is not known, but the repeat can be present in up to 32 copies, as in a member from Branchiostoma floridae. The region carries a highly conserved WxxW sequence motif and also has at least six well conserved cysteine residues." Q#14083 - CGI_10027449 superfamily 111397 1550 1628 1.77E-05 45.4099 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#14083 - CGI_10027449 superfamily 241571 302 412 1.79E-05 45.8663 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14083 - CGI_10027449 superfamily 219525 1983 2019 0.000102206 42.7914 cl06646 GCC2_GCC3 superfamily C - GCC2 and GCC3; GCC2 and GCC3. Q#14083 - CGI_10027449 superfamily 111397 1630 1711 0.000154277 42.7135 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#14083 - CGI_10027449 superfamily 241578 1172 1211 0.000338763 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14083 - CGI_10027449 superfamily 241571 424 456 0.000478882 41.2439 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14083 - CGI_10027449 superfamily 221695 896 918 0.00177943 38.9754 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#14083 - CGI_10027449 superfamily 219525 1931 1976 0.00379193 38.169 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#14084 - CGI_10027450 superfamily 242876 50 91 4.47E-06 41.6276 cl02092 Clat_adaptor_s superfamily NC - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#14085 - CGI_10027451 superfamily 242406 230 370 9.49E-23 93.8101 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#14088 - CGI_10027454 superfamily 241675 1012 1221 5.50E-103 326.951 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#14088 - CGI_10027454 superfamily 241574 728 957 2.44E-81 267.915 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#14089 - CGI_10027455 superfamily 216347 6 387 1.81E-102 313.318 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#14090 - CGI_10027456 superfamily 145726 4 67 0.00312015 34.2434 cl08353 Cu_amine_oxidN2 superfamily N - "Copper amine oxidase, N2 domain; This domain is the first or second structural domain in copper amine oxidases, it is known as the N2 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ)." Q#14091 - CGI_10027457 superfamily 247856 112 174 6.52E-22 84.9069 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#14091 - CGI_10027457 superfamily 247856 19 97 3.67E-14 63.7209 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#14092 - CGI_10027458 superfamily 241677 28 194 5.05E-64 198.25 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#14093 - CGI_10027459 superfamily 213401 701 746 6.72E-06 44.6661 cl17098 CASP8AP2 superfamily - - "Caspase 8-associated protein 2 myb-like domain; This domain is the SANT/myb-like domain of Caspase 8-associated protein 2 (CASP8AP2) / GON-4 like proteins. CASP8AP2 (aka Flice-Associated Huge Protein (FLASH)) is implicated in numerous gene regulatory roles including roles in embryogenesis, oncogenesis, down-regulation of replication-dependent histone genes, regulation of Caspase 8 activity at the death-inducing signaling complex (DISC), and as a useful marker in leukemia prognosis. Gon-4 is critical in Caenorhabditis elegans gonadogenesis. Danio rerio GON4 is a regulator of gene expression in hematopoietic development, possibly by repressing expression. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#14093 - CGI_10027459 superfamily 202341 1 27 0.00172787 37.0687 cl07842 PAH superfamily N - "Paired amphipathic helix repeat; This family contains the paired amphipathic helix repeat. The family contains the yeast SIN3 gene (also known as SDI1) that is a negative regulator of the yeast HO gene. This repeat may be distantly related to the helix-loop-helix motif, which mediate protein-protein interactions." Q#14094 - CGI_10027460 superfamily 220692 47 346 4.77E-09 56.0585 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#14096 - CGI_10027462 superfamily 247044 15 155 1.73E-42 139.614 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14097 - CGI_10027463 superfamily 245304 151 448 6.73E-149 436.606 cl10459 Peptidases_S8_S53 superfamily - - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#14097 - CGI_10027463 superfamily 201820 536 626 1.63E-29 112.719 cl08326 P_proprotein superfamily - - Proprotein convertase P-domain; A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. Q#14097 - CGI_10027463 superfamily 245304 411 487 5.18E-06 47.2874 cl10459 Peptidases_S8_S53 superfamily N - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#14098 - CGI_10027464 superfamily 206050 24 121 1.36E-29 114.293 cl16449 KIAA1430 superfamily - - KIAA1430 homologue; This is a family of KIAA1430 homologues. The function is not known. Q#14098 - CGI_10027464 superfamily 207654 743 808 5.38E-10 57.0674 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#14098 - CGI_10027464 superfamily 207654 671 729 5.45E-09 53.9858 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#14098 - CGI_10027464 superfamily 207654 900 965 7.77E-05 41.6594 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#14098 - CGI_10027464 superfamily 207654 404 470 0.00916958 35.4962 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#14099 - CGI_10027465 superfamily 216653 55 216 1.24E-11 59.5331 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#14100 - CGI_10027466 superfamily 216653 795 932 4.14E-21 91.1194 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#14100 - CGI_10027466 superfamily 216653 57 221 3.57E-17 79.5635 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#14100 - CGI_10027466 superfamily 207627 583 675 1.18E-15 74.2131 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#14100 - CGI_10027466 superfamily 207627 500 558 9.08E-15 71.5119 cl02522 Calx-beta superfamily N - Calx-beta domain; Calx-beta domain. Q#14102 - CGI_10027468 superfamily 243051 9 115 2.20E-16 74.3365 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#14102 - CGI_10027468 superfamily 243051 141 304 2.85E-16 73.9241 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#14103 - CGI_10027469 superfamily 192566 111 186 0.00604769 35.7432 cl18180 COG5 superfamily C - "Golgi transport complex subunit 5; The COG complex, the peripheral membrane oligomeric protein complex involved in intra-Golgi protein trafficking, consists of eight subunits arranged in two lobes bridged by Cog1. Cog5 is in the smaller, B lobe, bound in with Cog6-8, and is itself bound to Cog1 as well as, strongly, to Cog7." Q#14105 - CGI_10027471 superfamily 110440 136 163 2.50E-05 41.2393 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#14105 - CGI_10027471 superfamily 110440 187 215 0.000490166 37.3873 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#14106 - CGI_10027472 superfamily 247041 54 318 4.70E-76 241.452 cl15692 CE4_SF superfamily - - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#14107 - CGI_10027473 superfamily 247041 2 192 1.69E-58 190.221 cl15692 CE4_SF superfamily N - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#14108 - CGI_10027474 superfamily 247041 230 349 1.06E-19 86.9871 cl15692 CE4_SF superfamily C - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#14110 - CGI_10027476 superfamily 241592 39 127 1.80E-37 124.551 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#14112 - CGI_10027478 superfamily 241581 127 172 0.000778517 40.4474 cl00062 FHA superfamily N - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#14113 - CGI_10027479 superfamily 147570 1 189 1.45E-21 87.2698 cl09371 Siva superfamily - - "Cd27 binding protein (Siva); Siva binds to the CD27 cytoplasmic tail. It has a DD homology region, a box-B-like ring finger, and a zinc finger-like domain. Overexpression of Siva in various cell lines induces apoptosis, suggesting an important role for Siva in the CD27-transduced apoptotic pathway. Siva-1 binds to and inhibits BCL-X(L)-mediated protection against UV radiation-induced apoptosis. Indeed, the unique amphipathic helical region (SAH) present in Siva-1 is required for its binding to BCL-X(L) and sensitising cells to UV radiation. Natural complexes of Siva-1/BCL-X(L) are detected in HUT78 and murine thymocyte, suggesting a potential role for Siva-1 in regulating T cell homeostasis. This family contains both Siva-1 and the shorter Siva-2 lacking the sequence coded by exon 2. It has been suggested that Siva-2 could regulate the function of Siva-1." Q#14114 - CGI_10027480 superfamily 247799 167 215 7.43E-06 41.7786 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#14115 - CGI_10027481 superfamily 247744 9 194 5.01E-56 178.576 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#14120 - CGI_10027486 superfamily 245194 29 107 0.00287962 35.5465 cl09909 Cas8a1_I-A superfamily N - "CRISPR/Cas system-associated protein Cas8a1; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Large proteins, some contain Zn-finger domain; signature gene for I-A subtype; also known as CXXC_CXXC family" Q#14126 - CGI_10027494 superfamily 243072 48 171 1.60E-39 133.663 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14126 - CGI_10027494 superfamily 243072 18 50 0.000530506 35.9928 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14127 - CGI_10012699 superfamily 248097 7 69 1.29E-08 46.8746 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#14128 - CGI_10012700 superfamily 245226 14 181 4.98E-21 86.5856 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#14131 - CGI_10012703 superfamily 242406 42 113 4.31E-08 48.3565 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#14140 - CGI_10012712 superfamily 243134 444 567 1.38E-19 87.3196 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#14140 - CGI_10012712 superfamily 243134 938 1072 8.39E-19 85.0084 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#14140 - CGI_10012712 superfamily 243134 313 421 1.24E-17 81.5416 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#14140 - CGI_10012712 superfamily 243134 1090 1201 1.66E-12 66.5188 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#14140 - CGI_10012712 superfamily 205157 1339 1374 6.37E-08 50.9991 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#14140 - CGI_10012712 superfamily 205157 172 208 8.34E-06 44.8359 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#14140 - CGI_10012712 superfamily 205157 890 926 6.21E-05 42.5247 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#14140 - CGI_10012712 superfamily 205157 1378 1416 0.000813473 39.0579 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#14140 - CGI_10012712 superfamily 205157 811 842 0.00219967 37.9023 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#14143 - CGI_10012715 superfamily 242564 58 155 3.60E-38 127.761 cl01534 NDUFA12 superfamily - - "NADH ubiquinone oxidoreductase subunit NDUFA12; This family contains the 17.2 kD subunit of complex I (NDUFA12) and its homologues. The family also contains a second related eukaryotic protein of unknown function, ." Q#14144 - CGI_10012716 superfamily 243072 115 241 4.89E-28 110.166 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14144 - CGI_10012716 superfamily 243072 368 474 1.73E-23 97.069 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14144 - CGI_10012716 superfamily 247057 553 596 4.59E-10 56.9114 cl15755 SAM_superfamily superfamily N - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#14145 - CGI_10012717 superfamily 215724 46 355 4.00E-167 472.489 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#14146 - CGI_10012718 superfamily 211407 12 485 0 944.125 cl16940 ST7 superfamily - - "Suppression of tumorigenicity 7; ST7 is a metazoan protein that behaves as a tumor suppressor in human cancer cells. It appears to localize to the cytoplasm and plasma membrane, and may mediate tumor suppression by regulating genes that are involved in oncogenic pathways and/or maintain cellular structure. It has been suggested that the suppression of tumorigenicity is associated with a function in mediating the remodeling of the extracellular matrix. However, somatic mutations of ST7 have not been observed as being commonly associated with molecular pathogenesis in various human neoplasias." Q#14148 - CGI_10012720 superfamily 248345 892 1051 1.95E-38 143.548 cl17791 SAC3_GANP superfamily - - "SAC3/GANP/Nin1/mts3/eIF-3 p25 family; This large family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit. This family includes several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits." Q#14148 - CGI_10012720 superfamily 247723 561 632 2.11E-20 88.5935 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#14148 - CGI_10012720 superfamily 222274 18 88 5.31E-06 47.0956 cl18658 Nucleoporin_FG superfamily C - "Nucleoporin FG repeat region; This family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145." Q#14148 - CGI_10012720 superfamily 222274 313 400 0.00490019 37.4656 cl18658 Nucleoporin_FG superfamily - - "Nucleoporin FG repeat region; This family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145." Q#14149 - CGI_10025696 superfamily 241874 182 482 0.00351796 39.5922 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#14151 - CGI_10025698 superfamily 247750 136 449 3.01E-156 449.131 cl17196 E1_enzyme_family superfamily - - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#14152 - CGI_10025699 superfamily 243161 5 69 4.88E-10 54.3634 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#14157 - CGI_10025704 superfamily 247794 5 310 1.32E-125 364.408 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#14158 - CGI_10025705 superfamily 245602 262 561 1.56E-153 448.973 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#14159 - CGI_10025706 superfamily 241721 50 329 5.16E-107 327.266 cl00246 MTHFR superfamily - - "Methylenetetrahydrofolate reductase (MTHFR). 5,10-Methylenetetrahydrofolate is reduced to 5-methyltetrahydrofolate by methylenetetrahydrofolate reductase, a cytoplasmic, NAD(P)-dependent enzyme. 5-methyltetrahydrofolate is utilized by methionine synthase to convert homocysteine to methionine. The enzymatic mechanism is a ping-pong bi-bi mechanism, in which NAD(P)+ release precedes the binding of methylenetetrahydrofolate and the acceptor is free FAD. The family includes the 5,10-methylenetetrahydrofolate reductase EC:1.7.99.5 from prokaryotes and methylenetetrahydrofolate reductase EC: 1.5.1.20 from eukaryotes. The bacterial enzyme is a homotetramer and NADH is the preferred reductant while the eukaryotic enzyme is a homodimer and NADPH is the preferred reductant. In humans, there are several clinically significant mutations in MTHFR that result in hyperhomocysteinemia, which is a risk factor for the development of cardiovascular disease." Q#14162 - CGI_10025709 superfamily 219119 45 93 5.45E-23 86.1609 cl05928 SPC12 superfamily N - "Microsomal signal peptidase 12 kDa subunit (SPC12); This family consists of several microsomal signal peptidase 12 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 12 kDa subunit (SPC12)." Q#14163 - CGI_10025710 superfamily 241575 1564 1628 0.00855099 36.4815 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#14163 - CGI_10025710 superfamily 202558 1 228 2.21E-118 374.373 cl03915 XRN_N superfamily - - "XRN 5'-3' exonuclease N-terminus; This family aligns residues towards the N-terminus of several proteins with multiple functions. The members of this family all appear to possess 5'-3' exonuclease activity EC:3.1.11.-. Thus, the aligned region may be necessary for 5' to 3' exonuclease function. The family also contains several Xrn1 and Xrn2 proteins. The 5'-3' exoribonucleases Xrn1p and Xrn2p/Rat1p function in the degradation and processing of several classes of RNA in Saccharomyces cerevisiae. Xrn1p is the main enzyme catalyzing cytoplasmic mRNA degradation in multiple decay pathways, whereas Xrn2p/Rat1p functions in the processing of rRNAs and small nucleolar RNAs (snoRNAs) in the nucleus." Q#14167 - CGI_10025714 superfamily 242905 49 150 3.25E-56 173.566 cl02150 TAF10 superfamily - - "The TATA Binding Protein (TBP) Associated Factor 10; The TATA Binding Protein (TBP) Associated Factor 10 (TAF 10) is one of several TAFs that bind TBP and are involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of the seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and the assembly of the preinitiation complex. The TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. Several hypotheses are proposed for TAF functions, such as serving as activator-binding sites, being involved in core-promoter recognition, or to perform an essential catalytic activity. Each TAF - with the help of a specific activator - is required only for the expression of a subset of genes, and TAFs are not universally involved in transcription such as the GTFs. TAF10 regulates genes that are important for cell cycle progression and cell morphology. A lack of TAF10 leads to cell cycle arrest and cell death by apoptosis in mouse. In both yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF10 is part of other transcription regulatory multiprotein complexes (e.g., SAGA, TBP-free TAF-containing complex [TFTC], STAGA, and PCAF/GCN5). Several TAFs interact via histone-fold motifs. The histone fold (HFD) is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. The minimal HFD contains three alpha-helices linked by two loops. The HFD is found in core histones, TAFs and many other transcription factors. Five HF-containing TAF pairs have been described in TFIID: TAF6-TAF9, TAF4-TAF12, TAF11-TAF13, TAF8-TAF10 and TAF3-TAF10." Q#14168 - CGI_10025715 superfamily 243175 71 181 6.50E-45 147.755 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#14168 - CGI_10025715 superfamily 241832 12 63 7.12E-08 47.5665 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#14171 - CGI_10025718 superfamily 248458 86 257 7.59E-20 89.6805 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14171 - CGI_10025718 superfamily 248458 315 506 1.26E-09 58.4793 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14172 - CGI_10025719 superfamily 199156 60 74 0.00329169 32.0409 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#14173 - CGI_10025720 superfamily 241913 219 333 3.92E-06 45.3124 cl00509 hot_dog superfamily N - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#14173 - CGI_10025720 superfamily 241913 133 206 2.48E-05 42.5933 cl00509 hot_dog superfamily C - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#14173 - CGI_10025720 superfamily 241913 353 426 0.000241956 39.5117 cl00509 hot_dog superfamily C - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#14174 - CGI_10025721 superfamily 243054 692 903 1.78E-44 163.385 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 1668 1879 2.02E-40 151.444 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 153 374 5.05E-38 144.511 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 481 690 3.23E-36 139.503 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 271 480 9.04E-35 135.266 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 1457 1667 2.27E-34 134.11 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 1191 1339 2.86E-29 119.087 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 1881 2088 3.43E-29 118.702 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 247683 984 1036 1.12E-28 112.193 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#14174 - CGI_10025721 superfamily 243054 1987 2202 1.24E-19 90.5827 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 2101 2315 7.54E-13 69.3967 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 247856 2330 2398 1.54E-12 66.0321 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#14174 - CGI_10025721 superfamily 243054 15 122 1.37E-09 59.3816 cl02488 SPEC superfamily N - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 1054 1190 4.41E-09 57.8408 cl02488 SPEC superfamily N - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14174 - CGI_10025721 superfamily 243054 903 962 1.17E-09 58.4842 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14175 - CGI_10025722 superfamily 248012 487 618 3.31E-23 96.1884 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#14175 - CGI_10025722 superfamily 214507 375 428 1.91E-06 45.4988 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#14176 - CGI_10025723 superfamily 248012 52 183 4.14E-20 82.3213 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#14177 - CGI_10025724 superfamily 248012 52 183 4.80E-26 98.1144 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#14183 - CGI_10025730 superfamily 241593 383 535 3.96E-08 52.2638 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#14184 - CGI_10025731 superfamily 247743 335 452 0.00502898 37.5644 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#14184 - CGI_10025731 superfamily 243034 1079 1198 0.00660902 36.5892 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#14187 - CGI_10025734 superfamily 217473 275 436 2.72E-18 84.3389 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#14189 - CGI_10025737 superfamily 219911 224 432 1.07E-71 232.252 cl07255 PRP3 superfamily - - pre-mRNA processing factor 3 (PRP3); Pre-mRNA processing factor 3 (PRP3) is a U4/U6-associated splicing factor. The human PRP3 has been implicated in autosomal retinitis pigmentosa. Q#14189 - CGI_10025737 superfamily 219080 459 590 3.15E-35 128.998 cl05851 DUF1115 superfamily - - Protein of unknown function (DUF1115); This family represents the C-terminus of hypothetical eukaryotic proteins of unknown function. Q#14190 - CGI_10025738 superfamily 243119 260 304 0.000212169 38.9541 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#14190 - CGI_10025738 superfamily 243119 403 447 0.000402325 38.1837 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#14190 - CGI_10025738 superfamily 243119 75 114 0.00365353 35.4974 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#14192 - CGI_10025740 superfamily 243119 216 260 2.05E-05 40.8801 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#14195 - CGI_10025743 superfamily 242428 1 291 7.84E-49 165.249 cl01315 NRDE superfamily - - "NRDE protein; In eukaryotes this family is predicted to play a role in protein secretion and Golgi organisation. In plants this family includes Solanum habrochaites Cwp, which is involved in water permeability in the cuticles of fruit. Mouse T10 has been found to be expressed during early embryogenesis in mice. This protein contains a conserved NRDE motif." Q#14196 - CGI_10025744 superfamily 245814 108 184 1.77E-07 46.5607 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14198 - CGI_10025746 superfamily 217518 161 266 1.07E-37 131.203 cl08388 CBM_21 superfamily - - "Putative phosphatase regulatory subunit; This family consists of several eukaryotic proteins that are thought to be involved in the regulation of glycogen metabolism. For instance, the mouse PTG protein has been shown to interact with glycogen synthase, phosphorylase kinase, phosphorylase a: these three enzymes have key roles in the regulation of glycogen metabolism. PTG also binds the catalytic subunit of protein phosphatase 1 (PP1C) and localises it to glycogen. Subsets of similar interactions have been observed with several other members of this family, such as the yeast PIG1, PIG2, GAC1 and GIP2 proteins. While the precise function of these proteins is not known, they may serve a scaffold function, bringing together the key enzymes in glycogen metabolism. This family is a carbohydrate binding domain." Q#14199 - CGI_10025747 superfamily 246936 634 748 4.63E-20 87.546 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#14199 - CGI_10025747 superfamily 246918 177 229 9.12E-13 64.9155 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14199 - CGI_10025747 superfamily 246918 241 300 2.80E-09 54.9003 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14199 - CGI_10025747 superfamily 246918 305 364 3.07E-09 54.9003 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14199 - CGI_10025747 superfamily 246918 125 165 0.00109112 38.3367 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14200 - CGI_10002491 superfamily 247684 37 459 1.67E-84 272.229 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14201 - CGI_10002589 superfamily 218847 24 170 4.90E-38 137.631 cl18479 CDO_I superfamily - - Cysteine dioxygenase type I; Cysteine dioxygenase type I (EC:1.13.11.20) converts cysteine to cysteinesulphinic acid and is the rate-limiting step in sulphate production. Q#14202 - CGI_10002512 superfamily 149426 132 281 2.36E-27 105.556 cl18038 SEFIR superfamily - - "SEFIR domain; This family comprises IL17 receptors (IL17Rs) and SEF proteins. The latter are feedback inhibitors of FGF signalling and are also thought to be receptors. Due to its similarity to the TIR domain (pfam01582), the SEFIR region is thought to be involved in homotypic interactions with other SEFIR/TIR-domain-containing proteins. Thus, SEFs and IL17Rs may be involved in TOLL/IL1R-like signalling pathways." Q#14203 - CGI_10002513 superfamily 241563 68 109 7.39E-07 46.3184 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14203 - CGI_10002513 superfamily 241563 21 59 0.00912112 33.992 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14204 - CGI_10002514 superfamily 247044 267 352 1.18E-22 90.3863 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14204 - CGI_10002514 superfamily 247044 121 169 1.18E-20 85.3524 cl15697 ADF_gelsolin superfamily NC - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14204 - CGI_10002514 superfamily 247044 163 231 6.22E-14 66.492 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14205 - CGI_10002247 superfamily 248097 162 285 5.10E-29 108.121 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#14205 - CGI_10002247 superfamily 203593 26 109 0.00169368 36.8946 cl18243 Mod_r superfamily C - "Modifier of rudimentary (Mod(r)) protein; This family represents a conserved region approximately 150 residues long within a number of eukaryotic proteins that show homology with Drosophila melanogaster Modifier of rudimentary (Mod(r)) proteins. The N-terminal half of Mod(r) proteins is acidic, whereas the C-terminal half is basic, and both of these regions are represented in this family. Members of this family include the Vps37 subunit of the endosomal sorting complex ESCRT-I, a complex involved in recruiting transport machinery for protein sorting at the multivesicular body (MVB). The yeast ESCRT-I complex consists of three proteins (Vps23, Vps28 and Vps37). The mammalian homologue of Vps37 interacts with Tsg101 (Pfam: PF05743) through its mod(r) domain and its function is essential for lysosomal sorting of EGF receptors." Q#14210 - CGI_10002016 superfamily 247684 10 430 1.67E-97 305.356 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14217 - CGI_10003215 superfamily 242406 514 546 8.96E-06 44.5045 cl01271 DUF1768 superfamily NC - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#14222 - CGI_10012091 superfamily 242181 7 324 6.78E-96 290.688 cl00900 Ldh_2 superfamily - - "Malate/L-lactate dehydrogenase; This family consists of bacterial and archaeal Malate/L-lactate dehydrogenase. L-lactate dehydrogenase, EC:1.1.1.27, catalyzes the reaction (S)-lactate + NAD(+) <=> pyruvate + NADH. Malate dehydrogenase, EC:1.1.1.37 and EC:1.1.1.82, catalyzes the reactions: (S)-malate + NAD(+) <=> oxaloacetate + NADH, and (S)-malate + NADP(+) <=> oxaloacetate + NADPH respectively." Q#14225 - CGI_10012094 superfamily 247755 5 150 6.73E-65 220.907 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#14225 - CGI_10012094 superfamily 247755 1056 1159 3.40E-64 218.981 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#14225 - CGI_10012094 superfamily 244201 519 560 4.94E-06 46.1268 cl05797 SMC_hinge superfamily N - SMC proteins Flexible Hinge Domain; This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. Q#14226 - CGI_10012095 superfamily 241645 182 250 1.09E-05 44.1802 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#14227 - CGI_10012096 superfamily 247743 1939 2035 9.13E-05 44.4443 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#14227 - CGI_10012096 superfamily 193256 2302 2563 1.90E-67 232.916 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#14227 - CGI_10012096 superfamily 193257 2949 3172 4.87E-51 183.648 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#14227 - CGI_10012096 superfamily 193253 2580 2917 4.86E-39 152.498 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#14227 - CGI_10012096 superfamily 247743 1605 1749 6.42E-06 47.6752 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#14229 - CGI_10012098 superfamily 245201 21 261 7.25E-78 240.885 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14230 - CGI_10012099 superfamily 243051 317 465 1.66E-18 82.4257 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#14230 - CGI_10012099 superfamily 245206 35 244 3.99E-60 200.197 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#14230 - CGI_10012099 superfamily 241583 247 299 4.76E-06 46.311 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#14233 - CGI_10012102 superfamily 216363 449 528 4.47E-13 65.9546 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#14234 - CGI_10002939 superfamily 247762 122 259 1.43E-27 104.636 cl17208 intradiol_dioxygenase superfamily - - "Intradiol dioxygenases catalyze the critical ring-cleavage step in the conversion of catecholate derivatives to citric acid cycle intermediates. This family contains catechol 1,2-dioxygenases and protocatechuate 3,4-dioxygenases which are mononuclear non-heme iron enzymes that catalyze the oxygenation of catecholates to aliphatic acids via the cleavage of aromatic rings. The members are intradiol-cleaving enzymes which break the catechol C1-C2 bond and utilize Fe3+, as opposed to the extradiol-cleaving enzymes which break the C2-C3 or C1-C6 bond and utilize Fe2+ and Mn+. Catechol 1,2-dioxygenases are mostly homodimers with one catalytic ferric ion per monomer. Protocatechuate 3,4-dioxygenases form more diverse oligomers." Q#14236 - CGI_10024133 superfamily 247743 195 353 1.10E-22 94.9055 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#14236 - CGI_10024133 superfamily 216502 398 561 5.71E-33 125.012 cl03209 Peptidase_M41 superfamily - - Peptidase family M41; Peptidase family M41. Q#14237 - CGI_10024134 superfamily 243035 66 105 3.27E-08 51.8518 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14238 - CGI_10024135 superfamily 241546 553 672 4.47E-33 124.696 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14238 - CGI_10024135 superfamily 242146 710 838 0.00863963 37.3276 cl00859 Cytochrome_b_N superfamily C - "Cytochrome b (N-terminus)/b6/petB: Cytochrome b is a subunit of cytochrome bc1, an 11-subunit mitochondrial respiratory enzyme. Cytochrome b spans the mitochondrial membrane with 8 transmembrane helices (A-H) in eukaryotes. In plants and cyanobacteria, cytochrome b6 is analogous to eukaryote cytochrome b, containing two chains: helices A-D are encoded by the petB gene and helices E-H are encoded by the petD gene in these organisms. Cytochrome b/b6 contains two bound hemes and two ubiquinol/ubiquinone binding sites. The C-terminal portion of cytochrome b is described in a separate CD." Q#14240 - CGI_10024137 superfamily 241594 406 761 3.31E-164 481.294 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#14240 - CGI_10024137 superfamily 246669 12 135 6.21E-60 199.07 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#14240 - CGI_10024137 superfamily 241647 164 193 2.51E-10 56.7674 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#14240 - CGI_10024137 superfamily 241647 316 346 4.37E-08 50.6042 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#14240 - CGI_10024137 superfamily 241647 268 297 2.55E-06 45.2114 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#14241 - CGI_10024138 superfamily 216239 79 417 0 528.415 cl18361 IRK superfamily - - Inward rectifier potassium channel; Inward rectifier potassium channel. Q#14243 - CGI_10024140 superfamily 243072 371 499 4.61E-20 85.8982 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14244 - CGI_10024141 superfamily 246974 32 375 1.86E-122 363.839 cl15477 diphth2_R superfamily - - "diphthamide biosynthesis enzyme Dph1/Dph2 domain; Archaea and Eukaryotes, but not Eubacteria, share the property of having a covalently modified residue, 2'-[3-carboxamido-3-(trimethylammonio)propyl]histidine, as a part of a cytosolic protein. The modified His, termed diphthamide, is part of translation elongation factor EF-2 and is the site for ADP-ribosylation by diphtheria toxin. This model includes both Dph1 and Dph2 from Saccharomyces cerevisiae, although only Dph2 is found in the Archaea (see TIGR03682). Dph2 has been shown to act analogously to the radical SAM (rSAM) family (pfam04055), with 4Fe-4S-assisted cleavage of S-adenosylmethionine to create a free radical, but a different organic radical than in rSAM." Q#14246 - CGI_10024143 superfamily 245226 185 328 1.72E-62 202.33 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#14247 - CGI_10024144 superfamily 245226 34 224 5.82E-71 228.259 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#14247 - CGI_10024144 superfamily 219199 486 530 6.27E-10 55.464 cl06070 zf-GRF superfamily - - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#14248 - CGI_10024145 superfamily 243066 9 105 1.45E-23 95.4504 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#14248 - CGI_10024145 superfamily 198867 110 210 1.49E-19 84.1323 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#14248 - CGI_10024145 superfamily 243146 342 385 3.27E-07 47.6562 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14248 - CGI_10024145 superfamily 243146 390 433 0.000962759 37.2558 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14249 - CGI_10024146 superfamily 248458 51 229 2.10E-17 82.7469 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14249 - CGI_10024146 superfamily 248458 471 646 5.46E-17 81.5913 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14250 - CGI_10024147 superfamily 241625 12 116 2.04E-08 47.9705 cl00123 PROF superfamily - - "Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway." Q#14251 - CGI_10024148 superfamily 247736 2465 2508 3.26E-06 47.2705 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#14251 - CGI_10024148 superfamily 149273 103 305 4.02E-80 265.705 cl18036 DOT1 superfamily - - Histone methylation protein DOT1; The DOT1 domain regulates gene expression by methylating histone H3. H3 methylation by DOT1 has been shown to be required for the DNA damage checkpoint in yeast. Q#14251 - CGI_10024148 superfamily 247999 1735 1779 0.00106952 39.5026 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#14253 - CGI_10024150 superfamily 247724 10 160 1.44E-73 222.702 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#14254 - CGI_10024151 superfamily 243082 106 474 2.85E-123 365.115 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#14254 - CGI_10024151 superfamily 241645 8 74 3.63E-06 44.8288 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#14259 - CGI_10024156 superfamily 245201 65 319 1.63E-47 162.406 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14260 - CGI_10024157 superfamily 207716 60 92 1.59E-09 50.3334 cl02754 zf-LITAF-like superfamily N - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#14260 - CGI_10024157 superfamily 207716 94 127 4.77E-09 48.7927 cl02754 zf-LITAF-like superfamily N - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#14261 - CGI_10024158 superfamily 207716 54 127 1.81E-17 71.9046 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#14262 - CGI_10024159 superfamily 207716 45 118 5.55E-16 67.6674 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#14263 - CGI_10024160 superfamily 207716 4 72 3.75E-17 69.2082 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#14264 - CGI_10024161 superfamily 207716 57 120 5.34E-23 91.935 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#14264 - CGI_10024161 superfamily 215754 130 224 1.08E-22 91.546 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14264 - CGI_10024161 superfamily 215754 236 332 8.69E-19 81.1456 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14264 - CGI_10024161 superfamily 215754 342 426 1.30E-15 72.286 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14265 - CGI_10024162 superfamily 207716 54 127 4.01E-23 86.9274 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#14266 - CGI_10024163 superfamily 207716 26 99 1.32E-09 50.3334 cl02754 zf-LITAF-like superfamily - - "LITAF-like zinc ribbon domain; Members of this family display a conserved zinc ribbon structure with the motif C-XX-C- separated from the more C-terminal HX-C(P)X-C-X4-G-R motif by a variable region of usually 25-30 (hydrophobic) residues. Although it belongs to one of the zinc finger's fold groups (zinc ribbon), this particular domain was first identified in LPS-induced tumour necrosis alpha factor (LITAF) which is produced in mammalian cells after being challenged with lipopolysaccharide (LPS). The hydrophobic region probably inserts into the membrane rather than traversing it. Such an insertion brings together the N- and C-terminal C-XX-C motifs to form a compact Zn2+-binding structure." Q#14269 - CGI_10024166 superfamily 238076 43 170 2.78E-82 255.422 cl18938 PAX superfamily - - Paired Box domain Q#14270 - CGI_10024167 superfamily 247736 85 158 1.54E-08 48.8042 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#14273 - CGI_10024170 superfamily 192367 50 198 4.30E-73 234.422 cl10739 FPL superfamily - - Uncharacterized conserved protein; This entry represents an N-terminal region of approximately 150 residues of a family of proteins of unknown function. It contains a highly conserved FPL motif. Q#14276 - CGI_10024173 superfamily 243092 346 672 2.08E-29 118.206 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14278 - CGI_10024175 superfamily 219619 89 141 1.40E-13 64.9215 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#14278 - CGI_10024175 superfamily 219619 177 256 4.29E-10 55.2915 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#14281 - CGI_10024178 superfamily 222451 125 217 4.51E-20 85.0728 cl16470 DUF4209 superfamily - - "Domain of unknown function (DUF4209); This short domain is found in bacteria and eukaryotes, though not in yeasts or Archaea. It carries a highly conserved RNxxxHG sequence motif." Q#14287 - CGI_10024184 superfamily 148282 29 210 4.73E-17 78.1026 cl05873 p31comet superfamily - - "Mad1 and Cdc20-bound-Mad2 binding; This family is involved in the cell-cycle surveillance mechanism called the spindle checkpoint. This mechanism monitors the proper bipolar attachment of sister chromatids to spindle microtubules and ensures the fidelity of chromosome segregation during mitosis. A key player in mitosis is Mad2, and Mad2 exhibits an unusual two-state behaviour. A Mad1-Mad2 core complex recruits cytosolic Mad2 to kinetochores through Mad2 dimerisation and converts Mad2 to a conformer amenable to Cdc20 binding. p31comet inactivates the checkpoint by binding to Mad1- or Cdc20-bound Mad2 in such a way as to stop Mad2 activation and to promote the dissociation of the Mad2-Cdc20 complex." Q#14289 - CGI_10024186 superfamily 247724 20 247 5.77E-104 327.651 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#14289 - CGI_10024186 superfamily 243183 976 1054 5.20E-36 132.664 cl02785 Elongation_Factor_C superfamily - - "Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown." Q#14289 - CGI_10024186 superfamily 243187 673 876 2.12E-35 134.235 cl02789 EFG_like_IV superfamily C - "Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm." Q#14289 - CGI_10024186 superfamily 243185 485 586 6.92E-26 104.182 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#14289 - CGI_10024186 superfamily 243187 951 979 0.00129132 39.861 cl02789 EFG_like_IV superfamily N - "Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm." Q#14290 - CGI_10024187 superfamily 248030 245 772 7.29E-147 443.353 cl17476 Glyco_transf_7C superfamily - - "N-terminal domain of galactosyltransferase; This is the N-terminal domain of a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activities, all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalyzed reaction." Q#14290 - CGI_10024187 superfamily 190308 93 275 9.12E-10 58.4843 cl18163 Fringe superfamily C - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#14291 - CGI_10024188 superfamily 243078 2 114 2.42E-20 88.4233 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#14292 - CGI_10024189 superfamily 248458 144 187 0.00103481 37.6785 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14293 - CGI_10024190 superfamily 248458 129 239 1.24E-07 52.3161 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14293 - CGI_10024190 superfamily 248458 380 464 0.000245251 41.9157 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14296 - CGI_10024193 superfamily 248458 93 218 2.44E-09 57.7089 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14296 - CGI_10024193 superfamily 248458 315 479 0.00118621 39.6045 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14297 - CGI_10024194 superfamily 241607 165 197 7.38E-07 45.3386 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#14297 - CGI_10024194 superfamily 241607 88 123 1.97E-06 43.7978 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#14297 - CGI_10024194 superfamily 241607 235 274 0.00689932 33.7826 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#14300 - CGI_10024197 superfamily 243175 2 41 5.63E-06 38.7878 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#14305 - CGI_10015492 superfamily 247684 34 397 0 746.057 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14306 - CGI_10015493 superfamily 218246 7 137 5.51E-43 140.932 cl12301 OST3_OST6 superfamily - - "OST3 / OST6 family; The proteins in this family are part of a complex of eight ER proteins that transfers core oligosaccharide from dolichol carrier to Asn-X-Ser/Thr motifs. This family includes both OST3 and OST6, each of which contains four predicted transmembrane helices. Disruption of OST3 and OST6 leads to a defect in the assembly of the complex. Hence, the function of these genes seems to be essential for recruiting a fully active complex necessary for efficient N-glycosylation." Q#14308 - CGI_10015495 superfamily 214531 838 877 2.16E-07 49.1373 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#14308 - CGI_10015495 superfamily 214531 211 252 5.23E-05 42.2037 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#14308 - CGI_10015495 superfamily 214531 790 834 0.000331784 39.8925 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#14308 - CGI_10015495 superfamily 241578 314 356 0.00304586 39.2904 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14308 - CGI_10015495 superfamily 214531 134 155 0.00305955 37.1961 cl18310 LY superfamily NC - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#14313 - CGI_10015500 superfamily 243035 5 115 0.00194385 36.0586 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14314 - CGI_10015501 superfamily 244881 962 1261 2.39E-131 411.201 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#14314 - CGI_10015501 superfamily 215788 746 837 7.19E-35 130.377 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#14314 - CGI_10015501 superfamily 241977 1448 1554 4.38E-27 108.675 cl00607 PUA superfamily - - "PUA domain; The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain." Q#14314 - CGI_10015501 superfamily 203720 1364 1441 1.21E-24 101.086 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#14314 - CGI_10015501 superfamily 216731 57 148 6.46E-16 75.7583 cl12258 A2M_N superfamily - - MG2 domain; This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin. Q#14316 - CGI_10015503 superfamily 243072 28 143 3.12E-34 124.803 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14316 - CGI_10015503 superfamily 245201 204 469 3.88E-36 134.586 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14317 - CGI_10015504 superfamily 241677 7 173 2.28E-93 278.757 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#14317 - CGI_10015504 superfamily 243034 272 336 6.84E-12 61.242 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#14318 - CGI_10015505 superfamily 241644 6 106 2.94E-42 139.26 cl00154 UBCc superfamily N - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#14319 - CGI_10015506 superfamily 219635 403 588 7.48E-86 267.975 cl06790 Peptidase_C78 superfamily - - Peptidase family C78; This family formerly known as DUF1671 has been shown to be a cysteine peptidase called (Ufm1)-specific protease. Q#14320 - CGI_10015507 superfamily 247683 138 191 1.48E-25 100.811 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#14320 - CGI_10015507 superfamily 245201 250 507 5.95E-128 384.107 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14320 - CGI_10015507 superfamily 247725 5 112 2.31E-29 114.249 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#14320 - CGI_10015507 superfamily 247725 510 655 1.84E-21 91.9075 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#14320 - CGI_10015507 superfamily 246908 195 221 0.000140936 40.8465 cl15255 SH2 superfamily C - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#14321 - CGI_10015508 superfamily 215754 8 97 7.97E-19 79.6048 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14321 - CGI_10015508 superfamily 215754 119 193 1.30E-18 78.8344 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14321 - CGI_10015508 superfamily 215754 213 287 1.64E-15 70.36 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14322 - CGI_10015509 superfamily 218839 7 165 6.57E-53 169.802 cl05501 Med7 superfamily - - MED7 protein; This family consists of several eukaryotic proteins which are homologues of the yeast MED7 protein. Activation of gene transcription in metazoans is a multi-step process that is triggered by factors that recognise transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus. Q#14324 - CGI_10015511 superfamily 248264 156 302 1.37E-20 85.7517 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#14327 - CGI_10015514 superfamily 217381 3 96 2.70E-38 127.298 cl15956 TB2_DP1_HVA22 superfamily - - "TB2/DP1, HVA22 family; This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein, which in humans is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease. The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein, which is thought to be a regulatory protein." Q#14329 - CGI_10015516 superfamily 248264 320 405 6.01E-06 44.9206 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#14334 - CGI_10003635 superfamily 241594 711 1057 9.66E-144 437.382 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#14334 - CGI_10003635 superfamily 201217 165 215 3.32E-15 71.788 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#14334 - CGI_10003635 superfamily 201217 112 162 1.72E-14 69.862 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#14334 - CGI_10003635 superfamily 201217 221 267 3.63E-14 69.0916 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#14334 - CGI_10003635 superfamily 201217 283 318 1.15E-07 50.2168 cl08266 RCC1 superfamily N - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#14334 - CGI_10003635 superfamily 201217 4 52 1.41E-07 49.8316 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#14334 - CGI_10003635 superfamily 201217 55 108 1.94E-07 49.4464 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#14336 - CGI_10003637 superfamily 245213 595 625 4.00E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14336 - CGI_10003637 superfamily 245213 511 545 0.000179006 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14336 - CGI_10003637 superfamily 245213 429 463 0.000315424 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14336 - CGI_10003637 superfamily 215647 1275 1469 6.13E-37 141.592 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#14336 - CGI_10003637 superfamily 243060 686 782 9.46E-14 70.1004 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#14336 - CGI_10003637 superfamily 245213 471 505 1.33E-07 50.3232 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14336 - CGI_10003637 superfamily 221370 1072 1235 2.46E-06 48.9069 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#14336 - CGI_10003637 superfamily 245213 553 593 7.42E-05 42.336 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14336 - CGI_10003637 superfamily 245213 633 671 0.000784209 39.1524 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14336 - CGI_10003637 superfamily 245814 910 978 0.000846129 39.701 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14336 - CGI_10003637 superfamily 221695 409 432 0.00682975 36.279 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#14337 - CGI_10003638 superfamily 243175 156 251 1.96E-23 91.8085 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#14338 - CGI_10001332 superfamily 241874 31 70 4.37E-26 98.9152 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#14339 - CGI_10004292 superfamily 246902 22 167 4.27E-09 54.6075 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#14340 - CGI_10004293 superfamily 247727 1 340 6.28E-144 412.244 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#14342 - CGI_10004295 superfamily 155088 148 301 6.29E-27 103.299 cl02758 AMOP superfamily - - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#14342 - CGI_10004295 superfamily 246918 74 113 1.27E-09 53.1782 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14343 - CGI_10004296 superfamily 241913 34 170 4.34E-43 141.528 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#14346 - CGI_10004299 superfamily 245206 45 301 1.61E-93 281.012 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#14347 - CGI_10004300 superfamily 203458 1460 1489 2.59E-06 46.3613 cl05789 Hyd_WA superfamily - - Propeller; Probable beta-propeller. Q#14347 - CGI_10004300 superfamily 203458 1417 1446 1.58E-05 44.0501 cl05789 Hyd_WA superfamily - - Propeller; Probable beta-propeller. Q#14347 - CGI_10004300 superfamily 203458 1315 1342 4.33E-05 42.8945 cl05789 Hyd_WA superfamily - - Propeller; Probable beta-propeller. Q#14347 - CGI_10004300 superfamily 214782 1483 1516 9.50E-05 41.7138 cl02749 TECPR superfamily - - "Beta propeller repeats in Physarum polycephalum tectonins, Limulus lectin L-6 and animal hypothetical proteins; Beta propeller repeats in Physarum polycephalum tectonins, Limulus lectin L-6 and animal hypothetical proteins. " Q#14347 - CGI_10004300 superfamily 203458 1149 1181 0.00851603 35.9609 cl05789 Hyd_WA superfamily - - Propeller; Probable beta-propeller. Q#14349 - CGI_10004302 superfamily 244681 1 57 8.29E-13 59.3068 cl07291 RNase_H2-C superfamily N - "Ribonuclease H2-C is a subunit of the eukaryotic RNase H complex which cleaves RNA-DNA hybrids; Ribonuclease H2C is one of the three protein of eukaryotic RNase H2 complex that is required for nucleic acid binding and hydrolysis. RNase H is classified into two families, type I (prokaryotic RNase HI, eukaryotic RNase H1 and viral RNase H) and type II (prokaryotic RNase HII and HIII, and eukaryotic RNase H2/HII). RNase H endonucleolytically hydrolyzes an RNA strand when it is annealed to a complementary DNA strand in the presence of divalent cations, in DNA replication and repair. The enzyme can be found in bacteria, archaea, and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite a lack of evidence for homology from sequence comparisons, type I and type II RNase H share a common fold and similar steric configurations of the four acidic active-site residues, suggesting identical or very similar catalytic mechanisms. Eukaryotic RNase HII is active during replication and is believed to play a role in removal of Okazaki fragment primers and single ribonucleotides in DNA-DNA duplexes. Eukaryotic RNase HII is functional when it forms a complex with RNase H2B and RNase H2C proteins. It is speculated that the two accessory subunits are required for correct folding of the catalytic subunit of RNase HII. Mutations in the three subunits of human RNase HII cause neurological disorder." Q#14350 - CGI_10005997 superfamily 199166 82 196 1.15E-08 51.9444 cl15308 AMN1 superfamily NC - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#14353 - CGI_10006000 superfamily 111646 268 404 2.94E-80 245.779 cl03707 S-AdoMet_synt_C superfamily - - "S-adenosylmethionine synthetase, C-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#14353 - CGI_10006000 superfamily 217221 144 266 7.20E-69 215.748 cl03706 S-AdoMet_synt_M superfamily - - "S-adenosylmethionine synthetase, central domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#14353 - CGI_10006000 superfamily 201226 32 131 1.81E-60 193.455 cl02868 S-AdoMet_synt_N superfamily - - "S-adenosylmethionine synthetase, N-terminal domain; The three domains of S-adenosylmethionine synthetase have the same alpha+beta fold." Q#14356 - CGI_10006003 superfamily 247805 823 1028 2.59E-68 229.294 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#14356 - CGI_10006003 superfamily 247905 1040 1174 1.66E-38 141.606 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#14357 - CGI_10006004 superfamily 216191 69 148 1.38E-26 103.734 cl03017 UPF0004 superfamily - - Uncharacterized protein family UPF0004; This family is the N terminal half of the Prosite family. The C-terminal half has been shown to be related to MiaB proteins. This domain is a nearly always found in conjunction with pfam04055 and pfam01938 although its function is uncertain. Q#14358 - CGI_10006005 superfamily 243072 13 148 3.06E-21 88.5946 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14359 - CGI_10006006 superfamily 241580 112 186 1.37E-31 115.344 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#14360 - CGI_10006007 superfamily 247724 19 139 3.39E-76 226.455 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#14361 - CGI_10006008 superfamily 241868 19 111 2.52E-28 102.308 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#14362 - CGI_10003156 superfamily 247692 31 631 0 952.139 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#14363 - CGI_10003157 superfamily 241698 17 171 3.61E-58 182.409 cl00220 cysteine_hydrolases superfamily - - "Cysteine hydrolases; This family contains amidohydrolases, like CSHase (N-carbamoylsarcosine amidohydrolase), involved in creatine metabolism and nicotinamidase, converting nicotinamide to nicotinic acid and ammonia in the pyridine nucleotide cycle. It also contains isochorismatase, an enzyme that catalyzes the conversion of isochorismate to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of the vinyl ether bond, and other related enzymes with unknown function." Q#14364 - CGI_10003158 superfamily 241799 566 745 2.30E-44 161.191 cl00339 SugarP_isomerase superfamily - - "SugarP_isomerase: Sugar Phosphate Isomerase family; includes type A ribose 5-phosphate isomerase (RPI_A), glucosamine-6-phosphate (GlcN6P) deaminase, and 6-phosphogluconolactonase (6PGL). RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium, the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate. 6PGL converts 6-phosphoglucono-1,5-lactone to 6-phosphogluconate, the second step of the oxidative phase of the pentose phosphate pathway." Q#14364 - CGI_10003158 superfamily 217228 224 506 3.21E-41 154.435 cl07843 G6PD_C superfamily - - "Glucose-6-phosphate dehydrogenase, C-terminal domain; Glucose-6-phosphate dehydrogenase, C-terminal domain. " Q#14364 - CGI_10003158 superfamily 215937 39 221 2.30E-40 148.419 cl02877 G6PD_N superfamily - - "Glucose-6-phosphate dehydrogenase, NAD binding domain; Glucose-6-phosphate dehydrogenase, NAD binding domain. " Q#14364 - CGI_10003158 superfamily 241600 834 1036 2.40E-35 135.061 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#14367 - CGI_10022713 superfamily 243058 281 389 4.23E-16 76.1991 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14367 - CGI_10022713 superfamily 243058 396 515 1.55E-13 68.8803 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14367 - CGI_10022713 superfamily 243072 182 269 2.18E-08 53.5414 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14367 - CGI_10022713 superfamily 243058 775 870 0.000370755 39.9904 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14367 - CGI_10022713 superfamily 202819 46 85 1.50E-14 70.0226 cl04335 Ribosomal_L23eN superfamily N - "Ribosomal protein L23, N-terminal domain; The N-terminal domain appears to be specific to the eukaryotic ribosomal proteins L25, L23, and L23a." Q#14367 - CGI_10022713 superfamily 243058 479 589 0.00346359 37.294 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14368 - CGI_10022714 superfamily 241563 480 521 9.10E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14368 - CGI_10022714 superfamily 241563 55 97 0.000569126 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14368 - CGI_10022714 superfamily 241563 427 472 0.00336131 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14369 - CGI_10022715 superfamily 241603 52 271 5.42E-70 219.164 cl00089 NUC superfamily - - DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases. They exists as monomers and homodimers. Q#14370 - CGI_10022716 superfamily 241563 60 99 9.56E-06 43.0447 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14372 - CGI_10022718 superfamily 245835 43 142 0.00324938 37.9739 cl12013 BAR superfamily NC - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#14375 - CGI_10022721 superfamily 241546 2190 2309 1.44E-52 184.401 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14375 - CGI_10022721 superfamily 241546 4055 4174 5.57E-52 182.476 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14375 - CGI_10022721 superfamily 248011 737 779 0.000122567 43.6394 cl17457 PKD superfamily N - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#14375 - CGI_10022721 superfamily 243093 3279 3370 1.09E-15 76.7401 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#14375 - CGI_10022721 superfamily 243086 2080 2122 1.08E-08 55.0738 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#14375 - CGI_10022721 superfamily 248011 592 683 0.00298018 39.3566 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#14376 - CGI_10022722 superfamily 248011 870 918 1.86E-05 43.6394 cl17457 PKD superfamily N - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#14376 - CGI_10022722 superfamily 243093 339 417 1.56E-13 67.5554 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#14376 - CGI_10022722 superfamily 243100 56 112 2.10E-07 49.2448 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#14377 - CGI_10022723 superfamily 204418 3 124 2.28E-59 187.733 cl10915 DUF2246 superfamily N - Uncharacterized conserved protein (DUF2246); This is a family of proteins conserved from worms to humans of approximately 300 residues. The function is unknown. Q#14378 - CGI_10022724 superfamily 245205 315 382 4.25E-07 48.3881 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#14381 - CGI_10022727 superfamily 241972 24 95 0.003651 33.7067 cl00600 Ribosomal_L7Ae superfamily C - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#14382 - CGI_10022728 superfamily 241972 8 85 0.000477986 36.7883 cl00600 Ribosomal_L7Ae superfamily C - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#14383 - CGI_10022729 superfamily 247684 9 182 4.36E-21 88.7999 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14384 - CGI_10022730 superfamily 247684 9 182 4.67E-20 86.1035 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14385 - CGI_10022731 superfamily 247684 10 183 7.43E-16 74.1623 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14386 - CGI_10022732 superfamily 247684 9 182 6.09E-17 77.2439 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14386 - CGI_10022732 superfamily 247684 289 319 0.00987966 36.2527 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14387 - CGI_10022733 superfamily 247684 9 160 1.18E-10 58.7543 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14387 - CGI_10022733 superfamily 217211 138 194 0.00102128 36.8786 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#14388 - CGI_10022734 superfamily 247727 136 194 2.78E-07 47.3716 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#14389 - CGI_10022735 superfamily 241843 62 179 8.34E-33 116.2 cl00402 UPF0054 superfamily N - Uncharacterized protein family UPF0054; Uncharacterized protein family UPF0054. Q#14390 - CGI_10022736 superfamily 246748 12 166 3.51E-59 204.245 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#14391 - CGI_10022737 superfamily 246748 1 84 3.20E-27 102.167 cl14876 Zinc_peptidase_like superfamily N - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#14392 - CGI_10022738 superfamily 246680 9 88 0.00014595 40.2628 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#14393 - CGI_10022739 superfamily 247683 14 68 6.41E-25 96.2656 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#14393 - CGI_10022739 superfamily 245674 325 380 0.00392908 35.3426 cl11531 DUF904 superfamily N - Protein of unknown function (DUF904); This family consists of several bacterial and archaeal hypothetical proteins of unknown function. Q#14394 - CGI_10022740 superfamily 247683 3 55 8.35E-27 99.2633 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#14394 - CGI_10022740 superfamily 247683 112 161 1.23E-22 88.1594 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#14395 - CGI_10022741 superfamily 244539 342 578 2.39E-63 209.731 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#14395 - CGI_10022741 superfamily 242849 101 174 4.97E-23 93.4224 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#14395 - CGI_10022741 superfamily 241659 230 316 6.83E-15 70.8236 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#14396 - CGI_10022742 superfamily 247057 726 781 9.57E-17 77.1585 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#14396 - CGI_10022742 superfamily 247057 4 64 5.08E-25 100.983 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#14396 - CGI_10022742 superfamily 247683 644 695 3.86E-09 54.6277 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#14396 - CGI_10022742 superfamily 247057 1206 1265 0.00836634 35.949 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#14397 - CGI_10022743 superfamily 241900 621 934 4.43E-41 154.768 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#14397 - CGI_10022743 superfamily 241900 931 995 2.86E-06 48.9589 cl00490 EEP superfamily N - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#14398 - CGI_10022744 superfamily 245226 317 473 3.71E-14 70.4072 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#14401 - CGI_10022748 superfamily 227448 224 337 2.31E-17 83.9942 cl18812 BDP1 superfamily N - "Transcription initiation factor TFIIIB, Bdp1 subunit [Transcription]" Q#14402 - CGI_10022749 superfamily 241832 4 73 4.20E-34 117.65 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#14402 - CGI_10022749 superfamily 243175 84 180 1.72E-21 85.367 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#14403 - CGI_10022750 superfamily 241974 536 671 8.14E-17 77.2818 cl00604 STAS superfamily - - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#14403 - CGI_10022750 superfamily 216188 215 480 2.12E-38 143.512 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#14403 - CGI_10022750 superfamily 205965 72 155 3.26E-26 103.261 cl18285 Sulfate_tra_GLY superfamily - - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#14404 - CGI_10022751 superfamily 247684 19 430 3.02E-76 250.657 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14405 - CGI_10022752 superfamily 216554 1 167 2.35E-39 139.922 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#14406 - CGI_10022753 superfamily 247684 10 428 9.85E-85 272.229 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14407 - CGI_10022754 superfamily 243072 102 231 8.79E-35 124.803 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14407 - CGI_10022754 superfamily 243072 79 104 0.000425457 37.5336 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14409 - CGI_10022756 superfamily 243179 111 182 6.24E-13 61.3663 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#14410 - CGI_10022757 superfamily 218118 64 127 0.00519347 32.9713 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#14411 - CGI_10022758 superfamily 244970 228 269 6.44E-07 46.2214 cl08469 tRNA_SAD superfamily - - "Threonyl and Alanyl tRNA synthetase second additional domain; The catalytically active from of threonyl/alanyl tRNA synthetase is a dimer. Within the tRNA synthetase class II dimer, the bound tRNA interacts with both monomers making specific interactions with the catalytic domain, the C-terminal domain, and this domain (the second additional domain). The second additional domain is comprised of a pair of perpendicularly orientated antiparallel beta sheets, of four and three strands, respectively, that surround a central alpha helix that forms the core of the domain." Q#14412 - CGI_10022759 superfamily 197746 127 166 6.02E-06 41.5579 cl02624 MIR superfamily C - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#14412 - CGI_10022759 superfamily 197746 4 54 1.38E-05 40.4023 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#14412 - CGI_10022759 superfamily 197746 65 103 1.60E-05 40.4023 cl02624 MIR superfamily C - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#14413 - CGI_10022760 superfamily 241574 46 107 5.43E-25 93.1632 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#14415 - CGI_10010969 superfamily 241584 570 644 4.65E-08 52.1135 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14415 - CGI_10010969 superfamily 247724 718 802 5.79E-06 47.9256 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#14419 - CGI_10010973 superfamily 243066 24 104 4.48E-08 51.8493 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#14419 - CGI_10010973 superfamily 222150 729 753 0.00446218 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14420 - CGI_10010974 superfamily 217235 155 312 2.67E-97 287.519 cl18396 Gp_dh_C superfamily - - "Glyceraldehyde 3-phosphate dehydrogenase, C-terminal domain; GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. C-terminal domain is a mixed alpha/antiparallel beta fold." Q#14420 - CGI_10010974 superfamily 248296 3 149 4.71E-71 220.058 cl17742 Gp_dh_N superfamily - - "Glyceraldehyde 3-phosphate dehydrogenase, NAD binding domain; GAPDH is a tetrameric NAD-binding enzyme involved in glycolysis and glyconeogenesis. N-terminal domain is a Rossmann NAD(P) binding fold." Q#14423 - CGI_10010977 superfamily 245201 27 269 4.46E-71 231.742 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14424 - CGI_10010978 superfamily 110440 385 411 0.000946276 37.0021 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#14425 - CGI_10010979 superfamily 243084 889 986 1.67E-27 109.952 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#14427 - CGI_10010981 superfamily 206097 642 729 6.79E-08 51.5279 cl16481 DUF4211 superfamily - - Domain of unknown function (DUF4211); Domain of unknown function (DUF4211). Q#14428 - CGI_10010982 superfamily 220711 10 181 2.27E-39 141.183 cl18573 DUF2431 superfamily - - Domain of unknown function (DUF2431); This is the N-terminal domain of a family of proteins found from plants to humans. The function is not known. Q#14428 - CGI_10010982 superfamily 244928 449 543 0.000402875 38.9458 cl08386 FDX-ACB superfamily - - Ferredoxin-fold anticodon binding domain; This is the anticodon binding domain found in some phenylalanyl tRNA synthetases. The domain has a ferredoxin fold. Q#14429 - CGI_10010983 superfamily 241752 566 632 1.39E-19 85.4489 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#14429 - CGI_10010983 superfamily 241752 371 418 3.44E-16 75.8189 cl00283 ADP_ribosyl superfamily C - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#14431 - CGI_10010985 superfamily 244584 606 654 4.02E-05 42.3647 cl07029 SPT2 superfamily NC - SPT2 chromatin protein; This family includes the Saccharomyces cerevisiae protein SPT2 which is a chromatin protein involved in transcriptional regulation. Q#14433 - CGI_10010987 superfamily 245864 46 486 9.22E-89 282.245 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#14435 - CGI_10010989 superfamily 247805 70 209 3.05E-15 70.8292 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#14436 - CGI_10022875 superfamily 248097 3 111 3.72E-21 82.6982 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#14437 - CGI_10022876 superfamily 243072 12 107 7.30E-13 61.6306 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14438 - CGI_10022877 superfamily 248097 71 193 4.18E-18 76.535 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#14439 - CGI_10022878 superfamily 241645 1 50 2.03E-16 70.436 cl00155 UBQ superfamily C - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#14441 - CGI_10022880 superfamily 247769 166 286 3.03E-10 56.5789 cl17215 HDc superfamily C - Metal dependent phosphohydrolases with conserved 'HD' motif Q#14441 - CGI_10022880 superfamily 247057 61 112 2.34E-05 41.1462 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#14443 - CGI_10022882 superfamily 247769 162 325 6.36E-13 66.5941 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#14444 - CGI_10022883 superfamily 243100 209 264 1.22E-06 45.2476 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#14450 - CGI_10022889 superfamily 192556 853 937 1.39E-24 100.336 cl11030 DUF2435 superfamily - - Protein of unknown function (DUF2435); This is a conserved region of approximately 400 residues which is found only in eukaryotes. It is associated with HEAT domains pfam02985 in all members. The function is not known. Q#14451 - CGI_10022890 superfamily 243290 246 361 0.00545072 37.261 cl03075 GrpE superfamily C - "GrpE is the adenine nucleotide exchange factor of DnaK (Hsp70)-type ATPases. The GrpE dimer binds to the ATPase domain of Hsp70 catalyzing the dissociation of ADP, which enables rebinding of ATP, one step in the Hsp70 reaction cycle in protein folding. In eukaryotes, only the mitochondrial Hsp70, not the cytosolic form, is GrpE dependent." Q#14454 - CGI_10022893 superfamily 245040 535 600 0.00115522 37.8388 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#14455 - CGI_10022894 superfamily 215754 215 305 1.07E-14 68.0488 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14455 - CGI_10022894 superfamily 215754 104 204 4.02E-13 63.8116 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14455 - CGI_10022894 superfamily 215754 2 90 8.19E-13 63.0412 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#14457 - CGI_10022896 superfamily 218182 14 213 1.42E-66 207.302 cl18445 ERG2_Sigma1R superfamily - - "ERG2 and Sigma1 receptor like protein; This family consists of the fungal C-8 sterol isomerase and mammalian sigma1 receptor. C-8 sterol isomerase (delta-8--delta-7 sterol isomerase), catalyzes a reaction in ergosterol biosynthesis, which results in unsaturation at C-7 in the B ring of sterols. Sigma 1 receptor is a low molecular mass mammalian protein located in the endoplasmic reticulum, which interacts with endogenous steroid hormones, such as progesterone and testosterone. It also binds the sigma ligands, which are are a set of chemically unrelated drugs including haloperidol, pentazocine, and ditolylguanidine. Sigma1 effectors are not well understood, but sigma1 agonists have been observed to affect NMDA receptor function, the alpha-adrenergic system and opioid analgesia." Q#14460 - CGI_10022899 superfamily 245225 413 679 3.22E-90 299.158 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#14460 - CGI_10022899 superfamily 215648 794 1033 1.02E-78 258.68 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#14460 - CGI_10022899 superfamily 245225 307 413 3.93E-33 133.907 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#14460 - CGI_10022899 superfamily 216686 90 272 1.22E-20 91.6157 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#14460 - CGI_10022899 superfamily 219467 713 763 2.67E-12 63.8915 cl08456 NCD3G superfamily - - "Nine Cysteines Domain of family 3 GPCR; This conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the pfam00003 in several receptor proteins." Q#14461 - CGI_10022901 superfamily 241644 13 93 2.29E-23 91.4949 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#14462 - CGI_10022902 superfamily 243072 1164 1275 1.45E-16 78.5794 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14462 - CGI_10022902 superfamily 243072 1043 1206 2.84E-12 65.8678 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14462 - CGI_10022902 superfamily 243072 716 816 1.95E-06 47.7634 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14462 - CGI_10022902 superfamily 247743 19 137 0.000294838 41.8624 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#14463 - CGI_10022903 superfamily 243555 10 202 6.10E-12 62.0234 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#14466 - CGI_10022906 superfamily 247068 137 237 2.24E-18 77.7389 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14466 - CGI_10022906 superfamily 247068 24 127 7.62E-17 73.5017 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 1017 1111 2.86E-23 97.3841 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 496 591 9.79E-23 95.8433 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 603 694 7.32E-20 87.3689 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 169 289 4.02E-18 82.3613 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 63 161 5.84E-18 81.9761 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 702 794 1.17E-17 81.2057 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 910 1006 1.86E-17 80.4353 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 804 901 7.21E-17 78.8945 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 1119 1209 1.02E-12 66.5681 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 1329 1409 1.37E-12 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 397 487 2.50E-12 65.4125 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14467 - CGI_10022907 superfamily 247068 308 386 9.38E-06 45.7674 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14469 - CGI_10022909 superfamily 247742 72 396 5.88E-143 414.808 cl17188 enolase_like superfamily - - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#14470 - CGI_10022910 superfamily 247044 79 191 2.89E-56 182.808 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14470 - CGI_10022910 superfamily 247044 205 283 8.00E-20 83.4408 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14470 - CGI_10022910 superfamily 247044 319 404 2.08E-21 88.0752 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14471 - CGI_10022911 superfamily 241563 68 109 1.37E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14471 - CGI_10022911 superfamily 241563 28 59 0.00373235 35.918 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14472 - CGI_10022912 superfamily 247044 10 58 6.02E-21 80.73 cl15697 ADF_gelsolin superfamily C - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14474 - CGI_10022914 superfamily 243091 45 116 2.11E-08 52.1092 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#14477 - CGI_10003542 superfamily 245819 299 473 1.90E-49 175.845 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#14477 - CGI_10003542 superfamily 245819 843 1030 1.94E-45 164.289 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#14477 - CGI_10003542 superfamily 218992 516 609 5.07E-05 43.9425 cl05691 DUF1053 superfamily - - Domain of Unknown Function (DUF1053); This domain is found in Adenylate cyclases. Q#14478 - CGI_10003543 superfamily 241645 18 57 0.00267745 37.505 cl00155 UBQ superfamily C - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#14480 - CGI_10005182 superfamily 241575 599 665 8.39E-17 76.9275 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#14480 - CGI_10005182 superfamily 243132 690 1077 3.18E-120 376.718 cl02661 A_deamin superfamily - - "Adenosine-deaminase (editase) domain; Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defence against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc." Q#14480 - CGI_10005182 superfamily 207691 28 90 0.000920374 38.8565 cl02659 z-alpha superfamily - - "Adenosine deaminase z-alpha domain; This family consists of the N-terminus and thus the z-alpha domain of double-stranded RNA-specific adenosine deaminase (ADAR), an RNA- editing enzyme. The z-alpha domain is a Z-DNA binding domain, and binding of this region to B-DNA has been shown to be disfavoured by steric hindrance." Q#14483 - CGI_10005185 superfamily 241750 85 534 0 738.26 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#14484 - CGI_10005186 superfamily 247844 1 163 7.23E-59 184.82 cl17290 Methyltransf_4 superfamily N - Putative methyltransferase; This is a family of putative methyltransferases. The aligned region contains the GXGXG S-AdoMet binding site suggesting a putative methyltransferase activity. Q#14485 - CGI_10005187 superfamily 202474 30 113 3.20E-12 61.9009 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#14486 - CGI_10005188 superfamily 217293 73 261 1.94E-35 131.216 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#14486 - CGI_10005188 superfamily 202474 269 351 3.32E-15 73.4569 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#14486 - CGI_10005188 superfamily 247844 1 71 0.00825438 36.5057 cl17290 Methyltransf_4 superfamily C - Putative methyltransferase; This is a family of putative methyltransferases. The aligned region contains the GXGXG S-AdoMet binding site suggesting a putative methyltransferase activity. Q#14487 - CGI_10005189 superfamily 114359 13 308 8.24E-69 225.173 cl17943 DUF791 superfamily - - Protein of unknown function (DUF791); This family consists of several eukaryotic proteins of unknown function. Q#14488 - CGI_10005190 superfamily 245814 296 356 7.37E-06 44.0171 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14488 - CGI_10005190 superfamily 245814 493 556 3.50E-05 42.0911 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14488 - CGI_10005190 superfamily 245814 196 263 0.000157573 40.3787 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14491 - CGI_10004188 superfamily 177822 43 290 2.99E-21 90.7497 cl18088 PLN02164 superfamily N - sulfotransferase Q#14493 - CGI_10013832 superfamily 227453 70 270 4.67E-27 105.037 cl14331 COG5124 superfamily - - Protein predicted to be involved in meiotic recombination [Cell division and chromosome partitioning / General function prediction only] Q#14494 - CGI_10013833 superfamily 243058 125 231 2.93E-14 69.2655 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14494 - CGI_10013833 superfamily 243058 202 323 4.64E-10 56.9391 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14494 - CGI_10013833 superfamily 243058 391 487 3.75E-06 45.3832 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14494 - CGI_10013833 superfamily 243058 282 405 3.82E-06 45.3832 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14494 - CGI_10013833 superfamily 201951 28 109 0.000167397 40.0566 cl03339 IBB superfamily - - "Importin beta binding domain; This family consists of the importin alpha (karyopherin alpha), importin beta (karyopherin beta) binding domain. The domain mediates formation of the importin alpha beta complex; required for classical NLS import of proteins into the nucleus, through the nuclear pore complex and across the nuclear envelope. Also in the alignment is the NLS of importin alpha which overlaps with the IBB domain." Q#14495 - CGI_10013834 superfamily 241696 221 560 4.13E-169 485.982 cl00218 Glyco_hydrolase_16 superfamily - - "glycosyl hydrolase family 16; The O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycosyl hydrolase family 16. Family 16 includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues." Q#14498 - CGI_10013837 superfamily 241958 43 309 5.11E-37 137.262 cl00573 SDF superfamily C - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#14499 - CGI_10013838 superfamily 241958 44 79 0.00372265 34.0282 cl00573 SDF superfamily N - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#14500 - CGI_10013839 superfamily 248458 111 264 0.00120994 39.9897 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#14502 - CGI_10013841 superfamily 245213 162 196 0.00455181 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14502 - CGI_10013841 superfamily 245213 219 250 0.00467505 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14504 - CGI_10013843 superfamily 243092 3 293 1.94E-26 109.347 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14505 - CGI_10013844 superfamily 150957 761 869 3.30E-38 138.903 cl11034 Vps39_2 superfamily - - "Vacuolar sorting protein 39 domain 2; This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. This domain is involved in localisation and in mediating the interactions of Vps39 with Vps11." Q#14505 - CGI_10013844 superfamily 243036 24 285 1.06E-31 125.428 cl02434 CNH superfamily - - "CNH domain; Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations." Q#14505 - CGI_10013844 superfamily 220718 450 552 9.13E-28 109.251 cl11033 Vps39_1 superfamily - - "Vacuolar sorting protein 39 domain 1; This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. The precise function of this domain has not been characterized." Q#14506 - CGI_10013845 superfamily 247725 2274 2381 2.16E-48 170.468 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#14506 - CGI_10013845 superfamily 243054 637 844 3.51E-37 142.199 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 243054 522 739 5.71E-35 135.651 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 243054 952 1156 7.76E-34 132.569 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 243054 1590 1801 8.07E-31 123.71 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 243054 1802 2024 3.09E-29 119.087 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 243054 1168 1375 7.32E-28 114.85 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 243054 1380 1584 4.89E-27 112.539 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 241559 165 268 4.89E-24 100.463 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#14506 - CGI_10013845 superfamily 241559 46 149 6.30E-17 79.6623 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#14506 - CGI_10013845 superfamily 243054 417 501 9.90E-11 61.5658 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 243054 847 947 1.20E-10 61.1917 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14506 - CGI_10013845 superfamily 243054 2013 2075 7.22E-09 55.7878 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#14507 - CGI_10013846 superfamily 243072 1015 1090 2.85E-10 59.3194 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14507 - CGI_10013846 superfamily 241645 19 93 0.00475539 36.5427 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#14508 - CGI_10013847 superfamily 243034 429 514 4.48E-08 51.612 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#14508 - CGI_10013847 superfamily 243034 290 363 1.84E-06 46.9896 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#14508 - CGI_10013847 superfamily 243034 333 447 6.40E-06 45.0636 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#14508 - CGI_10013847 superfamily 248006 569 616 4.39E-05 41.7879 cl17452 TPR_10 superfamily - - Tetratricopeptide repeat; Tetratricopeptide repeat. Q#14510 - CGI_10013849 superfamily 247677 244 394 1.58E-37 135.051 cl17013 W2 superfamily - - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#14510 - CGI_10013849 superfamily 247678 4 131 1.00E-47 161.722 cl17014 eIF-5_eIF-2B superfamily - - "Domain found in IF2B/IF5; This family includes the N terminus of eIF-5, and the C terminus of eIF-2 beta. This region corresponds to the whole of the archaebacterial eIF-2 beta homologue. The region contains a putative zinc binding C4 finger." Q#14511 - CGI_10013850 superfamily 247684 47 201 2.38E-07 49.1243 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#14515 - CGI_10003747 superfamily 243035 37 103 1.93E-14 65.3337 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14517 - CGI_10006430 superfamily 246723 83 536 0 646.158 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#14517 - CGI_10006430 superfamily 246723 979 1431 0 621.506 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#14517 - CGI_10006430 superfamily 246723 1869 2309 0 618.809 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#14519 - CGI_10006432 superfamily 219086 43 173 1.33E-16 73.376 cl05857 BNIP3 superfamily - - "BNIP3; This family consists of several mammalian specific BCL2/adenovirus E1B 19-kDa protein-interacting protein 3 or BNIP3 sequences. BNIP3 belongs to the Bcl-2 homology 3 (BH3)-only family, a Bcl-2-related family possessing an atypical Bcl-2 homology 3 (BH3) domain, which regulates PCD from mitochondrial sites by selective Bcl-2/Bcl-XL interactions. BNIP3 family members contain a C-terminal transmembrane domain that is required for their mitochondrial localisation, homodimerisation, as well as regulation of their pro-apoptotic activities. BNIP3-mediated apoptosis has been reported to be independent of caspase activation and cytochrome c release and is characterized by early plasma membrane and mitochondrial damage, prior to the appearance of chromatin condensation or DNA fragmentation." Q#14520 - CGI_10002053 superfamily 243116 444 819 0 527.76 cl02626 DNA_pol_A superfamily - - "Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication; DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains." Q#14520 - CGI_10002053 superfamily 245226 252 433 1.04E-46 165.386 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#14520 - CGI_10002053 superfamily 246724 95 167 1.21E-29 113.263 cl14815 H3TH_StructSpec-5'-nucleases superfamily - - "H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination; The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases." Q#14520 - CGI_10002053 superfamily 246722 1 89 1.84E-19 87.1011 cl14812 PIN_SF superfamily N - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#14521 - CGI_10002056 superfamily 247744 2 165 7.67E-59 184.362 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#14522 - CGI_10002058 superfamily 245883 119 404 6.65E-98 295.921 cl12120 SDH_alpha superfamily - - Serine dehydratase alpha chain; L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyzes the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway. Q#14522 - CGI_10002058 superfamily 217489 1 96 2.73E-26 102.985 cl04004 SDH_beta superfamily N - Serine dehydratase beta chain; L-serine dehydratase (EC:4.2.1.13) is a found as a heterodimer of alpha and beta chain or as a fusion of the two chains in a single protein. This enzyme catalyzes the deamination of serine to form pyruvate. This enzyme is part of the gluconeogenesis pathway. Q#14523 - CGI_10002059 superfamily 245604 3 97 1.12E-41 134.969 cl11404 Biotinyl_lipoyl_domains superfamily - - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#14524 - CGI_10002063 superfamily 241758 2 343 9.96E-133 386.107 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#14528 - CGI_10013713 superfamily 245040 60 89 1.18E-05 39.7648 cl09238 CY superfamily NC - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#14529 - CGI_10013714 superfamily 245040 59 83 0.000183147 36.6832 cl09238 CY superfamily NC - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#14530 - CGI_10013715 superfamily 245040 26 104 9.45E-06 39.5923 cl09238 CY superfamily - - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#14531 - CGI_10013716 superfamily 245040 59 83 0.000355673 36.298 cl09238 CY superfamily NC - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#14532 - CGI_10013717 superfamily 245040 26 104 5.37E-06 40.3627 cl09238 CY superfamily - - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#14533 - CGI_10013718 superfamily 241571 17 77 7.10E-15 64.741 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14535 - CGI_10013720 superfamily 216981 421 542 1.03E-15 75.6469 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#14535 - CGI_10013720 superfamily 217473 790 849 0.000107466 44.2782 cl03978 Mab-21 superfamily NC - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#14537 - CGI_10013722 superfamily 244509 31 105 6.33E-16 73.7515 cl06793 PRKCSH superfamily - - "Glucosidase II beta subunit-like protein; The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. Mutations in the gene coding for PRKCSH have been found to be involved in the development of autosomal dominant polycystic liver disease (ADPLD), but the precise role the protein has in the pathogenesis of this disease is unknown. This family also includes an ER sensor for misfolded glycoproteins and is therefore likely to be a generic sugar binding domain." Q#14538 - CGI_10013723 superfamily 241680 20 196 2.43E-42 145.132 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#14539 - CGI_10013724 superfamily 241564 65 90 2.44E-10 53.0567 cl00035 BIR superfamily N - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#14541 - CGI_10013726 superfamily 242326 13 94 2.04E-15 68.4823 cl01136 DUF393 superfamily C - "Protein of unknown function, DUF393; Members of this family have two highly conserved cysteine residues near their N-terminus. The function of these proteins is unknown." Q#14544 - CGI_10013729 superfamily 222150 125 149 7.45E-05 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14545 - CGI_10013730 superfamily 247903 279 315 0.000411659 39.2021 cl17349 Peptidase_M54 superfamily N - "Peptidase family M54, also called archaemetzincins or archaelysins; Peptidase M54 (archaemetzincin or archaelysin) is a zinc-dependent aminopeptidase that contains the consensus zinc-binding sequence HEXXHXXGXXH/D and a conserved Met residue at the active site, and is thus classified as a metzincin. Archaemetzincins, first identified in archaea, are also found in bacteria and eukaryotes, including two human members, archaemetzincin-1 and -2 (AMZ1 and AMZ2). AMZ1 is mainly found in the liver and heart while AMZ2 is primarily expressed in testis and heart; both have been reported to degrade synthetic substrates and peptides. The Peptidase M54 family contains an extended metzincin concensus sequence of HEXXHXXGX3CX4CXMX17CXXC such that a second zinc ion is bound to four cysteines, thus resembling a zinc finger. Phylogenetic analysis of this family reveals a complex evolutionary process involving a series of lateral gene transfer, gene loss and genetic duplication events." Q#14547 - CGI_10013732 superfamily 241593 135 212 0.000954491 39.9374 cl00075 HATPase_c superfamily C - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#14547 - CGI_10013732 superfamily 244201 1712 1837 7.76E-06 46.4551 cl05797 SMC_hinge superfamily - - SMC proteins Flexible Hinge Domain; This family represents the hinge region of the SMC (Structural Maintenance of Chromosomes) family of proteins. The hinge region is responsible for formation of the DNA interacting dimer. It is also possible that the precise structure of it is an essential determinant of the specificity of the DNA-protein interaction. Q#14551 - CGI_10001474 superfamily 110440 496 522 0.00398192 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#14551 - CGI_10001474 superfamily 241563 75 110 0.00968628 35.3907 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14553 - CGI_10001937 superfamily 243096 312 431 8.75E-17 79.264 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#14556 - CGI_10010544 superfamily 243045 230 311 3.33E-07 49.1687 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#14556 - CGI_10010544 superfamily 241596 23 72 0.000838754 38.3491 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#14556 - CGI_10010544 superfamily 243045 91 148 0.0078949 35.6868 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#14558 - CGI_10010546 superfamily 241802 38 333 2.13E-133 386.869 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#14559 - CGI_10010547 superfamily 241743 21 139 1.19E-30 108.165 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#14560 - CGI_10010548 superfamily 241743 24 140 5.68E-33 114.328 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#14561 - CGI_10010549 superfamily 241743 25 153 2.51E-38 128.581 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#14562 - CGI_10010550 superfamily 241743 356 480 9.46E-41 143.218 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#14562 - CGI_10010550 superfamily 241642 214 273 6.36E-11 58.2758 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#14562 - CGI_10010550 superfamily 241634 36 138 4.94E-10 56.5943 cl00143 SynN superfamily - - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#14563 - CGI_10010551 superfamily 241697 368 624 2.59E-120 377.888 cl00219 Pterin_binding superfamily - - "Pterin binding enzymes. This family includes dihydropteroate synthase (DHPS) and cobalamin-dependent methyltransferases such as methyltetrahydrofolate, corrinoid iron-sulfur protein methyltransferase (MeTr) and methionine synthase (MetH). DHPS, a functional homodimer, catalyzes the condensation of p-aminobenzoic acid (pABA) in the de novo biosynthesis of folate, which is an essential cofactor in both nucleic acid and protein biosynthesis. Prokaryotes (and some lower eukaryotes) must synthesize folate de novo, while higher eukaryotes are able to utilize dietary folate and therefore lack DHPS. Sulfonamide drugs, which are substrate analogs of pABA, target DHPS. Cobalamin-dependent methyltransferases catalyze the transfer of a methyl group via a methyl- cob(III)amide intermediate. These include MeTr, a functional heterodimer, and the folate binding domain of MetH." Q#14563 - CGI_10010551 superfamily 241759 668 892 1.24E-115 363.123 cl00293 B12-binding_like superfamily - - "B12 binding domain (B12-BD). Most of the members bind different cobalamid derivates, like B12 (adenosylcobamide) or methylcobalamin or methyl-Co(III) 5-hydroxybenzimidazolylcobamide. This domain is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. Cobalamin undergoes a conformational change on binding the protein; the dimethylbenzimidazole group, which is coordinated to the cobalt in the free cofactor, moves away from the corrin and is replaced by a histidine contributed by the protein. The sequence Asp-X-His-X-X-Gly, which contains this histidine ligand, is conserved in many cobalamin-binding proteins. Not all members of this family contain the conserved binding motif." Q#14563 - CGI_10010551 superfamily 246616 19 335 5.12E-128 401.3 cl14105 MetH superfamily - - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#14563 - CGI_10010551 superfamily 111813 1084 1220 1.23E-43 156.704 cl03805 Met_synt_B12 superfamily - - "Vitamin B12 dependent methionine synthase, activation domain; Vitamin B12 dependent methionine synthase, activation domain. " Q#14563 - CGI_10010551 superfamily 248312 1284 1428 4.82E-09 56.2077 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#14565 - CGI_10010553 superfamily 243065 1004 1160 1.53E-23 99.8237 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#14565 - CGI_10010553 superfamily 243065 106 262 4.67E-23 98.2829 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#14565 - CGI_10010553 superfamily 243065 555 711 4.67E-23 98.2829 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#14566 - CGI_10010554 superfamily 243065 76 232 1.68E-23 98.6681 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#14566 - CGI_10010554 superfamily 243065 525 681 3.45E-23 97.8977 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#14567 - CGI_10010555 superfamily 243065 50 183 2.38E-15 69.7781 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#14568 - CGI_10010556 superfamily 243065 435 588 9.70E-13 66.2713 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#14568 - CGI_10010556 superfamily 205157 368 396 0.00190783 36.7467 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#14570 - CGI_10010558 superfamily 245332 4 341 1.90E-135 400.729 cl10557 PRK14481 superfamily - - dihydroxyacetone kinase subunit DhaK; Provisional Q#14570 - CGI_10010558 superfamily 243501 408 582 3.04E-37 135.755 cl03685 Dak2 superfamily - - DAK2 domain; This domain is the predicted phosphatase domain of the dihydroxyacetone kinase family. Q#14571 - CGI_10010559 superfamily 245332 4 339 2.79E-100 310.207 cl10557 PRK14481 superfamily - - dihydroxyacetone kinase subunit DhaK; Provisional Q#14571 - CGI_10010559 superfamily 243501 408 582 5.68E-38 138.067 cl03685 Dak2 superfamily - - DAK2 domain; This domain is the predicted phosphatase domain of the dihydroxyacetone kinase family. Q#14572 - CGI_10010560 superfamily 146451 72 88 0.00407828 31.9459 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#14573 - CGI_10010561 superfamily 243501 37 51 0.00132226 36.3739 cl03685 Dak2 superfamily NC - DAK2 domain; This domain is the predicted phosphatase domain of the dihydroxyacetone kinase family. Q#14574 - CGI_10010562 superfamily 241599 167 225 1.06E-20 84.2172 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#14574 - CGI_10010562 superfamily 146451 305 321 0.00242964 35.4127 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#14575 - CGI_10010563 superfamily 241599 155 213 3.46E-22 88.0692 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#14575 - CGI_10010563 superfamily 146451 291 310 0.00699336 33.8719 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#14576 - CGI_10010564 superfamily 241578 1146 1307 6.53E-18 83.495 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14576 - CGI_10010564 superfamily 245213 329 366 2.99E-08 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 903 939 5.69E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 1061 1097 1.03E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 941 977 1.18E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 1099 1135 1.64E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 563 599 1.94E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 291 326 4.58E-06 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 751 787 6.14E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 865 901 9.81E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 368 403 1.39E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 525 560 1.44E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 216 252 3.73E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 254 288 6.34E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 827 863 8.54E-05 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 1019 1058 0.000296629 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 979 1016 0.000803366 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 406 445 0.00234332 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 790 825 0.00528988 36.4606 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 245213 680 711 0.00926151 35.6902 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14576 - CGI_10010564 superfamily 243124 79 211 8.99E-16 76.6969 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#14577 - CGI_10010565 superfamily 241578 103 261 8.35E-39 142.816 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14577 - CGI_10010565 superfamily 245213 320 355 6.44E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 434 470 1.12E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 776 812 3.54E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 358 392 9.15E-05 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 889 920 0.000232447 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 472 510 0.000414775 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 959 995 0.000451088 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 589 624 0.000539922 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 519 548 0.00119285 38.0014 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 395 432 0.00170164 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 851 886 0.00345784 36.4606 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 550 585 0.00600963 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14577 - CGI_10010565 superfamily 245213 283 317 0.00896834 35.305 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14581 - CGI_10009654 superfamily 216049 1 85 0.0095118 33.0282 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#14582 - CGI_10009655 superfamily 247755 9 41 3.34E-14 62.1015 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#14583 - CGI_10009656 superfamily 241578 1 130 3.82E-27 100.829 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14584 - CGI_10009657 superfamily 241578 638 787 8.88E-36 134.727 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14584 - CGI_10009657 superfamily 245213 816 850 1.15E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14584 - CGI_10009657 superfamily 245814 874 946 3.05E-05 43.6319 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14584 - CGI_10009657 superfamily 245814 958 1048 6.78E-07 48.5453 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14587 - CGI_10021808 superfamily 217521 526 959 0 563.835 cl04032 CAS_CSE1 superfamily - - "CAS/CSE protein, C-terminus; Mammalian cellular apoptosis susceptibility (CAS) proteins are homologous to the yeast chromosome-segregation protein, CSE1. This family aligns the C-terminal halves (approximately). CAS is involved in both cellular apoptosis and proliferation. Apoptosis is inhibited in CAS-depleted cells, while the expression of CAS correlates to the degree of cellular proliferation. Like CSE1, it is essential for the mitotic checkpoint in the cell cycle (CAS depletion blocks the cell in the G2 phase), and has been shown to be associated with the microtubule network and the mitotic spindle, as is the protein MEK, which is thought to regulate the intracellular localisation (predominantly nuclear vs. predominantly cytosolic) of CAS. In the nucleus, CAS acts as a nuclear transport factor in the importin pathway. The importin pathway mediates the nuclear transport of several proteins that are necessary for mitosis and further progression. CAS is therefore thought to affect the cell cycle through its effect on the nuclear transport of these proteins. Since apoptosis also requires the nuclear import of several proteins (such as P53 and transcription factors), it has been suggested that CAS also enables apoptosis by facilitating the nuclear import of at least a subset of these essential proteins." Q#14587 - CGI_10021808 superfamily 243689 29 101 2.03E-09 55.3273 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#14590 - CGI_10021811 superfamily 241597 34 105 2.32E-32 116.629 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#14592 - CGI_10021813 superfamily 241594 159 496 6.54E-22 95.7096 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#14594 - CGI_10021815 superfamily 245201 18 176 4.31E-30 111.174 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14595 - CGI_10021816 superfamily 245201 7 125 2.86E-13 69.4709 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14596 - CGI_10021817 superfamily 220131 741 1044 8.98E-66 227.544 cl11721 DUF1943 superfamily - - "Domain of unknown function (DUF1943); Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined." Q#14596 - CGI_10021817 superfamily 243065 2121 2282 5.13E-25 105.176 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#14597 - CGI_10021818 superfamily 241637 38 99 8.31E-12 61.9406 cl00146 TFIIS_I superfamily - - N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme Q#14598 - CGI_10021819 superfamily 217943 10 400 4.55E-32 125.508 cl15508 LTV superfamily - - Low temperature viability protein; The low-temperature viability protein LTV1 is involved in ribosome biogenesis 40S subunit production. Q#14600 - CGI_10021822 superfamily 189650 114 190 2.18E-16 72.6998 cl02913 zf-PARP superfamily - - Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region; Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. Q#14600 - CGI_10021822 superfamily 189650 215 285 5.86E-09 51.899 cl02913 zf-PARP superfamily - - Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region; Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. Q#14600 - CGI_10021822 superfamily 189650 9 52 0.00282051 35.3354 cl02913 zf-PARP superfamily C - Poly(ADP-ribose) polymerase and DNA-Ligase Zn-finger region; Poly(ADP-ribose) polymerase is an important regulatory component of the cellular response to DNA damage. The amino-terminal region of Poly(ADP-ribose) polymerase consists of two PARP-type zinc fingers. This region acts as a DNA nick sensor. Q#14601 - CGI_10021823 superfamily 192472 161 224 1.34E-23 90.7808 cl10874 DUF2373 superfamily - - Uncharacterized conserved protein (DUF2373); This is the C-terminal conserved region of a family of proteins found from fungi to humans. The function is not known. Q#14603 - CGI_10021825 superfamily 206090 144 246 4.33E-36 134.8 cl16477 Asx-hm superfamily - - "Transcriptional enhancer, Asx-hm domain; The Asx-hm domain of the additional sex combs-like 1 proteins contains two putative nuclear receptor co-regulator binding (NR box) motifs. The Asx proteins acts as an enhancer of trithorax and polycomb in displaying bidirectional homoeotic phenotypes in Drosophila, suggesting that it is required for maintenance of both activation and silencing of Hox genes. Asx is required for normal adult haematopoiesis and its function depends on its cellular context." Q#14606 - CGI_10021828 superfamily 241567 51 298 4.92E-94 281.411 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#14607 - CGI_10021829 superfamily 241640 1242 1480 1.03E-75 252.197 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#14607 - CGI_10021829 superfamily 241609 391 469 2.31E-21 91.2855 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#14607 - CGI_10021829 superfamily 243035 76 237 3.65E-09 56.4741 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14607 - CGI_10021829 superfamily 241613 614 645 3.30E-08 51.8238 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#14607 - CGI_10021829 superfamily 241619 553 606 2.13E-07 50.5461 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#14607 - CGI_10021829 superfamily 241613 488 525 2.66E-05 43.3494 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#14607 - CGI_10021829 superfamily 243061 655 756 7.24E-38 139.399 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#14607 - CGI_10021829 superfamily 243061 763 861 7.10E-37 136.318 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#14607 - CGI_10021829 superfamily 243061 973 1071 5.54E-33 125.147 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#14607 - CGI_10021829 superfamily 243061 865 961 6.85E-29 113.591 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#14607 - CGI_10021829 superfamily 243061 1093 1194 1.57E-28 112.435 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#14607 - CGI_10021829 superfamily 241609 334 370 6.12E-09 55.0767 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#14607 - CGI_10021829 superfamily 241609 279 315 2.47E-08 53.1507 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#14607 - CGI_10021829 superfamily 243061 3 59 0.00298449 37.7066 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#14608 - CGI_10021830 superfamily 217359 241 607 3.64E-67 224.912 cl03878 Exo70 superfamily - - "Exo70 exocyst complex subunit; The Exo70 protein forms one subunit of the exocyst complex. First discovered in S. cerevisiae, Exo70 and other exocyst proteins have been observed in several other eukaryotes, including humans. In S. cerevisiae, the exocyst complex is involved in the late stages of exocytosis, and is localised at the tip of the bud, the major site of exocytosis in yeast. Exo70 interacts with the Rho3 GTPase. This interaction mediates one of the three known functions of Rho3 in cell polarity: vesicle docking and fusion with the plasma membrane (the other two functions are regulation of actin polarity and transport of exocytic vesicles from the mother cell to the bud). In humans, the functions of Exo70 and the exocyst complex are less well characterized: Exo70 is expressed in several tissues and is thought to also be involved in exocytosis." Q#14610 - CGI_10021832 superfamily 241580 1 67 3.80E-33 117.655 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#14611 - CGI_10021833 superfamily 241782 244 411 7.09E-16 78.3991 cl00321 AAT_I superfamily NC - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#14613 - CGI_10021835 superfamily 207614 26 59 1.06E-11 61.5575 cl02494 SapA superfamily - - Saposin A-type domain; Saposin A-type domain. Q#14613 - CGI_10021835 superfamily 207614 1083 1113 9.46E-10 56.1647 cl02494 SapA superfamily - - Saposin A-type domain; Saposin A-type domain. Q#14613 - CGI_10021835 superfamily 112314 674 708 1.88E-06 46.4057 cl08395 SapB_2 superfamily - - "Saposin-like type B, region 2; Saposin-like type B, region 2. " Q#14613 - CGI_10021835 superfamily 112314 762 796 1.88E-06 46.4057 cl08395 SapB_2 superfamily - - "Saposin-like type B, region 2; Saposin-like type B, region 2. " Q#14613 - CGI_10021835 superfamily 112314 850 884 1.88E-06 46.4057 cl08395 SapB_2 superfamily - - "Saposin-like type B, region 2; Saposin-like type B, region 2. " Q#14613 - CGI_10021835 superfamily 112314 367 401 3.27E-06 45.6353 cl08395 SapB_2 superfamily - - "Saposin-like type B, region 2; Saposin-like type B, region 2. " Q#14613 - CGI_10021835 superfamily 191220 431 469 1.70E-05 43.7394 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#14613 - CGI_10021835 superfamily 191220 632 670 1.82E-05 43.7394 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#14613 - CGI_10021835 superfamily 112314 1012 1046 3.22E-05 42.9389 cl08395 SapB_2 superfamily - - "Saposin-like type B, region 2; Saposin-like type B, region 2. " Q#14613 - CGI_10021835 superfamily 191220 970 1008 3.62E-05 42.5838 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#14613 - CGI_10021835 superfamily 191220 912 950 3.71E-05 42.5838 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#14613 - CGI_10021835 superfamily 191220 810 846 4.04E-05 42.5838 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#14613 - CGI_10021835 superfamily 191220 722 758 4.04E-05 42.5838 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#14613 - CGI_10021835 superfamily 112314 566 599 7.98E-05 41.7833 cl08395 SapB_2 superfamily - - "Saposin-like type B, region 2; Saposin-like type B, region 2. " Q#14613 - CGI_10021835 superfamily 191220 525 561 0.000212361 40.6578 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#14613 - CGI_10021835 superfamily 112314 473 507 0.000749293 38.7017 cl08395 SapB_2 superfamily - - "Saposin-like type B, region 2; Saposin-like type B, region 2. " Q#14615 - CGI_10021837 superfamily 241584 2 71 1.10E-11 60.9731 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14615 - CGI_10021837 superfamily 241584 86 179 1.24E-06 45.9503 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14616 - CGI_10021838 superfamily 247042 96 238 4.13E-10 58.7897 cl15693 Sema superfamily N - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#14616 - CGI_10021838 superfamily 247999 32 86 0.00724187 33.618 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#14618 - CGI_10021840 superfamily 198867 137 235 5.67E-45 155.009 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#14618 - CGI_10021840 superfamily 243066 26 129 5.37E-35 127.348 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#14618 - CGI_10021840 superfamily 243146 370 414 4.39E-13 64.605 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14618 - CGI_10021840 superfamily 243146 286 332 4.91E-13 64.5019 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14618 - CGI_10021840 superfamily 243146 321 360 7.63E-13 63.8346 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14618 - CGI_10021840 superfamily 243146 476 521 1.50E-11 60.2647 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14618 - CGI_10021840 superfamily 243146 510 555 1.40E-10 57.2862 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14618 - CGI_10021840 superfamily 243146 436 474 4.85E-10 56.0275 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14619 - CGI_10021841 superfamily 243072 57 149 4.26E-12 60.0898 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14621 - CGI_10021843 superfamily 243064 24 190 1.48E-15 70.5222 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#14622 - CGI_10021844 superfamily 222150 796 818 4.32E-05 42.3789 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14622 - CGI_10021844 superfamily 222150 768 793 5.89E-05 41.9937 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14622 - CGI_10021844 superfamily 246975 756 776 0.00527977 36.1709 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#14623 - CGI_10003298 superfamily 241647 89 116 5.05E-10 56.3822 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#14623 - CGI_10003298 superfamily 241647 127 157 2.12E-07 48.6782 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#14623 - CGI_10003298 superfamily 207669 262 311 1.49E-12 64.0026 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#14623 - CGI_10003298 superfamily 207669 615 663 7.35E-06 44.3574 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#14623 - CGI_10003298 superfamily 207669 395 453 0.000688965 38.7089 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#14624 - CGI_10003299 superfamily 216423 28 406 1.07E-138 412.785 cl18367 Glyco_hydro_35 superfamily - - Glycosyl hydrolases family 35; Glycosyl hydrolases family 35. Q#14626 - CGI_10003301 superfamily 222258 499 545 4.04E-05 43.7108 cl18656 AAA_30 superfamily C - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#14627 - CGI_10004441 superfamily 110440 134 160 0.00136103 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#14629 - CGI_10004443 superfamily 219542 56 163 5.52E-35 129.669 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#14629 - CGI_10004443 superfamily 219541 447 618 1.60E-21 91.7611 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#14629 - CGI_10004443 superfamily 215896 223 352 2.31E-14 71.172 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#14631 - CGI_10004445 superfamily 219541 232 405 2.15E-25 101.391 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#14631 - CGI_10004445 superfamily 219542 1 63 3.00E-20 86.5267 cl18517 Cu-oxidase_3 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#14631 - CGI_10004445 superfamily 215896 66 137 2.31E-11 61.542 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#14632 - CGI_10000662 superfamily 216315 185 274 0.000575883 39.3318 cl18364 ART superfamily N - NAD:arginine ADP-ribosyltransferase; NAD:arginine ADP-ribosyltransferase. Q#14633 - CGI_10013311 superfamily 246723 4 123 5.21E-48 163.628 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#14636 - CGI_10013314 superfamily 247825 71 346 7.12E-37 135.614 cl17271 BcrAD_BadFG superfamily - - "BadF/BadG/BcrA/BcrD ATPase family; This family includes the BadF and BadG proteins that are two subunits of Benzoyl-CoA reductase, that may be involved in ATP hydrolysis. The family also includes an activase subunit from the enzyme 2-hydroxyglutaryl-CoA dehydratase. An uncharacterized protein from Aquifex aeolicus contains two copies of this region suggesting that the family may structurally dimerise. This family appears to be related to pfam00370." Q#14637 - CGI_10013315 superfamily 218809 14 114 1.71E-25 93.6569 cl05468 DUF872 superfamily - - Eukaryotic protein of unknown function (DUF872); This family consists of several uncharacterized eukaryotic proteins. The function of this family is unknown. Q#14639 - CGI_10013317 superfamily 245864 4 431 8.88E-98 303.431 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#14640 - CGI_10013318 superfamily 243083 522 617 1.22E-41 149.004 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#14640 - CGI_10013318 superfamily 241597 194 245 8.56E-05 41.8356 cl00082 HMG-box superfamily C - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#14640 - CGI_10013318 superfamily 243091 716 834 2.36E-38 140.547 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#14640 - CGI_10013318 superfamily 243083 2 114 7.24E-24 98.6125 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#14640 - CGI_10013318 superfamily 197795 661 710 1.73E-09 55.4834 cl02673 AWS superfamily - - associated with SET domains; subdomain of PRESET Q#14640 - CGI_10013318 superfamily 247999 477 516 7.30E-07 47.8704 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#14643 - CGI_10013321 superfamily 128778 199 323 1.34E-10 59.5859 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#14643 - CGI_10013321 superfamily 243109 543 740 1.10E-07 51.1296 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#14643 - CGI_10013321 superfamily 241563 143 191 0.00420593 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14644 - CGI_10013322 superfamily 243092 205 481 4.43E-39 144.015 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14646 - CGI_10013324 superfamily 245316 29 120 2.72E-34 118.862 cl10502 nitrobindin superfamily C - "nitrobindin heme-binding domain; Nitrobindin is a heme-containing lipocalin that may reversibly bind nitric oxide. This heme-binding domain forms a beta barrel structure, and in a small family of proteins from tetrapods, it is found C-terminal to a THAP zinc finger domain (a sequence-specific DNA binding domain). Members of this group are putatively related to fatty acid-binding proteins (FABPs)." Q#14647 - CGI_10013325 superfamily 245598 1 238 4.24E-113 354.062 cl11396 Patatin_and_cPLA2 superfamily C - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#14647 - CGI_10013325 superfamily 245598 324 581 1.63E-98 315.927 cl11396 Patatin_and_cPLA2 superfamily N - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#14648 - CGI_10013326 superfamily 246669 28 126 7.53E-29 103.111 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#14649 - CGI_10013327 superfamily 245213 1320 1357 2.21E-08 53.4094 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1397 1434 2.51E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 943 979 3.82E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1627 1663 5.35E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1436 1472 9.17E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 2112 2149 1.65E-06 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1244 1280 1.70E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 449 484 1.73E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1130 1165 2.16E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1474 1510 2.29E-06 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1589 1625 2.30E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 867 903 2.31E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 487 523 4.37E-06 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 563 599 5.61E-06 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1778 1814 6.29E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 639 675 7.50E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1551 1587 9.15E-06 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 525 561 1.05E-05 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1703 1739 1.12E-05 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 2188 2223 1.29E-05 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 601 636 1.44E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 715 750 1.71E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1359 1395 2.28E-05 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 373 408 2.46E-05 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 829 864 2.66E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1513 1549 3.60E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1168 1204 3.74E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 753 788 4.81E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1206 1242 4.93E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 905 941 5.51E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1092 1128 0.00010381 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1055 1089 0.000130017 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1282 1318 0.000169292 41.8534 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 2001 2036 0.000241122 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 2151 2185 0.000396351 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1018 1052 0.000472853 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 2038 2072 0.00051723 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 411 446 0.000812117 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1665 1700 0.00087505 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 791 826 0.00108156 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1964 1998 0.00125924 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1853 1887 0.00547321 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 2075 2109 0.00584228 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1927 1961 0.00618091 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 245213 1890 1924 0.00762863 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14649 - CGI_10013327 superfamily 243060 192 275 6.17E-06 46.9884 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#14649 - CGI_10013327 superfamily 243060 2352 2430 1.09E-05 46.218 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#14649 - CGI_10013327 superfamily 243060 2231 2300 2.28E-05 45.4476 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#14649 - CGI_10013327 superfamily 243119 2642 2690 0.00100187 39.7245 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#14651 - CGI_10013329 superfamily 245864 56 150 1.31E-32 120.846 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#14652 - CGI_10013330 superfamily 243072 1 113 1.21E-16 71.6458 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14653 - CGI_10013331 superfamily 243072 36 152 6.93E-25 95.9134 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14654 - CGI_10013332 superfamily 243072 1 108 1.05E-26 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14656 - CGI_10001118 superfamily 248054 11 53 2.87E-07 43.3529 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#14658 - CGI_10001230 superfamily 241874 12 167 1.29E-46 159.954 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#14659 - CGI_10002821 superfamily 248097 39 159 3.66E-18 76.1498 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#14660 - CGI_10002822 superfamily 247912 32 101 9.54E-15 67.5264 cl17358 Beta-lactamase superfamily C - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#14663 - CGI_10000380 superfamily 241733 1 63 2.29E-26 93.0053 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#14664 - CGI_10000455 superfamily 246723 7 207 7.64E-129 375.363 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#14665 - CGI_10000581 superfamily 216301 20 237 1.57E-55 177.841 cl03099 EMP24_GP25L superfamily - - emp24/gp25L/p24 family/GOLD; Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Q#14666 - CGI_10003924 superfamily 245213 397 434 5.19E-08 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245213 475 512 4.05E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245213 319 354 5.13E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245213 357 394 6.75E-07 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245213 596 634 8.88E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245213 554 594 1.20E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245213 437 473 1.02E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245213 637 672 0.00190312 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245213 515 551 0.00723549 35.6902 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14666 - CGI_10003924 superfamily 245814 240 304 0.000372967 39.6976 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14667 - CGI_10000790 superfamily 245201 3 132 3.32E-55 176.188 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14669 - CGI_10011388 superfamily 218284 26 197 2.97E-45 149.712 cl04786 SOUL superfamily - - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#14670 - CGI_10011389 superfamily 218284 51 106 1.08E-10 54.1827 cl04786 SOUL superfamily C - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#14671 - CGI_10011390 superfamily 247085 274 391 3.44E-11 60.597 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#14671 - CGI_10011390 superfamily 245596 3 260 2.37E-97 300.66 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#14671 - CGI_10011390 superfamily 218284 395 552 4.67E-34 126.6 cl04786 SOUL superfamily - - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#14672 - CGI_10011391 superfamily 218284 1 109 2.90E-25 94.6287 cl04786 SOUL superfamily N - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#14674 - CGI_10011393 superfamily 246681 1 240 1.97E-51 168.476 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#14677 - CGI_10011396 superfamily 222269 49 245 8.46E-45 152.478 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#14679 - CGI_10011398 superfamily 241570 245 357 2.59E-34 123.59 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#14679 - CGI_10011398 superfamily 241570 123 236 2.38E-27 104.33 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#14679 - CGI_10011398 superfamily 213107 5 43 6.21E-14 65.3447 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#14682 - CGI_10011401 superfamily 243066 17 70 0.000542659 34.0657 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#14683 - CGI_10011402 superfamily 247041 7 273 9.69E-89 283.054 cl15692 CE4_SF superfamily - - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#14683 - CGI_10011402 superfamily 247041 524 787 3.11E-59 202.932 cl15692 CE4_SF superfamily - - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#14685 - CGI_10011405 superfamily 241580 13 90 2.43E-47 155.404 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#14686 - CGI_10011406 superfamily 192535 43 145 8.77E-05 42.583 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#14689 - CGI_10023926 superfamily 152088 131 213 9.21E-17 72.2355 cl13155 DUF3259 superfamily - - Protein of unknown function (DUF3259); This eukaryotic family of proteins has no known function. Q#14691 - CGI_10023928 superfamily 218493 429 571 1.89E-38 138.259 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#14691 - CGI_10023928 superfamily 248054 28 57 0.00663554 35.1404 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#14692 - CGI_10023929 superfamily 246669 283 416 2.69E-46 157.36 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#14692 - CGI_10023929 superfamily 246669 149 275 1.19E-30 114.284 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#14693 - CGI_10023930 superfamily 218387 12 229 2.76E-91 278.41 cl18452 Phytochelatin superfamily - - "Phytochelatin synthase; Phytochelatin synthase is the enzyme responsible for the synthesis of heavy-metal-binding peptides (phytochelatins) from glutathione and related thiols. The crystal structure of a member of this family shows it to possess a papain fold. The enzyme catalyzes the deglycination of a GSH donor molecule. The enzyme contains a catalytic triad of cysteine, histidine and aspartate residues." Q#14694 - CGI_10023931 superfamily 246751 132 372 1.41E-64 209.407 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#14695 - CGI_10023932 superfamily 246751 99 337 5.86E-62 201.318 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#14696 - CGI_10023933 superfamily 198867 254 353 3.48E-18 81.2336 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#14696 - CGI_10023933 superfamily 243066 138 245 2.70E-15 73.0353 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#14696 - CGI_10023933 superfamily 243146 577 623 2.34E-09 54.5898 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14696 - CGI_10023933 superfamily 243146 539 588 1.75E-06 46.0123 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14697 - CGI_10023934 superfamily 219000 103 246 3.26E-28 113.51 cl05717 Drf_FH3 superfamily - - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#14697 - CGI_10023934 superfamily 219001 4 97 8.87E-18 82.7419 cl05720 Drf_GBD superfamily N - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#14700 - CGI_10023937 superfamily 219428 1 18 6.35E-07 43.0187 cl06499 PPI_Ypi1 superfamily NC - "Protein phosphatase inhibitor; These proteins include Ypi1, , a novel Saccharomyces cerevisiae type 1 protein phosphatase inhibitor and ppp1r11/hcgv, annotated as having protein phosphatase inhibitor activity." Q#14701 - CGI_10023938 superfamily 243058 535 621 8.68E-10 56.9391 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14701 - CGI_10023938 superfamily 243058 264 367 1.32E-05 44.2276 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#14705 - CGI_10023942 superfamily 151147 26 55 1.03E-08 50.935 cl11240 DUF2475 superfamily C - Protein of unknown function (DUF2475); This family of proteins has no known function. Q#14705 - CGI_10023942 superfamily 151147 259 283 3.49E-05 40.9198 cl11240 DUF2475 superfamily C - Protein of unknown function (DUF2475); This family of proteins has no known function. Q#14706 - CGI_10023943 superfamily 243072 181 337 1.30E-26 106.314 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14707 - CGI_10023944 superfamily 216981 1 28 0.00776577 32.8898 cl17087 OTU superfamily NC - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#14709 - CGI_10023946 superfamily 248264 235 351 4.05E-09 54.1654 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#14711 - CGI_10023948 superfamily 152683 8 100 7.45E-09 48.4381 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#14712 - CGI_10023949 superfamily 241589 92 219 1.26E-25 98.0899 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#14713 - CGI_10023950 superfamily 241589 8 127 7.57E-43 145.469 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#14713 - CGI_10023950 superfamily 241589 170 305 1.95E-32 117.735 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#14716 - CGI_10023953 superfamily 145533 60 159 1.15E-47 162.148 cl03592 Ski_Sno superfamily - - "SKI/SNO/DAC family; This family contains a presumed domain that is about 100 amino acids long. All members of this family contain a conserved CLPQ motif. The c-ski proto-oncogene has been shown to influence proliferation, morphological transformation and myogenic differentiation. Sno, a Ski proto-oncogene homologue, is expressed in two isoforms and plays a role in the response to proliferation stimuli. Dachshund also contains this domain. It is involved in various aspects of development." Q#14717 - CGI_10023954 superfamily 146346 98 331 3.70E-101 300.902 cl04200 UPF0121 superfamily - - Uncharacterized protein family (UPF0121); Uncharacterized integral membrane protein family. Q#14718 - CGI_10023955 superfamily 241596 91 148 1.45E-16 71.4763 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#14720 - CGI_10023957 superfamily 247792 10 61 1.94E-06 44.744 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#14721 - CGI_10023958 superfamily 220249 31 95 1.01E-12 60.6968 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#14721 - CGI_10023958 superfamily 220249 174 241 2.04E-12 59.9264 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#14722 - CGI_10023959 superfamily 220249 54 116 3.78E-13 59.9264 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#14723 - CGI_10023960 superfamily 241567 13 231 1.70E-82 248.669 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#14724 - CGI_10023961 superfamily 246680 27 103 1.44E-14 69.1802 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#14724 - CGI_10023961 superfamily 246680 136 217 5.08E-06 44.5225 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#14724 - CGI_10023961 superfamily 246680 278 331 0.00379101 35.7742 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#14725 - CGI_10023962 superfamily 220691 66 299 0.00902285 36.4418 cl18569 7TM_GPCR_Srv superfamily N - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#14727 - CGI_10023964 superfamily 243072 1 122 2.93E-09 52.3858 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14728 - CGI_10023966 superfamily 243146 82 130 4.49E-08 47.6691 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14732 - CGI_10001204 superfamily 241563 78 114 2.47E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14732 - CGI_10001204 superfamily 245027 111 217 0.00160265 37.758 cl09176 FlgN superfamily C - FlgN protein; This family includes the FlgN protein and export chaperone involved in flagellar synthesis. Q#14734 - CGI_10001493 superfamily 242164 20 117 3.99E-30 108.578 cl00878 Ribosomal_S24e superfamily C - Ribosomal protein S24e; Ribosomal protein S24e. Q#14735 - CGI_10001494 superfamily 220695 96 237 1.70E-05 44.4919 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#14736 - CGI_10001495 superfamily 217293 10 144 7.49E-17 77.2879 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#14736 - CGI_10001495 superfamily 202474 151 201 6.28E-10 57.2785 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#14738 - CGI_10001521 superfamily 243067 110 205 4.06E-08 51.6444 cl02520 REM superfamily - - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#14738 - CGI_10001521 superfamily 243053 315 524 2.39E-22 94.9189 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#14739 - CGI_10001522 superfamily 243179 128 196 2.08E-19 83.1217 cl02781 tetraspanin_LEL superfamily C - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#14739 - CGI_10001522 superfamily 248019 259 317 0.000667719 40.6387 cl17465 DAGK_cat superfamily N - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#14740 - CGI_10001337 superfamily 248028 44 155 9.68E-06 42.4905 cl17474 Steroid_dh superfamily C - "3-oxo-5-alpha-steroid 4-dehydrogenase; This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalyzed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants is DET2, a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development." Q#14743 - CGI_10020256 superfamily 242280 53 82 0.000204109 34.5818 cl01066 Trm112p superfamily N - "Trm112p-like protein; The function of this family is uncertain. The bacterial members are about 60-70 amino acids in length and the eukaryotic examples are about 120 amino acids in length. The C terminus contains the strongest conservation. Trm112p is required for tRNA methylation in S. cerevisiae and is found in complexes with 2 tRNA methylases (TRM9 and TRM11) also with putative methyltransferase YDR140W. The zinc-finger protein Ynr046w is plurifunctional and a component of the eRF1 methyltransferase in yeast. The crystal structure of Ynr046w has been determined to 1.7 A resolution. It comprises a zinc-binding domain built from both the N- and C-terminal sequences and an inserted domain, absent from bacterial and archaeal orthologs of the protein, composed of three alpha-helices." Q#14744 - CGI_10020257 superfamily 241647 23 40 0.00485621 34.0406 cl00157 WW superfamily N - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#14745 - CGI_10020258 superfamily 241647 8 39 5.91E-07 46.7522 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#14746 - CGI_10020259 superfamily 241644 52 143 1.48E-17 75.4084 cl00154 UBCc superfamily N - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#14747 - CGI_10020260 superfamily 248338 159 279 0.000808358 39.5069 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#14748 - CGI_10020261 superfamily 244824 49 507 0 617.706 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#14749 - CGI_10020262 superfamily 241559 40 154 3.45E-22 87.7515 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#14751 - CGI_10020264 superfamily 243092 381 683 1.51E-35 140.163 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14751 - CGI_10020264 superfamily 243035 2694 2812 6.34E-28 112.713 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 243035 2250 2364 3.35E-27 110.402 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 245847 1993 2134 1.74E-26 109.364 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#14751 - CGI_10020264 superfamily 243035 2988 3098 3.91E-26 107.32 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 243035 2537 2658 7.38E-26 106.55 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 243035 2842 2959 6.70E-24 100.772 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 245847 1842 1977 6.67E-23 98.9636 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#14751 - CGI_10020264 superfamily 243035 3125 3245 2.85E-22 96.1497 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 243035 1567 1670 2.39E-21 93.4533 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 245847 1678 1822 5.62E-20 90.104 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#14751 - CGI_10020264 superfamily 243035 984 1112 7.64E-20 88.9285 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 243035 2392 2507 8.46E-20 88.8309 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 243035 1413 1531 1.18E-19 88.4457 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 243035 1263 1384 6.93E-19 86.5197 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 243035 1132 1248 3.26E-13 69.5709 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14751 - CGI_10020264 superfamily 247856 53 117 7.54E-07 49.4685 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#14751 - CGI_10020264 superfamily 243092 556 862 1.87E-20 94.7092 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14751 - CGI_10020264 superfamily 243093 2144 2222 2.86E-12 66.0146 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#14751 - CGI_10020264 superfamily 243092 196 539 3.11E-12 68.9008 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14752 - CGI_10020265 superfamily 241568 108 160 1.46E-06 47.8428 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#14752 - CGI_10020265 superfamily 245213 1667 1699 0.000214159 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14752 - CGI_10020265 superfamily 245213 1278 1320 0.000291575 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14752 - CGI_10020265 superfamily 245213 1500 1532 0.000667711 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14752 - CGI_10020265 superfamily 219525 1698 1742 8.25E-07 48.5693 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#14752 - CGI_10020265 superfamily 241578 1454 1497 8.74E-06 47.7648 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14752 - CGI_10020265 superfamily 241578 1233 1273 1.31E-05 47.3796 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14752 - CGI_10020265 superfamily 241578 1758 1798 4.55E-05 45.8388 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14752 - CGI_10020265 superfamily 219525 2003 2039 8.74E-05 42.7914 cl06646 GCC2_GCC3 superfamily C - GCC2 and GCC3; GCC2 and GCC3. Q#14752 - CGI_10020265 superfamily 219525 2071 2119 0.000211637 41.6358 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#14752 - CGI_10020265 superfamily 241578 1796 1838 0.000553478 42.372 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14752 - CGI_10020265 superfamily 221695 1646 1670 0.000901014 39.3606 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#14752 - CGI_10020265 superfamily 222150 718 743 0.00114938 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14752 - CGI_10020265 superfamily 221695 1562 1585 0.00152551 38.5902 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#14752 - CGI_10020265 superfamily 241578 1362 1403 0.00204813 40.446 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14752 - CGI_10020265 superfamily 241568 51 103 0.00339288 37.8913 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#14752 - CGI_10020265 superfamily 241578 1837 1880 0.00449847 39.6756 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14753 - CGI_10020266 superfamily 248022 19 429 1.94E-40 150.121 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#14754 - CGI_10020267 superfamily 247792 16 66 5.10E-08 50.9072 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#14754 - CGI_10020267 superfamily 241563 165 195 0.000430471 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14755 - CGI_10020268 superfamily 241868 171 282 1.47E-51 169.023 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#14759 - CGI_10020274 superfamily 247755 159 379 6.06E-135 388.39 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#14760 - CGI_10020275 superfamily 243092 726 993 6.05E-14 72.3676 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14760 - CGI_10020275 superfamily 243092 550 831 3.84E-06 48.4852 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14760 - CGI_10020275 superfamily 243092 366 444 5.55E-06 48.1 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14761 - CGI_10020276 superfamily 247856 55 92 0.00244352 33.2901 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#14765 - CGI_10002114 superfamily 219579 23 69 9.75E-06 44.0966 cl16001 Afi1 superfamily - - "Docking domain of Afi1 for Arf3 in vesicle trafficking; This domain occurs at the N-terminal of Afi1, a protein necessary for vesicle trafficking in yeast. This domain is the interacting region of the protein which binds to Arf3. Afi1 is distributed asymmetrically at the plasma membrane and is required for polarized distribution of Arf3 but not of an Arf3 guanine nucleotide-exchange factor, Yel1p. However, Afi1 is not required for targeting of Arf3 or Yel1p to the plasma membrane. Afi1 functions as an Arf3 polarization-specific adapter and participates in development of polarity. Although Arf3 is the homologue of human Arf6 it does not function in the same way, not being necessary for endocytosis or for mating factor receptor internalisation. In the S phase, however, it is concentrated at the plasma membrane of the emerging bud. Because of its polarized localisation and its critical function in the normal budding pattern of yeast, Arf3 is probably a regulator of vesicle trafficking, which is important for polarized growth." Q#14767 - CGI_10002116 superfamily 242748 209 280 0.00267773 36.6905 cl01853 COG4467 superfamily C - "Regulator of replication initiation timing [Replication, recombination, and repair]" Q#14770 - CGI_10020281 superfamily 243072 1 98 1.87E-18 76.2682 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14771 - CGI_10020282 superfamily 243082 225 240 3.62E-05 43.3155 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#14772 - CGI_10020283 superfamily 241594 1509 1828 1.00E-88 295.243 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#14772 - CGI_10020283 superfamily 243072 851 980 3.81E-19 86.2834 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 757 869 2.65E-16 78.1942 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 670 816 1.94E-15 75.4978 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 994 1102 2.54E-15 75.1126 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 617 707 7.98E-11 62.0158 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 1176 1264 8.01E-10 58.9342 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 86 199 4.53E-05 43.9115 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 189 304 0.000279223 41.6003 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 281 404 0.00081694 40.0595 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14772 - CGI_10020283 superfamily 243072 1298 1371 0.00175482 38.9039 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14773 - CGI_10020284 superfamily 245864 27 418 9.07E-95 310.75 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#14773 - CGI_10020284 superfamily 243068 624 895 4.20E-10 60.2516 cl02523 Zona_pellucida superfamily - - Zona pellucida-like domain; Zona pellucida-like domain. Q#14774 - CGI_10020285 superfamily 247725 18 235 2.31E-30 121.307 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#14774 - CGI_10020285 superfamily 247725 236 423 3.30E-18 85.4833 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#14775 - CGI_10020286 superfamily 247727 51 159 7.85E-06 44.7283 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#14775 - CGI_10020286 superfamily 247727 474 587 0.000905784 38.1799 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#14776 - CGI_10020287 superfamily 242206 43 158 2.59E-31 115.588 cl00938 Rieske superfamily - - "Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis." Q#14777 - CGI_10020288 superfamily 242206 124 239 2.91E-33 121.751 cl00938 Rieske superfamily - - "Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis." Q#14779 - CGI_10020290 superfamily 243035 849 960 8.52E-26 104.624 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14779 - CGI_10020290 superfamily 243035 977 1088 5.42E-23 96.5349 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14779 - CGI_10020290 superfamily 245213 507 537 0.00213745 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14779 - CGI_10020290 superfamily 246918 658 710 4.53E-14 68.7675 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14779 - CGI_10020290 superfamily 246918 716 768 1.37E-12 64.5303 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14779 - CGI_10020290 superfamily 246918 618 653 0.00249469 37.1811 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14780 - CGI_10020291 superfamily 246918 178 230 1.09E-11 60.2931 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14780 - CGI_10020291 superfamily 246918 292 343 3.11E-11 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14780 - CGI_10020291 superfamily 246918 235 287 1.58E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14780 - CGI_10020291 superfamily 246918 121 173 2.45E-10 56.4411 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14780 - CGI_10020291 superfamily 246918 357 401 6.96E-06 43.7295 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14780 - CGI_10020291 superfamily 246918 64 93 0.000319552 38.7219 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14780 - CGI_10020291 superfamily 246918 7 36 0.000373196 38.3367 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14781 - CGI_10020292 superfamily 243035 116 227 3.35E-28 110.402 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14781 - CGI_10020292 superfamily 243035 2 99 5.79E-21 89.6013 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14781 - CGI_10020292 superfamily 246918 578 629 3.92E-12 62.6043 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14781 - CGI_10020292 superfamily 246918 407 459 1.43E-10 57.9819 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14781 - CGI_10020292 superfamily 246918 302 345 3.50E-08 51.0483 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14781 - CGI_10020292 superfamily 246918 635 686 2.71E-05 42.5739 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14781 - CGI_10020292 superfamily 246918 530 558 0.000207214 40.2627 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14781 - CGI_10020292 superfamily 246918 253 288 0.0014021 37.5663 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14782 - CGI_10020294 superfamily 246918 14 66 1.72E-14 64.1451 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14783 - CGI_10020295 superfamily 216653 354 494 5.73E-20 86.1118 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#14783 - CGI_10020295 superfamily 216653 100 181 9.13E-17 76.8671 cl08331 Na_Ca_ex superfamily C - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#14784 - CGI_10020296 superfamily 219086 2 144 7.65E-18 76.0724 cl05857 BNIP3 superfamily - - "BNIP3; This family consists of several mammalian specific BCL2/adenovirus E1B 19-kDa protein-interacting protein 3 or BNIP3 sequences. BNIP3 belongs to the Bcl-2 homology 3 (BH3)-only family, a Bcl-2-related family possessing an atypical Bcl-2 homology 3 (BH3) domain, which regulates PCD from mitochondrial sites by selective Bcl-2/Bcl-XL interactions. BNIP3 family members contain a C-terminal transmembrane domain that is required for their mitochondrial localisation, homodimerisation, as well as regulation of their pro-apoptotic activities. BNIP3-mediated apoptosis has been reported to be independent of caspase activation and cytochrome c release and is characterized by early plasma membrane and mitochondrial damage, prior to the appearance of chromatin condensation or DNA fragmentation." Q#14785 - CGI_10020297 superfamily 238012 220 271 1.38E-10 58.5198 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#14785 - CGI_10020297 superfamily 238012 120 172 8.71E-08 50.4306 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#14785 - CGI_10020297 superfamily 238012 173 212 4.04E-05 42.3414 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#14785 - CGI_10020297 superfamily 245201 732 935 9.50E-51 180.04 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14785 - CGI_10020297 superfamily 247038 306 402 8.09E-07 47.9893 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#14785 - CGI_10020297 superfamily 247038 404 464 1.34E-05 44.3301 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#14792 - CGI_10020304 superfamily 243073 311 354 0.00215091 35.5491 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#14793 - CGI_10001789 superfamily 241584 802 882 5.92E-10 59.0471 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14793 - CGI_10001789 superfamily 241584 693 767 2.63E-06 47.8763 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14793 - CGI_10001789 superfamily 241584 1475 1560 1.96E-05 45.1799 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14793 - CGI_10001789 superfamily 241584 331 420 5.35E-05 44.0243 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14793 - CGI_10001789 superfamily 241584 926 1018 0.000263697 41.7131 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14793 - CGI_10001789 superfamily 241584 425 514 0.000996139 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14793 - CGI_10001789 superfamily 241832 2001 2125 0.00130823 39.527 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#14793 - CGI_10001789 superfamily 241584 1046 1131 0.00482509 37.5941 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14794 - CGI_10003044 superfamily 243092 70 107 1.52E-05 42.2994 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14794 - CGI_10003044 superfamily 243092 377 439 0.000943363 39.6256 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14796 - CGI_10007526 superfamily 241563 135 171 0.00103409 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14797 - CGI_10007527 superfamily 243092 157 276 0.000682227 39.6256 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14798 - CGI_10007528 superfamily 217953 43 285 8.62E-66 210.055 cl04440 DUF410 superfamily - - Protein of unknown function (DUF410); This family of proteins is from Caenorhabditis elegans and has no known function. The protein has some GO references indicating that the protein has a positive regulation of growth rate and is involved in nematode larval development. Q#14799 - CGI_10007529 superfamily 241599 104 162 9.77E-25 95.388 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#14802 - CGI_10007533 superfamily 241568 49 77 0.00345078 34.4245 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#14803 - CGI_10007534 superfamily 214531 293 333 4.90E-10 54.5301 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#14803 - CGI_10007534 superfamily 215683 267 307 7.80E-06 42.5423 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#14807 - CGI_10002798 superfamily 242889 208 277 2.50E-12 61.1039 cl02111 PCI superfamily N - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#14810 - CGI_10003195 superfamily 241578 1065 1226 1.57E-40 150.52 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 41 202 1.05E-38 145.127 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 664 825 1.05E-38 145.127 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 1902 2063 3.03E-31 123.556 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 2770 2934 8.56E-24 101.599 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 1291 1436 1.94E-22 97.7474 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 241 401 2.48E-20 91.5842 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 2128 2279 3.94E-20 91.199 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 452 606 2.70E-18 85.421 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 1704 1844 1.58E-17 83.495 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 1492 1648 3.35E-17 82.3394 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 245213 2085 2120 2.54E-12 64.9654 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14810 - CGI_10003195 superfamily 245213 2310 2345 3.80E-11 61.4986 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14810 - CGI_10003195 superfamily 241578 2578 2747 1.92E-39 147.433 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 860 1012 2.89E-24 103.622 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 241578 2353 2507 8.95E-20 90.2064 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14810 - CGI_10003195 superfamily 245213 2538 2571 1.38E-07 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14812 - CGI_10003771 superfamily 217643 1 262 1.25E-134 391.906 cl04182 Solute_trans_a superfamily - - "Organic solute transporter Ostalpha; This family is a transmembrane organic solute transport protein. In vertebrates these proteins form a complex with Ostbeta, and function as bile transporters. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death." Q#14813 - CGI_10003772 superfamily 219226 338 535 5.66E-36 132.423 cl06118 Senescence superfamily - - "Senescence-associated protein; This family contains a number of plant senescence-associated proteins of approximately 450 residues in length. In Hemerocallis, petals have a genetically based program that leads to senescence and cell death approximately 24 hours after the flower opens, and it is believed that senescence proteins produced around that time have a role in this program. This family extends to the higher vertebrates where the full-length protein is often a Spartin, associated with mitochondrial membranes and transportation along microtubules." Q#14813 - CGI_10003772 superfamily 241764 40 115 6.79E-06 44.1907 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#14814 - CGI_10003773 superfamily 247068 1771 1871 6.53E-31 121.266 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 1980 2076 3.30E-30 118.955 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 1667 1763 3.83E-29 115.874 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2195 2291 1.94E-28 113.948 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2936 3033 1.59E-26 108.555 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 997 1092 2.45E-26 107.784 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 1315 1410 2.59E-26 107.784 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 1209 1307 8.24E-26 106.244 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 1883 1972 2.90E-25 104.703 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2085 2186 1.13E-23 100.081 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2506 2606 1.90E-23 99.3101 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 584 684 2.49E-23 99.3101 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 479 575 4.22E-23 98.5397 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 1420 1519 1.16E-21 94.3025 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2829 2924 7.06E-21 91.9913 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 1100 1201 7.95E-21 91.9913 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 3041 3141 8.45E-21 91.9913 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 3257 3354 4.12E-20 89.6801 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2613 2710 2.04E-19 87.7541 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247907 3833 3998 5.35E-19 88.244 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#14814 - CGI_10003773 superfamily 247068 3150 3248 7.96E-19 86.2133 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2406 2498 5.31E-18 83.5169 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 252 356 1.45E-17 82.3613 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 370 470 1.68E-17 82.3613 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 245213 3753 3789 7.26E-17 78.8326 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14814 - CGI_10003773 superfamily 247068 796 885 7.28E-16 77.3537 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 1570 1658 3.61E-15 75.4277 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 3364 3465 3.85E-15 75.4277 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2717 2820 2.29E-14 73.1165 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 144 244 2.77E-14 72.7313 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 693 783 1.54E-13 70.8053 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 896 988 1.72E-11 64.6421 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 44 131 1.11E-10 62.3309 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247068 2308 2391 4.18E-10 60.4049 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#14814 - CGI_10003773 superfamily 247907 4064 4224 6.58E-10 60.8949 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#14814 - CGI_10003773 superfamily 245213 3793 3827 4.07E-08 53.4094 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14814 - CGI_10003773 superfamily 245213 4027 4059 8.03E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14814 - CGI_10003773 superfamily 245213 3728 3751 0.000983907 40.3126 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14815 - CGI_10003774 superfamily 241578 22 192 8.89E-38 133.507 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14816 - CGI_10001913 superfamily 245847 24 165 1.92E-09 55.5863 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#14817 - CGI_10002289 superfamily 201885 59 226 1.72E-37 131.206 cl12249 I_LWEQ superfamily - - "I/LWEQ domain; I/LWEQ domains bind to actin. It has been shown that the I/LWEQ domains from mouse talin and yeast Sla2p interact with F-actin. I/LWEQ domains can be placed into four major groups based on sequence similarity: (1) Metazoan talin; (2) Dictyostelium TalA/TalB and SLA110; (3) metazoan Hip1p; and (4) yeast Sla2p. The domain has four conserved blocks, the name of the domain is derived from the initial conserved amino acid of each of the four blocks." Q#14818 - CGI_10001177 superfamily 222313 109 160 1.62E-08 51.4238 cl18662 Methyltransf_32 superfamily C - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#14820 - CGI_10007313 superfamily 247725 154 274 1.57E-66 218.763 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#14820 - CGI_10007313 superfamily 246669 40 135 0.000346526 40.1279 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#14820 - CGI_10007313 superfamily 218976 470 562 2.57E-36 133.728 cl05671 DUF1041 superfamily - - "Domain of Unknown Function (DUF1041); This family consists of several eukaryotic domains of unknown function. Members of this family are often found in tandem repeats and co-occur with pfam00168, pfam00130 and pfam00169 domains." Q#14823 - CGI_10003433 superfamily 247727 49 111 2.08E-25 96.3757 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#14824 - CGI_10003434 superfamily 205431 296 478 1.07E-61 208.704 cl16184 DUF4042 superfamily - - "Domain of unknown function (DUF4042); This presumed domain is functionally uncharacterized. This domain family is found in eukaryotes, and is approximately 180 amino acids in length." Q#14825 - CGI_10003435 superfamily 222150 456 480 2.06E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14825 - CGI_10003435 superfamily 222150 428 449 2.32E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14825 - CGI_10003435 superfamily 222150 343 367 3.33E-05 42.3789 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14825 - CGI_10003435 superfamily 222150 370 395 8.94E-05 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14825 - CGI_10003435 superfamily 222150 569 594 0.000533044 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14825 - CGI_10003435 superfamily 222150 398 421 0.00733108 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#14828 - CGI_10004692 superfamily 219669 196 279 3.45E-07 46.6171 cl06832 Integrin_B_tail superfamily - - Integrin beta tail domain; This is the beta tail domain of the Integrin protein. Integrins are receptors which are involved in cell-cell and cell-extracellular matrix interactions. Q#14829 - CGI_10004693 superfamily 248247 53 123 0.00158819 37.3001 cl17693 Integrin_beta superfamily C - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#14830 - CGI_10004694 superfamily 246723 370 1004 0 791.774 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#14831 - CGI_10004695 superfamily 243109 1 164 1.29E-115 327.923 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#14833 - CGI_10003065 superfamily 241691 417 537 0.000204367 41.3436 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#14835 - CGI_10026021 superfamily 247724 512 714 4.58E-35 134.069 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#14835 - CGI_10026021 superfamily 248054 6 221 3.00E-10 59.6235 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#14836 - CGI_10026022 superfamily 248363 7 131 7.21E-39 129.27 cl17809 Sedlin_N superfamily - - "Sedlin, N-terminal conserved region; Mutations in this protein are associated with the X-linked spondyloepiphyseal dysplasia tarda syndrome (OMIM:313400). This family represents an N-terminal conserved region." Q#14837 - CGI_10026023 superfamily 243173 389 603 1.22E-113 350.101 cl02774 Topoisomer_IB_N superfamily - - "Topoisomer_IB_N: N-terminal DNA binding fragment found in eukaryotic DNA topoisomerase (topo) IB proteins similar to the monomeric yeast and human topo I and heterodimeric topo I from Leishmania donvanni. Topo I enzymes are divided into: topo type IA (bacterial) and type IB (eukaryotic). Topo I relaxes superhelical tension in duplex DNA by creating a single-strand nick, the broken strand can then rotate around the unbroken strand to remove DNA supercoils and, the nick is religated, liberating topo I. These enzymes regulate the topological changes that accompany DNA replication, transcription and other nuclear processes. Human topo I is the target of a diverse set of anticancer drugs including camptothecins (CPTs). CPTs bind to the topo I-DNA complex and inhibit re-ligation of the single-strand nick, resulting in the accumulation of topo I-DNA adducts. In addition to differences in structure and some biochemical properties, Trypanosomatid parasite topo I differ from human topo I in their sensitivity to CPTs and other classical topo I inhibitors. Trypanosomatid topos I play putative roles in organizing the kinetoplast DNA network unique to these parasites. This family may represent more than one structural domain." Q#14837 - CGI_10026023 superfamily 241691 613 812 6.63E-78 254.114 cl00213 DNA_BRE_C superfamily - - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#14837 - CGI_10026023 superfamily 206538 867 938 1.55E-36 133.121 cl16833 Topo_C_assoc superfamily - - C-terminal topoisomerase domain; This domain is found at the C-terminal of topoisomerase and other similar enzymes. Q#14838 - CGI_10026024 superfamily 203031 97 157 3.10E-05 40.3892 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#14839 - CGI_10026025 superfamily 241748 178 411 1.51E-92 296.95 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#14839 - CGI_10026025 superfamily 219951 521 681 3.98E-68 225.191 cl07315 SPT16 superfamily - - FACT complex subunit (SPT16/CDC68); Proteins in this family are subunits the FACT complex. The FACT complex plays a role in transcription initiation and promotes binding of TATA-binding protein (TBP) to a TATA box in chromatin. Q#14839 - CGI_10026025 superfamily 247725 798 888 8.97E-27 106.456 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#14843 - CGI_10026029 superfamily 245213 168 204 6.90E-08 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14843 - CGI_10026029 superfamily 245213 130 166 1.99E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14843 - CGI_10026029 superfamily 245213 207 242 2.56E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#14845 - CGI_10026031 superfamily 247755 50 259 1.20E-72 235.628 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#14845 - CGI_10026031 superfamily 247789 386 593 5.85E-30 117.745 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#14848 - CGI_10026034 superfamily 246908 179 260 6.42E-05 42.4427 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#14848 - CGI_10026034 superfamily 246680 972 1048 0.00465668 36.5446 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#14848 - CGI_10026034 superfamily 247724 326 558 1.77E-14 72.3687 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#14848 - CGI_10026034 superfamily 246908 56 147 0.00128676 38.3639 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#14848 - CGI_10026034 superfamily 247803 244 359 0.00850285 36.5938 cl17249 YlqF_related_GTPase superfamily - - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#14849 - CGI_10026035 superfamily 243072 15 158 6.09E-27 103.617 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14849 - CGI_10026035 superfamily 247792 333 369 0.00974259 33.5732 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#14849 - CGI_10026035 superfamily 247792 242 285 3.69E-08 49.6856 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#14849 - CGI_10026035 superfamily 247792 195 238 1.90E-05 41.5964 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#14850 - CGI_10026036 superfamily 243072 501 620 8.59E-32 120.566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#14850 - CGI_10026036 superfamily 241760 94 138 2.61E-26 102.537 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#14850 - CGI_10026036 superfamily 115363 26 85 1.19E-24 98.2129 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#14850 - CGI_10026036 superfamily 115363 165 232 3.59E-24 97.0573 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#14852 - CGI_10026038 superfamily 242206 8 99 1.88E-15 67.8991 cl00938 Rieske superfamily - - "Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis." Q#14853 - CGI_10026039 superfamily 246597 18 312 3.39E-86 263.001 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#14855 - CGI_10026041 superfamily 241607 483 519 1.77E-06 45.7238 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#14855 - CGI_10026041 superfamily 241607 532 559 0.000338376 39.1754 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#14855 - CGI_10026041 superfamily 241607 626 643 0.000548575 38.405 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#14855 - CGI_10026041 superfamily 217473 46 333 7.49E-06 46.9745 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#14857 - CGI_10026043 superfamily 247724 46 222 6.66E-122 346.958 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#14860 - CGI_10026046 superfamily 243092 141 263 4.09E-08 52.7224 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14861 - CGI_10026047 superfamily 247058 34 227 1.29E-74 228.984 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#14862 - CGI_10026048 superfamily 241832 363 464 8.60E-51 170.045 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#14862 - CGI_10026048 superfamily 241832 25 124 8.25E-41 142.75 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#14862 - CGI_10026048 superfamily 241832 135 227 2.32E-21 88.9338 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#14862 - CGI_10026048 superfamily 241832 272 341 3.21E-15 71.5338 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#14863 - CGI_10026049 superfamily 150787 32 540 9.20E-153 450.041 cl10851 DUF2362 superfamily - - Uncharacterized conserved protein (DUF2362); This is a family of proteins conserved from nematodes to humans. The function is not known. Q#14865 - CGI_10026051 superfamily 245599 522 695 2.70E-38 140.436 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#14865 - CGI_10026051 superfamily 207662 42 114 8.06E-35 127.293 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#14866 - CGI_10026052 superfamily 218721 39 291 3.18E-32 126.462 cl05344 TROVE superfamily C - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#14866 - CGI_10026052 superfamily 218721 282 479 3.07E-13 69.8374 cl05344 TROVE superfamily N - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#14867 - CGI_10026053 superfamily 218721 13 351 3.24E-51 178.078 cl05344 TROVE superfamily - - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#14868 - CGI_10026054 superfamily 218721 2 341 7.16E-58 197.724 cl05344 TROVE superfamily - - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#14868 - CGI_10026054 superfamily 241578 345 478 5.24E-05 42.4416 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14869 - CGI_10026055 superfamily 242102 1 129 6.75E-32 115.837 cl00801 FlaRed superfamily - - "Flavin Reductases; Flavin reductase catalyzes the reduction of FMN or FAD using NAD(P)H as a cofactor. It is part of a two-component enzyme system, composed of a smaller component (flavin reductase) and a larger component (an oxidase), which uses the reduced flavin to hydroxylate substrates." Q#14870 - CGI_10026056 superfamily 245212 484 526 4.82E-07 47.6294 cl09940 S4 superfamily C - "S4/Hsp/ tRNA synthetase RNA-binding domain; The domain surface is populated by conserved, charged residues that define a likely RNA-binding site; Found in stress proteins, ribosomal proteins and tRNA synthetases; This may imply a hitherto unrecognized functional similarity between these three protein classes." Q#14870 - CGI_10026056 superfamily 215762 385 483 1.80E-26 103.559 cl02814 Ribosomal_S4 superfamily - - Ribosomal protein S4/S9 N-terminal domain; This family includes small ribosomal subunit S9 from prokaryotes and S16 from metazoans. This domain is predicted to bind to ribosomal RNA. This domain is composed of four helices in the known structure. However the domain is discontinuous in sequence and the alignment for this family contains only the first three helices. Q#14872 - CGI_10026058 superfamily 245087 4 256 1.49E-98 291.07 cl09515 PCNA superfamily - - "Proliferating Cell Nuclear Antigen (PCNA) domain found in eukaryotes and archaea. These polymerase processivity factors play a role in DNA replication and repair. PCNA encircles duplex DNA in its central cavity, providing a DNA-bound platform for the attachment of the polymerase. The trimeric PCNA ring is structurally similar to the dimeric ring formed by the DNA polymerase processivity factors in bacteria (beta subunit DNA polymerase III holoenzyme) and in bacteriophages (catalytic subunits in T4 and RB69). This structural correspondence further substantiates the mechanistic connection between eukaryotic and prokaryotic DNA replication that has been suggested on biochemical grounds. PCNA is also involved with proteins involved in cell cycle processes such as DNA repair and apoptosis. Many of these proteins contain a highly conserved motif known as the PIP-box (PCNA interacting protein box) which contains the sequence Qxx[LIM]xxF[FY]." Q#14874 - CGI_10026060 superfamily 246680 825 907 1.61E-39 142.073 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#14874 - CGI_10026060 superfamily 194336 490 589 5.75E-25 101.168 cl02517 ZU5 superfamily - - ZU5 domain; Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function. Q#14874 - CGI_10026060 superfamily 245814 76 160 3.78E-16 75.5093 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#14874 - CGI_10026060 superfamily 246918 222 270 4.58E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14874 - CGI_10026060 superfamily 246918 173 217 2.10E-08 52.2039 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14878 - CGI_10026064 superfamily 245206 10 340 5.58E-162 461.202 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#14879 - CGI_10026065 superfamily 245206 88 271 1.77E-78 245.876 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#14879 - CGI_10026065 superfamily 245206 12 84 1.77E-05 44.2971 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#14881 - CGI_10026067 superfamily 110440 475 502 0.000532649 37.7725 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#14881 - CGI_10026067 superfamily 241563 50 86 0.00564561 35.0055 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14882 - CGI_10026068 superfamily 245201 18 269 9.87E-71 227.403 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14882 - CGI_10026068 superfamily 248058 357 485 4.41E-60 195.35 cl17504 CaMKII_AD superfamily - - Calcium/calmodulin dependent protein kinase II Association; This domain is found at the C-terminus of the Calcium/calmodulin dependent protein kinases II (CaMKII). These proteins also have a Ser/Thr protein kinase domain (pfam00069) at their N-terminus. The function of the CaMKII association domain is the assembly of the single proteins into large (8 to 14 subunits) multimers. Q#14883 - CGI_10026069 superfamily 246918 46 98 5.26E-14 64.9155 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14883 - CGI_10026069 superfamily 246918 160 212 4.10E-13 62.6043 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14883 - CGI_10026069 superfamily 246918 103 155 5.36E-11 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#14884 - CGI_10026070 superfamily 243035 340 415 5.12E-15 71.1117 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14884 - CGI_10026070 superfamily 243035 212 321 3.02E-22 91.2121 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#14886 - CGI_10026072 superfamily 246597 38 342 0 682.492 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#14887 - CGI_10026074 superfamily 215647 620 819 2.31E-11 63.3964 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#14889 - CGI_10026076 superfamily 241584 944 1029 7.60E-05 42.0983 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#14889 - CGI_10026076 superfamily 216686 10 187 3.48E-46 165.574 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#14889 - CGI_10026076 superfamily 222429 216 270 4.88E-06 45.6944 cl18676 Myb_DNA-bind_5 superfamily N - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#14889 - CGI_10026076 superfamily 241571 884 938 2.88E-05 43.5551 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14889 - CGI_10026076 superfamily 241571 695 785 3.43E-05 43.5328 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14889 - CGI_10026076 superfamily 241571 537 619 0.00230641 37.7771 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14890 - CGI_10026077 superfamily 245248 102 566 1.21E-69 233.659 cl10080 RPE65 superfamily - - "Retinal pigment epithelial membrane protein; This family represents a retinal pigment epithelial membrane receptor which is abundantly expressed in retinal pigment epithelium, and binds plasma retinal binding protein. The family also includes the sequence related neoxanthin cleavage enzyme in plants and lignostilbene-alpha,beta-dioxygenase in bacteria." Q#14891 - CGI_10026078 superfamily 241599 387 444 8.12E-25 97.6992 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#14891 - CGI_10026078 superfamily 217293 28 229 4.35E-30 116.963 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#14891 - CGI_10026078 superfamily 202474 236 313 1.91E-12 65.7529 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#14892 - CGI_10026079 superfamily 247727 123 220 2.96E-15 68.6106 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#14894 - CGI_10026081 superfamily 241750 573 770 2.12E-27 111.991 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#14895 - CGI_10026082 superfamily 241749 7 144 5.46E-33 115.561 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#14898 - CGI_10005949 superfamily 241638 76 198 1.56E-10 55.0669 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#14901 - CGI_10005952 superfamily 221377 208 350 5.68E-07 47.8487 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#14903 - CGI_10005954 superfamily 219127 69 367 2.08E-81 253.772 cl05943 MIG-14_Wnt-bd superfamily - - "Wnt-binding factor required for Wnt secretion; MIG-14 is a Wnt-binding factor. Newly synthesised EGL-20/Wnt binds to MIG-14 in the Golgi, targetting the Wnt to the cell membrane for secretion. AP-2-mediated endocytosis and retromer retrieval at the sorting endosome would recycle MIG-14 to the Golgi, where it can bind to EGL-20/Wnt for next cycle of secretion." Q#14907 - CGI_10017411 superfamily 110440 82 108 8.81E-05 37.0021 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#14909 - CGI_10017413 superfamily 241563 65 106 5.27E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14910 - CGI_10017414 superfamily 241563 62 103 1.86E-05 42.4664 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14911 - CGI_10017415 superfamily 246664 201 233 5.27E-08 51.5418 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#14912 - CGI_10017416 superfamily 246664 104 562 2.71E-128 392.829 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#14914 - CGI_10017418 superfamily 245227 13 653 0 1204.09 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#14915 - CGI_10017419 superfamily 245660 171 270 0.00086872 41.0534 cl11493 PQQ_DH_like superfamily C - "PQQ-dependent dehydrogenases and related proteins; This family is composed of dehydrogenases with pyrroloquinoline quinone (PQQ) as a cofactor, such as ethanol, methanol, and membrane-bound glucose dehydrogenases. The alignment model contains an 8-bladed beta-propeller, and the family also includes distantly related proteins which are not enzymatically active and do not bind PQQ." Q#14916 - CGI_10017420 superfamily 242406 1 67 6.60E-10 52.2085 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#14917 - CGI_10017421 superfamily 247736 79 154 7.17E-07 46.0208 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#14921 - CGI_10017425 superfamily 246664 56 150 1.66E-23 95.4545 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#14922 - CGI_10017426 superfamily 246664 316 693 1.20E-136 411.968 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#14924 - CGI_10017428 superfamily 241563 61 96 0.00597391 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14928 - CGI_10008012 superfamily 241867 25 187 8.05E-07 46.3926 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#14929 - CGI_10008013 superfamily 241867 29 188 7.43E-16 71.8158 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#14930 - CGI_10008014 superfamily 217309 56 610 0 682.501 cl09289 EMP70 superfamily - - Endomembrane protein 70; Endomembrane protein 70. Q#14931 - CGI_10008015 superfamily 248372 7 114 1.61E-18 76.084 cl17818 Nuc_deoxyrib_tr superfamily - - Nucleoside 2-deoxyribosyltransferase; Nucleoside 2-deoxyribosyltransferase EC:2.4.2.6 catalyzes the cleavage of the glycosidic bonds of 2`-deoxyribonucleosides. Q#14932 - CGI_10008016 superfamily 202014 285 489 2.44E-60 203.417 cl03387 RB_A superfamily - - Retinoblastoma-associated protein A domain; This domain has the cyclin fold as predicted. Q#14932 - CGI_10008016 superfamily 216744 563 681 6.37E-33 124.801 cl18378 RB_B superfamily - - "Retinoblastoma-associated protein B domain; The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B domain. The B domain has a cyclin fold." Q#14932 - CGI_10008016 superfamily 149868 684 783 3.06E-13 68.4437 cl07508 Rb_C superfamily C - "Rb C-terminal domain; The Rb C-terminal domain is required for high-affinity binding to E2F-DP complexes and for maximal repression of E2F-responsive promoters, thereby acting as a growth suppressor by blocking the G1-S transition of the cell cycle. This domain has a strand-loop-helix structure, which directly interacts with both E2F1 and DP1, followed by a tail segment that lacks regular secondary structure." Q#14932 - CGI_10008016 superfamily 221325 76 167 0.000143104 41.5334 cl13385 DUF3452 superfamily N - "Domain of unknown function (DUF3452); This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is typically between 124 to 150 amino acids in length. This domain is found associated with pfam01858, pfam01857. This domain has a single completely conserved residue W that may be functionally important." Q#14935 - CGI_10008019 superfamily 219145 73 230 1.45E-57 182.445 cl05974 SPC25 superfamily - - "Microsomal signal peptidase 25 kDa subunit (SPC25); This family consists of several microsomal signal peptidase 25 kDa subunit proteins. Translocation of polypeptide chains across the endoplasmic reticulum (ER) membrane is triggered by signal sequences. Subsequently, signal recognition particle interacts with its membrane receptor and the ribosome-bound nascent chain is targeted to the ER where it is transferred into a protein-conducting channel. At some point, a second signal sequence recognition event takes place in the membrane and translocation of the nascent chain through the membrane occurs. The signal sequence of most secretory and membrane proteins is cleaved off at this stage. Cleavage occurs by the signal peptidase complex (SPC) as soon as the lumenal domain of the translocating polypeptide is large enough to expose its cleavage site to the enzyme. The signal peptidase complex is possibly also involved in proteolytic events in the ER membrane other than the processing of the signal sequence, for example the further digestion of the cleaved signal peptide or the degradation of membrane proteins. Mammalian signal peptidase is as a complex of five different polypeptide chains. This family represents the 25 kDa subunit (SPC25)." Q#14939 - CGI_10009361 superfamily 216363 3 64 1.27E-05 38.2202 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#14945 - CGI_10025908 superfamily 241563 68 105 9.20E-06 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#14945 - CGI_10025908 superfamily 128778 113 235 0.00115999 38.0147 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#14946 - CGI_10025909 superfamily 246908 286 387 1.11E-38 135.788 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#14947 - CGI_10025910 superfamily 245201 30 303 0 560.865 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#14948 - CGI_10025911 superfamily 198867 91 190 1.40E-41 144.994 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#14948 - CGI_10025911 superfamily 243146 382 429 4.66E-14 67.1983 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14948 - CGI_10025911 superfamily 243146 287 333 1.65E-13 65.6575 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14948 - CGI_10025911 superfamily 243146 418 462 6.40E-13 63.8346 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14948 - CGI_10025911 superfamily 243146 478 524 9.38E-13 63.3463 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14948 - CGI_10025911 superfamily 243146 322 367 8.28E-11 57.6714 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14948 - CGI_10025911 superfamily 243146 241 286 4.92E-10 55.6423 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14948 - CGI_10025911 superfamily 243066 27 86 0.000139872 40.3669 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#14949 - CGI_10025912 superfamily 241559 5 97 0.0005966 40.3404 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#14953 - CGI_10025916 superfamily 217958 102 192 0.00100928 39.8644 cl04445 Birna_RdRp superfamily C - "Birnavirus RNA dependent RNA polymerase (VP1); Birnaviruses are dsRNA viruses. This family corresponds to the RNA dependent RNA polymerase. This protein is also known as VP1. All of the birnavirus VP1 proteins contain conserved RdRp motifs that reside in the catalytic "palm" domain of all classes of polymerases. However, the birnavirus RdRps lack the highly conserved Gly-Asp-Asp (GDD) sequence, a component of the proposed catalytic site of this enzyme family that exists in the conserved motif VI of the palm domain of other RdRps." Q#14957 - CGI_10025920 superfamily 241583 60 251 1.09E-48 168.904 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#14957 - CGI_10025920 superfamily 241571 293 412 8.37E-17 77.0674 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#14959 - CGI_10025922 superfamily 247948 65 116 7.85E-16 69.245 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#14960 - CGI_10025923 superfamily 216152 21 369 3.18E-69 226.041 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#14961 - CGI_10025924 superfamily 241578 364 609 1.48E-106 333.086 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#14961 - CGI_10025924 superfamily 218277 711 813 5.44E-28 109.928 cl04773 Sec23_helical superfamily - - "Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices." Q#14961 - CGI_10025924 superfamily 219707 614 697 4.35E-26 104.135 cl06871 Sec23_BS superfamily - - Sec23/Sec24 beta-sandwich domain; Sec23/Sec24 beta-sandwich domain. Q#14961 - CGI_10025924 superfamily 203092 287 325 9.89E-15 70.2819 cl04769 zf-Sec23_Sec24 superfamily - - "Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain." Q#14961 - CGI_10025924 superfamily 247044 831 884 2.79E-09 55.3836 cl15697 ADF_gelsolin superfamily C - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#14963 - CGI_10025926 superfamily 247743 1305 1343 0.000126305 42.5183 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#14963 - CGI_10025926 superfamily 149859 18 103 1.27E-26 107.441 cl07498 MAPKK1_Int superfamily N - Mitogen-activated protein kinase kinase 1 interacting; Mitogen-activated protein kinase kinase 1 interacting protein is a small subcellular adaptor protein required for MAPK signaling and ERK1/2 activation. The overall topology of this domain has a central five-stranded beta-sheet sandwiched between a two alpha-helix and a one alpha-helix layer. Q#14964 - CGI_10025927 superfamily 220618 247 519 2.06E-16 78.38 cl10873 DUF2369 superfamily - - Uncharacterized conserved protein (DUF2369); This is a proline-rich region of a group of proteins found from plants to fungi. The function is not known. Q#14968 - CGI_10025931 superfamily 243066 21 125 1.74E-28 109.244 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#14968 - CGI_10025931 superfamily 198867 133 217 2.01E-23 95.3031 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#14968 - CGI_10025931 superfamily 243146 414 460 0.00155242 36.8707 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14968 - CGI_10025931 superfamily 243146 318 359 0.00230538 36.4855 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14968 - CGI_10025931 superfamily 243146 503 538 0.00669503 34.8951 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14968 - CGI_10025931 superfamily 243146 363 414 0.00780338 34.7314 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#14970 - CGI_10025933 superfamily 152105 205 252 0.00190178 36.3567 cl13169 WBP-1 superfamily C - "WW domain-binding protein 1; This family of proteins represents WBP-1, a ligand of the WW domain of Yes-associated protein. This protein has a proline-rich domain. WBP-1 does not bind to the SH3 domain." Q#14971 - CGI_10025934 superfamily 243092 7 311 1.85E-88 269.205 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14972 - CGI_10025935 superfamily 246597 12 296 0 615.774 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#14974 - CGI_10025937 superfamily 241832 46 155 7.12E-59 186.18 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#14975 - CGI_10025938 superfamily 241810 635 760 1.06E-65 217.442 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#14975 - CGI_10025938 superfamily 247805 136 267 8.77E-23 96.6375 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#14975 - CGI_10025938 superfamily 247905 359 503 7.37E-06 45.6917 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#14975 - CGI_10025938 superfamily 219729 846 1025 4.11E-77 250.981 cl06956 DSHCT superfamily - - DSHCT (NUC185) domain; This C terminal domain is found in DOB1/SK12/helY-like DEAD box helicases. Q#14976 - CGI_10025939 superfamily 216212 356 858 8.46E-90 296.51 cl03037 HCO3_cotransp superfamily - - HCO3- transporter family; This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. Q#14976 - CGI_10025939 superfamily 241651 150 313 5.96E-08 51.9422 cl00163 PTS_IIA_fru superfamily - - "PTS_IIA, PTS system, fructose/mannitol specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation." Q#14977 - CGI_10025940 superfamily 245835 10 267 2.34E-117 358.963 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#14977 - CGI_10025940 superfamily 245456 612 820 1.24E-65 220.747 cl10970 AP_MHD_Cterm superfamily - - "C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD); This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15." Q#14978 - CGI_10025941 superfamily 243310 40 272 1.19E-74 230.973 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#14979 - CGI_10025942 superfamily 147693 43 107 3.54E-14 63.5391 cl05311 NDUF_B7 superfamily - - "NADH-ubiquinone oxidoreductase B18 subunit (NDUFB7); This family consists of several NADH-ubiquinone oxidoreductase B18 subunit proteins from different eukaryotic organisms. Oxidative phosphorylation is the well-characterized process in which ATP, the principal carrier of chemical energy of individual cells, is produced due to a mitochondrial proton gradient formed by the transfer of electrons from NADH and FADH2 to molecular oxygen. The oxidative phosphorylation (OXPHOS) system is located in the mitochondrial inner membrane and consists of five multi-subunit enzyme complexes and two small electron carriers: coenzyme Q10 and cytochrome C. At least 70 structural proteins involved in the formation of the whole OXPHOS system are encoded by nuclear genes, whereas 13 structural proteins are encoded by the mitochondrial genome. Deficiency of NADH ubiquinone oxidoreductase, the first enzyme complex of the mitochondrial respiratory chain, is one of the most frequent causes of human mitochondrial encephalomyopathies." Q#14980 - CGI_10025943 superfamily 243092 8 308 1.45E-31 120.132 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#14982 - CGI_10025945 superfamily 241546 476 594 5.52E-50 174.28 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14982 - CGI_10025945 superfamily 241546 1044 1164 2.51E-48 169.658 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14982 - CGI_10025945 superfamily 241645 93 166 5.09E-16 75.3036 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#14982 - CGI_10025945 superfamily 241645 231 305 1.03E-14 71.8368 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#14982 - CGI_10025945 superfamily 241546 350 466 2.98E-30 117.656 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14982 - CGI_10025945 superfamily 241546 919 1033 8.78E-13 67.1948 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14982 - CGI_10025945 superfamily 241546 1284 1333 1.12E-12 66.8096 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14982 - CGI_10025945 superfamily 241546 1188 1272 2.03E-05 44.468 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14982 - CGI_10025945 superfamily 241579 623 726 0.000248762 40.9696 cl00060 FGF superfamily - - "Acidic and basic fibroblast growth factor family; FGFs are mitogens, which stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family plays essential roles in patterning and differentiation during vertebrate embryogenesis, and has neurotrophic activities. FGFs have a high affinity for heparan sulfate proteoglycans and require heparan sulfate to activate one of four cell surface FGF receptors. Upon binding to FGF, the receptors dimerize and their intracellular tyrosine kinase domains become active. FGFs have internal pseudo-threefold symmetry (beta-trefoil topology)." Q#14983 - CGI_10025946 superfamily 241546 164 282 2.33E-41 150.398 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 1606 1723 1.35E-36 136.916 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 673 789 2.36E-27 109.952 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 943 1043 3.55E-27 109.567 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 421 522 1.40E-16 78.7508 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 36 152 2.09E-16 78.3656 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 1107 1198 3.17E-12 66.0392 cl00011 PLAT superfamily N - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 1210 1260 1.30E-10 61.0316 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 306 393 2.48E-08 54.098 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 834 931 6.68E-08 52.6632 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 1541 1595 2.92E-05 44.6373 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14983 - CGI_10025946 superfamily 241546 549 639 0.00319591 38.0257 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#14988 - CGI_10025951 superfamily 243091 3 33 0.000556469 34.6176 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#14990 - CGI_10025953 superfamily 245206 40 286 7.30E-120 347.532 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#14991 - CGI_10025954 superfamily 245206 50 286 4.84E-55 181.654 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#14992 - CGI_10025955 superfamily 248318 2 52 6.61E-14 68.2313 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#14992 - CGI_10025955 superfamily 216371 502 912 4.27E-165 491.568 cl18365 ERG4_ERG24 superfamily - - Ergosterol biosynthesis ERG4/ERG24 family; Ergosterol biosynthesis ERG4/ERG24 family. Q#14993 - CGI_10025956 superfamily 245596 148 443 0 522.92 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#14993 - CGI_10025956 superfamily 247085 463 575 1.15E-20 88.3314 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#14994 - CGI_10025957 superfamily 147298 1003 1214 1.29E-44 162.69 cl04904 Pecanex_C superfamily - - Pecanex protein (C-terminus); This family consists of C terminal region of the pecanex protein homologues. The pecanex protein is a maternal-effect neurogenic gene found in Drosophila. Q#14995 - CGI_10025958 superfamily 220605 10 462 2.01E-56 199.124 cl10853 Med17 superfamily - - Subunit 17 of Mediator complex; This Mediator complex subunit was formerly known as Srb4 in yeasts or Trap80 in Drosophila and human. The Med17 subunit is located within the head domain and is essential for cell viability to the extent that a mutant strain of cerevisiae lacking it shows all RNA polymerase II-dependent transcription ceasing at non-permissive temperatures. Q#14998 - CGI_10025961 superfamily 247057 40 98 2.15E-07 46.4637 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#14999 - CGI_10025962 superfamily 245596 100 199 3.83E-28 110.475 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#14999 - CGI_10025962 superfamily 245596 200 239 3.47E-10 58.4732 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15001 - CGI_10025964 superfamily 243035 84 158 7.52E-05 39.5254 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15002 - CGI_10025965 superfamily 247057 285 331 2.32E-06 44.9229 cl15755 SAM_superfamily superfamily C - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#15002 - CGI_10025965 superfamily 247057 215 272 7.04E-06 43.4409 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#15003 - CGI_10025966 superfamily 245596 75 199 8.91E-31 113.942 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15006 - CGI_10025969 superfamily 217473 115 339 3.02E-28 114.77 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#15007 - CGI_10025970 superfamily 241574 38 125 2.62E-28 111.138 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#15007 - CGI_10025970 superfamily 241574 126 178 2.38E-16 76.8557 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#15007 - CGI_10025970 superfamily 241574 240 416 5.21E-16 76.0853 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#15008 - CGI_10025971 superfamily 245201 33 67 0.00613036 34.9102 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15011 - CGI_10005414 superfamily 243250 80 359 2.11E-85 269.133 cl02959 Glyco_hydro_9 superfamily C - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#15013 - CGI_10005416 superfamily 245814 283 371 0.00223113 36.9451 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15015 - CGI_10005418 superfamily 248097 6 136 1.05E-15 68.831 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15016 - CGI_10005419 superfamily 248097 216 347 5.53E-21 86.5502 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15016 - CGI_10005419 superfamily 221533 89 151 0.00169806 36.1356 cl13726 TMF_DNA_bd superfamily C - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#15017 - CGI_10005094 superfamily 204985 3 119 1.56E-45 146.933 cl14987 Chorein_N superfamily - - "N-terminal region of Chorein, a TM vesicle-mediated sorter; Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport." Q#15018 - CGI_10005095 superfamily 241610 83 135 8.69E-16 67.275 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15019 - CGI_10005096 superfamily 241646 40 69 4.01E-06 45.9039 cl00156 WAP superfamily N - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#15020 - CGI_10005097 superfamily 241610 344 397 3.96E-20 83.0682 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15020 - CGI_10005097 superfamily 241832 42 87 0.000768036 37.3852 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15020 - CGI_10005097 superfamily 241646 271 319 8.63E-05 39.7414 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#15022 - CGI_10005099 superfamily 243035 42 67 0.000329761 34.883 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15023 - CGI_10005100 superfamily 243035 26 99 1.25E-13 61.6232 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15025 - CGI_10006637 superfamily 247724 13 181 1.61E-62 193.914 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15027 - CGI_10006639 superfamily 247038 425 511 5.75E-16 76.7245 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 247038 1442 1496 2.10E-13 69.4057 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 247038 1196 1267 1.84E-10 60.5461 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 247038 2036 2118 7.36E-10 59.0053 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 247038 1273 1350 1.39E-08 55.16 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 247038 1867 1943 3.66E-08 53.6125 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 247038 1772 1847 7.29E-08 52.8421 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 247038 191 291 4.26E-05 44.3677 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 241629 92 164 2.73E-22 96.4285 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#15027 - CGI_10006639 superfamily 244965 492 632 1.40E-09 58.962 cl08459 PA14 superfamily - - "PA14 domain; This domain forms an insert in bacterial beta-glucosidases and is found in other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium prespore-cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding." Q#15027 - CGI_10006639 superfamily 247038 1972 2030 7.17E-05 43.5852 cl15674 IPT superfamily N - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15027 - CGI_10006639 superfamily 247038 1679 1742 0.000187355 42.4296 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15028 - CGI_10006640 superfamily 247095 60 505 4.23E-168 484.082 cl15837 alkPPc superfamily - - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#15030 - CGI_10006642 superfamily 241872 62 162 4.63E-06 43.7593 cl00453 CDP-OH_P_transf superfamily - - CDP-alcohol phosphatidyltransferase; All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. Q#15031 - CGI_10006643 superfamily 216050 19 103 7.38E-06 42.6763 cl18357 rve superfamily C - Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site. Q#15032 - CGI_10006644 superfamily 241594 433 496 1.45E-05 45.7592 cl00077 HECTc superfamily N - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#15035 - CGI_10005789 superfamily 248281 1 60 3.39E-07 44.5687 cl17727 GT1 superfamily N - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#15037 - CGI_10005791 superfamily 245205 82 164 1.78E-06 44.1509 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#15039 - CGI_10005793 superfamily 245213 227 260 0.000111331 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15039 - CGI_10005793 superfamily 245213 179 224 0.000176961 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15040 - CGI_10005794 superfamily 246910 503 612 2.58E-60 198.68 cl15257 GIY-YIG_SF superfamily - - "GIY-YIG nuclease domain superfamily; The GIY-YIG nuclease domain superfamily includes a large and diverse group of proteins involved in many cellular processes, such as class I homing GIY-YIG family endonucleases, prokaryotic nucleotide excision repair proteins UvrC and Cho, type II restriction enzymes, the endonuclease/reverse transcriptase of eukaryotic retrotransposable elements, and a family of eukaryotic enzymes that repair stalled replication forks. All of these members contain a conserved GIY-YIG nuclease domain that may serve as a scaffold for the coordination of a divalent metal ion required for catalysis of the phosphodiester bond cleavage. By combining with different specificity, targeting, or other domains, the GIY-YIG nucleases may perform different functions." Q#15040 - CGI_10005794 superfamily 243072 1 77 1.84E-13 67.7938 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15040 - CGI_10005794 superfamily 243125 406 440 9.74E-12 60.881 cl02649 LEM superfamily - - "LEM (Lap2/Emerin/Man1) domain found in emerin, lamina-associated polypeptide 2 (LAP2), inner nuclear membrane protein Man1 and similar proteins; The family corresponds to a group of inner nuclear membrane proteins containing LEM domain. Emerin occurs in four phosphorylated forms and plays a role in cell cycle-dependent events. It is absent from the inner nuclear membrane in most patients with X-linked muscular dystrophy. Emerin interacts with A-type and B-type lamins. Man1, also termed LEM domain-containing protein 3 (LEMD3) is an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and post-mitotic reassembly. Some LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are non-membrane nuclear polypeptides. This family also contains LEM domain-containing protein LEMP-1 and LEM2. LEMP-1, also termed cancer/testis antigen 50 (CT50), is encoded by LEMD1, a novel testis-specific gene expressed in colorectal cancers. LEMP-1 may function as a cancer-testis antigen for immunotherapy of colorectal carcinoma (CRC). LEM2, also termed LEMD2, is a novel Man1-related ubiquitously expressed inner nuclear membrane protein required for normal nuclear envelope morphology. Association with lamin A is required for its proper nuclear envelope localization while its binding to lamin C plays an important role in the organization of lamin A/C complexes. Some uncharacterized LEM domain-containing proteins are also included in this family. Unlike other family members, these harbor an ankyrin repeat region that may mediate protein-protein interactions." Q#15042 - CGI_10003981 superfamily 241551 77 138 1.89E-21 83.8875 cl00016 Cyt_c_Oxidase_Vb superfamily N - "Cytochrome c oxidase subunit Vb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Vb is one of three mammalian subunits that lacks a transmembrane region. Subunit Vb is located on the matrix side of the membrane and binds the regulatory subunit of protein kinase A. The abnormally extended conformation is stable only in the CcO assembly." Q#15043 - CGI_10003982 superfamily 243058 43 128 1.71E-07 47.3091 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#15044 - CGI_10003983 superfamily 245201 24 270 1.53E-75 242.04 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15044 - CGI_10003983 superfamily 247694 400 555 3.17E-37 134.128 cl17070 AMPKA_C_like superfamily - - "C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha subunit and similar domains; This family is composed of AMPKs, microtubule-associated protein/microtubule affinity regulating kinases (MARKs), yeast Kcc4p-like proteins, plant calcineurin B-Like (CBL)-interacting protein kinases (CIPKs), and similar proteins. They are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. AMPKs act as sensors for the energy status of the cell and are activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. MARKs phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Kcc4p and related proteins are septin-associated proteins that are involved in septin organization and in the yeast morphogenesis checkpoint coordinating the cell cycle with bud formation. CIPKs interact with the calcineurin B-like (CBL) calcium sensors to form a signaling network that decode specific calcium signals triggered by a variety of environmental stimuli including salinity, drought, cold, light, and mechanical perturbation, among others. All members of this family contain an N-terminal catalytic kinase domain and a C-terminal regulatory domain which is also called kinase associated domain 1 (KA1) in some cases. The C-terminal regulatory domain serves as a protein interaction domain in AMPKs and CIPKs. In MARKs and Kcc4p-like proteins, this domain binds phospholipids and may be involved in membrane localization." Q#15045 - CGI_10003984 superfamily 247755 439 644 1.80E-82 259.779 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#15045 - CGI_10003984 superfamily 241940 57 338 2.48E-123 370.022 cl00549 ABC_membrane_2 superfamily - - ABC transporter transmembrane region 2; This domain covers the transmembrane of a small family of ABC transporters and shares sequence similarity with pfam00664. Mutations in this domain in human ABCD3 (PMP70) are believed responsible for Zellweger Syndrome-2; mutations in human ABCD1 (ALD) are responsible for recessive X-linked adrenoleukodystrophy. A Saccharomyces cerevisiae homolog is involved in the import of long-chain fatty acids. Q#15046 - CGI_10019421 superfamily 243098 479 517 0.00139307 37.9628 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15046 - CGI_10019421 superfamily 241565 768 832 0.00173292 37.6863 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#15048 - CGI_10019423 superfamily 242199 48 261 6.75E-95 281.698 cl00931 Ribosomal_S6e superfamily - - Ribosomal protein S6e; Ribosomal protein S6e. Q#15050 - CGI_10019425 superfamily 241832 15 103 2.48E-32 109.937 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15051 - CGI_10019426 superfamily 241832 15 103 4.39E-27 96.4547 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15054 - CGI_10019429 superfamily 247856 91 139 1.00E-07 51.0093 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#15058 - CGI_10019433 superfamily 243100 96 140 3.12E-05 40.3659 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#15060 - CGI_10019435 superfamily 241754 4 363 1.12E-174 516.094 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#15060 - CGI_10019435 superfamily 241581 452 543 2.05E-05 43.9142 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#15060 - CGI_10019435 superfamily 221571 659 699 3.17E-05 42.8751 cl13810 KIF1B superfamily - - "Kinesin protein 1B; This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00225, pfam00498. KIF1B is an anterograde motor for transport of mitochondria in axons of neuronal cells." Q#15062 - CGI_10019437 superfamily 247787 8 191 8.62E-39 135.404 cl17233 RecA-like_NTPases superfamily N - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#15063 - CGI_10019438 superfamily 242899 7 156 3.18E-51 162.345 cl02135 TRAPP superfamily - - "Transport protein particle (TRAPP) component; TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterized TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localise TRAPP to the Golgi." Q#15064 - CGI_10019439 superfamily 202351 92 215 4.92E-14 69.448 cl03662 Na_Pi_cotrans superfamily C - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#15064 - CGI_10019439 superfamily 202351 388 477 8.17E-07 47.8768 cl03662 Na_Pi_cotrans superfamily N - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#15065 - CGI_10019440 superfamily 202351 128 251 1.08E-13 68.6776 cl03662 Na_Pi_cotrans superfamily C - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#15065 - CGI_10019440 superfamily 202351 374 502 4.49E-12 64.0552 cl03662 Na_Pi_cotrans superfamily C - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#15066 - CGI_10019441 superfamily 202351 27 150 8.80E-12 62.1292 cl03662 Na_Pi_cotrans superfamily C - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#15066 - CGI_10019441 superfamily 202351 181 314 6.64E-06 44.41 cl03662 Na_Pi_cotrans superfamily C - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#15067 - CGI_10019442 superfamily 241782 42 408 2.40E-169 482.106 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#15068 - CGI_10019443 superfamily 241884 5 217 8.77E-142 399.348 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#15069 - CGI_10019444 superfamily 241666 60 299 6.39E-50 178.745 cl00184 CAS_like superfamily - - "Clavaminic acid synthetase (CAS) -like; CAS is a trifunctional Fe(II)/ 2-oxoglutarate (2OG) oxygenase carrying out three reactions in the biosynthesis of clavulanic acid, an inhibitor of class A serine beta-lactamases. In general, Fe(II)-2OG oxygenases catalyze a hydroxylation reaction, which leads to the incorporation of an oxygen atom from dioxygen into a hydroxyl group and conversion of 2OG to succinate and CO2" Q#15069 - CGI_10019444 superfamily 190398 656 980 1.72E-52 188.115 cl03672 DUF221 superfamily - - "Domain of unknown function DUF221; This family consists of hypothetical transmembrane proteins none of which have any function, the aligned region is at 538 residues at maximum length." Q#15069 - CGI_10019444 superfamily 222479 429 500 0.0029925 38.2608 cl16507 RSN1_TM superfamily N - "Late exocytosis, associated with Golgi transport; This family represents the first three transmembrane regions of 11-TM proteins involved in vesicle transport. In S. cerevisiae these proteins are members of the yeast facilitator superfamily and are integral membrane proteins localised to the cell periphery, in particular to the bud-neck region. The distribution is consistent with a role in late exocytosis which is in agreement with the proteins' ability to substitute for the function of Sro7p, required for the sorting of the protein Enap1 into Golgi-derived vesicles destined for the cell surface." Q#15071 - CGI_10019446 superfamily 241616 59 140 7.61E-30 111.583 cl00109 MADS superfamily - - "MADS: MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptonal regulators. Binds DNA and exists as hetero and homo-dimers. Composed of 2 main subgroups: SRF-like/Type I and MEF2-like (myocyte enhancer factor 2)/ Type II. These subgroups differ mainly in position of the alpha 2 helix responsible for the dimerization interface; Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi." Q#15072 - CGI_10019447 superfamily 246748 53 307 1.87E-117 343.815 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#15074 - CGI_10019449 superfamily 243050 15 67 1.59E-21 85.8924 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#15074 - CGI_10019449 superfamily 241599 160 218 1.42E-19 80.3652 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#15074 - CGI_10019449 superfamily 243050 74 128 1.98E-14 66.7022 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#15075 - CGI_10019450 superfamily 243050 10 62 7.58E-22 87.048 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#15075 - CGI_10019450 superfamily 241599 155 211 2.80E-18 77.2836 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#15075 - CGI_10019450 superfamily 243050 70 124 3.61E-15 69.0134 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#15076 - CGI_10019451 superfamily 247069 101 157 4.13E-08 48.3972 cl15787 SEC14 superfamily C - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#15076 - CGI_10019451 superfamily 247643 21 71 1.49E-07 44.8849 cl16919 CRAL_TRIO_N superfamily - - "CRAL/TRIO, N-terminal domain; This all-alpha domain is found to the N-terminus of pfam00650." Q#15080 - CGI_10019455 superfamily 241832 49 70 1.93E-05 40.2477 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15080 - CGI_10019455 superfamily 243175 115 172 2.29E-10 54.1958 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#15081 - CGI_10019456 superfamily 241832 7 78 3.64E-17 72.9896 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15081 - CGI_10019456 superfamily 243175 153 210 6.47E-09 51.1142 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#15082 - CGI_10019457 superfamily 241645 185 268 4.77E-29 106.904 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#15085 - CGI_10005856 superfamily 245596 409 638 8.40E-78 256.466 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15085 - CGI_10005856 superfamily 245596 298 358 6.52E-06 47.6876 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15085 - CGI_10005856 superfamily 242144 696 803 0.00769055 38.0842 cl00857 DUF63 superfamily C - Membrane protein of unknown function DUF63; Proteins found in Archaebacteria of unknown function. These proteins are probably transmembrane proteins. Q#15086 - CGI_10005857 superfamily 241609 46 113 2.11E-21 83.9667 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#15086 - CGI_10005857 superfamily 241609 1 34 3.69E-14 64.3338 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#15088 - CGI_10005859 superfamily 245596 156 269 1.11E-18 81.5852 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15089 - CGI_10005860 superfamily 245596 517 766 2.95E-77 252.999 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15090 - CGI_10005861 superfamily 248458 202 381 9.25E-09 55.3977 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#15091 - CGI_10005862 superfamily 243035 66 155 9.78E-15 71.8821 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15091 - CGI_10005862 superfamily 241613 158 187 4.05E-07 47.9718 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#15091 - CGI_10005862 superfamily 215647 459 528 6.44E-15 74.5672 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#15091 - CGI_10005862 superfamily 215647 398 473 3.47E-09 56.8481 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#15094 - CGI_10008331 superfamily 247068 461 558 3.45E-14 70.0349 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15094 - CGI_10008331 superfamily 247068 369 449 5.03E-06 45.7674 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15094 - CGI_10008331 superfamily 247068 587 667 2.06E-05 43.8414 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15094 - CGI_10008331 superfamily 247068 676 751 0.000443427 39.5872 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15095 - CGI_10008332 superfamily 241832 273 401 2.17E-66 210.671 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15095 - CGI_10008332 superfamily 241832 26 125 1.73E-59 191.731 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15095 - CGI_10008332 superfamily 241832 160 263 1.03E-56 184.412 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15097 - CGI_10008334 superfamily 220792 4 41 2.66E-10 58.5715 cl11150 EPL1 superfamily N - Enhancer of polycomb-like; This is a family of EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes. Q#15097 - CGI_10008334 superfamily 191602 364 487 9.12E-10 58.0843 cl06011 E_Pc_C superfamily C - "Enhancer of Polycomb C-terminus; This family represents the C-terminus of eukaryotic enhancer of polycomb proteins, which have roles in heterochromatin formation. This family contains several conserved motifs." Q#15098 - CGI_10008335 superfamily 241599 1026 1082 1.25E-09 56.0977 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#15102 - CGI_10008339 superfamily 219225 1 177 4.22E-79 235.82 cl06114 FAIM1 superfamily - - "Fas apoptotic inhibitory molecule (FAIM1); This family consists of several fas apoptotic inhibitory molecule (FAIM1) proteins. FAIM expression is upregulated in B cells by anti-Ig treatment that induces Fas-resistance, and overexpression of FAIM diminishes sensitivity to Fas-mediated apoptosis of B and non-B cell lines. FAIM1 is highly evolutionarily conserved and is widely expressed in murine tissues, suggesting that FAIM plays an important role in cellular physiology." Q#15104 - CGI_10008341 superfamily 241696 121 446 8.79E-173 490.99 cl00218 Glyco_hydrolase_16 superfamily - - "glycosyl hydrolase family 16; The O-Glycosyl hydrolases are a widespread group of enzymes that hydrolyse the glycosidic bond between two or more carbohydrates, or between a carbohydrate and a non-carbohydrate moiety. A glycosyl hydrolase classification system based on sequence similarity has led to the definition of more than 95 different families inlcuding glycosyl hydrolase family 16. Family 16 includes lichenase, xyloglucan endotransglycosylase (XET), beta-agarase, kappa-carrageenase, endo-beta-1,3-glucanase, endo-beta-1,3-1,4-glucanase, and endo-beta-galactosidase, all of which have a conserved jelly roll fold with a deep active site channel harboring the catalytic residues." Q#15105 - CGI_10011981 superfamily 242093 170 252 0.00404417 37.1089 cl00788 MttA_Hcf106 superfamily N - mttA/Hcf106 family; Members of this protein family are involved in a sec independent translocation mechanism. This pathway has been called the DeltapH pathway in chloroplasts. Members of this family in E.coli are involved in export of redox proteins with a "twin arginine" leader motif. Q#15106 - CGI_10011982 superfamily 247792 18 69 9.27E-08 49.3664 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15108 - CGI_10011984 superfamily 243092 111 353 0.000687666 39.6256 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15109 - CGI_10011985 superfamily 247792 18 69 2.61E-09 53.9888 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15109 - CGI_10011985 superfamily 243092 387 583 0.00633063 37.6996 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15110 - CGI_10011986 superfamily 248097 356 478 1.34E-24 98.4914 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15111 - CGI_10011987 superfamily 241563 68 109 9.70E-05 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15112 - CGI_10011988 superfamily 246875 183 312 7.48E-48 158.596 cl15166 RRP7_like superfamily - - "RRP7 domain ribosomal RNA-processing protein 7 (Rrp7p), ribosomal RNA-processing protein 7 homolog A (Rrp7A), and similar proteins; This CD corresponds to the RRP7 domain of Rrp7p and Rrp7A. Rrp7p is encoded by YCL031C gene from Saccharomyces cerevisiae. It is an essential yeast protein involved in pre-rRNA processing and ribosome assembly, and is speculated to be required for correct assembly of rpS27 into the pre-ribosomal particle. Rrp7A, also termed gastric cancer antigen Zg14, is the Rrp7p homolog mainly found in Metazoans. The cellular function of Rrp7A remains unclear currently. Both Rrp7p and Rrp7A harbor an N-terminal RNA recognition motif (RRM), also termed RBD (RNA binding domain) or RNP (ribonucleoprotein domain), and a C-terminal RRP7 domain." Q#15112 - CGI_10011988 superfamily 247723 89 191 2.16E-35 125.114 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15113 - CGI_10011989 superfamily 241563 60 98 0.000320083 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15114 - CGI_10011990 superfamily 247724 313 481 1.78E-85 262.657 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15114 - CGI_10011990 superfamily 128778 179 277 2.00E-11 61.1266 cl17972 BBC superfamily N - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#15114 - CGI_10011990 superfamily 241563 34 75 0.0001149 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15115 - CGI_10011991 superfamily 241563 68 109 5.35E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15115 - CGI_10011991 superfamily 241563 28 59 0.00536719 35.1476 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15116 - CGI_10011992 superfamily 247724 8 108 5.21E-44 142.86 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15117 - CGI_10011993 superfamily 241563 68 109 0.000120925 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15118 - CGI_10011994 superfamily 241563 31 72 3.51E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15119 - CGI_10011995 superfamily 241563 111 152 0.00275704 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15120 - CGI_10011996 superfamily 241563 68 109 0.00012948 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15121 - CGI_10011997 superfamily 241563 68 109 0.000131488 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15122 - CGI_10011998 superfamily 241563 68 109 0.000518184 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15122 - CGI_10011998 superfamily 241563 21 59 0.00213879 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15123 - CGI_10011999 superfamily 241563 68 109 0.000542421 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15123 - CGI_10011999 superfamily 241563 22 59 0.00696747 34.7624 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15124 - CGI_10012000 superfamily 243095 368 578 4.02E-85 267.623 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#15124 - CGI_10012000 superfamily 241566 304 352 5.32E-09 53.2648 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#15125 - CGI_10012001 superfamily 207546 277 337 3.41E-32 116.366 cl02165 CBFB_NFYA superfamily - - CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B; CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B. Q#15126 - CGI_10012002 superfamily 242882 3 98 1.12E-63 192.803 cl02102 S10_plectin superfamily - - Plectin/S10 domain; This presumed domain is found at the N-terminus of some isoforms of the cytoskeletal muscle protein plectin as well as the ribosomal S10 protein. This domain may be involved in RNA binding. Q#15127 - CGI_10012003 superfamily 244913 1 454 5.47E-165 489.02 cl08327 Glyco_hydro_47 superfamily - - "Glycosyl hydrolase family 47; Members of this family are alpha-mannosidases that catalyze the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2)." Q#15129 - CGI_10004216 superfamily 245040 26 89 3.49E-05 38.5633 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#15137 - CGI_10008292 superfamily 215686 11 145 0.00638987 34.3153 cl18340 Lipocalin superfamily - - "Lipocalin / cytosolic fatty-acid binding protein family; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel." Q#15138 - CGI_10008293 superfamily 247068 591 687 3.55E-15 73.1165 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15138 - CGI_10008293 superfamily 247068 378 466 3.72E-09 55.3974 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15138 - CGI_10008293 superfamily 247068 270 367 2.75E-06 46.5378 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15138 - CGI_10008293 superfamily 245213 805 843 6.87E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15138 - CGI_10008293 superfamily 247068 493 582 0.000118104 41.5302 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15138 - CGI_10008293 superfamily 216265 874 985 1.79E-24 101.225 cl03079 Cadherin_C superfamily - - Cadherin cytoplasmic region; Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn. Q#15138 - CGI_10008293 superfamily 214565 767 813 0.000737598 39.0829 cl18312 VWC_out superfamily - - von Willebrand factor (vWF) type C domain; von Willebrand factor (vWF) type C domain. Q#15139 - CGI_10008294 superfamily 234583 1 365 1.61E-135 398.382 cl18873 PRK00029 superfamily N - hypothetical protein; Validated Q#15140 - CGI_10008295 superfamily 241831 19 81 4.21E-23 85.3006 cl00386 BolA superfamily - - BolA-like protein; This family consist of the morphoprotein BolA from E. coli and its various homologues. In E. coli over expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5. Q#15144 - CGI_10008299 superfamily 248024 194 370 2.75E-17 79.6345 cl17470 SBF superfamily - - "Sodium Bile acid symporter family; This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds." Q#15145 - CGI_10008300 superfamily 247648 165 255 2.24E-32 116.081 cl16941 NTP-PPase superfamily - - "Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain superfamily; This superfamily contains enzymes that hydrolyze the alpha-beta phosphodiester bond of all canonical NTPs into monophosphate derivatives and pyrophosphate (PPi). Divalent ions, such as Mg2+ ion(s), are essential to activate a proposed water nucleophile and stabilize the charged intermediates to facilitate catalysis. These enzymes share a conserved divalent ion-binding motif EXX[E/D] in their active sites. They also share a highly conserved four-helix bundle, where one face forms the active site, while the other participates in oligomer assembly. The four-helix bundle consists of two central antiparallel alpha-helices that can be contained within a single protomer or form upon dimerization. The superfamily members include dimeric dUTP pyrophosphatases (dUTPases; EC 3.6.1.23), the nonspecific NTP-PPase MazG proteins, HisE-encoded phosphoribosyl ATP pyrophosphohydolase (PRA-PH), fungal histidine biosynthesis trifunctional proteins, and several uncharacterized protein families." Q#15145 - CGI_10008300 superfamily 247648 41 131 5.67E-31 112.229 cl16941 NTP-PPase superfamily - - "Nucleoside Triphosphate Pyrophosphohydrolase (EC 3.6.1.8) MazG-like domain superfamily; This superfamily contains enzymes that hydrolyze the alpha-beta phosphodiester bond of all canonical NTPs into monophosphate derivatives and pyrophosphate (PPi). Divalent ions, such as Mg2+ ion(s), are essential to activate a proposed water nucleophile and stabilize the charged intermediates to facilitate catalysis. These enzymes share a conserved divalent ion-binding motif EXX[E/D] in their active sites. They also share a highly conserved four-helix bundle, where one face forms the active site, while the other participates in oligomer assembly. The four-helix bundle consists of two central antiparallel alpha-helices that can be contained within a single protomer or form upon dimerization. The superfamily members include dimeric dUTP pyrophosphatases (dUTPases; EC 3.6.1.23), the nonspecific NTP-PPase MazG proteins, HisE-encoded phosphoribosyl ATP pyrophosphohydolase (PRA-PH), fungal histidine biosynthesis trifunctional proteins, and several uncharacterized protein families." Q#15149 - CGI_10001752 superfamily 247743 13 166 5.85E-14 68.7119 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#15150 - CGI_10001753 superfamily 243176 1 311 1.76E-89 278.956 cl02777 chaperonin_like superfamily N - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#15151 - CGI_10001755 superfamily 243116 398 773 7.30E-161 474.217 cl02626 DNA_pol_A superfamily - - "Family A polymerase primarily fills DNA gaps that arise during DNA repair, recombination and replication; DNA polymerase family A, 5'-3' polymerase domain. Family A polymerase functions primarily to fill DNA gaps that arise during DNA repair, recombination and replication. DNA-dependent DNA polymerases can be classified into six main groups based upon phylogenetic relationships with E. coli polymerase I (classA), E. coli polymerase II (class B), E.coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class X), E. coli UmuC/DinB and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family A polymerases are found primarily in organisms related to prokaryotes and include prokaryotic DNA polymerase I, mitochondrial polymerase gamma, and several bacteriophage polymerases including those from odd-numbered phage (T3, T5, and T7). Prokaryotic polymerase I (pol I) has two functional domains located on the same polypeptide; a 5'-3' polymerase and a 5'-3' exonuclease. Pol I uses its 5' nuclease activity to remove the ribonucleotide portion of newly synthesized Okazaki fragments and the DNA polymerase activity to fill in the resulting gap. The structure of these polymerases resembles in overall morphology a cupped human right hand, with fingers (which bind an incoming nucleotide and interact with the single-stranded template), palm (which harbors the catalytic amino acid residues and also binds an incoming dNTP) and thumb (which binds double-stranded DNA) subdomains." Q#15151 - CGI_10001755 superfamily 246724 90 162 4.57E-29 111.337 cl14815 H3TH_StructSpec-5'-nucleases superfamily - - "H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination; The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases." Q#15151 - CGI_10001755 superfamily 246722 1 82 1.87E-10 59.7519 cl14812 PIN_SF superfamily N - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#15151 - CGI_10001755 superfamily 245226 246 389 8.85E-08 51.4963 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#15152 - CGI_10001757 superfamily 241550 58 216 4.78E-36 132.303 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#15152 - CGI_10001757 superfamily 245839 257 419 8.21E-34 124.249 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#15155 - CGI_10011828 superfamily 241886 93 308 4.48E-57 188.537 cl00470 Aldo_ket_red superfamily C - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#15156 - CGI_10011829 superfamily 245603 311 397 2.16E-20 85.7767 cl11403 pepsin_retropepsin_like superfamily - - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#15157 - CGI_10011831 superfamily 246925 153 281 2.06E-05 46.1946 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15158 - CGI_10011832 superfamily 247683 226 280 2.88E-24 94.6856 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15158 - CGI_10011832 superfamily 247683 31 81 6.17E-24 94.0221 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15158 - CGI_10011832 superfamily 246908 335 427 1.29E-34 124.549 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#15158 - CGI_10011832 superfamily 247683 140 193 9.79E-22 88.0911 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15159 - CGI_10011833 superfamily 241599 789 847 1.76E-21 90.3804 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#15159 - CGI_10011833 superfamily 199908 231 291 2.71E-08 52.2579 cl16908 DnaJ_zf superfamily - - "Zinc finger domain of DnaJ and HSP40; Central/middle or CxxCxGxG-motif containing domain of DnaJ/Hsp40 (heat shock protein 40). DnaJ proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonin family. Hsp40 proteins are characterized by the presence of an N-terminal J domain, which mediates the interaction with Hsp70. This central domain contains four repeats of a CxxCxGxG motif and binds to two Zinc ions. It has been implicated in substrate binding." Q#15159 - CGI_10011833 superfamily 199908 583 635 1.83E-06 46.8651 cl16908 DnaJ_zf superfamily N - "Zinc finger domain of DnaJ and HSP40; Central/middle or CxxCxGxG-motif containing domain of DnaJ/Hsp40 (heat shock protein 40). DnaJ proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonin family. Hsp40 proteins are characterized by the presence of an N-terminal J domain, which mediates the interaction with Hsp70. This central domain contains four repeats of a CxxCxGxG motif and binds to two Zinc ions. It has been implicated in substrate binding." Q#15160 - CGI_10011834 superfamily 241596 256 294 4.16E-06 43.3567 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#15162 - CGI_10011836 superfamily 248097 21 158 1.67E-22 87.7058 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15163 - CGI_10011837 superfamily 248097 46 183 3.27E-23 93.0986 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15163 - CGI_10011837 superfamily 248097 251 388 8.12E-23 92.3282 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15164 - CGI_10011838 superfamily 217293 43 232 4.20E-37 135.068 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15164 - CGI_10011838 superfamily 202474 239 343 7.70E-12 63.4417 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#15165 - CGI_10011839 superfamily 217293 1 153 1.21E-28 110.8 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15165 - CGI_10011839 superfamily 202474 160 256 5.58E-09 54.5821 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#15166 - CGI_10011840 superfamily 217293 35 225 1.14E-35 134.683 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15166 - CGI_10011840 superfamily 217293 425 552 6.22E-17 80.3695 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15166 - CGI_10011840 superfamily 202474 232 328 1.49E-08 54.5821 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#15166 - CGI_10011840 superfamily 202474 559 629 3.06E-07 50.7301 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#15168 - CGI_10011842 superfamily 217293 464 667 2.05E-24 102.711 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15168 - CGI_10011842 superfamily 217293 31 231 2.65E-24 102.326 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15169 - CGI_10011843 superfamily 217293 25 212 8.48E-38 136.994 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15169 - CGI_10011843 superfamily 247741 300 335 0.00186076 38.4317 cl17187 Aldolase_Class_I superfamily NC - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#15170 - CGI_10011844 superfamily 217293 22 216 1.69E-25 102.711 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15170 - CGI_10011844 superfamily 202474 223 436 2.83E-11 61.9009 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#15171 - CGI_10011845 superfamily 217293 33 232 1.38E-28 111.185 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15173 - CGI_10011847 superfamily 216554 147 329 1.72E-32 120.662 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#15174 - CGI_10011848 superfamily 243072 121 245 3.15E-32 121.722 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15174 - CGI_10011848 superfamily 243072 36 147 1.33E-21 91.6762 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15174 - CGI_10011848 superfamily 243072 330 457 5.85E-21 89.7502 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15174 - CGI_10011848 superfamily 243072 439 556 3.60E-20 87.439 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15174 - CGI_10011848 superfamily 243073 617 654 2.04E-07 48.6205 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#15176 - CGI_10011850 superfamily 248054 5 86 1.06E-06 47.6823 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#15177 - CGI_10011851 superfamily 241580 120 197 8.03E-49 162.338 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#15178 - CGI_10012996 superfamily 177822 35 215 9.17E-18 79.1937 cl18088 PLN02164 superfamily N - sulfotransferase Q#15179 - CGI_10012997 superfamily 247727 218 299 0.00261428 36.6391 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#15179 - CGI_10012997 superfamily 177822 352 591 1.32E-21 95.3721 cl18088 PLN02164 superfamily N - sulfotransferase Q#15180 - CGI_10012998 superfamily 241575 20 87 4.48E-11 59.5935 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#15180 - CGI_10012998 superfamily 241575 109 175 3.40E-09 54.2007 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#15180 - CGI_10012998 superfamily 241575 211 278 1.78E-07 49.1931 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#15180 - CGI_10012998 superfamily 243132 302 667 3.68E-126 381.341 cl02661 A_deamin superfamily - - "Adenosine-deaminase (editase) domain; Adenosine deaminases acting on RNA (ADARs) can deaminate adenosine to form inosine. In long double-stranded RNA, this process is non-specific; it occurs site-specifically in RNA transcripts. The former is important in defence against viruses, whereas the latter may affect splicing or untranslated regions. They are primarily nuclear proteins, but a longer isoform of ADAR1 is found predominantly in the cytoplasm. ADARs are derived from the Tad1-like tRNA deaminases that are present across eukaryotes. These in turn belong to the nucleotide/nucleic acid deaminase superfamily and are characterized by a distinct insert between the two conserved cysteines that are involved in binding zinc." Q#15181 - CGI_10012999 superfamily 241563 64 100 2.34E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15182 - CGI_10013000 superfamily 241900 33 285 1.01E-101 306.087 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#15183 - CGI_10013001 superfamily 241900 37 292 2.67E-106 318.028 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#15185 - CGI_10013003 superfamily 243082 306 487 4.88E-39 143.778 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#15185 - CGI_10013003 superfamily 243082 133 226 2.62E-07 50.9446 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#15186 - CGI_10013004 superfamily 246675 388 678 2.14E-143 435.697 cl14615 PI-PLCc_GDPD_SF superfamily - - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#15186 - CGI_10013004 superfamily 246669 710 838 4.09E-58 197.379 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#15186 - CGI_10013004 superfamily 247725 97 205 2.74E-53 182.843 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15186 - CGI_10013004 superfamily 150071 306 387 6.57E-21 89.555 cl08538 efhand_like superfamily - - "Phosphoinositide-specific phospholipase C, efhand-like; Members of this family are predominantly found in phosphoinositide-specific phospholipase C. They adopt a structure consisting of a core of four alpha helices, in an EF like fold, and are required for functioning of the enzyme." Q#15188 - CGI_10013006 superfamily 245201 33 267 2.60E-67 221.24 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15188 - CGI_10013006 superfamily 243239 394 481 1.15E-44 154.016 cl02916 POLO_box superfamily - - "Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases; The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides." Q#15188 - CGI_10013006 superfamily 243239 496 577 1.11E-36 131.55 cl02916 POLO_box superfamily - - "Polo-box domain (PBD), a C-terminal tandemly repeated region of polo-like kinases; The polo-like Ser/Thr kinases (Plk1, Plk2/Snk, Plk3/Prk/Fnk, Plk4/Sak, and the inactive kinase Plk5) play various roles in cytokinesis and mitosis. At their C-terminus, they contain a tandemly repeated polo-box domain (in the case of Plk4, a tandem repeat of cryptic PBDs is found in the middle of the protein followed by a C-terminal single repeat), which appears to be involved in autoinhibition and in mediating the subcellular localization. The latter may be controlled via interactions between the polo-box domain and phospho-peptide motifs. The phosphopeptide binding site is formed at the interface between the two tandemly repeated PBDs. The PBDs of Plk4/Sak appear unique in participating in homodimer interactions, though it is not clear whether and how they interact with phosphopeptides." Q#15189 - CGI_10013007 superfamily 243107 22 68 1.84E-11 60.2514 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#15190 - CGI_10013008 superfamily 241900 538 866 0 545.834 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#15190 - CGI_10013008 superfamily 217007 58 354 1.52E-102 329.946 cl11995 Syja_N superfamily - - SacI homology domain; This Pfam family represents a protein domain which shows homology to the yeast protein SacI. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin. Q#15190 - CGI_10013008 superfamily 247723 865 1004 5.19E-34 129.833 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15192 - CGI_10013010 superfamily 247755 1 222 2.83E-70 227.819 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#15192 - CGI_10013010 superfamily 247789 340 546 5.12E-32 123.138 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#15195 - CGI_10013013 superfamily 152787 122 186 1.03E-13 63.0005 cl18053 V-SNARE_C superfamily - - Snare region anchored in the vesicle membrane C-terminus; Within the SNARE proteins interactions in the C-terminal half of the SNARE helix are critical to the driving of membrane fusion; whereas interactions in the N-terminal half of the SNARE domain are important for promoting priming or docking of the vesicle pfam05008. Q#15196 - CGI_10013014 superfamily 218028 166 299 2.01E-14 68.4931 cl04479 AAA_4 superfamily - - "Divergent AAA domain; This family is related to the pfam00004 family, and presumably has the same function (ATP-binding)." Q#15197 - CGI_10002545 superfamily 245201 24 216 1.49E-32 122.345 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15199 - CGI_10002547 superfamily 243689 29 95 9.51E-07 47.6233 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#15200 - CGI_10002548 superfamily 241563 62 97 0.000351578 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15200 - CGI_10002548 superfamily 110440 483 510 0.00444607 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#15200 - CGI_10002548 superfamily 241563 8 53 0.00716296 34.6203 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15201 - CGI_10011741 superfamily 243058 513 592 1.51E-05 44.2276 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#15201 - CGI_10011741 superfamily 197676 19 41 0.00343878 36.6749 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#15203 - CGI_10011743 superfamily 207637 1 87 2.58E-17 75.267 cl02541 CIDE_N superfamily - - "CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein." Q#15203 - CGI_10011743 superfamily 150043 108 324 1.32E-74 231.936 cl07748 DFF40 superfamily - - DNA fragmentation factor 40 kDa; Members of this family of eukaryotic apoptotic proteins induce DNA fragmentation and chromatin condensation during apoptosis. Q#15204 - CGI_10011744 superfamily 245603 213 336 2.14E-77 239.377 cl11403 pepsin_retropepsin_like superfamily - - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#15204 - CGI_10011744 superfamily 241645 3 70 3.18E-24 95.5144 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#15204 - CGI_10011744 superfamily 241643 410 445 0.000158045 39.3647 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#15207 - CGI_10011747 superfamily 245213 63 93 0.000489897 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15209 - CGI_10011749 superfamily 243034 97 168 0.0054774 34.278 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#15210 - CGI_10011750 superfamily 219016 64 119 1.18E-19 83.0795 cl05757 DUF1077 superfamily N - Protein of unknown function (DUF1077); This family consists of several hypothetical eukaryotic proteins of unknown function. Q#15211 - CGI_10011751 superfamily 247723 60 127 3.65E-32 115.398 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15211 - CGI_10011751 superfamily 206083 205 231 0.00993402 33.3504 cl16471 zf-C2H2_6 superfamily - - C2H2-type zinc finger; C2H2-type zinc finger. Q#15212 - CGI_10011752 superfamily 243109 88 242 7.70E-80 246.716 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#15212 - CGI_10011752 superfamily 247805 273 418 4.25E-30 115.274 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15212 - CGI_10011752 superfamily 247805 1 66 3.15E-16 76.3693 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15214 - CGI_10019660 superfamily 241754 6 334 6.05E-156 446.633 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#15215 - CGI_10019661 superfamily 246918 32 82 8.22E-09 53.3595 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#15215 - CGI_10019661 superfamily 245814 673 732 2.38E-08 52.4091 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15215 - CGI_10019661 superfamily 246918 580 641 0.00205425 37.5663 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#15215 - CGI_10019661 superfamily 246918 318 347 0.0056273 36.2294 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#15215 - CGI_10019661 superfamily 246918 793 813 0.00639939 36.0255 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#15216 - CGI_10019662 superfamily 245201 1 240 2.20E-176 511.07 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15216 - CGI_10019662 superfamily 243036 501 807 6.18E-79 258.82 cl02434 CNH superfamily - - "CNH domain; Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations." Q#15217 - CGI_10019663 superfamily 245201 2 287 0 609.372 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15218 - CGI_10019664 superfamily 247745 155 446 3.70E-148 439.008 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#15218 - CGI_10019664 superfamily 245003 463 524 1.46E-09 55.6658 cl08536 Alpha-mann_mid superfamily N - "Alpha mannosidase, middle domain; Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase." Q#15219 - CGI_10019665 superfamily 187403 4 356 0 549.251 cl14649 BRO1_Alix_like superfamily - - "Protein-interacting Bro1-like domain of mammalian Alix and related domains; This superfamily includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1 and Rim20 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, HD-PTP, and Brox) and Snf7 (in the case of yeast Bro1, and Rim20). The single domain protein human Brox, and the isolated Bro1-like domains of Alix, HD-PTP and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix, HD-PTP, Bro1, and Rim20 also have a V-shaped (V) domain, which in the case of Alix, has been shown to be a dimerization domain and to contain a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in this superfamily. Alix, HD-PTP and Bro1 also have a proline-rich region (PRR); the Alix PRR binds multiple partners. Rhophilin-1, and -2, in addition to this Bro1-like domain, have an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This protein has a C-terminal, catalytically inactive tyrosine phosphatase domain." Q#15220 - CGI_10019666 superfamily 241640 381 504 1.22E-11 62.5901 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#15222 - CGI_10019668 superfamily 241599 412 466 9.99E-12 60.72 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#15222 - CGI_10019668 superfamily 202226 290 367 5.44E-35 125.869 cl08348 CUT superfamily - - "CUT domain; The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain, often found downstream of the CUT domain. Multiple copies of the CUT domain can exist in one protein ." Q#15226 - CGI_10019672 superfamily 243034 423 527 8.67E-08 51.2268 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#15226 - CGI_10019672 superfamily 243034 101 175 3.61E-06 46.2192 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#15226 - CGI_10019672 superfamily 243072 320 396 0.00781027 36.2075 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15228 - CGI_10019674 superfamily 246723 108 747 0 699.441 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#15229 - CGI_10019675 superfamily 243161 5 75 4.32E-11 57.0598 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#15231 - CGI_10019677 superfamily 243033 75 185 1.33E-11 58.4837 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#15232 - CGI_10019678 superfamily 243161 86 150 3.72E-11 55.9042 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#15234 - CGI_10019680 superfamily 222090 1 113 1.28E-10 55.3566 cl18636 Methyltransf_22 superfamily NC - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#15235 - CGI_10019681 superfamily 246723 281 811 0 550.369 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#15235 - CGI_10019681 superfamily 243033 92 189 1.21E-09 56.9429 cl02428 Ependymin superfamily N - Ependymin; Ependymin. Q#15237 - CGI_10019683 superfamily 247068 40 137 2.25E-06 44.997 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15237 - CGI_10019683 superfamily 247068 145 237 1.76E-05 42.3006 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15237 - CGI_10019683 superfamily 247068 4 32 0.0070485 34.5966 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15238 - CGI_10019684 superfamily 247068 145 219 9.58E-06 43.4562 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15238 - CGI_10019684 superfamily 247068 348 393 0.00419147 35.367 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15239 - CGI_10019685 superfamily 247724 34 211 1.70E-105 310.308 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15240 - CGI_10019686 superfamily 245206 562 803 1.15E-106 330.335 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#15242 - CGI_10019688 superfamily 191369 1 182 3.68E-46 153.417 cl05372 DUF837 superfamily - - Protein of unknown function (DUF837); This family consists of several eukaryotic proteins of unknown function. One of the family members is a circulating cathodic antigen (CCA) found in Schistosoma mansoni (Blood fluke). Q#15243 - CGI_10019689 superfamily 243072 408 532 1.07E-34 129.426 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15243 - CGI_10019689 superfamily 243072 572 697 2.45E-33 125.574 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15243 - CGI_10019689 superfamily 243072 638 763 3.28E-31 119.411 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15243 - CGI_10019689 superfamily 243072 385 412 0.0014659 37.5336 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15243 - CGI_10019689 superfamily 243072 547 575 0.00174975 37.1484 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15244 - CGI_10009217 superfamily 248012 478 598 1.02E-10 59.5945 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#15244 - CGI_10009217 superfamily 214507 356 393 0.000239298 39.3356 cl15307 LRRCT superfamily C - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#15246 - CGI_10009219 superfamily 241624 22 286 1.12E-91 275.744 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#15247 - CGI_10009220 superfamily 217492 97 629 4.47E-104 327.693 cl18413 GH3 superfamily - - GH3 auxin-responsive promoter; GH3 auxin-responsive promoter. Q#15248 - CGI_10009221 superfamily 242849 6 104 2.98E-11 55.2877 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#15249 - CGI_10009222 superfamily 218485 410 519 4.50E-63 204.292 cl04974 ETF_QO superfamily - - "Electron transfer flavoprotein-ubiquinone oxidoreductase; Electron-transfer flavoprotein-ubiquinone oxidoreductase (ETF-QO) in the inner mitochondrial membrane accepts electrons from electron-transfer flavoprotein which is located in the mitochondrial matrix and reduces ubiquinone in the mitochondrial membrane. The two redox centres in the protein, FAD and a [4Fe4S] cluster, are present in a 64-kDa monomer." Q#15249 - CGI_10009222 superfamily 248054 19 61 1.35E-05 43.2296 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#15250 - CGI_10009223 superfamily 220657 1 196 4.85E-55 193.335 cl10940 RAI16-like superfamily N - Retinoic acid induced 16-like protein; This is the conserved N-terminal 450 residues of a family of proteins described as retinoic acid-induced protein 16-like proteins. The exact function is not known. The proteins are found from worms to humans. Q#15252 - CGI_10009225 superfamily 247683 173 226 1.36E-25 101.257 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15252 - CGI_10009225 superfamily 247792 11 56 6.80E-12 61.6928 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15252 - CGI_10009225 superfamily 247683 368 422 1.88E-26 103.628 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15252 - CGI_10009225 superfamily 247683 112 163 5.28E-25 99.3584 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15252 - CGI_10009225 superfamily 247683 728 782 9.64E-25 98.6923 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15256 - CGI_10009229 superfamily 243035 321 363 7.11E-05 41.4314 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15257 - CGI_10009230 superfamily 241763 16 194 2.65E-85 253.7 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#15258 - CGI_10009231 superfamily 241763 52 263 1.07E-112 326.118 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#15258 - CGI_10009231 superfamily 244586 5 25 0.000309094 37.6083 cl07031 Inhibitor_I29 superfamily N - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#15259 - CGI_10000914 superfamily 247684 1 289 1.45E-65 217.145 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15261 - CGI_10009331 superfamily 220692 116 330 0.000221135 41.4209 cl18570 7TM_GPCR_Srw superfamily N - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#15263 - CGI_10009333 superfamily 247724 27 81 6.97E-13 61.703 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15265 - CGI_10009844 superfamily 247743 246 389 1.19E-23 97.9871 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#15265 - CGI_10009844 superfamily 216502 451 654 4.70E-52 179.71 cl03209 Peptidase_M41 superfamily - - Peptidase family M41; Peptidase family M41. Q#15266 - CGI_10009845 superfamily 150442 199 319 2.52E-54 175.737 cl10750 Tmemb_18A superfamily - - "Transmembrane protein 188; The function of this family of transmembrane proteins has not, as yet, been determined." Q#15266 - CGI_10009845 superfamily 218118 80 140 2.68E-09 53.0017 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#15267 - CGI_10009846 superfamily 247723 126 194 1.86E-27 103.842 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15268 - CGI_10009847 superfamily 219120 297 428 0.00243526 39.8035 cl05929 Mycoplasma_p37 superfamily NC - "High affinity transport system protein p37; This family consists of several high affinity transport system protein p37 sequences which are specific to Mycoplasma species. The p37 gene is part of an operon encoding two additional proteins which are highly similar to components of the periplasmic binding-protein-dependent transport systems of Gram-negative bacteria.It has been suggested that p37 is part of a homologous, high-affinity transport system in M. hyorhinis, a Gram-positive bacterium." Q#15269 - CGI_10009848 superfamily 243176 87 313 1.80E-13 69.1785 cl02777 chaperonin_like superfamily N - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#15269 - CGI_10009848 superfamily 243176 34 96 4.35E-09 56.1418 cl02777 chaperonin_like superfamily C - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#15270 - CGI_10009849 superfamily 243072 500 612 1.25E-28 113.633 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15270 - CGI_10009849 superfamily 246925 1074 1307 4.14E-08 55.4394 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15272 - CGI_10009851 superfamily 222150 317 342 0.000770835 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#15272 - CGI_10009851 superfamily 246975 303 324 0.00546096 34.6301 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#15274 - CGI_10009853 superfamily 241628 978 1158 4.12E-89 287.99 cl00130 PseudoU_synth superfamily - - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#15274 - CGI_10009853 superfamily 245212 945 1002 0.000398787 40.3106 cl09940 S4 superfamily N - "S4/Hsp/ tRNA synthetase RNA-binding domain; The domain surface is populated by conserved, charged residues that define a likely RNA-binding site; Found in stress proteins, ribosomal proteins and tRNA synthetases; This may imply a hitherto unrecognized functional similarity between these three protein classes." Q#15274 - CGI_10009853 superfamily 241628 1292 1326 9.18E-08 52.6327 cl00130 PseudoU_synth superfamily N - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#15274 - CGI_10009853 superfamily 209898 37 53 0.000909565 38.8607 cl14787 MORN superfamily C - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#15274 - CGI_10009853 superfamily 209898 226 248 0.00150415 38.1534 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#15275 - CGI_10009854 superfamily 201479 155 284 3.06E-30 118.115 cl02994 Transglut_N superfamily - - Transglutaminase family; Transglutaminase family. Q#15275 - CGI_10009854 superfamily 201479 975 1096 1.10E-26 108.1 cl02994 Transglut_N superfamily - - Transglutaminase family; Transglutaminase family. Q#15275 - CGI_10009854 superfamily 247916 1250 1345 4.88E-18 81.275 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#15275 - CGI_10009854 superfamily 247916 433 527 2.83E-17 79.349 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#15275 - CGI_10009854 superfamily 216198 1592 1680 2.71E-08 53.4716 cl08295 Transglut_C superfamily - - "Transglutaminase family, C-terminal ig like domain; Transglutaminase family, C-terminal ig like domain. " Q#15275 - CGI_10009854 superfamily 216198 777 871 3.10E-06 47.3084 cl08295 Transglut_C superfamily - - "Transglutaminase family, C-terminal ig like domain; Transglutaminase family, C-terminal ig like domain. " Q#15275 - CGI_10009854 superfamily 247916 519 563 2.95E-05 43.9107 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#15277 - CGI_10009856 superfamily 247724 199 407 8.51E-35 128.676 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15279 - CGI_10009858 superfamily 147626 77 106 0.000824751 36.0791 cl05227 DUF1519 superfamily N - Protein of unknown function (DUF1519); This family consists of several putative homing endonuclease proteins of around 245 residues in length which appear to be found exclusively in Naegleria species. The function of this family is unclear. Q#15280 - CGI_10009859 superfamily 241739 220 548 0 577.614 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#15280 - CGI_10009859 superfamily 245205 108 217 2.58E-48 164.573 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#15281 - CGI_10009860 superfamily 243092 52 155 9.45E-20 82.3828 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15284 - CGI_10001064 superfamily 241594 258 517 2.80E-12 66.1747 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#15286 - CGI_10001464 superfamily 217316 111 185 0.00155685 37.2208 cl03832 DUF234 superfamily - - Archaea bacterial proteins of unknown function; Archaea bacterial proteins of unknown function. Q#15287 - CGI_10001465 superfamily 110440 394 420 0.00399928 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#15288 - CGI_10001466 superfamily 245106 134 259 3.30E-68 209.423 cl09615 UBA_e1_C superfamily - - Ubiquitin-activating enzyme e1 C-terminal domain; This presumed domain found at the C-terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterized. Q#15288 - CGI_10001466 superfamily 202124 69 95 2.27E-05 40.9936 cl08340 UBACT superfamily C - Repeat in ubiquitin-activating (UBA) protein; Repeat in ubiquitin-activating (UBA) protein. Q#15290 - CGI_10018405 superfamily 216897 2 80 7.29E-22 86.9665 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#15290 - CGI_10018405 superfamily 248097 160 273 1.67E-15 70.3718 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15292 - CGI_10018407 superfamily 241563 69 110 3.16E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15293 - CGI_10018408 superfamily 216897 81 159 3.50E-23 88.5072 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#15294 - CGI_10018409 superfamily 216897 2 80 1.87E-23 90.4332 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#15294 - CGI_10018409 superfamily 248097 95 210 7.08E-16 70.757 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15295 - CGI_10018411 superfamily 248097 1 103 8.69E-12 56.8898 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15296 - CGI_10018412 superfamily 216897 103 181 1.02E-24 93.5148 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#15299 - CGI_10018415 superfamily 221897 115 320 1.06E-44 158.317 cl16028 Maelstrom superfamily - - "piRNA pathway germ-plasm component; Maelstrom is a germ-plasm component protein, that is shown to be functionally involved in the piRNA pathway. It is conserved throughout Eukaryota, though it appears to have been lost from all examined teleost fish species. The domain architecture shows that it is coupled with several DNA- and RNA- related domains such as HMG box, SR-25-like and HDAC_interact domains. Sequence analysis and fold recognition have found a distant similarity between Maelstrom domain and the DnaQ 3'-5' exonuclease family with the RNase H fold (Exonuc_X-T, pfam00929); notably, that the Maelstrom domains from basal eukaryotes contain the conserved 3'-5' exonuclease active site residues (Asp-Glu-Asp-His-Asp, DEDHD). However, the animal and some amoeba maelstrom contain another set of conserved residues (Glu-His-His-Cys-His-Cys, EHHCHC). This evolutionary link together with structural examinations leads to the hypothesis that Maelstrom domains may have a potential nuclease-transposase activity or RNA-binding ability that may be implicated in piRNA biogenesis. A protein function evolution mode, namely "active site switch", has been proposed, in which the amoeba Maelstrom domains are the possible evolutionary intermediates due to their harbouring of the specific characteristics of both 3'-5' exonuclease and Maelstrom domains." Q#15299 - CGI_10018415 superfamily 241597 1 51 1.77E-05 43.1936 cl00082 HMG-box superfamily N - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#15301 - CGI_10018417 superfamily 241597 10 80 5.51E-31 106.999 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#15303 - CGI_10018419 superfamily 243082 65 359 2.89E-50 172.075 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#15305 - CGI_10018421 superfamily 241868 295 384 1.66E-47 162.706 cl00447 Nudix_Hydrolase superfamily C - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#15305 - CGI_10018421 superfamily 245010 401 532 1.67E-22 93.4483 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#15305 - CGI_10018421 superfamily 243072 21 99 1.42E-15 73.5718 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15305 - CGI_10018421 superfamily 204192 252 280 5.51E-05 41.0606 cl07801 zf-NADH-PPase superfamily - - NADH pyrophosphatase zinc ribbon domain; This domain is found in between two duplicated NUDIX domains. It has a zinc ribbon structure. Q#15305 - CGI_10018421 superfamily 220167 184 249 0.00122019 37.3861 cl07800 NUDIX-like superfamily N - "NADH pyrophosphatase-like rudimentary NUDIX domain; The N-terminal domain in NADH pyrophosphatase, which has a rudiment Nudix fold according to SCOP." Q#15309 - CGI_10018425 superfamily 247684 13 387 0 556.153 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15310 - CGI_10018426 superfamily 243035 1 112 1.98E-20 83.4381 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15313 - CGI_10018429 superfamily 241578 11 169 3.64E-38 136.653 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15313 - CGI_10018429 superfamily 241578 244 409 5.97E-38 136.262 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15314 - CGI_10018430 superfamily 243051 274 420 1.02E-16 77.8033 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#15314 - CGI_10018430 superfamily 241578 11 176 4.64E-41 146.278 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15315 - CGI_10018431 superfamily 115363 208 266 2.56E-08 51.989 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#15315 - CGI_10018431 superfamily 115363 766 823 5.46E-08 51.2186 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#15315 - CGI_10018431 superfamily 241578 445 534 7.67E-08 51.9572 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15315 - CGI_10018431 superfamily 115363 692 737 2.62E-06 46.211 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#15319 - CGI_10020401 superfamily 245864 57 148 3.50E-09 53.051 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#15320 - CGI_10020402 superfamily 245864 11 247 1.59E-39 143.958 cl12078 p450 superfamily NC - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#15321 - CGI_10020403 superfamily 247792 691 736 1.30E-08 52.448 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15321 - CGI_10020403 superfamily 245201 390 586 1.16E-103 323.918 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15321 - CGI_10020403 superfamily 246908 12 107 4.31E-39 141.378 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#15321 - CGI_10020403 superfamily 246908 154 255 2.72E-24 99.2252 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#15323 - CGI_10020405 superfamily 241760 264 312 4.60E-22 90.8781 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#15323 - CGI_10020405 superfamily 149946 167 254 6.89E-28 108.896 cl07621 efhand_2 superfamily - - "EF-hand; Members of this family adopt a helix-loop-helix motif, as per other EF hand domains. However, since they do not contain the canonical pattern of calcium binding residues found in many EF hand domains, they do not bind calcium ions. The main function of this domain is the provision of specificity in beta-dystroglycan recognition, though in dystrophin it serves an additional role: stabilisation of the WW domain (pfam00397), enhancing dystroglycan binding." Q#15325 - CGI_10020407 superfamily 147569 9 118 5.36E-38 125.922 cl05167 eIF_4EBP superfamily - - "Eukaryotic translation initiation factor 4E binding protein (EIF4EBP); This family consists of several eukaryotic translation initiation factor 4E binding proteins (EIF4EBP1,2 and 3). Translation initiation in eukaryotes is mediated by the cap structure (m7GpppN, where N is any nucleotide) present at the 5' end of all cellular mRNAs, except organellar. The cap is recognised by eukaryotic initiation factor 4F (eIF4F), which consists of three polypeptides, including eIF4E, the cap-binding protein subunit. The interaction of the cap with eIF4E facilitates the binding of the ribosome to the mRNA. eIF4E activity is regulated in part by translational repressors, 4E-BP1, 4E-BP2 and 4E-BP3 which bind to it and prevent its assembly into eIF4F." Q#15326 - CGI_10020408 superfamily 245818 14 371 1.21E-128 374.609 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#15327 - CGI_10020409 superfamily 241780 2 281 9.30E-105 321.704 cl00319 Gn_AT_II superfamily - - "Glutamine amidotransferases class-II (GATase). The glutaminase domain catalyzes an amide nitrogen transfer from glutamine to the appropriate substrate. In this process, glutamine is hydrolyzed to glutamic acid and ammonia. This domain is related to members of the Ntn (N-terminal nucleophile) hydrolase superfamily and is found at the N-terminus of enzymes such as glucosamine-fructose 6-phosphate synthase (GLMS or GFAT), glutamine phosphoribosylpyrophosphate (Prpp) amidotransferase (GPATase), asparagine synthetase B (AsnB), beta lactam synthetase (beta-LS) and glutamate synthase (GltS). GLMS catalyzes the formation of glucosamine 6-phosphate from fructose 6-phosphate and glutamine in amino sugar synthesis. GPATase catalyzes the first step in purine biosynthesis, an amide transfer from glutamine to PRPP, resulting in phosphoribosylamine, pyrophosphate and glutamate. Asparagine synthetase B synthesizes asparagine from aspartate and glutamine. Beta-LS catalyzes the formation of the beta-lactam ring in the beta-lactamase inhibitor clavulanic acid. GltS synthesizes L-glutamate from 2-oxoglutarate and L-glutamine. These enzymes are generally dimers, but GPATase also exists as a homotetramer." Q#15327 - CGI_10020409 superfamily 241833 365 490 1.65E-58 195.023 cl00389 SIS superfamily - - SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Q#15327 - CGI_10020409 superfamily 241833 527 664 3.46E-52 178.225 cl00389 SIS superfamily - - SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Q#15328 - CGI_10020410 superfamily 241782 19 482 7.00E-113 344.167 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#15330 - CGI_10020412 superfamily 247724 30 345 3.25E-159 451.596 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15331 - CGI_10020413 superfamily 247684 9 182 2.02E-16 78.0143 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15331 - CGI_10020413 superfamily 247684 402 576 1.22E-15 75.7031 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15337 - CGI_10020419 superfamily 192109 3 49 8.33E-09 50.2741 cl07313 U3_assoc_6 superfamily N - U3 small nucleolar RNA-associated protein 6; This is a family of U3 nucleolar RNA-associated proteins which are involved in nucleolar processing of pre-18S ribosomal RNA. Q#15341 - CGI_10020424 superfamily 198867 117 217 4.13E-36 129.971 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#15341 - CGI_10020424 superfamily 243066 5 108 4.48E-31 116.563 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#15341 - CGI_10020424 superfamily 243146 446 492 3.32E-12 61.9086 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15341 - CGI_10020424 superfamily 243146 494 540 3.58E-11 59.2122 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15341 - CGI_10020424 superfamily 243146 350 395 1.88E-10 56.901 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15341 - CGI_10020424 superfamily 243146 315 361 1.84E-06 45.6271 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15341 - CGI_10020424 superfamily 243146 399 447 0.000861034 37.813 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15342 - CGI_10020425 superfamily 247755 1675 1890 2.63E-95 309.05 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#15342 - CGI_10020425 superfamily 247755 790 1000 3.08E-80 266.293 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#15343 - CGI_10020426 superfamily 246598 233 414 1.05E-67 217.52 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#15343 - CGI_10020426 superfamily 241571 102 196 3.07E-05 42.3772 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15344 - CGI_10020427 superfamily 243051 58 127 1.26E-08 49.2713 cl02479 MAM superfamily NC - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#15345 - CGI_10020428 superfamily 215647 560 823 3.23E-70 234.81 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#15345 - CGI_10020428 superfamily 216897 48 130 2.62E-24 98.9076 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#15345 - CGI_10020428 superfamily 221370 283 493 1.29E-20 92.4344 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#15345 - CGI_10020428 superfamily 243086 504 545 2.05E-14 69.7113 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#15345 - CGI_10020428 superfamily 243029 211 270 7.40E-08 51.3612 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#15345 - CGI_10020428 superfamily 243029 144 198 7.34E-05 42.1164 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#15350 - CGI_10020433 superfamily 216411 127 214 0.000258874 38.7971 cl15974 MARVEL superfamily N - "Membrane-associating domain; MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis." Q#15351 - CGI_10020434 superfamily 242232 122 178 4.16E-15 69.9982 cl00984 TM2 superfamily C - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#15351 - CGI_10020434 superfamily 242232 267 338 5.05E-11 58.4422 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#15351 - CGI_10020434 superfamily 242232 44 93 3.55E-09 52.5604 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#15351 - CGI_10020434 superfamily 242232 197 246 9.04E-09 52.279 cl00984 TM2 superfamily C - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#15352 - CGI_10020435 superfamily 216411 55 93 0.00624306 33.4043 cl15974 MARVEL superfamily N - "Membrane-associating domain; MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis." Q#15353 - CGI_10020436 superfamily 241636 257 476 3.22E-108 325.696 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#15354 - CGI_10020437 superfamily 241636 88 282 5.57E-120 347.653 cl00145 TBOX superfamily - - "T-box DNA binding domain of the T-box family of transcriptional regulators. The T-box family is an ancient group that appears to play a critical role in development in all animal species. These genes were uncovered on the basis of similarity to the DNA binding domain of murine Brachyury (T) gene product, the defining feature of the family. Common features shared by T-box family members are DNA-binding and transcriptional regulatory activity, a role in development and conserved expression patterns, most of the known genes in all species being expressed in mesoderm or mesoderm precursors." Q#15355 - CGI_10020438 superfamily 247724 5 164 1.50E-87 258.636 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15356 - CGI_10020439 superfamily 247725 126 223 1.59E-12 65.3165 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15356 - CGI_10020439 superfamily 247725 4 118 1.85E-09 56.3213 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15358 - CGI_10020441 superfamily 219936 200 303 1.18E-29 111.188 cl18534 SPA superfamily - - Stabilisation of polarity axis; Yeast AFI1 (ARF3-interaction protein 1) has been shown to interact with the outer plaque of the spindle pole body. In Aspergillus nidulans the protein member is necessary for stabilisation of the polarity axes during septation. and in S. cerevisiae it functions as a polarisation-specific docking factor. Q#15358 - CGI_10020441 superfamily 219579 47 99 1.69E-10 56.8082 cl16001 Afi1 superfamily - - "Docking domain of Afi1 for Arf3 in vesicle trafficking; This domain occurs at the N-terminal of Afi1, a protein necessary for vesicle trafficking in yeast. This domain is the interacting region of the protein which binds to Arf3. Afi1 is distributed asymmetrically at the plasma membrane and is required for polarized distribution of Arf3 but not of an Arf3 guanine nucleotide-exchange factor, Yel1p. However, Afi1 is not required for targeting of Arf3 or Yel1p to the plasma membrane. Afi1 functions as an Arf3 polarization-specific adapter and participates in development of polarity. Although Arf3 is the homologue of human Arf6 it does not function in the same way, not being necessary for endocytosis or for mating factor receptor internalisation. In the S phase, however, it is concentrated at the plasma membrane of the emerging bud. Because of its polarized localisation and its critical function in the normal budding pattern of yeast, Arf3 is probably a regulator of vesicle trafficking, which is important for polarized growth." Q#15359 - CGI_10020442 superfamily 246722 20 190 2.82E-79 256.81 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#15359 - CGI_10020442 superfamily 216112 455 781 1.17E-96 309.227 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#15361 - CGI_10020444 superfamily 243088 118 227 1.64E-55 179.142 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#15362 - CGI_10020445 superfamily 241571 28 132 6.83E-11 57.8074 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15362 - CGI_10020445 superfamily 241613 139 170 0.000492956 37.1862 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#15363 - CGI_10020446 superfamily 243058 256 370 1.01E-13 67.7247 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#15363 - CGI_10020446 superfamily 243058 338 453 5.73E-12 62.3319 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#15363 - CGI_10020446 superfamily 243058 48 162 5.85E-12 62.3319 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#15363 - CGI_10020446 superfamily 243058 130 285 6.37E-08 50.3907 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#15364 - CGI_10002339 superfamily 248139 624 1135 0 654.253 cl17585 RNA_pol_B_RPB2 superfamily N - "RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Each RNA polymerase complex contains two related members of this family, in each case they are the two largest subunits.The clamp is a mobile structure that grips DNA during elongation." Q#15364 - CGI_10002339 superfamily 248139 332 517 1.24E-52 198.561 cl17585 RNA_pol_B_RPB2 superfamily NC - "RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Each RNA polymerase complex contains two related members of this family, in each case they are the two largest subunits.The clamp is a mobile structure that grips DNA during elongation." Q#15364 - CGI_10002339 superfamily 248139 32 227 1.13E-45 176.99 cl17585 RNA_pol_B_RPB2 superfamily C - "RNA polymerase beta subunit. RNA polymerases catalyse the DNA dependent polymerization of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Each RNA polymerase complex contains two related members of this family, in each case they are the two largest subunits.The clamp is a mobile structure that grips DNA during elongation." Q#15364 - CGI_10002339 superfamily 219216 562 620 2.39E-18 81.4734 cl06098 RNA_pol_Rpa2_4 superfamily - - "RNA polymerase I, Rpa2 specific domain; This domain is found between domain 3 (pfam04565) and domain 5 (pfam04565), but shows no homology to domain 4 of Rpb2. The external domains in multisubunit RNA polymerase (those most distant from the active site) are known to demonstrate more sequence variability." Q#15364 - CGI_10002339 superfamily 218151 182 357 4.41E-12 65.8225 cl15858 RNA_pol_Rpb2_2 superfamily - - "RNA polymerase Rpb2, domain 2; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). Rpb2 is the second largest subunit of the RNA polymerase. This domain forms one of the two distinctive lobes of the Rpb2 structure. This domain is also known as the lobe domain. DNA has been demonstrated to bind to the concave surface of the lobe domain, and plays a role in maintaining the transcription bubble. Many of the bacterial members contain large insertions within this domain, as region known as dispensable region 1 (DRI)." Q#15365 - CGI_10002340 superfamily 220809 15 246 6.88E-49 166.096 cl11190 UPF0565 superfamily - - Uncharacterized protein family UPF0565; This family of proteins has no known function. Q#15366 - CGI_10002341 superfamily 247755 352 546 1.15E-52 177.641 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#15366 - CGI_10002341 superfamily 247755 224 298 2.19E-32 121.402 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#15366 - CGI_10002341 superfamily 247755 80 132 1.61E-10 58.9992 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#15366 - CGI_10002341 superfamily 221805 292 335 4.13E-10 56.8218 cl14896 ABC_tran_2 superfamily C - ABC transporter; This domain is related to pfam00005. Q#15366 - CGI_10002341 superfamily 247755 182 251 7.71E-06 45.9967 cl17201 ABC_ATPase superfamily NC - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#15370 - CGI_10024351 superfamily 202234 4 85 4.65E-21 83.8949 cl03577 Viral_Rep superfamily - - Putative viral replication protein; This is a family of viral ORFs from various plant and animal ssDNA circoviruses. Published evidence to support the annotated function "viral replication associated protein" has not be found. Q#15376 - CGI_10024357 superfamily 218222 367 767 0 611.039 cl04696 Pellino superfamily - - "Pellino; Pellino is involved in Toll-like signalling pathways, and associates with the kinase domain of the Pelle Ser/Thr kinase." Q#15376 - CGI_10024357 superfamily 242350 72 173 3.22E-15 73.8982 cl01182 SprT superfamily N - "SprT homologues; Predicted to have roles in transcription elongation. Contains a conserved HExxH motif, indicating a metalloprotease function." Q#15379 - CGI_10024360 superfamily 245201 39 143 1.65E-19 81.1289 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15380 - CGI_10024361 superfamily 216599 46 115 1.20E-23 93.4935 cl18372 B56 superfamily C - "Protein phosphatase 2A regulatory B subunit (B56 family); Protein phosphatase 2A (PP2A) is a major intracellular protein phosphatase that regulates multiple aspects of cell growth and metabolism. The ability of this widely distributed heterotrimeric enzyme to act on a diverse array of substrates is largely controlled by the nature of its regulatory B subunit. There are multiple families of B subunits (See also pfam01240), this family is called the B56 family." Q#15381 - CGI_10024362 superfamily 243421 17 136 1.15E-29 107.696 cl03428 MAS20 superfamily - - MAS20 protein import receptor; MAS20 protein import receptor. Q#15382 - CGI_10024363 superfamily 246748 233 431 3.64E-101 305.654 cl14876 Zinc_peptidase_like superfamily N - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#15382 - CGI_10024363 superfamily 244870 95 221 6.04E-58 188.608 cl08238 PA superfamily - - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#15382 - CGI_10024363 superfamily 246748 12 117 6.16E-30 116.907 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#15383 - CGI_10024364 superfamily 219188 78 363 8.03E-89 274.548 cl18498 Lung_7-TM_R superfamily - - Lung seven transmembrane receptor; This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins. Q#15384 - CGI_10024365 superfamily 244509 75 160 8.28E-12 60.2695 cl06793 PRKCSH superfamily - - "Glucosidase II beta subunit-like protein; The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. Mutations in the gene coding for PRKCSH have been found to be involved in the development of autosomal dominant polycystic liver disease (ADPLD), but the precise role the protein has in the pathogenesis of this disease is unknown. This family also includes an ER sensor for misfolded glycoproteins and is therefore likely to be a generic sugar binding domain." Q#15384 - CGI_10024365 superfamily 244509 247 304 1.91E-07 47.9431 cl06793 PRKCSH superfamily - - "Glucosidase II beta subunit-like protein; The sequences found in this family are similar to a region found in the beta-subunit of glucosidase II, which is also known as protein kinase C substrate 80K-H (PRKCSH). The enzyme catalyzes the sequential removal of two alpha-1,3-linked glucose residues in the second step of N-linked oligosaccharide processing. The beta subunit is required for the solubility and stability of the heterodimeric enzyme, and is involved in retaining the enzyme within the endoplasmic reticulum. Mutations in the gene coding for PRKCSH have been found to be involved in the development of autosomal dominant polycystic liver disease (ADPLD), but the precise role the protein has in the pathogenesis of this disease is unknown. This family also includes an ER sensor for misfolded glycoproteins and is therefore likely to be a generic sugar binding domain." Q#15385 - CGI_10024366 superfamily 247637 2 351 1.83E-175 494.599 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#15387 - CGI_10024368 superfamily 219001 36 216 1.06E-53 186.361 cl05720 Drf_GBD superfamily - - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#15387 - CGI_10024368 superfamily 219000 219 408 3.45E-53 185.543 cl05717 Drf_FH3 superfamily - - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#15388 - CGI_10024369 superfamily 245456 890 1166 1.25E-110 348.97 cl10970 AP_MHD_Cterm superfamily - - "C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD); This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15." Q#15388 - CGI_10024369 superfamily 241754 5 306 9.72E-134 413.417 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#15389 - CGI_10024370 superfamily 247724 136 354 3.57E-124 368.156 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15389 - CGI_10024370 superfamily 243184 465 551 1.59E-34 125.818 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#15389 - CGI_10024370 superfamily 243185 370 456 1.13E-32 120.787 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#15391 - CGI_10024372 superfamily 247856 140 163 0.00223842 36.102 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#15393 - CGI_10024374 superfamily 248279 49 137 2.06E-20 83.5699 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#15394 - CGI_10024375 superfamily 241686 13 71 1.41E-12 59.5417 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#15394 - CGI_10024375 superfamily 242173 88 194 5.02E-23 90.389 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#15395 - CGI_10024376 superfamily 246925 174 362 9.72E-26 107.441 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15398 - CGI_10024379 superfamily 204741 5 61 1.39E-10 51.994 cl13257 DUF3317 superfamily - - "Protein of unknown function (DUF3317); This is a short family of proteins conserved from fungi and plants to human. One each of the human and mouse members is annotated as being androgen down-regulated protein expressed in mouse prostate, with a potential signal transduction function, and all appear to be membrane proteins." Q#15399 - CGI_10024380 superfamily 247856 117 178 6.54E-15 66.0321 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#15399 - CGI_10024380 superfamily 247856 44 104 2.50E-08 47.5425 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#15400 - CGI_10024381 superfamily 247856 219 280 4.36E-06 43.6905 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#15400 - CGI_10024381 superfamily 247856 295 353 1.29E-05 42.1497 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#15400 - CGI_10024381 superfamily 247856 46 87 5.32E-05 40.6089 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#15401 - CGI_10024382 superfamily 243183 825 902 6.87E-27 106.073 cl02785 Elongation_Factor_C superfamily - - "Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown." Q#15401 - CGI_10024382 superfamily 247792 37 69 0.000783488 38.5808 cl17238 RING superfamily N - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15401 - CGI_10024382 superfamily 241677 378 444 7.19E-25 102.72 cl00197 cyclophilin superfamily C - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#15403 - CGI_10024384 superfamily 241664 48 162 2.38E-40 138.487 cl00182 Mth938-like superfamily - - "Mth938-like domain. The members of this family include: Mth938, 2P1, Xcr35, Rpa2829, and several uncharacterized sequences. Mth938 is a hypothetical protein encoded by the Methanobacterium thermoautotrophicum (Mth) genome. This protein crystallizes as a dimer, although it is monomeric in solution, with one disulfide bond in each monomer. 2P1 is a partially characterized nuclear protein which is homologous to E3-3 from rat and known to be alternately spliced. Xcr35 and Rpa2829 are hypothetical proteins of unknown function from the Xanthomonas campestris and Rhodopseudomonas palustris genomes, respectively, for which the crystal structures have been determined." Q#15405 - CGI_10024386 superfamily 241610 1665 1717 1.15E-19 86.535 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 1288 1341 1.97E-18 82.683 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 2315 2368 3.47E-16 76.5198 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 1730 1782 4.42E-16 76.1346 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 1919 1972 6.29E-16 75.7494 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 1123 1175 7.56E-16 75.3642 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 1865 1915 3.47E-15 73.4382 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 1542 1594 1.16E-14 71.8974 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241584 2094 2190 1.62E-13 69.8327 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#15405 - CGI_10024386 superfamily 241610 1224 1279 4.04E-13 67.6602 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 2206 2256 3.87E-12 64.5786 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241610 1604 1644 9.76E-12 63.423 cl00101 KU superfamily C - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#15405 - CGI_10024386 superfamily 241584 1347 1434 4.14E-11 62.5139 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#15405 - CGI_10024386 superfamily 241613 1976 2008 6.34E-06 46.0458 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#15405 - CGI_10024386 superfamily 241613 2018 2051 1.34E-05 45.2754 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#15405 - CGI_10024386 superfamily 241613 2057 2090 3.27E-05 44.1198 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#15405 - CGI_10024386 superfamily 245814 2621 2694 8.56E-13 67.5304 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15405 - CGI_10024386 superfamily 245814 2524 2592 2.87E-10 59.8264 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15405 - CGI_10024386 superfamily 245814 2457 2514 1.13E-06 49.229 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15405 - CGI_10024386 superfamily 204025 2704 2737 0.0020949 38.7717 cl07344 PLAC superfamily - - PLAC (protease and lacunin) domain; The PLAC (protease and lacunin) domain is a short six-cysteine region that is usually found at the C terminal of proteins. It is found in a range of proteins including PACE4 (paired basic amino acid cleaving enzyme 4) and the extracellular matrix protein lacunin. Q#15406 - CGI_10024387 superfamily 245864 63 498 1.29E-117 357.359 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#15408 - CGI_10024389 superfamily 241607 461 512 1.54E-15 71.9517 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#15408 - CGI_10024389 superfamily 220617 108 147 0.00946021 35.8877 cl10871 DUF2371 superfamily N - Uncharacterized conserved protein (DUF2371); This is a family of proteins conserved from nematodes to humans. The function is not known. Q#15409 - CGI_10024390 superfamily 247725 11 123 2.79E-43 149.003 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15409 - CGI_10024390 superfamily 147416 318 413 5.65E-39 137.14 cl04988 Sprouty superfamily - - "Sprouty protein (Spry); This family consists of eukaryotic Sprouty protein homologues. Sprouty proteins have been revealed as inhibitors of the Ras/mitogen-activated protein kinase (MAPK) cascade, a pathway crucial for developmental processes initiated by activation of various receptor tyrosine kinases. The sprouty gene has found to be expressed in the the brain, cochlea, nasal organs, teeth, salivary gland, lungs, digestive tract, kidneys and limb buds in mice." Q#15411 - CGI_10024392 superfamily 247805 519 670 1.19E-18 84.6964 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15411 - CGI_10024392 superfamily 247905 856 989 7.47E-12 64.1812 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#15411 - CGI_10024392 superfamily 221155 1069 1192 7.83E-17 78.9488 cl13152 RIG-I_C-RD superfamily - - "C-terminal domain of RIG-I; This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerisation. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity." Q#15411 - CGI_10024392 superfamily 213148 731 862 3.35E-12 65.0258 cl17041 helicase_insert_domain superfamily - - "helical domain inserted in SF2-type helicase domain in Hef-, MDA5- and FancM-like proteins; This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases, like archaeal Hef helicase, MDA5-like helicases and FancM-like helicases. The exact function of this domain is unknown, but seems to play a role in interaction with nucleotides and/or the stabilization of the nucleotide complex." Q#15411 - CGI_10024392 superfamily 246680 168 252 4.22E-08 52.3038 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#15411 - CGI_10024392 superfamily 246680 73 155 1.60E-05 44.5998 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#15412 - CGI_10024393 superfamily 247805 105 260 1.10E-19 86.6224 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15412 - CGI_10024393 superfamily 247905 323 457 1.50E-14 71.1148 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#15412 - CGI_10024393 superfamily 243035 565 649 3.00E-05 42.9922 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15417 - CGI_10024398 superfamily 248458 98 478 8.96E-34 130.126 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#15418 - CGI_10024399 superfamily 216686 109 258 3.68E-29 112.031 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#15419 - CGI_10024400 superfamily 202668 441 544 1.43E-22 95.0386 cl04110 BK_channel_a superfamily - - Calcium-activated BK potassium channel alpha subunit; Calcium-activated BK potassium channel alpha subunit. Q#15419 - CGI_10024400 superfamily 219619 215 288 3.67E-11 61.4547 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#15420 - CGI_10024401 superfamily 186602 79 318 7.41E-84 260.717 cl03849 PSS superfamily - - Phosphatidyl serine synthase; Phosphatidyl serine synthase is also known as serine exchange enzyme. This family represents eukaryotic PSS I and II which are membrane bound proteins which catalyzes the replacement of the head group of a phospholipid (phosphotidylcholine or phosphotidylethanolamine) by L-serine. Q#15422 - CGI_10024403 superfamily 241733 6 81 3.19E-24 88.0866 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#15424 - CGI_10024405 superfamily 241680 39 246 1.27E-60 193.624 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#15427 - CGI_10001558 superfamily 244824 2 282 9.79E-59 198.737 cl07893 AmyAc_family superfamily N - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#15428 - CGI_10001559 superfamily 241574 94 309 5.68E-84 260.981 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#15430 - CGI_10002463 superfamily 248264 119 170 6.70E-08 48.0022 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#15431 - CGI_10010363 superfamily 243093 396 447 0.00470643 35.969 cl02568 WSC superfamily N - WSC domain; This domain may be involved in carbohydrate binding. Q#15436 - CGI_10010368 superfamily 245201 17 228 1.74E-64 213.638 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15436 - CGI_10010368 superfamily 247694 542 636 8.91E-29 110.782 cl17070 AMPKA_C_like superfamily - - "C-terminal regulatory domain of 5'-AMP-activated protein kinase (AMPK) alpha subunit and similar domains; This family is composed of AMPKs, microtubule-associated protein/microtubule affinity regulating kinases (MARKs), yeast Kcc4p-like proteins, plant calcineurin B-Like (CBL)-interacting protein kinases (CIPKs), and similar proteins. They are serine/threonine protein kinases (STKs) that catalyze the transfer of the gamma-phosphoryl group from ATP to S/T residues on protein substrates. AMPKs act as sensors for the energy status of the cell and are activated by cellular stresses that lead to ATP depletion such as hypoxia, heat shock, and glucose deprivation, among others. MARKs phosphorylate the tau protein and related microtubule-associated proteins (MAPs) on tubulin binding sites to induce detachment from microtubules, and are involved in the regulation of cell shape and polarity, cell cycle control, transport, and the cytoskeleton. Kcc4p and related proteins are septin-associated proteins that are involved in septin organization and in the yeast morphogenesis checkpoint coordinating the cell cycle with bud formation. CIPKs interact with the calcineurin B-like (CBL) calcium sensors to form a signaling network that decode specific calcium signals triggered by a variety of environmental stimuli including salinity, drought, cold, light, and mechanical perturbation, among others. All members of this family contain an N-terminal catalytic kinase domain and a C-terminal regulatory domain which is also called kinase associated domain 1 (KA1) in some cases. The C-terminal regulatory domain serves as a protein interaction domain in AMPKs and CIPKs. In MARKs and Kcc4p-like proteins, this domain binds phospholipids and may be involved in membrane localization." Q#15445 - CGI_10003259 superfamily 243072 52 201 2.17E-07 49.6894 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15450 - CGI_10020797 superfamily 247683 302 354 2.07E-11 58.4803 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15451 - CGI_10020798 superfamily 245596 24 239 3.87E-106 308.742 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15452 - CGI_10020799 superfamily 245596 131 346 1.34E-109 321.839 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15453 - CGI_10020800 superfamily 216363 185 280 1.55E-15 70.1918 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#15455 - CGI_10020803 superfamily 243072 20 133 4.59E-21 87.8242 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15456 - CGI_10020804 superfamily 204080 169 274 1.29E-07 50.3485 cl18252 BAAT_C superfamily C - BAAT / Acyl-CoA thioester hydrolase C terminal; This catalytic domain is found at the C terminal of acyl-CoA thioester hydrolases and bile acid-CoA:amino acid N-acetyltransferases (BAAT). Q#15457 - CGI_10020805 superfamily 241566 29 78 9.04E-15 70.2135 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#15457 - CGI_10020805 superfamily 241566 161 211 8.93E-11 58.6576 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#15457 - CGI_10020805 superfamily 241566 92 139 2.37E-05 42.8644 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#15457 - CGI_10020805 superfamily 243037 638 777 1.18E-62 208.195 cl02440 DAGK_acc superfamily - - Diacylglycerol kinase accessory domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown. Q#15457 - CGI_10020805 superfamily 248019 487 612 3.02E-46 161.31 cl17465 DAGK_cat superfamily - - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#15457 - CGI_10020805 superfamily 241645 350 446 5.74E-29 112.166 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#15458 - CGI_10020806 superfamily 247723 131 212 1.20E-50 170.92 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15458 - CGI_10020806 superfamily 247723 51 128 1.63E-46 159.262 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15458 - CGI_10020806 superfamily 247723 226 301 7.21E-33 121.578 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15461 - CGI_10020809 superfamily 248097 140 266 4.99E-15 68.831 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15462 - CGI_10020810 superfamily 152088 126 206 3.99E-12 59.1387 cl13155 DUF3259 superfamily - - Protein of unknown function (DUF3259); This eukaryotic family of proteins has no known function. Q#15463 - CGI_10020811 superfamily 152088 132 212 1.77E-20 82.2507 cl13155 DUF3259 superfamily - - Protein of unknown function (DUF3259); This eukaryotic family of proteins has no known function. Q#15466 - CGI_10020814 superfamily 247794 6 280 1.86E-54 181.824 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#15467 - CGI_10020815 superfamily 248097 51 161 6.44E-23 88.8614 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15468 - CGI_10006595 superfamily 247807 165 182 0.00169094 36.119 cl17253 AAA_17 superfamily C - AAA domain; AAA domain. Q#15471 - CGI_10006599 superfamily 206083 56 78 0.00198666 37.9728 cl16471 zf-C2H2_6 superfamily - - C2H2-type zinc finger; C2H2-type zinc finger. Q#15478 - CGI_10002870 superfamily 221643 89 235 7.11E-21 85.5339 cl13947 DUF3752 superfamily - - "Protein of unknown function (DUF3752); This domain family is found in eukaryotes, and is typically between 140 and 163 amino acids in length." Q#15483 - CGI_10003964 superfamily 220389 2 160 6.67E-71 216.112 cl10747 DUF2053 superfamily - - Predicted membrane protein (DUF2053); This entry is of the conserved N-terminal 150 residues of proteins conserved from plants to humans. The function is unknown although some annotation suggests it to be a transmembrane protein. Q#15485 - CGI_10003250 superfamily 243074 144 189 5.85E-06 42.4937 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#15487 - CGI_10003252 superfamily 247792 12 64 0.000323162 37.04 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15489 - CGI_10003254 superfamily 247792 16 68 8.28E-07 45.8996 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15489 - CGI_10003254 superfamily 241563 100 141 0.00731844 34.6203 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15490 - CGI_10003104 superfamily 241563 88 127 6.43E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15491 - CGI_10003105 superfamily 241563 171 207 0.00102253 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15492 - CGI_10003107 superfamily 207662 103 174 7.62E-38 135.382 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#15492 - CGI_10003107 superfamily 245599 518 639 2.46E-34 129.265 cl11397 NR_LBD superfamily C - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#15494 - CGI_10008916 superfamily 247805 39 147 2.60E-11 56.962 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15495 - CGI_10008917 superfamily 245847 7 147 0.0010165 38.6376 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#15496 - CGI_10008918 superfamily 241574 10 59 1.21E-08 49.1213 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#15497 - CGI_10008919 superfamily 243066 41 139 2.28E-13 64.9461 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#15499 - CGI_10008921 superfamily 243092 160 384 0.00577263 36.9292 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15500 - CGI_10008922 superfamily 243092 288 431 0.00402451 38.47 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15501 - CGI_10008923 superfamily 217316 138 195 0.00857908 34.9096 cl03832 DUF234 superfamily C - Archaea bacterial proteins of unknown function; Archaea bacterial proteins of unknown function. Q#15502 - CGI_10008924 superfamily 247750 135 292 1.17E-75 241.425 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#15502 - CGI_10008924 superfamily 247750 1 67 5.00E-35 132.413 cl17196 E1_enzyme_family superfamily NC - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#15502 - CGI_10008924 superfamily 202124 40 102 1.42E-11 59.098 cl08340 UBACT superfamily - - Repeat in ubiquitin-activating (UBA) protein; Repeat in ubiquitin-activating (UBA) protein. Q#15503 - CGI_10008925 superfamily 241563 62 103 0.00404099 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15504 - CGI_10008926 superfamily 241983 36 78 0.000219258 38.1078 cl00614 ADP_ribosyl_GH superfamily N - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#15506 - CGI_10008928 superfamily 245201 17 107 1.45E-21 87.1901 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15507 - CGI_10008929 superfamily 245201 168 428 0 529.585 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15507 - CGI_10008929 superfamily 246908 52 153 5.55E-52 171.996 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#15507 - CGI_10008929 superfamily 247683 1 44 6.17E-08 49.118 cl17036 SH3 superfamily N - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#15509 - CGI_10011333 superfamily 243069 3 46 0.000104979 37.1267 cl02525 Band_7 superfamily N - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#15510 - CGI_10011334 superfamily 243119 2 22 0.00148722 33.9566 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#15511 - CGI_10011335 superfamily 222429 38 116 8.24E-10 53.3984 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#15512 - CGI_10011336 superfamily 217062 208 275 5.65E-09 54.5827 cl12266 Branch superfamily N - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#15513 - CGI_10011337 superfamily 221636 12 59 1.18E-18 75.6882 cl13929 MOZART1 superfamily - - "Mitotic-spindle organizing gamma-tubulin ring associated; The name MOZART is derived from letters of 'mitotic-spindle organizing proteins associated with a ring of gamma-tubulin'. This family operates as part of the gamma-tubulin ring complex, gamma-TuRC, one of the complexes necessary for chromosome segregation. This complex is located at centrosomes and mediates the formation of bipolar spindles in mitosis; it consists of six subunits. However, unlike the other four known subunits, this family does not carry the conserved 'Spc97-Spc98' GCP domain, so the TUBCGP nomenclature cannot be used for it. MOZART1 is required for gamma-TuRC recruitment to centrosomes." Q#15516 - CGI_10005647 superfamily 246918 118 161 1.08E-06 44.1147 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#15519 - CGI_10005650 superfamily 247724 29 193 1.46E-108 312.572 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15521 - CGI_10005652 superfamily 219565 86 400 1.21E-26 110.273 cl06690 DUF1619 superfamily - - Protein of unknown function (DUF1619); This is a family of sequences derived from hypothetical eukaryotic proteins. The region in question is approximately 330 residues long and has a cysteine rich amino-terminus. Q#15525 - CGI_10017434 superfamily 247725 121 153 0.00935034 32.6761 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15526 - CGI_10017435 superfamily 217293 15 113 3.18E-05 42.6199 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15526 - CGI_10017435 superfamily 202474 135 197 0.0061411 35.7073 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#15527 - CGI_10017436 superfamily 248097 3 122 9.64E-20 79.2314 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15529 - CGI_10017438 superfamily 245814 351 429 0.00038592 39.4109 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15529 - CGI_10017438 superfamily 245814 270 321 0.00715487 35.4604 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15531 - CGI_10017440 superfamily 241832 191 336 5.19E-47 157.783 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15533 - CGI_10017442 superfamily 198738 242 325 5.72E-46 154.016 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#15533 - CGI_10017442 superfamily 247057 122 210 5.99E-43 146.344 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#15533 - CGI_10017442 superfamily 152056 1 81 3.51E-20 83.6904 cl13126 GABP-alpha superfamily - - "GA-binding protein alpha chain; This family of proteins represents the transcription factor GABP alpha. This alpha domain is a five-stranded beta-sheet crossed by a distorted helix termed an OST domain. The surface of the GABP alpha OST domain contains two clusters of negatively-charged residues suggesting there are positively-charged partner proteins. The OST domain binds to the CH1 and CH3 domains of the co-activator histone acetyltransferase CBP/p300, a direct link between GABP and transcriptional machinery has been made." Q#15534 - CGI_10004631 superfamily 248372 60 167 2.75E-18 76.8544 cl17818 Nuc_deoxyrib_tr superfamily - - Nucleoside 2-deoxyribosyltransferase; Nucleoside 2-deoxyribosyltransferase EC:2.4.2.6 catalyzes the cleavage of the glycosidic bonds of 2`-deoxyribonucleosides. Q#15535 - CGI_10004632 superfamily 243092 131 159 0.00133632 35.0204 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15536 - CGI_10004633 superfamily 245205 131 212 0.00110277 36.8321 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#15538 - CGI_10003780 superfamily 241750 226 563 1.60E-67 229.888 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#15539 - CGI_10004018 superfamily 247727 111 190 5.17E-06 43.5727 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#15540 - CGI_10004019 superfamily 245847 25 93 1.38E-05 40.6178 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#15541 - CGI_10004020 superfamily 218913 260 410 1.16E-06 48.8981 cl18486 Trehalose_recp superfamily N - "Trehalose receptor; In Drosophila, taste is perceived by gustatory neurons located in sensilla distributed on several different appendages throughout the body of the animal. This family represents the taste receptor sensitive to trehalose." Q#15542 - CGI_10004021 superfamily 247866 10 210 2.89E-05 43.5952 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#15544 - CGI_10004023 superfamily 248054 10 83 6.26E-09 51.704 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#15545 - CGI_10003781 superfamily 245226 7 197 2.11E-114 331.485 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#15545 - CGI_10003781 superfamily 207658 226 306 4.69E-16 71.1742 cl02578 HRDC superfamily - - HRDC domain; The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain cause human disease. It is interesting to note that the RecQ helicase in Deinococcus radiodurans has three tandem HRDC domains. Q#15550 - CGI_10013537 superfamily 241913 33 143 3.38E-23 89.9245 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#15551 - CGI_10013538 superfamily 241640 214 353 1.04E-48 165.527 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#15551 - CGI_10013538 superfamily 241571 45 155 6.20E-33 119.439 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15551 - CGI_10013538 superfamily 241613 162 196 3.67E-08 49.1274 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#15552 - CGI_10013539 superfamily 241640 537 751 1.01E-77 251.427 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#15552 - CGI_10013539 superfamily 241571 405 515 7.20E-32 120.595 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15552 - CGI_10013539 superfamily 241571 86 199 1.13E-31 120.21 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15552 - CGI_10013539 superfamily 241571 270 382 1.99E-29 113.661 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15552 - CGI_10013539 superfamily 241913 28 77 2.55E-06 46.397 cl00509 hot_dog superfamily N - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#15553 - CGI_10013540 superfamily 241640 239 474 1.11E-92 284.169 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#15553 - CGI_10013540 superfamily 241613 117 152 1.90E-09 53.7498 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#15553 - CGI_10013540 superfamily 243061 173 218 1.29E-12 64.2854 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#15553 - CGI_10013540 superfamily 243061 16 105 4.40E-09 54.0098 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#15554 - CGI_10013541 superfamily 220695 66 256 9.94E-08 52.1959 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#15555 - CGI_10013542 superfamily 243072 3 123 1.16E-16 75.883 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15555 - CGI_10013542 superfamily 243072 104 168 0.00621408 35.4371 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15557 - CGI_10013544 superfamily 241644 1064 1212 1.25E-34 131.941 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#15557 - CGI_10013544 superfamily 226749 1705 1901 1.15E-18 88.7832 cl18777 COG4299 superfamily N - Uncharacterized protein conserved in bacteria [Function unknown] Q#15557 - CGI_10013544 superfamily 226749 1549 1625 1.35E-08 57.582 cl18777 COG4299 superfamily C - Uncharacterized protein conserved in bacteria [Function unknown] Q#15558 - CGI_10013545 superfamily 245206 19 265 1.27E-110 322.999 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#15559 - CGI_10013546 superfamily 220607 23 98 2.56E-21 81.2988 cl10858 DDDD superfamily - - Putative mitochondrial precursor protein; This is a family of small conserved proteins found from nematodes to humans. The C-terminal region is rich in asparagine. Members are putatively assigned to be mitochondrial precursor proteins but this could not be confirmed. Q#15560 - CGI_10013547 superfamily 241631 734 919 2.86E-63 216.705 cl00136 Sec7 superfamily - - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#15560 - CGI_10013547 superfamily 243072 1835 1962 9.29E-22 94.7578 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15560 - CGI_10013547 superfamily 245201 2060 2218 6.26E-14 73.0397 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15561 - CGI_10013548 superfamily 243062 235 334 1.90E-35 125.851 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#15564 - CGI_10023394 superfamily 241584 565 641 9.90E-08 51.3431 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#15564 - CGI_10023394 superfamily 222150 52 77 3.30E-07 48.5421 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#15564 - CGI_10023394 superfamily 247724 713 815 2.11E-06 49.4664 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15564 - CGI_10023394 superfamily 222150 136 160 3.77E-06 45.4605 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#15564 - CGI_10023394 superfamily 222150 108 133 2.23E-05 43.1493 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#15564 - CGI_10023394 superfamily 222150 83 105 2.55E-05 43.1493 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#15564 - CGI_10023394 superfamily 242669 457 535 0.000312143 42.8345 cl01728 DUF2232 superfamily N - "Predicted membrane protein (DUF2232); This domain, found in various hypothetical bacterial proteins, has no known function." Q#15565 - CGI_10023395 superfamily 247724 1 159 1.37E-111 322.831 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15566 - CGI_10023396 superfamily 222005 464 515 9.25E-07 48.8876 cl18632 AAA_19 superfamily C - Part of AAA domain; Part of AAA domain. Q#15566 - CGI_10023396 superfamily 221913 1025 1054 0.00520866 39.0607 cl18626 AAA_12 superfamily C - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#15568 - CGI_10023398 superfamily 248458 176 598 2.60E-42 156.32 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#15569 - CGI_10023399 superfamily 245213 253 286 1.93E-05 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15570 - CGI_10023400 superfamily 216290 157 275 5.65E-08 51.135 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#15570 - CGI_10023400 superfamily 217685 289 435 1.28E-06 47.3288 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#15572 - CGI_10023402 superfamily 247725 9 105 2.53E-33 122.388 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15572 - CGI_10023402 superfamily 207724 176 233 2.65E-15 71.1075 cl02772 BSD superfamily - - BSD domain; This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function. Q#15572 - CGI_10023402 superfamily 207724 109 152 5.53E-06 44.1435 cl02772 BSD superfamily - - BSD domain; This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function. Q#15573 - CGI_10023403 superfamily 245213 42 72 0.0049667 35.305 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15573 - CGI_10023403 superfamily 248012 394 534 0.00012284 41.1548 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#15575 - CGI_10023405 superfamily 247792 103 143 1.08E-09 55.9148 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15575 - CGI_10023405 superfamily 247999 174 228 1.33E-05 44.0184 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#15576 - CGI_10023406 superfamily 246748 87 277 9.82E-122 361.139 cl14876 Zinc_peptidase_like superfamily NC - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#15576 - CGI_10023406 superfamily 246748 272 329 1.15E-32 125.782 cl14876 Zinc_peptidase_like superfamily N - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#15576 - CGI_10023406 superfamily 246748 46 90 7.73E-14 70.6981 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#15577 - CGI_10023407 superfamily 246748 6 461 0 871.143 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#15580 - CGI_10023410 superfamily 247805 58 265 1.66E-82 258.954 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15580 - CGI_10023410 superfamily 247905 307 408 1.83E-20 88.0636 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#15580 - CGI_10023410 superfamily 222474 439 501 7.03E-23 92.8713 cl16500 DUF4217 superfamily - - Domain of unknown function (DUF4217); This short domain is found at the C-terminus of many helicase proteins. Q#15581 - CGI_10023411 superfamily 241563 74 113 3.54E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15581 - CGI_10023411 superfamily 110440 491 517 0.000407099 38.5429 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#15581 - CGI_10023411 superfamily 218425 348 376 0.00994621 37.2918 cl04931 eIF-3_zeta superfamily NC - "Eukaryotic translation initiation factor 3 subunit 7 (eIF-3); This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery. The gene coding for the protein has been implicated in cancer in mammals." Q#15583 - CGI_10023413 superfamily 247724 35 198 1.80E-61 192.492 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15585 - CGI_10023415 superfamily 110440 82 108 0.000433768 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#15585 - CGI_10023415 superfamily 110440 125 152 0.00156201 33.9205 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#15586 - CGI_10023416 superfamily 241563 83 121 6.77E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15586 - CGI_10023416 superfamily 243362 382 423 0.000210634 40.4863 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#15587 - CGI_10023417 superfamily 243061 1 102 2.36E-39 142.481 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#15587 - CGI_10023417 superfamily 243061 110 209 3.64E-37 136.318 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#15587 - CGI_10023417 superfamily 215647 638 844 4.08E-26 108.465 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#15587 - CGI_10023417 superfamily 243086 586 626 3.06E-15 71.6373 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#15587 - CGI_10023417 superfamily 221370 383 560 0.000566329 41.2029 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#15588 - CGI_10023418 superfamily 241739 224 413 6.07E-94 302.202 cl00268 class_II_aaRS-like_core superfamily N - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#15588 - CGI_10023418 superfamily 241805 9 57 2.93E-19 83.6898 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#15588 - CGI_10023418 superfamily 241739 68 140 3.22E-25 106.906 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#15588 - CGI_10023418 superfamily 241738 541 575 9.47E-10 57.9528 cl00266 HGTP_anticodon superfamily C - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#15588 - CGI_10023418 superfamily 241563 676 714 0.000204323 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15589 - CGI_10023419 superfamily 248438 35 141 1.01E-06 48.188 cl17884 COG1214 superfamily C - "Inactive homolog of metal-dependent proteases, putative molecular chaperone [Posttranslational modification, protein turnover, chaperones]" Q#15591 - CGI_10023421 superfamily 222370 38 119 1.43E-19 82.5709 cl16386 Longin superfamily - - "Regulated-SNARE-like domain; Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain." Q#15591 - CGI_10023421 superfamily 243035 318 411 5.57E-15 71.0917 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15592 - CGI_10023422 superfamily 204415 13 83 6.24E-19 90.7413 cl16016 TTKRSYEDQ superfamily N - Predicted coiled-coil domain-containing protein; This is the C-terminal 500 amino acids of a family of proteins with a predicted coiled-coil domain conserved from nematodes to humans. It carries a characteristic TTKRSYEDQ sequence-motif. The function is not known. Q#15592 - CGI_10023422 superfamily 152912 527 553 0.000179164 40.9316 cl13860 DUF3697 superfamily C - "Ubiquitin-associated protein 2; This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00627. There are two conserved sequence motifs: AVEMPG and QFG." Q#15593 - CGI_10023423 superfamily 241736 10 96 1.98E-20 84.2091 cl00263 TFold superfamily N - "Tunnelling fold (T-fold). The five known T-folds are found in five different enzymes with different functions: dihydroneopterin-triphosphate epimerase (DHNTPE), dihydroneopterin aldolase (DHNA) , GTP cyclohydrolase I (GTPCH-1), 6-pyrovoyl tetrahydropterin synthetase (PTPS), and uricase (UO,uroate/urate oxidase). They bind to substrates belonging to the purine or pterin families, and share a fold-related binding site with a glutamate or glutamine residue anchoring the substrate and a lot of conserved interactions. They also share a similar oligomerization mode: several T-folds join together to form a beta(2n)alpha(n) barrel, then two barrels join together in a head-to-head fashion to made up the native enzymes. The functional enzyme is a tetramer for UO, a hexamer for PTPS, an octamer for DHNA/DHNTPE and a decamer for GTPCH-1. The substrate is located in a deep and narrow pocket at the interface between monomers. In PTPS, the active site is located at the interface of three monomers, two from one trimer and one from the other trimer. In GTPCH-1, it is also located at the interface of three subunits, two from one pentamer and one from the other pentamer. There are four equivalent active sites in UO, six in PTPS, eight in DHNA/DHNTPE and ten in GTPCH-1. Each globular multimeric enzyme encloses a tunnel which is lined with charged residues for DHNA and UO, and with basic residues in PTPS. The N and C-terminal ends are located on one side of the T-fold while the residues involved in the catalytic activity are located at the opposite side. In PTPS, UO and DHNA/DHNTPE, the N and C-terminal extremities of the enzyme are located on the exterior side of the functional multimeric enzyme. In GTPCH-1, the extra C-terminal helix places the extremity inside the tunnel." Q#15594 - CGI_10023424 superfamily 242232 108 157 5.46E-14 62.9608 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#15596 - CGI_10023426 superfamily 241567 472 707 5.66E-85 270.241 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#15596 - CGI_10023426 superfamily 241567 62 251 1.95E-78 252.907 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#15597 - CGI_10023427 superfamily 241567 59 304 5.85E-105 309.146 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#15598 - CGI_10023428 superfamily 218556 18 509 4.01E-140 419.786 cl05076 RRN3 superfamily - - RNA polymerase I specific transcription initiation factor RRN3; This family consists of several eukaryotic proteins which are homologous to the yeast RRN3 protein. RRN3 is one of the RRN genes specifically required for the transcription of rDNA by RNA polymerase I (Pol I) in Saccharomyces cerevisiae. Q#15600 - CGI_10023430 superfamily 247743 495 585 1.66E-08 54.8447 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#15601 - CGI_10023431 superfamily 219975 211 302 2.98E-32 116.644 cl07355 Fcf2 superfamily - - Fcf2 pre-rRNA processing; This is a family of eukaryotic nucleolar proteins that are involved in pre-rRNA processing. Q#15603 - CGI_10023433 superfamily 243179 186 322 1.03E-09 55.0465 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#15605 - CGI_10023435 superfamily 243179 214 351 6.79E-13 64.4763 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#15606 - CGI_10023436 superfamily 243179 135 263 6.22E-16 71.7951 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#15607 - CGI_10023437 superfamily 243098 567 613 2.97E-15 73.4011 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15607 - CGI_10023437 superfamily 243098 111 158 3.72E-14 70.3195 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15607 - CGI_10023437 superfamily 243098 1340 1385 9.77E-14 69.1639 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15607 - CGI_10023437 superfamily 243098 1840 1887 2.05E-12 65.3119 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15607 - CGI_10023437 superfamily 243098 762 808 2.07E-12 65.3119 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15607 - CGI_10023437 superfamily 243098 961 1006 2.13E-12 65.3119 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15607 - CGI_10023437 superfamily 243098 1145 1190 1.92E-10 59.5339 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15607 - CGI_10023437 superfamily 243098 1583 1629 3.93E-07 49.9039 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15608 - CGI_10023438 superfamily 243098 61 106 1.09E-12 63.7711 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15608 - CGI_10023438 superfamily 243098 630 675 7.03E-11 58.7635 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15608 - CGI_10023438 superfamily 243098 238 284 3.86E-08 50.6743 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15608 - CGI_10023438 superfamily 243098 422 466 0.000166881 40.274 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#15609 - CGI_10023439 superfamily 247058 1417 1619 3.61E-56 196.242 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#15609 - CGI_10023439 superfamily 247743 370 521 9.38E-27 109.543 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#15609 - CGI_10023439 superfamily 202367 1739 1917 1.54E-61 210.859 cl18226 3HCDH_N superfamily - - "3-hydroxyacyl-CoA dehydrogenase, NAD binding domain; This family also includes lambda crystallin." Q#15609 - CGI_10023439 superfamily 243084 902 1013 2.03E-39 144.807 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#15609 - CGI_10023439 superfamily 216084 1920 2015 4.67E-28 111.53 cl08285 3HCDH superfamily - - "3-hydroxyacyl-CoA dehydrogenase, C-terminal domain; This family also includes lambda crystallin. Some proteins include two copies of this domain." Q#15610 - CGI_10023440 superfamily 245210 47 465 1.41E-158 457.711 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#15611 - CGI_10023441 superfamily 246597 7 295 7.66E-108 318.092 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#15612 - CGI_10023442 superfamily 241899 516 548 0.00340215 38.217 cl00489 60KD_IMP superfamily C - 60Kd inner membrane protein; 60Kd inner membrane protein. Q#15613 - CGI_10006895 superfamily 241619 276 313 0.00124353 38.7173 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#15614 - CGI_10006896 superfamily 245835 187 404 9.39E-121 352.402 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#15614 - CGI_10006896 superfamily 243088 31 171 2.23E-79 243.495 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#15618 - CGI_10020689 superfamily 219562 77 343 2.84E-87 278.031 cl06685 LETM1 superfamily - - LETM1-like protein; Members of this family are inner mitochondrial membrane proteins which play a role in potassium and hydrogen ion exchange. Deletion of LETM1 is thought to be involved in the development of Wolf-Hirschhorn syndrome in humans. Q#15619 - CGI_10020690 superfamily 241629 58 172 1.13E-30 113.564 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#15620 - CGI_10020691 superfamily 220657 78 469 1.03E-101 318.91 cl10940 RAI16-like superfamily - - Retinoic acid induced 16-like protein; This is the conserved N-terminal 450 residues of a family of proteins described as retinoic acid-induced protein 16-like proteins. The exact function is not known. The proteins are found from worms to humans. Q#15622 - CGI_10020693 superfamily 243035 9 136 1.85E-19 81.8973 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15623 - CGI_10020694 superfamily 243035 345 469 4.11E-17 77.2749 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15623 - CGI_10020694 superfamily 243035 212 326 2.23E-11 60.7113 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15624 - CGI_10020695 superfamily 214545 420 562 7.68E-56 188.298 cl10551 CULLIN superfamily - - Cullin; Cullin. Q#15624 - CGI_10020695 superfamily 245539 688 760 1.95E-24 98.0097 cl11186 Cullin_Nedd8 superfamily - - "Cullin protein neddylation domain; This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue." Q#15625 - CGI_10020696 superfamily 241583 103 178 1.24E-24 100.723 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#15625 - CGI_10020696 superfamily 241583 177 222 0.00300743 37.5507 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#15627 - CGI_10020698 superfamily 246908 581 691 1.22E-26 105.859 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#15628 - CGI_10020699 superfamily 207921 1 113 5.84E-45 151.166 cl03350 Ribosomal_L28e superfamily - - Ribosomal L28e protein family; Ribosomal L28e protein family. Q#15628 - CGI_10020699 superfamily 218303 131 176 2.28E-11 59.4992 cl15325 Mak16 superfamily C - Mak16 protein C-terminal region; The precise function of this eukaryotic protein family is unknown. The yeast orthologues have been implicated in cell cycle progression and biogenesis of 60S ribosomal subunits. The Schistosoma mansoni Mak16 has been shown to target protein transport to the nucleolus. Q#15629 - CGI_10020700 superfamily 241599 67 109 3.66E-13 63.0312 cl00084 homeodomain superfamily C - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#15629 - CGI_10020700 superfamily 243061 209 307 2.94E-44 148.259 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#15631 - CGI_10020702 superfamily 241646 55 96 1.37E-07 43.9779 cl00156 WAP superfamily N - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#15633 - CGI_10020704 superfamily 247905 262 317 4.81E-08 50.3141 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#15633 - CGI_10020704 superfamily 247805 57 143 2.89E-07 48.4095 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15636 - CGI_10020707 superfamily 217293 35 234 5.88E-36 131.986 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15636 - CGI_10020707 superfamily 202474 241 329 5.00E-12 63.8269 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#15638 - CGI_10020709 superfamily 247743 580 646 0.00010289 42.1331 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#15639 - CGI_10020710 superfamily 242934 13 154 1.32E-43 148.864 cl02219 Bap31 superfamily - - "B-cell receptor-associated protein 31-like; Bap31 is a polytopic integral protein of the endoplasmic reticulum membrane and a substrate of caspase-8. Bap31 is cleaved within its cytosolic domain, generating pro-apoptotic p20 Bap31." Q#15640 - CGI_10020711 superfamily 243072 25 144 1.69E-09 52.771 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15641 - CGI_10020712 superfamily 245206 3 290 3.26E-135 387.958 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#15643 - CGI_10020714 superfamily 243092 20 255 5.63E-25 104.724 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15644 - CGI_10020715 superfamily 248097 49 108 1.16E-10 54.1934 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15645 - CGI_10020716 superfamily 241832 2 90 3.35E-38 130.166 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15645 - CGI_10020716 superfamily 243175 104 227 3.05E-35 123.585 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#15646 - CGI_10020717 superfamily 242730 73 120 0.00457812 36.4703 cl01825 Phage_Mu_Gam superfamily C - Bacteriophage Mu Gam like protein; This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. Q#15648 - CGI_10020719 superfamily 241640 225 329 5.35E-06 44.8709 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#15649 - CGI_10020720 superfamily 247684 5 424 5.13E-106 327.697 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15651 - CGI_10011595 superfamily 247799 1 76 3.10E-22 84.2177 cl17245 KH-I superfamily N - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#15652 - CGI_10011596 superfamily 245847 15 77 0.0010524 34.0694 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#15654 - CGI_10011598 superfamily 245201 91 332 4.22E-97 292.887 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15657 - CGI_10011601 superfamily 243035 64 109 0.000566864 37.2142 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15660 - CGI_10015738 superfamily 247792 5194 5249 0.000692435 41.2772 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15660 - CGI_10015738 superfamily 201217 1087 1134 1.52E-12 66.7804 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#15660 - CGI_10015738 superfamily 205718 1072 1100 3.15E-07 50.9518 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#15660 - CGI_10015738 superfamily 205718 856 885 0.00149414 40.1662 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#15661 - CGI_10015739 superfamily 202715 125 223 2.61E-31 111.901 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#15662 - CGI_10015740 superfamily 241631 248 422 2.28E-31 118.491 cl00136 Sec7 superfamily - - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#15662 - CGI_10015740 superfamily 202715 61 156 7.31E-29 108.82 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#15662 - CGI_10015740 superfamily 243074 185 226 5.53E-10 55.2053 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#15663 - CGI_10015741 superfamily 247725 642 741 2.86E-39 143.208 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15663 - CGI_10015741 superfamily 241645 1505 1581 2.29E-08 53.446 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#15663 - CGI_10015741 superfamily 243095 1277 1462 4.73E-57 197.916 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#15663 - CGI_10015741 superfamily 243047 850 978 2.15E-39 144.685 cl02464 ArfGap superfamily - - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#15663 - CGI_10015741 superfamily 247725 1070 1178 4.03E-13 68.5374 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15663 - CGI_10015741 superfamily 247725 754 831 1.32E-11 63.1021 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15663 - CGI_10015741 superfamily 247057 8 62 3.10E-11 61.5424 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#15664 - CGI_10015742 superfamily 241599 237 297 7.73E-09 51.0901 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#15665 - CGI_10015743 superfamily 219608 8 174 5.19E-64 197.979 cl06751 DUF1649 superfamily - - Protein of unknown function (DUF1649); This family is made up of sequences derived from hypothetical eukaryotic proteins of unknown function. Q#15666 - CGI_10015744 superfamily 245021 29 57 0.00817743 30.3423 cl09153 PhdYeFM_antitox superfamily C - "Antitoxin Phd_YefM, type II toxin-antitoxin system; Members of this family act as antitoxins in type II toxin-antitoxin systems. When bound to their toxin partners, they can bind DNA via the N-terminus and repress the expression of operons containing genes encoding the toxin and the antitoxin. This domain complexes with Txe toxins containing pfam06769, Fic/DOC toxins containing pfam02661 and YafO toxins containing pfam13957." Q#15667 - CGI_10015745 superfamily 115278 1 128 2.64E-28 102.848 cl05898 DUF1143 superfamily - - Protein of unknown function (DUF1143); This family consists of several hypothetical mammalian proteins (from mouse and human). The function of this family is unknown. Q#15668 - CGI_10015746 superfamily 241597 958 1029 3.85E-38 140.126 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#15668 - CGI_10015746 superfamily 215882 2260 2343 0.00250452 38.8011 cl09511 FERM_M superfamily C - FERM central domain; This domain is the central structural domain of the FERM domain. Q#15673 - CGI_10015752 superfamily 243094 46 260 6.64E-107 317.829 cl02569 RasGAP superfamily N - "Ras GTPase Activating Domain; RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator." Q#15675 - CGI_10015754 superfamily 248097 9 125 1.31E-16 70.757 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15676 - CGI_10015755 superfamily 248097 45 153 2.49E-20 81.5426 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15677 - CGI_10015756 superfamily 241571 190 295 1.07E-22 90.5494 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15677 - CGI_10015756 superfamily 241571 41 156 1.43E-10 57.037 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15678 - CGI_10015757 superfamily 150144 27 170 1.58E-22 92.5362 cl09624 Tex_N superfamily N - Tex-like protein N-terminal domain; This presumed domain is found at the N-terminus of Bordetella pertussis tex. This protein defines a novel family of prokaryotic transcriptional accessory factors. Q#15679 - CGI_10015758 superfamily 241574 49 104 2.33E-07 46.0398 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#15679 - CGI_10015758 superfamily 242611 12 52 0.00309615 34.1361 cl01629 TPP_enzymes superfamily NC - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#15680 - CGI_10008573 superfamily 248264 55 212 3.07E-42 141.991 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#15681 - CGI_10008574 superfamily 241552 35 112 3.99E-14 63.5464 cl00017 Cyt_c_Oxidase_VIa superfamily - - "Cytochrome c oxidase subunit VIa. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIa is expressed in two tissue-specific isoforms in mammals but not fish. VIa-H is the heart and skeletal muscle isoform; VIa-L is the liver or non-muscle isoform. Mammalian VIa-H induces a slip in CcO (decrease in proton/electron stoichiometry) at high intramitochondrial ATP/ADP ratios, while VIa-L induces a permanent slip in CcO, depending on the presence of cardiolipin and palmitate." Q#15682 - CGI_10008575 superfamily 243051 797 908 1.69E-22 95.9077 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#15682 - CGI_10008575 superfamily 243051 651 732 9.35E-17 78.9589 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#15682 - CGI_10008575 superfamily 241563 68 109 5.44E-07 47.8592 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15682 - CGI_10008575 superfamily 241563 15 58 0.00673359 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15684 - CGI_10006230 superfamily 243077 173 208 8.65E-09 51.3772 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#15685 - CGI_10006231 superfamily 247740 387 454 0.00257756 38.0307 cl17186 TIM_phosphate_binding superfamily NC - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#15686 - CGI_10006232 superfamily 222005 289 353 1.20E-05 43.88 cl18632 AAA_19 superfamily C - Part of AAA domain; Part of AAA domain. Q#15686 - CGI_10006232 superfamily 221913 702 749 1.87E-05 44.8387 cl18626 AAA_12 superfamily C - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#15690 - CGI_10006236 superfamily 243065 243 378 3.80E-18 80.5237 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#15692 - CGI_10006238 superfamily 220131 597 727 1.03E-29 122.385 cl11721 DUF1943 superfamily C - "Domain of unknown function (DUF1943); Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined." Q#15692 - CGI_10006238 superfamily 243065 2376 2512 3.66E-18 85.1461 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#15694 - CGI_10006240 superfamily 247786 95 183 2.57E-07 47.5881 cl17232 F420_oxidored superfamily - - NADP oxidoreductase coenzyme F420-dependent; NADP oxidoreductase coenzyme F420-dependent. Q#15695 - CGI_10003288 superfamily 241578 69 226 2.87E-33 122.785 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15695 - CGI_10003288 superfamily 241578 253 373 3.17E-26 102.755 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15695 - CGI_10003288 superfamily 241578 21 57 0.00670508 36.2093 cl00057 vWFA superfamily NC - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15696 - CGI_10003289 superfamily 241578 26 205 1.19E-36 132.352 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15696 - CGI_10003289 superfamily 241578 241 405 1.17E-30 115.847 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15697 - CGI_10003290 superfamily 243146 16 66 2.23E-08 46.7827 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15697 - CGI_10003290 superfamily 243146 55 109 0.00242251 33.1906 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15698 - CGI_10003291 superfamily 243146 12 47 1.78E-06 41.3899 cl02701 Kelch_3 superfamily N - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15698 - CGI_10003291 superfamily 243146 36 90 0.000450612 34.7314 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15700 - CGI_10003293 superfamily 222090 331 516 2.72E-14 71.535 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#15702 - CGI_10000838 superfamily 245814 327 394 0.00071344 37.4687 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15703 - CGI_10013895 superfamily 245206 1 260 1.04E-92 279.109 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#15704 - CGI_10013896 superfamily 218375 23 283 2.33E-63 203.87 cl09349 IFRD superfamily - - "Interferon-related developmental regulator (IFRD); Interferon-related developmental regulator (IFRD1) is the human homologue of the rat early response protein PC4 and its murine homologue TIS7. The exact function of IFRD1 is unknown but it has been shown that PC4 is necessary to muscle differentiation and that it might have a role in signal transduction. This family also contains IFRD2 and its murine equivalent SKMc15 which are highly expressed soon after gastrulation and in the hepatic primordium, suggesting an involvement in early hematopoiesis." Q#15709 - CGI_10013901 superfamily 243072 31 147 1.02E-19 87.0538 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15709 - CGI_10013901 superfamily 243072 298 421 2.81E-15 73.957 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15709 - CGI_10013901 superfamily 243072 96 243 1.37E-14 72.031 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15709 - CGI_10013901 superfamily 243072 395 453 1.99E-08 53.5414 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 955 1074 2.07E-25 104.388 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 892 1010 3.46E-24 100.921 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 1588 1703 1.87E-23 98.995 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 1361 1475 1.08E-22 96.6838 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 1024 1140 2.62E-22 95.5282 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 1123 1236 8.99E-22 93.9874 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 1525 1639 3.53E-21 92.4466 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 1185 1295 6.05E-20 88.5946 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15710 - CGI_10013902 superfamily 243072 833 945 6.19E-19 85.8982 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15712 - CGI_10013904 superfamily 192997 298 443 1.72E-37 138.869 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#15713 - CGI_10013905 superfamily 245210 5 252 1.07E-117 346.468 cl09938 cond_enzymes superfamily C - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#15714 - CGI_10001123 superfamily 246664 287 673 1.05E-173 504.416 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#15714 - CGI_10001123 superfamily 243092 76 169 5.17E-17 80.842 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15714 - CGI_10001123 superfamily 246664 183 225 0.00782974 37.675 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#15715 - CGI_10005302 superfamily 214531 40 83 0.00300797 31.4181 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#15716 - CGI_10005303 superfamily 149010 39 111 1.49E-31 121.634 cl06656 DUF1604 superfamily - - Protein of unknown function (DUF1604); This family is found at the N-terminus of several eukaryotic RNA processing proteins. Q#15716 - CGI_10005303 superfamily 248289 1762 1819 9.13E-08 51.6559 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#15716 - CGI_10005303 superfamily 214531 1049 1081 1.10E-07 51.0633 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#15716 - CGI_10005303 superfamily 111397 1859 1936 5.79E-07 49.6471 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#15716 - CGI_10005303 superfamily 243107 150 167 1.37E-05 45.2286 cl02611 G-patch superfamily C - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#15716 - CGI_10005303 superfamily 214531 1083 1124 9.22E-05 42.5889 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#15716 - CGI_10005303 superfamily 214531 1358 1400 0.00284964 38.3517 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#15716 - CGI_10005303 superfamily 214531 1304 1358 0.00376331 37.9665 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#15719 - CGI_10004614 superfamily 247085 161 234 6.64E-12 59.8266 cl15820 RICIN superfamily N - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#15719 - CGI_10004614 superfamily 245596 18 148 2.75E-77 237.873 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#15723 - CGI_10004618 superfamily 246680 1 66 0.00942431 32.944 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#15724 - CGI_10004619 superfamily 201362 4 75 0.000887223 35.4188 cl08277 Motile_Sperm superfamily C - MSP (Major sperm protein) domain; Major sperm proteins are involved in sperm motility. These proteins oligomerise to form filaments. This family contains many other proteins. Q#15725 - CGI_10004620 superfamily 219425 30 148 3.04E-17 72.9594 cl06494 Hydrolase_2 superfamily - - "Cell Wall Hydrolase; These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance B. subtilis sleB is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyses the cortex. A similar role is carried out by the partially redundant B. subtilis cwlJ. It is not clear whether these enzymes are amidases or peptidases." Q#15726 - CGI_10004621 superfamily 247057 424 464 0.00105183 37.1304 cl15755 SAM_superfamily superfamily N - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#15727 - CGI_10000506 superfamily 241568 8 49 2.32E-05 40.1388 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#15727 - CGI_10000506 superfamily 241568 53 108 0.000242349 37.1696 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#15731 - CGI_10004922 superfamily 191444 53 128 0.000525737 35.3777 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#15732 - CGI_10004923 superfamily 247727 330 421 8.01E-13 64.7586 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#15732 - CGI_10004923 superfamily 247727 278 368 0.0020735 38.6116 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#15738 - CGI_10010262 superfamily 248097 9 125 2.46E-16 69.9866 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15740 - CGI_10010264 superfamily 247769 252 380 8.75E-08 50.8009 cl17215 HDc superfamily C - Metal dependent phosphohydrolases with conserved 'HD' motif Q#15741 - CGI_10010265 superfamily 247792 57 103 0.000725457 38.1956 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15741 - CGI_10010265 superfamily 110440 595 620 1.33E-05 43.1653 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#15741 - CGI_10010265 superfamily 241563 209 238 0.000248092 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15741 - CGI_10010265 superfamily 110440 424 447 0.00306406 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#15743 - CGI_10010267 superfamily 243056 38 237 1.62E-21 93.1925 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#15743 - CGI_10010267 superfamily 241626 313 364 0.00875114 35.1728 cl00125 RHOD superfamily C - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#15744 - CGI_10010268 superfamily 246908 241 335 4.68E-62 198.429 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#15744 - CGI_10010268 superfamily 247792 366 408 7.90E-10 54.7592 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15744 - CGI_10010268 superfamily 111184 32 161 1.56E-74 232.078 cl03506 Cbl_N superfamily - - "CBL proto-oncogene N-terminal domain 1; Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. Cbl_N is comprised of 3 structural domains of which this is the first - a four helix bundle." Q#15744 - CGI_10010268 superfamily 202379 164 247 1.65E-44 151.899 cl03701 Cbl_N2 superfamily - - "CBL proto-oncogene N-terminus, EF hand-like domain; Cbl is an adaptor protein that binds EGF receptors (or other tyrosine kinases) and SH3 domains, functioning as a negative regulator of many signaling pathways. The N-terminal domain is evolutionarily conserved, and is known to bind to phosphorylated tyrosine residues. The so called N-terminal domain is actually 3 structural domains, of which this is the central EF hand domain." Q#15745 - CGI_10010269 superfamily 241563 96 131 0.00080667 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15749 - CGI_10007771 superfamily 202224 463 579 5.82E-49 172.095 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#15749 - CGI_10007771 superfamily 243120 79 164 7.47E-31 119.221 cl02633 ARID superfamily - - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#15749 - CGI_10007771 superfamily 217292 669 722 2.02E-25 102.346 cl03786 zf-C5HC2 superfamily - - C5HC2 zinc finger; Predicted zinc finger with eight potential zinc ligand binding residues. This domain is found in Jumonji. This domain may have a DNA binding function. Q#15749 - CGI_10007771 superfamily 210240 13 54 1.22E-17 79.6142 cl15840 JmjN superfamily - - jmjN domain; jmjN domain. Q#15749 - CGI_10007771 superfamily 247999 289 337 4.04E-15 72.5231 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#15749 - CGI_10007771 superfamily 247999 1622 1665 9.54E-08 51.0586 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#15749 - CGI_10007771 superfamily 247999 1151 1189 1.04E-05 45.2806 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#15750 - CGI_10007772 superfamily 218657 6 140 8.87E-29 105.465 cl09577 THOC7 superfamily - - Tho complex subunit 7; The Tho complex is involved in transcription elongation and mRNA export from the nucleus. Q#15751 - CGI_10007773 superfamily 149388 342 396 3.22E-17 78.1875 cl07068 SCA7 superfamily - - "SCA7, zinc-binding domain; This domain is found in the protein Sgf73/Sca7 which is a component of the multihistone acetyltransferase complexes SAGA and SILK. This domain is also found in Ataxin-7, a human protein which in its polyglutamine expanded pathological form, is responsible for the neurodegenerative disease spinocerebellar ataxia 7 (SCA7). Ataxin-7 is an integral component of the mammalian SAGA-like complexes, the TATA-binding protein-free TAF-containing complex (TFTC) and the SPT3/TAF9/GCN5 acetyltransferase complex (STAGA). This domain is a minimal domain in ataxin-7-like proteins that is required for interaction with TFTC/STAGA subunits and is conserved highly through evolution. The domain contains a conserved Cys(3)His motif that binds zinc, thus indicating this to be a new zinc-binding domain." Q#15752 - CGI_10007774 superfamily 247750 103 345 3.43E-133 393.557 cl17196 E1_enzyme_family superfamily - - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#15752 - CGI_10007774 superfamily 245106 505 591 2.40E-45 157.036 cl09615 UBA_e1_C superfamily C - Ubiquitin-activating enzyme e1 C-terminal domain; This presumed domain found at the C-terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterized. Q#15752 - CGI_10007774 superfamily 247750 441 483 5.89E-20 89.6341 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#15754 - CGI_10007776 superfamily 245206 165 303 1.00E-29 114.486 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#15759 - CGI_10002620 superfamily 220788 517 577 2.02E-09 55.2317 cl11144 MCC-bdg_PDZ superfamily - - PDZ domain of MCC-2 bdg protein for Usher syndrome; The protein has a high homology to the tumour suppressor MCC (mutated in colon cancer; or MCC1 hereafter) and was named MCC2. MCC2 protein binds the first PDZ domain of AIE-75 with its C-terminal amino acids -DTFL. A possible role of MCC2 as a tumor suppressor has been put forward. The carboxyl terminus of the predicted protein was DTFL which matched the consensus motif X-S/T-X-phi (phi: hydrophobic amino acid residue) for binding to the PDZ domain of AIE-75. Q#15759 - CGI_10002620 superfamily 220788 811 853 0.00574385 35.9718 cl11144 MCC-bdg_PDZ superfamily C - PDZ domain of MCC-2 bdg protein for Usher syndrome; The protein has a high homology to the tumour suppressor MCC (mutated in colon cancer; or MCC1 hereafter) and was named MCC2. MCC2 protein binds the first PDZ domain of AIE-75 with its C-terminal amino acids -DTFL. A possible role of MCC2 as a tumor suppressor has been put forward. The carboxyl terminus of the predicted protein was DTFL which matched the consensus motif X-S/T-X-phi (phi: hydrophobic amino acid residue) for binding to the PDZ domain of AIE-75. Q#15761 - CGI_10011689 superfamily 243035 34 81 9.74E-09 51.4466 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15763 - CGI_10011691 superfamily 241647 92 119 0.000831616 37.8926 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#15763 - CGI_10011691 superfamily 221479 461 636 8.89E-83 261.5 cl13646 PCIF1_WW superfamily - - "Phosphorylated CTD interacting factor 1 WW domain; This domain family is found in bacteria and eukaryotes, and is approximately 180 amino acids in length. This domain is the WW domain of PCIF1. PCIF1 interacts with phosphorylated RNA polymerase II carboxy-terminal domain (CTD). The WW domain of PCIF1 can directly and preferentially bind to the phosphorylated CTD compared to the unphosphorylated CTD. PCIF1 binds to the hyperphosphorylated RNAP II (RNAP IIO) in vitro and in vivo. Double immunofluorescence labeling in HeLa cells demonstrated that PCIF1 and endogenous RNAP IIO are co-localized in the cell nucleus. Thus, PCIF1 may play a role in mRNA synthesis by modulating RNAP IIO activity." Q#15764 - CGI_10011692 superfamily 247692 16 487 0 525.268 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#15768 - CGI_10011696 superfamily 246925 318 476 4.07E-09 57.7506 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15768 - CGI_10011696 superfamily 246925 95 208 3.83E-08 54.669 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15768 - CGI_10011696 superfamily 214507 597 646 3.29E-05 42.8024 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#15768 - CGI_10011696 superfamily 246925 397 573 0.000638539 41.5722 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15769 - CGI_10011697 superfamily 198738 1 31 2.15E-09 48.4183 cl02599 Ets superfamily N - Ets-domain; Ets-domain. Q#15771 - CGI_10011699 superfamily 198738 2 49 1.95E-16 67.293 cl02599 Ets superfamily N - Ets-domain; Ets-domain. Q#15773 - CGI_10011701 superfamily 245814 164 216 6.71E-09 52.1063 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15773 - CGI_10011701 superfamily 245814 51 130 2.03E-09 53.9303 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15773 - CGI_10011701 superfamily 245814 245 326 4.30E-09 52.8929 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15774 - CGI_10011702 superfamily 243250 415 841 3.47E-136 430.917 cl02959 Glyco_hydro_9 superfamily - - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#15774 - CGI_10011702 superfamily 243250 1140 1566 7.09E-135 427.45 cl02959 Glyco_hydro_9 superfamily - - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#15775 - CGI_10011703 superfamily 246925 187 501 7.32E-24 102.049 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15776 - CGI_10012360 superfamily 241752 15 135 5.30E-45 144.77 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#15777 - CGI_10012361 superfamily 247097 47 83 9.66E-06 40.8974 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#15779 - CGI_10012363 superfamily 247097 94 130 0.000121744 36.6602 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#15780 - CGI_10012364 superfamily 247097 91 126 6.51E-05 37.0454 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#15781 - CGI_10012365 superfamily 244895 41 520 0 547.145 cl08294 Peptidase_M17 superfamily - - "Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants." Q#15782 - CGI_10012366 superfamily 241594 453 810 1.92E-122 374.979 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#15785 - CGI_10012370 superfamily 243072 197 323 5.07E-33 124.418 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15785 - CGI_10012370 superfamily 245201 532 783 1.46E-118 361.283 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15785 - CGI_10012370 superfamily 246908 338 425 5.89E-19 83.2402 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#15785 - CGI_10012370 superfamily 246908 54 139 1.08E-18 82.2767 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#15786 - CGI_10012371 superfamily 247792 54 128 2.42E-28 100.58 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15787 - CGI_10012372 superfamily 241592 23 117 4.03E-43 139.778 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#15788 - CGI_10012373 superfamily 192151 244 383 9.36E-60 194.479 cl07405 DP superfamily - - Transcription factor DP; DP forms a heterodimer with E2F and regulates genes involved in cell cycle progression. The transcriptional activity of E2F is inhibited by the retinoblastoma protein which binds to the E2F-DP heterodimer and negatively regulates the G1-S transition. Q#15788 - CGI_10012373 superfamily 202203 160 224 6.97E-12 61.0449 cl03534 E2F_TDP superfamily C - "E2F/DP family winged-helix DNA-binding domain; This family contains the transcription factor E2F and its dimerisation partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. The crystal structure of an E2F4-DP2-DNA complex shows that the DNA-binding domains of the E2F and DP proteins both have a fold related to the winged-helix DNA-binding motif. Recognition of the central c/gGCGCg/c sequence of the consensus DNA-binding site is symmetric, and amino acids that contact these bases are conserved among all known E2F and DP proteins." Q#15790 - CGI_10012375 superfamily 248458 53 227 3.07E-12 66.9537 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#15791 - CGI_10006857 superfamily 247684 24 450 2.45E-107 331.549 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15792 - CGI_10006858 superfamily 245206 71 247 1.70E-48 163.549 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#15792 - CGI_10006858 superfamily 245206 44 90 0.000173379 40.6481 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#15793 - CGI_10006859 superfamily 243093 111 202 0.00954782 34.3681 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#15794 - CGI_10006860 superfamily 245206 20 339 1.21E-161 465.234 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#15794 - CGI_10006860 superfamily 176932 366 455 1.22E-31 117.271 cl03838 FAR_C superfamily - - "C-terminal domain of fatty acyl CoA reductases; C-terminal domain of fatty acyl CoA reductases, a family of SDR-like proteins. SDRs or short-chain dehydrogenases/reductases are Rossmann-fold NAD(P)H-binding proteins. Many proteins in this FAR_C family may function as fatty acyl-CoA reductases (FARs), acting on medium and long chain fatty acids, and have been reported to be involved in diverse processes such as the biosynthesis of insect pheromones, plant cuticular wax production, and mammalian wax biosynthesis. In Arabidopsis thaliana, proteins with this particular architecture have also been identified as the MALE STERILITY 2 (MS2) gene product, which is implicated in male gametogenesis. Mutations in MS2 inhibit the synthesis of exine (sporopollenin), rendering plants unable to reduce pollen wall fatty acids to corresponding alcohols. The function of this C-terminal domain is unclear." Q#15796 - CGI_10006862 superfamily 216363 9 80 6.17E-10 50.9318 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#15797 - CGI_10006863 superfamily 243119 644 692 0.000114303 40.4949 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#15799 - CGI_10001894 superfamily 244895 1 267 1.53E-122 360.708 cl08294 Peptidase_M17 superfamily N - "Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants." Q#15800 - CGI_10001897 superfamily 215733 1 102 1.16E-23 97.6358 cl02811 E1-E2_ATPase superfamily N - E1-E2 ATPase; E1-E2 ATPase. Q#15800 - CGI_10001897 superfamily 226572 262 374 1.47E-08 52.5612 cl18761 COG4087 superfamily N - Soluble P-type ATPase [General function prediction only] Q#15801 - CGI_10001899 superfamily 241857 16 142 1.25E-06 44.5769 cl00427 TM_PBP2 superfamily N - "Transmembrane subunit (TM) found in Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which generally bind type 2 PBPs. These types of transporters consist of a PBP, two TMs, and two cytoplasmic ABC ATPase subunits, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. For these transporters the ABCs and TMs are on independent polypeptide chains. These systems transport a diverse range of substrates. Most are specific for a single substrate or a group of related substrates; however some transporters are more promiscuous, transporting structurally diverse substrates such as the histidine/lysine and arginine transporter in Enterobacteriaceae. In the latter case, this is achieved through binding different PBPs with different specificities to the TMs. For other promiscuous transporters such as the multiple-sugar transporter Msm of Streptococcus mutans, the PBP has a wide substrate specificity. These transporters include the maltose-maltodextrin, phosphate and sulfate transporters, among others." Q#15802 - CGI_10001900 superfamily 247986 42 263 1.58E-38 137.89 cl17432 PBPb superfamily - - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#15803 - CGI_10001908 superfamily 241825 17 101 5.08E-18 73.3431 cl00379 Ribosomal_L18_L5e superfamily - - "Ribosomal L18/L5e: L18 (L5e) is a ribosomal protein found in the central protuberance (CP) of the large subunit. L18 binds 5S rRNA and induces a conformational change that stimulates the binding of L5 to 5S rRNA. Association of 5S rRNA with 23S rRNA depends on the binding of L18 and L5 to 5S rRNA. L18/L5e is generally described as L18 in prokaryotes and archaea, and as L5e (or L5) in eukaryotes. In bacteria, the CP proteins L5, L18, and L25 are required for the ribosome to incorporate 5S rRNA into the large subunit, one of the last steps in ribosome assembly. In archaea, both L18 and L5 bind 5S rRNA; in eukaryotes, only the L18 homolog (L5e) binds 5S rRNA but a homolog to L5 is also identified." Q#15804 - CGI_10001909 superfamily 215872 25 99 2.52E-19 76.4634 cl08261 Ribosomal_L6 superfamily - - Ribosomal protein L6; Ribosomal protein L6. Q#15806 - CGI_10006664 superfamily 247792 195 233 0.000937357 37.04 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15808 - CGI_10006666 superfamily 247684 38 241 2.68E-26 108.204 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15809 - CGI_10006667 superfamily 241596 278 335 4.05E-13 64.1575 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#15812 - CGI_10006670 superfamily 241580 686 760 1.96E-12 64.4974 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#15813 - CGI_10006671 superfamily 241810 231 287 1.55E-21 88.0134 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#15813 - CGI_10006671 superfamily 241810 434 485 9.08E-14 66.0364 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#15813 - CGI_10006671 superfamily 243107 131 208 7.44E-23 92.4382 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#15814 - CGI_10006672 superfamily 241900 36 196 1.67E-65 210.558 cl00490 EEP superfamily C - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#15814 - CGI_10006672 superfamily 241900 200 372 3.20E-62 202.083 cl00490 EEP superfamily N - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#15815 - CGI_10006673 superfamily 241600 64 248 5.88E-71 219.419 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#15816 - CGI_10006674 superfamily 241600 51 220 6.62E-67 207.478 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#15817 - CGI_10006675 superfamily 241600 1 168 7.56E-71 215.567 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#15818 - CGI_10003205 superfamily 241563 68 109 1.92E-07 48.2444 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15818 - CGI_10003205 superfamily 241563 28 59 0.00135158 37.0736 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15822 - CGI_10007566 superfamily 243310 287 524 3.19E-93 286.827 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#15823 - CGI_10007567 superfamily 241574 219 439 2.09E-64 211.29 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#15823 - CGI_10007567 superfamily 241574 1 156 8.25E-59 196.268 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#15826 - CGI_10007570 superfamily 245230 26 456 0 952.12 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#15827 - CGI_10007571 superfamily 245230 9 81 8.53E-43 144.741 cl10017 Tubulin_FtsZ superfamily NC - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#15828 - CGI_10007572 superfamily 247905 307 439 1.19E-22 93.8416 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#15828 - CGI_10007572 superfamily 247805 76 149 9.45E-06 44.2504 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15829 - CGI_10000647 superfamily 221377 3 91 0.000191806 38.2187 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#15834 - CGI_10003440 superfamily 243035 403 513 2.53E-12 63.7929 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15837 - CGI_10002327 superfamily 192107 63 137 5.15E-18 78.0277 cl07312 Med14 superfamily NC - "Mediator complex subunit MED14; Saccharomyces cerevisiae RGR1 mediator complex subunit affects chromatin structure, transcriptional regulation of diverse genes and sporulation, required for glucose repression, HO repression, RME1 repression and sporulation. This subunit is also found in higher eukaryotes and Med14 is the agreed unified nomenclature for this subunit. Med14 is found in the tail region of Mediator." Q#15841 - CGI_10017082 superfamily 246925 23 321 3.57E-23 98.1965 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15843 - CGI_10017084 superfamily 219285 178 427 3.44E-90 278.823 cl06207 D123 superfamily - - "D123; This family contains a number of eukaryotic D123 proteins approximately 330 residues long. It has been shown that mutated variants of D123 exhibit temperature-dependent differences in their degradation rate. D123 proteins are regulators of eIF2, the central regulator of translational initiation." Q#15846 - CGI_10017087 superfamily 243119 8 54 6.69E-05 40.4949 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#15846 - CGI_10017087 superfamily 243119 99 152 0.000804671 37.0382 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#15847 - CGI_10017088 superfamily 248100 26 78 2.90E-09 52.5416 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#15847 - CGI_10017088 superfamily 248100 143 192 0.00226457 35.5928 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#15848 - CGI_10017089 superfamily 216897 97 181 7.13E-18 74.6401 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#15849 - CGI_10017090 superfamily 151051 63 276 7.75E-107 320.728 cl11131 Nrf1_DNA-bind superfamily - - "NLS-binding and DNA-binding and dimerisation domains of Nrf1; In Drosophila, the erect wing (ewg) protein is required for proper development of the central nervous system and the indirect flight muscles. The fly ewg gene encodes a novel DNA-binding domain that is also found in four genes previously identified in sea urchin, chicken, zebrafish, and human. Nuclear respiratory factor-1 is a transcriptional activator that has been implicated in the nuclear control of respiratory chain expression in vertebrates. The first 26 amino acids of nuclear respiratory factor-1 are required for the binding of dynein light chain. The interaction with dynein light chain is observed for both ewg and Nrf-1, transcription factors that are structurally and functionally similar between humans and Drosophila. The highest level of expression of both ewg and Nrf-1 was found in the central nervous system, somites, first branchial arch, optic vesicle, and otic vesicle. In the mouse Nrf-1 protein, there is also an NLS domain at 88-116, and a DNA binding and dimerisation domain at 127-282. Ewg is a site-specific transcriptional activator, and evolutionarily conserved regions of ewg contribute both positively and negatively to transcriptional activity." Q#15851 - CGI_10017092 superfamily 241764 280 351 5.05E-23 95.1048 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#15851 - CGI_10017092 superfamily 245201 880 1040 1.80E-65 220.879 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15851 - CGI_10017092 superfamily 243088 17 89 4.44E-24 99.3168 cl02563 PX_domain superfamily C - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#15851 - CGI_10017092 superfamily 245201 391 470 8.17E-24 101.852 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15852 - CGI_10017093 superfamily 247724 3 175 1.98E-120 340.842 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15853 - CGI_10017094 superfamily 245201 831 1118 0 589.031 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15853 - CGI_10017094 superfamily 247723 274 364 1.49E-37 138.162 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15853 - CGI_10017094 superfamily 247723 377 449 1.28E-31 120.344 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15853 - CGI_10017094 superfamily 241620 461 491 1.69E-11 61.5171 cl00113 CRIB superfamily C - "PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules." Q#15854 - CGI_10017095 superfamily 248013 699 738 0.000504793 39.9399 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#15854 - CGI_10017095 superfamily 243120 308 392 2.49E-26 105.739 cl02633 ARID superfamily - - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#15854 - CGI_10017095 superfamily 149305 184 277 1.49E-09 57.0733 cl06977 RBB1NT superfamily - - RBB1NT (NUC162) domain; This domain is found N terminal to the ARID/BRIGHT domain in DNA-binding proteins of the Retinoblastoma-binding protein 1 family. Q#15855 - CGI_10017096 superfamily 215754 102 193 3.13E-16 71.9008 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#15855 - CGI_10017096 superfamily 215754 221 296 5.78E-14 65.7376 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#15855 - CGI_10017096 superfamily 215754 21 93 7.40E-08 48.7888 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#15856 - CGI_10017097 superfamily 242896 21 311 1.03E-109 323.889 cl02127 RNA_pol_Rpc34 superfamily - - "RNA polymerase Rpc34 subunit; Subunit specific to RNA Pol III, the tRNA specific polymerase. The C34 subunit of yeast RNA Pol III is part of a subcomplex of three subunits which have no counterpart in the other two nuclear RNA polymerases. This subunit interacts with TFIIIB70 and is therefore participates in Pol III recruitment." Q#15857 - CGI_10017098 superfamily 248013 26 76 0.000209268 39.1695 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#15859 - CGI_10017100 superfamily 242162 22 61 1.48E-20 77.8953 cl00876 Ribosomal_S27 superfamily - - Ribosomal protein S27a; This family of ribosomal proteins consists mainly of the 40S ribosomal protein S27a which is synthesised as a C-terminal extension of ubiquitin (CEP). The S27a domain compromises the C-terminal half of the protein. The synthesis of ribosomal proteins as extensions of ubiquitin promotes their incorporation into nascent ribosomes by a transient metabolic stabilisation and is required for efficient ribosome biogenesis. The ribosomal extension protein S27a contains a basic region that is proposed to form a zinc finger; its fusion gene is proposed as a mechanism to maintain a fixed ratio between ubiquitin necessary for degrading proteins and ribosomes a source of proteins. Q#15860 - CGI_10017101 superfamily 243034 683 769 2.02E-11 61.6272 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#15860 - CGI_10017101 superfamily 243034 644 710 3.01E-06 46.2192 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#15860 - CGI_10017101 superfamily 243034 343 457 2.04E-05 43.5228 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#15860 - CGI_10017101 superfamily 243034 441 525 0.000518964 39.2856 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#15862 - CGI_10017103 superfamily 247905 967 1091 2.01E-27 110.02 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#15862 - CGI_10017103 superfamily 247805 434 579 2.10E-21 92.7856 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15862 - CGI_10017103 superfamily 206063 242 362 9.56E-38 140.092 cl16460 DBINO superfamily - - DNA-binding domain; DBINO is a DNA-binding domain found on global transcription activator SNF2L1 proteins and chromatin re-modelling proteins. Q#15863 - CGI_10017104 superfamily 241565 20 89 2.47E-10 53.8647 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#15864 - CGI_10017105 superfamily 247804 375 429 4.93E-05 41.407 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#15864 - CGI_10017105 superfamily 247804 326 368 0.000293843 39.0958 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#15865 - CGI_10017106 superfamily 245814 45 133 0.00099114 34.3871 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15867 - CGI_10017108 superfamily 217685 288 434 4.69E-26 104.338 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#15867 - CGI_10017108 superfamily 216290 152 273 3.53E-21 89.6549 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#15868 - CGI_10017109 superfamily 245814 168 236 9.92E-05 39.7799 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15868 - CGI_10017109 superfamily 245814 82 135 0.00384262 35.0633 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15871 - CGI_10017112 superfamily 247684 56 163 4.72E-14 69.9251 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15872 - CGI_10006447 superfamily 241864 25 187 9.63E-43 141.065 cl00439 UPF0047 superfamily - - Uncharacterized protein family UPF0047; This family has no known function. The alignment contains a conserved aspartate and histidine that may be functionally important. Q#15873 - CGI_10006448 superfamily 243072 35 154 3.97E-24 94.7578 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15873 - CGI_10006448 superfamily 243072 128 197 4.08E-06 43.9115 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15873 - CGI_10006448 superfamily 243073 221 260 5.84E-06 42.4573 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#15875 - CGI_10006451 superfamily 243161 14 68 8.14E-07 42.4222 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#15877 - CGI_10006453 superfamily 245226 277 448 1.25E-19 87.7412 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#15879 - CGI_10006455 superfamily 247743 335 368 1.66E-05 45.2147 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#15879 - CGI_10006455 superfamily 243092 888 1222 1.55E-17 84.3088 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15880 - CGI_10006456 superfamily 217410 30 71 0.000464934 37.3336 cl18409 DDE_1 superfamily NC - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localised to the centromere." Q#15882 - CGI_10021302 superfamily 238012 126 174 2.32E-08 50.0454 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#15882 - CGI_10021302 superfamily 238012 176 218 5.68E-07 45.8082 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#15882 - CGI_10021302 superfamily 238012 226 264 1.39E-06 44.6526 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#15882 - CGI_10021302 superfamily 238012 73 119 5.77E-05 40.0302 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#15883 - CGI_10021303 superfamily 241563 68 109 1.17E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15883 - CGI_10021303 superfamily 241563 28 59 0.00120833 37.0736 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15885 - CGI_10021305 superfamily 245602 663 968 8.04E-70 235.573 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#15885 - CGI_10021305 superfamily 245602 898 1051 1.06E-26 113.491 cl11402 GH31 superfamily N - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#15891 - CGI_10021311 superfamily 247745 5 283 2.77E-95 285.379 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#15892 - CGI_10021312 superfamily 241832 277 333 0.000771864 38.3712 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#15893 - CGI_10021313 superfamily 241563 105 143 5.64E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15894 - CGI_10021314 superfamily 243091 104 204 4.54E-14 65.206 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#15895 - CGI_10021315 superfamily 241563 62 100 7.25E-05 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15895 - CGI_10021315 superfamily 243092 306 434 0.00220636 38.8552 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15895 - CGI_10021315 superfamily 217020 100 199 0.00371837 36.0334 cl03574 Seryl_tRNA_N superfamily - - Seryl-tRNA synthetase N-terminal domain; This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase. Q#15896 - CGI_10021316 superfamily 241817 1 110 4.86E-41 138.479 cl00365 F1-ATPase_gamma superfamily N - "mitochondrial ATP synthase gamma subunit; The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain of F-ATPases is composed of alpha, beta, gamma, delta, and epsilon (not present in bacteria) subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain." Q#15897 - CGI_10021317 superfamily 216363 66 118 2.11E-13 62.4878 cl08312 UPF0029 superfamily NC - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#15898 - CGI_10021318 superfamily 247905 403 519 1.54E-26 106.168 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#15898 - CGI_10021318 superfamily 192726 189 338 6.43E-53 181.111 cl12779 Med25 superfamily - - "Mediator complex subunit 25 PTOV activation and synapsin 2; Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-active part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. The overall function of the full-length Med25 is efficiently to coordinate the transcriptional activation of RAR/RXR (retinoic acid receptor/retinoic X receptor) in higher eukaryotic cells. Human Med25 consists of several domains with different binding properties, the N-terminal, VWA domain, an SD1 - synapsin 1 - domain from residues 229-381, a PTOV(B) or ACID domain from 395-545, an SD2 domain from residues 564-645 and a C-terminal NR box-containing domain (646-650) from 646-747. This family is the combined PTOV and SD2 domains. the PTOV domain being the domain through which Med25 co-operates with the histone acetyltransferase CBP, but the function of the SD2 domain is unclear." Q#15901 - CGI_10021321 superfamily 245814 81 162 0.000176508 39.4109 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15901 - CGI_10021321 superfamily 246918 222 245 0.000506741 37.9515 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#15902 - CGI_10021322 superfamily 241645 7 120 1.92E-53 165.486 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#15903 - CGI_10021323 superfamily 217293 1 194 4.02E-38 137.764 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#15906 - CGI_10021326 superfamily 244859 244 471 6.31E-17 79.9229 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#15907 - CGI_10021327 superfamily 243179 89 200 5.65E-11 57.1575 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#15908 - CGI_10021328 superfamily 243035 132 262 6.34E-12 60.4802 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15909 - CGI_10021329 superfamily 243035 128 252 3.32E-06 44.1478 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#15910 - CGI_10021330 superfamily 234583 6 422 6.90E-165 483.896 cl18873 PRK00029 superfamily C - hypothetical protein; Validated Q#15910 - CGI_10021330 superfamily 234583 508 585 5.36E-18 85.9852 cl18873 PRK00029 superfamily N - hypothetical protein; Validated Q#15911 - CGI_10021331 superfamily 243212 103 223 5.08E-16 71.6061 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#15911 - CGI_10021331 superfamily 215866 10 78 6.78E-09 51.9424 cl18349 Arrestin_N superfamily N - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#15913 - CGI_10021333 superfamily 205121 23 47 3.44E-05 40.5652 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#15914 - CGI_10021334 superfamily 198913 1628 1756 6.98E-55 189.491 cl07883 CAMSAP_CKK superfamily - - "Microtubule-binding calmodulin-regulated spectrin-associated; This is the C-terminal domain of a family of eumetazoan proteins collectively defined as calmodulin-regulated spectrin-associated, or CAMSAP, proteins. CAMSAP proteins carry an N-terminal region that includes the CH domain, a central region including a predicted coiled-coil and this C-terminal, or CKK, domain - defined as being present in CAMSAP, KIAA1078 and KIAA1543, The C-terminal domain is the part of the CAMSAP proteins that binds to microtubules. The domain appears to act by producing inhibition of neurite extension, probably by blocking microtubule function. CKK represents a domain that has evolved with the metazoa. The structure of a murine hypothetical protein from RIKEN cDNA has shown the domain to adopt a mainly beta barrel structure with an associated alpha-helical hairpin." Q#15914 - CGI_10021334 superfamily 241559 221 304 6.16E-21 90.4228 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#15915 - CGI_10021335 superfamily 203690 439 494 6.34E-13 64.4912 cl06576 UnbV_ASPIC superfamily C - ASPIC and UnbV; This conserved sequence is found associated with pfam00515 in several paralogous proteins in Rhodopirellula baltica. It is also found associated with pfam01839 in several eukaryotic integrin-like proteins (e.g. human ASPIC) and in several other bacterial proteins. Q#15916 - CGI_10021336 superfamily 216292 3 208 1.74E-37 131.403 cl03091 Clathrin_lg_ch superfamily - - Clathrin light chain; Clathrin light chain. Q#15917 - CGI_10021337 superfamily 245205 5 110 3.27E-23 88.0889 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#15918 - CGI_10021338 superfamily 148567 31 101 1.06E-16 69.9143 cl06182 Rab5ip superfamily - - Rab5-interacting protein (Rab5ip); This family consists of several Rab5-interacting protein (RIP5 or Rab5ip) sequences. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5. Q#15920 - CGI_10006624 superfamily 218118 63 133 7.03E-16 68.7948 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#15923 - CGI_10006627 superfamily 218118 163 240 2.27E-12 60.3205 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#15923 - CGI_10006627 superfamily 218118 27 93 8.80E-12 58.7797 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#15927 - CGI_10011136 superfamily 244859 8 115 1.27E-08 51.0081 cl08171 HtrL_YibB superfamily N - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#15928 - CGI_10011137 superfamily 248097 101 226 4.32E-07 46.5265 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#15928 - CGI_10011137 superfamily 245674 17 77 0.00388719 34.187 cl11531 DUF904 superfamily - - Protein of unknown function (DUF904); This family consists of several bacterial and archaeal hypothetical proteins of unknown function. Q#15934 - CGI_10011143 superfamily 241754 208 429 7.28E-115 356.685 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#15934 - CGI_10011143 superfamily 241754 16 138 2.40E-46 169.478 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#15934 - CGI_10011143 superfamily 245835 495 618 0.00762155 37.456 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#15935 - CGI_10011144 superfamily 241571 366 489 2.46E-14 69.3634 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#15935 - CGI_10011144 superfamily 241583 171 361 5.69E-41 145.792 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#15937 - CGI_10011146 superfamily 216254 21 62 8.89E-09 51.4798 cl08303 Recep_L_domain superfamily C - Receptor L domain; The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. Q#15942 - CGI_10003675 superfamily 247805 35 149 8.31E-14 64.666 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#15943 - CGI_10003676 superfamily 243072 21 122 3.68E-07 46.6078 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15943 - CGI_10003676 superfamily 243072 94 197 0.000430162 37.7483 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15946 - CGI_10023339 superfamily 221808 105 193 0.00826726 36.1718 cl15119 Tet_JBP superfamily C - "Oxygenase domain of the 2OGFeDO superfamily; A double-stranded beta helix (DSBH) fold domain of the 2-oxoglutarate (2OG)-Fe(II)-dependent dioxygenase (2OGFeDO) superfamily found in various eukaryotes, bacteria and bacteriophages. Members of this family catalyze nucleic acid modifications, such as thymidine hydroxylation during base J synthesis in kinetoplastids, and the conversion of 5 methyl-cytosine (5-mC) to 5-hydroxymethyl-cytosine (hmC), or further oxidation to 5-formylcytosine (5fC) and 5-carboxylcytosine (5caC). Metazoan TET proteins contain a cysteine-rich region inserted into the core of the DSBH fold. Vertebrate TET proteins are oncogenes that are mutated in various myeloid cancers. Fungal and algal versions of this family are linked to a predicted transposase and show lineage-specific expansions." Q#15946 - CGI_10023339 superfamily 241563 70 105 0.00971585 34.6203 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#15948 - CGI_10023341 superfamily 245882 21 116 1.13E-34 123.555 cl12119 Alpha_L_fucos superfamily C - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#15949 - CGI_10023342 superfamily 245882 25 400 1.23E-166 477.553 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#15950 - CGI_10023343 superfamily 243146 161 206 3.98E-08 49.197 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15950 - CGI_10023343 superfamily 243146 17 62 7.08E-08 48.4266 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15950 - CGI_10023343 superfamily 243146 113 162 1.57E-07 47.4429 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15950 - CGI_10023343 superfamily 243146 65 111 3.43E-05 40.7226 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15950 - CGI_10023343 superfamily 243146 260 306 0.000105957 39.1818 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#15953 - CGI_10023346 superfamily 218165 30 103 8.37E-10 53.7763 cl04619 Ribophorin_I superfamily C - "Ribophorin I; Ribophorin I is an essential subunit of oligosaccharyltransferase (OST), which is also known as Dolichyl-diphosphooligosaccharide--protein glycosyltransferase, (EC:2.4.1.119). OST catalyzes the transfer of an oligosaccharide from dolichol pyrophosphate to selected asparagine residues of nascent polypeptides as they are translocated into the lumen of the rough endoplasmic reticulum. Ribophorin I and OST48 are though to be responsible for OST catalytic activity. Both yeast and mammalian proteins are glycosylated but the sites are not conserved. Glycosylation may contribute towards general solubility but is unlikely to be involved in a specific biochemical function Most family members are predicted to have a transmembrane helix at the C terminus of this region." Q#15954 - CGI_10023347 superfamily 218165 32 452 5.92E-169 490.978 cl04619 Ribophorin_I superfamily - - "Ribophorin I; Ribophorin I is an essential subunit of oligosaccharyltransferase (OST), which is also known as Dolichyl-diphosphooligosaccharide--protein glycosyltransferase, (EC:2.4.1.119). OST catalyzes the transfer of an oligosaccharide from dolichol pyrophosphate to selected asparagine residues of nascent polypeptides as they are translocated into the lumen of the rough endoplasmic reticulum. Ribophorin I and OST48 are though to be responsible for OST catalytic activity. Both yeast and mammalian proteins are glycosylated but the sites are not conserved. Glycosylation may contribute towards general solubility but is unlikely to be involved in a specific biochemical function Most family members are predicted to have a transmembrane helix at the C terminus of this region." Q#15956 - CGI_10023349 superfamily 243056 174 420 8.58E-42 148.661 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#15958 - CGI_10023351 superfamily 247724 37 362 0 575.63 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#15958 - CGI_10023351 superfamily 246925 601 741 0.000246629 42.7278 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#15959 - CGI_10023352 superfamily 238012 15705 15739 0.00683274 39.645 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#15959 - CGI_10023352 superfamily 238012 14247 14295 0.00798614 39.645 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#15959 - CGI_10023352 superfamily 245847 156 283 6.28E-10 62.3697 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#15959 - CGI_10023352 superfamily 243065 15195 15347 1.09E-08 58.6073 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#15959 - CGI_10023352 superfamily 219677 13695 13726 8.82E-05 45.1212 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#15959 - CGI_10023352 superfamily 219677 12780 12813 0.000141245 44.3508 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#15959 - CGI_10023352 superfamily 241611 14787 14946 0.000198994 45.072 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#15959 - CGI_10023352 superfamily 219677 13433 13464 0.00087752 42.0396 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#15959 - CGI_10023352 superfamily 219677 13305 13336 0.000945954 42.0396 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#15959 - CGI_10023352 superfamily 219677 12751 12775 0.00961244 38.958 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#15961 - CGI_10023354 superfamily 219677 2 33 0.00260467 36.6468 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#15962 - CGI_10023355 superfamily 243134 30 160 1.25E-33 120.832 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#15962 - CGI_10023355 superfamily 243134 175 307 1.58E-25 98.8755 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#15963 - CGI_10023356 superfamily 243056 200 422 1.37E-43 159.009 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#15964 - CGI_10023357 superfamily 247792 654 697 0.000159624 40.892 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#15965 - CGI_10023358 superfamily 247829 10 412 0 647.411 cl17275 PRTase_typeII superfamily - - "Phosphoribosyltransferase (PRTase) type II; This family contains two enzymes that play an important role in NAD production by either allowing quinolinic acid (QA) , quinolinate phosphoribosyl transferase (QAPRTase), or nicotinic acid (NA), nicotinate phosphoribosyltransferase (NAPRTase), to be used in the synthesis of NAD. QAPRTase catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide, an important step in the de novo synthesis of NAD. NAPRTase catalyses a similar reaction leading to NAMN and pyrophosphate, using nicotinic acid an PPRP as substrates, used in the NAD salvage pathway." Q#15966 - CGI_10023359 superfamily 241584 625 715 8.04E-11 60.5879 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#15966 - CGI_10023359 superfamily 245814 60 134 7.38E-08 51.3359 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15966 - CGI_10023359 superfamily 241584 944 1036 2.76E-07 50.1875 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#15966 - CGI_10023359 superfamily 245814 547 618 3.01E-06 46.7135 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15966 - CGI_10023359 superfamily 222432 1175 1258 1.78E-19 86.593 cl16451 Bravo_FIGEY superfamily N - C-terminal domain of Fibronectin type III; This is the very C-terminal region of neural adhesion molecule L1 proteins that are also known as Bravo or NrCAM. It lies upstream of the IG and Fn3 domains and has the highly conserved motif FIGEY. The function is not known. Q#15966 - CGI_10023359 superfamily 245814 356 429 1.32E-15 73.973 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15966 - CGI_10023359 superfamily 245814 440 517 2.00E-11 62.1376 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15966 - CGI_10023359 superfamily 245814 277 325 3.02E-08 52.4091 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15966 - CGI_10023359 superfamily 245814 158 239 6.51E-07 49.0908 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#15967 - CGI_10023360 superfamily 215754 102 197 5.03E-28 104.643 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#15967 - CGI_10023360 superfamily 215754 201 290 2.78E-25 97.324 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#15967 - CGI_10023360 superfamily 215754 2 99 1.94E-18 78.064 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#15968 - CGI_10023361 superfamily 243082 749 894 1.03E-53 187.11 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#15968 - CGI_10023361 superfamily 243082 255 439 2.74E-22 98.2136 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#15968 - CGI_10023361 superfamily 245879 19 116 2.78E-16 75.8577 cl12116 DUSP superfamily - - DUSP domain; The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. Q#15969 - CGI_10023362 superfamily 247723 251 329 7.55E-42 148.703 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15969 - CGI_10023362 superfamily 247723 336 407 6.39E-33 123.085 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15969 - CGI_10023362 superfamily 247723 79 157 7.30E-28 108.887 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#15969 - CGI_10023362 superfamily 243072 545 660 1.01E-25 104.388 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15969 - CGI_10023362 superfamily 243072 800 915 1.01E-25 104.388 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#15969 - CGI_10023362 superfamily 243073 990 1027 1.05E-07 50.2221 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#15970 - CGI_10023363 superfamily 244089 1 168 1.10E-102 294.941 cl05433 ARPC4 superfamily - - ARP2/3 complex 20 kDa subunit (ARPC4); This family consists of several eukaryotic ARP2/3 complex 20 kDa subunit (P20-ARC) proteins. The Arp2/3 protein complex has been implicated in the control of actin polymerisation in cells. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3 it has been suggested that the complex promotes actin assembly in lamellipodia and may participate in lamellipodial protrusion. Q#15971 - CGI_10023364 superfamily 111626 44 244 1.05E-125 361.902 cl17929 Synapsin_C superfamily - - "Synapsin, ATP binding domain; Ca dependent ATP binding in this ATP grasp fold. Function unknown." Q#15971 - CGI_10023364 superfamily 111020 8 42 8.13E-11 57.9236 cl03435 Synapsin superfamily N - "Synapsin, N-terminal domain; Synapsin, N-terminal domain. " Q#15972 - CGI_10023365 superfamily 243064 20 186 5.09E-20 84.0042 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#15973 - CGI_10023366 superfamily 150823 28 182 3.59E-48 156.079 cl10897 Armet superfamily - - "Degradation arginine-rich protein for mis-folding; This is a family of small proteins of approximately 170 residues which contain four di-sulfide bridges that are highly conserved from nematodes to humans. Armet is a soluble protein resident in the endoplasmic reticulum and induced by ER stress. It appears to be involved with dealing with mis-folded proteins in the ER, thus in quality control of ER stress." Q#15974 - CGI_10023367 superfamily 247684 2 358 2.07E-146 425.259 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#15975 - CGI_10023368 superfamily 241705 91 203 7.02E-29 106.903 cl00228 HIT_like superfamily - - "HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups." Q#15975 - CGI_10023368 superfamily 218677 2 63 9.72E-16 69.9749 cl12312 DcpS superfamily N - Scavenger mRNA decapping enzyme (DcpS) N-terminal; This family consists of several scavenger mRNA decapping enzymes (DcpS) and is the N-terminal domain of these proteins. DcpS is a scavenger pyrophosphatase that hydrolyses the residual cap structure following 3' to 5' decay of an mRNA. The association of DcpS with 3' to 5' exonuclease exosome components suggests that these two activities are linked and there is a coupled exonucleolytic decay-dependent decapping pathway. Q#15975 - CGI_10023368 superfamily 243836 197 236 0.00267467 35.7343 cl04661 Polysacc_synt_4 superfamily NC - Polysaccharide biosynthesis; This family of proteins plays a role in xylan biosynthesis in plant cell walls. Its precise role in xylan biosynthesis is unknown. Its function in other organisms is unknown. Q#15977 - CGI_10023370 superfamily 245213 84 121 0.000766461 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15977 - CGI_10023370 superfamily 245213 123 164 0.000828818 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15977 - CGI_10023370 superfamily 243119 496 533 0.000893112 37.7985 cl02629 CBM_14 superfamily C - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#15981 - CGI_10023374 superfamily 241900 2 312 1.30E-127 377.813 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#15981 - CGI_10023374 superfamily 219199 474 523 2.82E-10 56.2344 cl06070 zf-GRF superfamily - - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#15982 - CGI_10023375 superfamily 243125 12 28 0.00380983 32.6613 cl02649 LEM superfamily N - "LEM (Lap2/Emerin/Man1) domain found in emerin, lamina-associated polypeptide 2 (LAP2), inner nuclear membrane protein Man1 and similar proteins; The family corresponds to a group of inner nuclear membrane proteins containing LEM domain. Emerin occurs in four phosphorylated forms and plays a role in cell cycle-dependent events. It is absent from the inner nuclear membrane in most patients with X-linked muscular dystrophy. Emerin interacts with A-type and B-type lamins. Man1, also termed LEM domain-containing protein 3 (LEMD3) is an integral protein of the inner nuclear membrane that binds to nuclear lamins and emerin, thus playing a role in nuclear organization. LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and post-mitotic reassembly. Some LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are non-membrane nuclear polypeptides. This family also contains LEM domain-containing protein LEMP-1 and LEM2. LEMP-1, also termed cancer/testis antigen 50 (CT50), is encoded by LEMD1, a novel testis-specific gene expressed in colorectal cancers. LEMP-1 may function as a cancer-testis antigen for immunotherapy of colorectal carcinoma (CRC). LEM2, also termed LEMD2, is a novel Man1-related ubiquitously expressed inner nuclear membrane protein required for normal nuclear envelope morphology. Association with lamin A is required for its proper nuclear envelope localization while its binding to lamin C plays an important role in the organization of lamin A/C complexes. Some uncharacterized LEM domain-containing proteins are also included in this family. Unlike other family members, these harbor an ankyrin repeat region that may mediate protein-protein interactions." Q#15984 - CGI_10023377 superfamily 241665 275 662 9.80E-89 284.255 cl00183 AGE superfamily - - "AGE domain; N-acyl-D-glucosamine 2-epimerase domain; Responsible for intermediate epimerization during biosynthesis of N-acetylneuraminic acid. Catalytic mechanism is believed to be via nucleotide elimination and readdition and is ATP modulated. AGE is structurally and mechanistically distinct from the other four types of epimerases. The AGE domain monomer is composed of an alpha(6)/alpha(6)-barrel, the structure of which is also found in glucoamylase and cellulase. The active form is a homodimer. The alignment also contains subtype III mannose 6-phosphate isomerases." Q#15985 - CGI_10023378 superfamily 217473 231 374 6.85E-15 73.1681 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#15986 - CGI_10023379 superfamily 241617 186 251 3.85E-19 83.5761 cl00110 MBD superfamily - - "MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family." Q#15986 - CGI_10023379 superfamily 246713 707 807 3.52E-05 43.1142 cl14786 ENDO3c superfamily C - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#15987 - CGI_10023380 superfamily 243092 24 296 4.89E-33 128.607 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#15988 - CGI_10023381 superfamily 244913 1 441 2.53E-157 458.204 cl08327 Glyco_hydro_47 superfamily - - "Glycosyl hydrolase family 47; Members of this family are alpha-mannosidases that catalyze the hydrolysis of the terminal 1,2-linked alpha-D-mannose residues in the oligo-mannose oligosaccharide Man(9)(GlcNAc)(2)." Q#15989 - CGI_10023382 superfamily 247068 69 130 3.90E-05 43.8414 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#15989 - CGI_10023382 superfamily 245847 1368 1511 3.01E-06 47.7321 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#15990 - CGI_10023383 superfamily 245201 375 636 5.83E-122 366.798 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#15990 - CGI_10023383 superfamily 247038 48 155 3.08E-18 81.3093 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15991 - CGI_10023384 superfamily 247042 41 488 2.13E-57 199.094 cl15693 Sema superfamily - - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#15991 - CGI_10023384 superfamily 247038 542 586 2.56E-11 60.3157 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#15991 - CGI_10023384 superfamily 243104 491 535 3.03E-09 53.7029 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#15992 - CGI_10023385 superfamily 247939 125 278 6.61E-05 40.8132 cl17385 DUF442 superfamily C - Putative phosphatase (DUF442); Although this domain is uncharacterized it seems likely that it performs a phosphatase function. Q#15993 - CGI_10023386 superfamily 243096 450 634 9.35E-21 91.2052 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#15993 - CGI_10023386 superfamily 247725 632 813 3.30E-29 115.738 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#15994 - CGI_10023387 superfamily 241578 598 785 1.37E-58 201.854 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#15994 - CGI_10023387 superfamily 219821 476 586 7.02E-31 119.783 cl07136 VWA_N superfamily - - "VWA N-terminal; This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits." Q#15994 - CGI_10023387 superfamily 217211 866 924 6.66E-10 57.6793 cl03691 Cache_1 superfamily N - Cache domain; Cache domain. Q#15994 - CGI_10023387 superfamily 117050 937 1007 3.60E-09 55.8842 cl07190 VGCC_alpha2 superfamily C - "Neuronal voltage-dependent calcium channel alpha 2acd; This eukaryotic domain has been found in the neuronal voltage-dependent calcium channel (VGCC) alpha 2a, 2c, and 2d subunits. It is also found in other calcium channel alpha-2 delta subunits to the N-terminus of a Cache domain (pfam02743)." Q#15995 - CGI_10023388 superfamily 248275 167 190 0.00175764 36.0188 cl17721 zf-C2H2_jaz superfamily - - "Zinc-finger double-stranded RNA-binding; This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation." Q#15996 - CGI_10023389 superfamily 241884 8 220 6.16E-149 418.279 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#15998 - CGI_10023391 superfamily 245847 1506 1629 7.20E-10 58.9029 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#15999 - CGI_10023392 superfamily 245213 9 44 1.13E-10 53.7946 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#15999 - CGI_10023392 superfamily 245847 48 194 9.78E-15 67.5817 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#16000 - CGI_10023393 superfamily 248247 130 290 1.66E-23 101.527 cl17693 Integrin_beta superfamily N - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#16000 - CGI_10023393 superfamily 248247 18 132 2.51E-10 61.081 cl17693 Integrin_beta superfamily C - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#16000 - CGI_10023393 superfamily 219677 300 327 0.00925446 34.3356 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#16001 - CGI_10004341 superfamily 247792 18 68 1.26E-06 46.2848 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16001 - CGI_10004341 superfamily 220605 252 306 0.00951008 37.3403 cl10853 Med17 superfamily NC - Subunit 17 of Mediator complex; This Mediator complex subunit was formerly known as Srb4 in yeasts or Trap80 in Drosophila and human. The Med17 subunit is located within the head domain and is essential for cell viability to the extent that a mutant strain of cerevisiae lacking it shows all RNA polymerase II-dependent transcription ceasing at non-permissive temperatures. Q#16004 - CGI_10004344 superfamily 246954 34 578 1.20E-105 331.513 cl15415 Sec1 superfamily - - Sec1 family; Sec1 family. Q#16005 - CGI_10004345 superfamily 243103 15 56 0.00226873 35.9248 cl02600 HTH_MerR-SF superfamily N - "Helix-Turn-Helix DNA binding domain of transcription regulators from the MerR superfamily; Helix-turn-helix (HTH) transcription regulator MerR superfamily, N-terminal domain. The MerR family transcription regulators have been shown to mediate responses to stress including exposure to heavy metals, drugs, or oxygen radicals in eubacterial and some archaeal species. They regulate transcription of multidrug/metal ion transporter genes and oxidative stress regulons by reconfiguring the spacer between the -35 and -10 promoter elements. A typical MerR regulator is comprised of two distinct domains that harbor the regulatory (effector-binding) site and the active (DNA-binding) site. Their N-terminal domains are homologous and contain a DNA-binding winged HTH motif, while the C-terminal domains are often dissimilar and bind specific coactivator molecules such as metal ions, drugs, and organic substrates." Q#16006 - CGI_10004346 superfamily 243176 7 552 0 893.959 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#16008 - CGI_10009050 superfamily 243034 32 59 0.00799382 31.2628 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#16009 - CGI_10009051 superfamily 247724 183 388 1.30E-56 186.546 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16009 - CGI_10009051 superfamily 204472 8 88 9.10E-33 120.195 cl11057 TrmE_N superfamily N - "GTP-binding protein TrmE N-terminus; This family represents the shorter, B, chain of the homo-dimeric structure which is a guanine nucleotide-binding protein that binds and hydrolyses GTP. TrmE is homologous to the tetrahydrofolate-binding domain of N,N-dimethylglycine oxidase and indeed binds formyl-tetrahydrofolate. TrmE actively participates in the formylation reaction of uridine and regulates the ensuing hydrogenation reaction of a Schiff's base intermediate. This B chain is the N-terminal portion of the protein consisting of five beta-strands and three alpha helices and is necessary for mediating dimer formation within the protein." Q#16009 - CGI_10009051 superfamily 204989 396 464 4.27E-15 70.2405 cl14994 GTPase_Cys_C superfamily - - "Catalytic cysteine-containing C-terminus of GTPase, MnmE; This short C-terminal region contains the only cysteine present in these proteins. It is proposed that MnmE is a tRNA-modifying enzyme and that Cys-451 functions as a catalytic residue in the modification reaction." Q#16010 - CGI_10009052 superfamily 241563 58 98 5.29E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16011 - CGI_10009053 superfamily 246910 10 85 7.98E-46 150.846 cl15257 GIY-YIG_SF superfamily - - "GIY-YIG nuclease domain superfamily; The GIY-YIG nuclease domain superfamily includes a large and diverse group of proteins involved in many cellular processes, such as class I homing GIY-YIG family endonucleases, prokaryotic nucleotide excision repair proteins UvrC and Cho, type II restriction enzymes, the endonuclease/reverse transcriptase of eukaryotic retrotransposable elements, and a family of eukaryotic enzymes that repair stalled replication forks. All of these members contain a conserved GIY-YIG nuclease domain that may serve as a scaffold for the coordination of a divalent metal ion required for catalysis of the phosphodiester bond cleavage. By combining with different specificity, targeting, or other domains, the GIY-YIG nucleases may perform different functions." Q#16012 - CGI_10009054 superfamily 215866 17 156 9.75E-41 142.464 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#16012 - CGI_10009054 superfamily 243212 178 305 7.92E-27 103.963 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#16013 - CGI_10009055 superfamily 242889 309 410 5.44E-24 95.3625 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#16014 - CGI_10009056 superfamily 220612 33 96 2.01E-28 99.5222 cl10867 DDA1 superfamily - - "Det1 complexing ubiquitin ligase; DDA1 (De-etiolated 1, Damaged DNA binding protein 1 associated 1) protein binds strongly with DDB1 and Det1 forming a DDD complex which is part of the ubiquitin conjugation system." Q#16015 - CGI_10009057 superfamily 177822 2 213 2.04E-12 63.7857 cl18088 PLN02164 superfamily N - sulfotransferase Q#16017 - CGI_10009059 superfamily 241547 96 323 4.98E-96 286.486 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#16018 - CGI_10009060 superfamily 241547 89 317 1.02E-89 270.308 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#16021 - CGI_10000541 superfamily 241750 5 183 3.01E-125 365.537 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#16023 - CGI_10000738 superfamily 248009 15 112 9.67E-27 106.915 cl17455 UPF0027 superfamily N - Uncharacterized protein family UPF0027; Uncharacterized protein family UPF0027. Q#16024 - CGI_10000640 superfamily 248054 36 88 1.76E-09 52.8596 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#16025 - CGI_10000754 superfamily 245205 42 123 5.56E-08 47.6177 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#16026 - CGI_10004553 superfamily 220627 13 43 4.90E-07 44.1382 cl10886 Telomere_reg-2 superfamily N - Telomere length regulation protein; This family is the central conserved 110 amino acid region of a group of proteins called telomere-length regulation or clock abnormal protein-2 which are conserved from plants to humans. The full-length protein regulates telomere length and contributes to silencing of sub-telomeric regions. In vitro the protein binds to telomeric DNA repeats. Q#16027 - CGI_10004554 superfamily 242184 1 27 5.54E-07 42.8861 cl00909 Ribosomal_L24e_L24 superfamily N - "Ribosomal protein L24e/L24 is a ribosomal protein found in eukaryotes (L24) and in archaea (L24e, distinct from archaeal L24). L24e/L24 is located on the surface of the large subunit, adjacent to proteins L14 and L3, and near the translation factor binding site. L24e/L24 appears to play a role in the kinetics of peptide synthesis, and may be involved in interactions between the large and small subunits, either directly or through other factors. In mouse, a deletion mutation in L24 has been identified as the cause for the belly spot and tail (Bst) mutation that results in disrupted pigmentation, somitogenesis and retinal cell fate determination. L24 may be an important protein in eukaryotic reproduction: in shrimp, L24 expression is elevated in the ovary, suggesting a role in oogenesis, and in Arabidopsis, L24 has been proposed to have a specific function in gynoecium development. No protein with sequence or structural homology to L24e/L24 has been identified in bacteria, but a functionally equivalent protein may exist. Bacterial L19 forms an interprotein beta sheet with L14 that is similar to the L24e/L14 interprotein beta sheet observed in the archaeal L24e structures. Some eukaryotic L24 proteins were initially identified as L30, and this alignment model contains several sequences called L30." Q#16030 - CGI_10004557 superfamily 221130 1 93 4.09E-17 71.1527 cl13043 ARL2_Bind_BART superfamily C - The ARF-like 2 binding protein BART; BART binds specifically to ARL2.GTP with a high affinity however it does not bind to ARL2.GDP. It is thought that this specific interaction is due to BART being the first identified ARL2-specific effector. The function is not completely characterized. BART is predominantly cytosolic but can also be found to be associated with mitochondria. BART is also involved in binding to the adenine nucleotide transporter ANT1. Q#16031 - CGI_10001008 superfamily 241619 34 80 2.13E-05 39.1025 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#16035 - CGI_10010119 superfamily 246713 11 32 0.000806372 32.7612 cl14786 ENDO3c superfamily N - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#16036 - CGI_10010120 superfamily 241750 19 261 2.21E-19 83.871 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#16038 - CGI_10010122 superfamily 241600 17 97 1.17E-26 98.4666 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16040 - CGI_10001477 superfamily 241568 147 172 0.00292366 35.1312 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#16040 - CGI_10001477 superfamily 245847 184 329 1.55E-21 88.7677 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#16040 - CGI_10001477 superfamily 241619 16 83 0.00280458 35.2505 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#16041 - CGI_10001121 superfamily 217293 26 224 1.00E-38 139.305 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16041 - CGI_10001121 superfamily 202474 231 304 0.00269032 37.6333 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16042 - CGI_10001122 superfamily 248012 101 236 7.91E-27 101.631 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#16043 - CGI_10001523 superfamily 247999 667 717 1.12E-06 46.4362 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#16049 - CGI_10001980 superfamily 245213 154 189 2.02E-09 51.8686 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16049 - CGI_10001980 superfamily 245213 116 151 1.83E-08 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16049 - CGI_10001980 superfamily 245847 195 261 1.31E-13 65.6557 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#16051 - CGI_10001987 superfamily 247724 13 77 2.05E-30 109.127 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16051 - CGI_10001987 superfamily 247724 73 106 2.72E-11 57.1246 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16056 - CGI_10002166 superfamily 243134 27 149 1.72E-36 128.151 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#16056 - CGI_10002166 superfamily 243134 163 285 3.25E-36 127.38 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#16057 - CGI_10002167 superfamily 248312 35 128 0.000984737 36.5625 cl17758 PMP22_Claudin superfamily C - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#16058 - CGI_10002807 superfamily 247792 314 367 4.39E-05 40.892 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16060 - CGI_10026085 superfamily 247743 53 178 0.00154039 37.9496 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16062 - CGI_10026087 superfamily 247792 94 139 3.55E-10 52.0628 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16064 - CGI_10026089 superfamily 241976 1 110 3.19E-28 101.065 cl00606 Archease superfamily N - "Archease protein family (MTH1598/TM1083); This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism." Q#16066 - CGI_10026091 superfamily 247725 48 131 5.89E-34 128.174 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16066 - CGI_10026091 superfamily 247725 180 229 7.27E-10 58.0234 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16067 - CGI_10026092 superfamily 246680 35 120 1.27E-19 83.142 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#16067 - CGI_10026092 superfamily 248012 179 312 7.86E-12 62.3408 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#16072 - CGI_10026097 superfamily 248458 67 451 2.48E-40 149.001 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16073 - CGI_10026098 superfamily 241874 29 589 0 577.208 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#16074 - CGI_10026099 superfamily 246680 24 107 4.19E-18 77.7492 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#16074 - CGI_10026099 superfamily 248012 172 282 9.78E-11 58.4888 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#16075 - CGI_10026100 superfamily 246680 24 105 2.23E-14 63.882 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#16076 - CGI_10026101 superfamily 248012 22 132 2.25E-10 55.022 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#16077 - CGI_10026102 superfamily 245847 29 99 0.000948741 38.6918 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#16079 - CGI_10026105 superfamily 247905 223 362 1.78E-19 84.2116 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#16079 - CGI_10026105 superfamily 247805 37 192 1.12E-15 73.9108 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16080 - CGI_10026106 superfamily 245226 319 421 3.81E-11 61.5477 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#16080 - CGI_10026106 superfamily 248264 585 626 9.66E-05 41.839 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#16082 - CGI_10026108 superfamily 242406 1 62 0.00106883 35.6449 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#16083 - CGI_10026109 superfamily 218200 114 343 5.25E-73 235.339 cl04660 Glyco_transf_54 superfamily - - "N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein." Q#16088 - CGI_10026114 superfamily 246675 147 435 1.33E-132 390.93 cl14615 PI-PLCc_GDPD_SF superfamily - - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#16088 - CGI_10026114 superfamily 246669 461 587 6.67E-46 158.474 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#16088 - CGI_10026114 superfamily 247856 3 57 6.96E-05 40.9941 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16088 - CGI_10026114 superfamily 150071 66 147 5.68E-17 76.4582 cl08538 efhand_like superfamily - - "Phosphoinositide-specific phospholipase C, efhand-like; Members of this family are predominantly found in phosphoinositide-specific phospholipase C. They adopt a structure consisting of a core of four alpha helices, in an EF like fold, and are required for functioning of the enzyme." Q#16089 - CGI_10026115 superfamily 247725 41 135 1.86E-15 68.1097 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16090 - CGI_10026116 superfamily 247725 789 923 3.60E-60 205.576 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16090 - CGI_10026116 superfamily 247725 1837 1933 5.15E-49 172.096 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16090 - CGI_10026116 superfamily 241566 1734 1783 1.50E-16 76.7619 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#16090 - CGI_10026116 superfamily 221536 416 667 8.35E-104 333.222 cl13732 SBF2 superfamily - - "Myotubularin protein; This domain family is found in eukaryotes, and is approximately 220 amino acids in length. The family is found in association with pfam02141, pfam03456, pfam03455. This family is the middle region of SBF2, a member of the myotubularin family. Myotubularin-related proteins have been suggested to work in phosphoinositide-mediated signalling events that may also convey control of myelination. Mutations of SBF2 are implicated in Charcot-Marie-Tooth disease." Q#16090 - CGI_10026116 superfamily 245670 2 164 2.42E-56 195.854 cl11519 DENN superfamily - - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#16090 - CGI_10026116 superfamily 206020 1399 1453 1.65E-27 108.364 cl18286 Y_phosphatase_m superfamily - - "Myotubularin Y_phosphatase-like; This short region is highly conserved and seems to be common to many myotubularin proteins with protein tyrosine pyrophosphate activity. As the family has a number of highly conserved residues such as histidine, cysteine, glutamine and aspartate, it is possible that this represents a catalytic core of the active enzymatic part of the proteins." Q#16090 - CGI_10026116 superfamily 219103 1069 1125 1.73E-14 72.7891 cl05893 Myotub-related superfamily C - "Myotubularin-related; This family represents a region within eukaryotic myotubularin-related proteins that is sometimes found with pfam02893. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease." Q#16090 - CGI_10026116 superfamily 208095 240 308 5.84E-14 69.6262 cl04084 dDENN superfamily - - dDENN domain; This region is always found associated with pfam02141. It is predicted to form a globular domain. This domain is predicted to be completely alpha helical. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. Q#16091 - CGI_10026117 superfamily 243066 23 122 1.65E-14 70.7241 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#16092 - CGI_10026118 superfamily 243066 23 117 3.33E-12 64.1757 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#16092 - CGI_10026118 superfamily 243066 405 475 6.82E-07 48.3825 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#16092 - CGI_10026118 superfamily 221377 748 881 0.000481807 40.5299 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#16093 - CGI_10026119 superfamily 220692 203 289 2.42E-06 47.1989 cl18570 7TM_GPCR_Srw superfamily N - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#16095 - CGI_10026121 superfamily 245206 99 377 2.30E-95 289.178 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16096 - CGI_10026122 superfamily 147120 79 208 1.98E-31 113.623 cl04763 Cor1 superfamily - - "Cor1/Xlr/Xmr conserved region; Cor1 is a component of the chromosome core in the meiotic prophase chromosomes. Xlr is a lymphoid cell specific protein. Xlm is abundantly transcribed in testis in a tissue-specific and developmentally regulated manner. The protein is located in the nuclei of spermatocytes, early in the prophase of the first meiotic division, and later becomes concentrated in the XY nuclear subregion where it is in particular associated with the axes of sex chromosomes." Q#16097 - CGI_10026123 superfamily 241692 14 213 2.40E-72 220.996 cl00214 Aldolase_II superfamily - - "Class II Aldolase and Adducin head (N-terminal) domain. Aldolases are ubiquitous enzymes catalyzing central steps of carbohydrate metabolism. Based on enzymatic mechanisms, this superfamily has been divided into two distinct classes (Class I and II). Class II enzymes are further divided into two sub-classes A and B. This family includes class II A aldolases and adducins which has not been ascribed any enzymatic function. Members of this class are primarily bacterial and eukaryotic in origin and include L-fuculose-1-phosphate, L-rhamnulose-1-phosphate aldolases and L-ribulose-5-phosphate 4-epimerases. They all share the ability to promote carbon-carbon bond cleavage and stabilize enolate intermediates using divalent cations." Q#16099 - CGI_10026125 superfamily 241578 16 141 0.00800869 35.8932 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16100 - CGI_10026126 superfamily 222340 533 757 3.42E-91 289.756 cl18665 GNAT_acetyltr_2 superfamily - - GNAT acetyltransferase 2; This domain has N-acetyltransferase activity. It has a GCN5-related N-acetyltransferase (GNAT) fold. Q#16100 - CGI_10026126 superfamily 218449 293 492 2.48E-61 206.273 cl18457 Helicase_RecD superfamily - - "Helicase; This domain contains a P-loop (Walker A) motif, suggesting that it has ATPase activity, and a Walker B motif. In tRNA(Met) cytidine acetyltransferase (TmcA) it may function as an RNA helicase motor (driven by ATP hydrolysis) which delivers the wobble base to the active centre of the GCN5-related N-acetyltransferase (GNAT) domain. It is found in the bacterial exodeoxyribonuclease V alpha chain (RecD), which has 5'-3' helicase activity. It is structurally similar to the motor domain 1A in other SF1 helicases." Q#16100 - CGI_10026126 superfamily 149420 107 201 2.34E-43 153.45 cl07096 DUF1726 superfamily - - "Domain of unknown function (DUF1726); This domain of unknown function is often found at the N-terminus of proteins containing pfam05127. Its fold resembles that of pfam05127, but it does not appear to bind ATP." Q#16100 - CGI_10026126 superfamily 222344 780 897 1.09E-27 109.292 cl16364 tRNA_bind_2 superfamily - - "Possible tRNA binding domain; This domain, found at the C-terminus of tRNA(Met) cytidine acetyltransferase, may be involved in tRNA-binding." Q#16101 - CGI_10026127 superfamily 247743 748 907 1.74E-06 48.6815 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16101 - CGI_10026127 superfamily 243092 1299 1648 4.87E-13 70.4416 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16101 - CGI_10026127 superfamily 243092 1834 2022 4.73E-07 52.3372 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16103 - CGI_10026129 superfamily 247723 10 82 5.35E-16 69.6632 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16104 - CGI_10026130 superfamily 217293 62 263 5.38E-48 165.113 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16104 - CGI_10026130 superfamily 202474 270 330 1.84E-21 91.5612 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16105 - CGI_10026131 superfamily 241641 143 213 6.46E-07 44.7621 cl00150 TY superfamily - - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#16106 - CGI_10026132 superfamily 245864 46 182 0.00013193 40.7246 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#16107 - CGI_10026133 superfamily 245864 3 291 9.16E-52 177.856 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#16108 - CGI_10026134 superfamily 245864 243 452 1.40E-48 174.004 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#16108 - CGI_10026134 superfamily 245864 27 181 8.95E-11 62.2958 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#16110 - CGI_10026136 superfamily 177822 54 230 1.67E-09 56.0817 cl18088 PLN02164 superfamily N - sulfotransferase Q#16111 - CGI_10026137 superfamily 246748 267 598 4.02E-145 428.046 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#16111 - CGI_10026137 superfamily 245213 79 110 5.57E-07 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16111 - CGI_10026137 superfamily 245213 40 72 5.43E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16111 - CGI_10026137 superfamily 245213 116 145 0.000667157 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16113 - CGI_10026139 superfamily 247792 11 56 1.79E-08 47.4404 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16114 - CGI_10026140 superfamily 245202 33 109 4.10E-45 145.744 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#16117 - CGI_10026143 superfamily 246671 88 231 1.07E-23 93.6416 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#16118 - CGI_10026144 superfamily 245596 178 478 1.91E-177 508.283 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16118 - CGI_10026144 superfamily 247085 491 608 9.58E-21 88.7166 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#16119 - CGI_10026145 superfamily 247723 19 99 4.45E-41 143.241 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16119 - CGI_10026145 superfamily 247723 117 186 9.33E-31 114.804 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16119 - CGI_10026145 superfamily 247723 478 545 1.17E-28 108.937 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16119 - CGI_10026145 superfamily 247723 312 380 6.25E-21 87.3816 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16121 - CGI_10002480 superfamily 241599 191 247 8.63E-21 84.2172 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#16122 - CGI_10006332 superfamily 243035 126 173 0.000137743 38.3698 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#16123 - CGI_10006333 superfamily 247941 175 316 1.89E-11 60.8124 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#16124 - CGI_10006334 superfamily 247684 1153 1572 4.64E-108 351.194 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#16126 - CGI_10001975 superfamily 243152 204 332 4.15E-42 144.354 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#16127 - CGI_10001976 superfamily 241638 49 145 0.00206491 35.0365 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#16130 - CGI_10003122 superfamily 241600 61 191 3.84E-65 209.789 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16130 - CGI_10003122 superfamily 241600 289 423 9.63E-64 206.323 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16131 - CGI_10003123 superfamily 241600 7 187 1.28E-89 264.102 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16133 - CGI_10016789 superfamily 243064 37 108 0.00597916 33.8679 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#16134 - CGI_10016790 superfamily 243064 11 104 0.0061653 33.4863 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#16137 - CGI_10016794 superfamily 247684 11 68 2.11E-13 62.7951 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#16138 - CGI_10016795 superfamily 241889 110 258 1.24E-45 152.399 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#16139 - CGI_10016796 superfamily 241600 2 176 1.79E-73 223.271 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16140 - CGI_10016797 superfamily 241600 581 789 2.42E-84 269.495 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16140 - CGI_10016797 superfamily 245205 397 464 1.24E-07 49.9289 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#16143 - CGI_10016800 superfamily 243092 16 294 8.93E-86 261.501 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16144 - CGI_10016801 superfamily 222150 189 214 3.19E-06 43.1493 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16144 - CGI_10016801 superfamily 243091 41 136 1.38E-05 42.4792 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#16146 - CGI_10016803 superfamily 245201 1488 1732 7.67E-47 169.725 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16146 - CGI_10016803 superfamily 241584 1380 1469 1.96E-18 83.6999 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16146 - CGI_10016803 superfamily 245814 505 569 1.33E-11 63.2771 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16146 - CGI_10016803 superfamily 245814 265 334 8.71E-11 60.5807 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16146 - CGI_10016803 superfamily 245814 166 235 3.05E-08 53.2619 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16146 - CGI_10016803 superfamily 245814 1169 1227 6.05E-08 52.4915 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16146 - CGI_10016803 superfamily 245814 1877 1950 1.07E-06 48.6395 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16146 - CGI_10016803 superfamily 245814 368 431 1.76E-06 47.8691 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16146 - CGI_10016803 superfamily 245814 1299 1373 2.69E-11 62.2132 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16146 - CGI_10016803 superfamily 245814 59 133 1.33E-06 48.6557 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16147 - CGI_10016804 superfamily 241584 44 136 4.08E-18 79.0775 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 245201 12218 12464 1.45E-56 200.926 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16149 - CGI_10016806 superfamily 241584 8357 8451 2.44E-20 91.7891 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 11688 11774 1.95E-19 89.0927 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 11094 11187 2.67E-19 88.7075 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7548 7639 2.75E-19 88.7075 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10891 10985 1.49E-18 86.3963 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 11196 11285 1.77E-18 86.3963 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7453 7536 2.80E-18 85.6259 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10283 10372 4.81E-18 85.2407 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7852 7945 2.03E-17 83.3147 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7143 7236 2.21E-17 83.3147 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 8156 8249 2.37E-17 82.9295 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7245 7336 2.48E-17 82.9295 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7954 8044 3.46E-17 82.5443 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 9067 9156 3.53E-17 82.5443 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 9978 10067 4.13E-17 82.5443 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10486 10574 8.11E-17 81.3887 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10789 10878 1.18E-16 81.0035 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7347 7438 1.28E-16 81.0035 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 8664 8755 1.36E-16 81.0035 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 8462 8552 1.73E-16 80.6183 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 9675 9764 3.26E-16 79.8479 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 8763 8856 3.42E-16 79.8479 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 8965 9053 3.70E-16 79.4627 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 11586 11679 3.89E-16 79.4627 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 8868 8957 4.37E-16 79.4627 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 9370 9459 5.37E-16 79.0775 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10993 11077 5.39E-16 79.0775 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 11788 11879 8.20E-16 78.6923 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 9270 9362 1.24E-15 77.9219 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10588 10675 1.99E-15 77.5367 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 9878 9970 2.53E-15 77.1515 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 11405 11492 2.86E-15 77.1515 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10188 10269 3.05E-15 76.7663 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 9580 9661 4.03E-15 76.3811 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7759 7840 6.61E-15 75.9959 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 11297 11384 7.22E-15 75.9959 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 8055 8144 1.48E-14 74.8403 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10080 10170 1.09E-13 72.5291 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 7650 7734 1.67E-13 71.7587 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10689 10781 5.48E-13 70.2179 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 11498 11578 1.08E-12 69.4475 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 8257 8348 3.62E-12 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 12084 12173 9.10E-12 66.7511 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 245814 12959 13026 1.66E-11 65.5883 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 13637 13704 6.33E-11 63.6623 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 241584 8561 8644 7.09E-11 64.0547 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 10385 10475 2.42E-10 62.5139 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 245814 4118 4189 4.02E-10 61.3511 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 241584 9472 9562 3.62E-09 59.0471 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 245814 13063 13133 3.88E-09 58.6547 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 12826 12897 8.92E-09 57.4991 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 4430 4498 1.62E-08 56.7287 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 4339 4403 1.95E-08 56.3435 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 2543 2616 8.14E-08 54.8027 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 13397 13466 1.49E-07 54.0323 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 4016 4082 1.57E-07 53.6471 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 3809 3879 3.40E-06 49.7951 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 4626 4697 6.87E-06 49.0247 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 12736 12788 1.69E-05 47.8691 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 5669 5724 9.18E-05 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 241584 9777 9869 0.000170699 44.7947 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 241584 9169 9261 0.000170699 44.7947 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16149 - CGI_10016806 superfamily 245814 6976 7031 0.00226403 41.3207 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 5043 5107 0.00338166 40.5503 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 12007 12080 9.77E-15 74.9248 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 11899 11982 9.60E-14 72.1528 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 4715 4797 1.04E-13 72.1528 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 2429 2512 3.11E-13 70.612 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 3892 3975 2.44E-11 65.2192 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 3610 3692 2.08E-10 62.5228 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 4532 4602 4.43E-10 61.0576 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 4209 4292 5.40E-10 61.3672 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 13187 13270 5.40E-10 61.3672 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 2679 2762 7.25E-10 60.982 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 7066 7139 1.89E-09 59.5168 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 5390 5467 1.41E-07 54.0485 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 5569 5632 4.06E-07 52.8929 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 12555 12670 1.50E-06 50.9669 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 3706 3787 1.84E-06 50.5817 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 2910 2980 2.29E-06 50.1885 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 13529 13604 2.90E-06 50.1965 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 5484 5557 1.47E-05 47.8853 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 5223 5288 0.000448756 43.2629 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 5122 5199 0.000880507 42.4925 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16149 - CGI_10016806 superfamily 245814 2 55 0.00517876 40.1813 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 6576 6647 1.38E-13 71.3663 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 3432 3503 2.68E-13 70.5959 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 5793 5861 5.49E-13 69.4403 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 243054 1921 2131 7.68E-12 68.6263 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#16151 - CGI_10016808 superfamily 245814 5683 5752 1.52E-11 65.2031 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 10748 10817 2.50E-11 64.8179 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 6855 6924 3.38E-11 64.4327 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 3904 3973 8.14E-11 63.2771 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 8049 8115 8.86E-11 62.8919 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 2384 2453 1.21E-10 62.5067 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 2825 2893 2.02E-10 62.1215 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 3014 3083 2.89E-10 61.7363 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 6284 6352 1.24E-09 59.8103 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 6717 6785 1.26E-09 59.8103 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 243054 588 792 2.17E-09 60.5372 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#16151 - CGI_10016808 superfamily 243054 2030 2241 2.18E-09 60.5372 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#16151 - CGI_10016808 superfamily 245814 2286 2354 4.72E-09 57.8843 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 6163 6232 6.77E-09 57.4991 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 2592 2658 9.05E-09 57.1139 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 243054 1363 1544 1.24E-08 58.226 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#16151 - CGI_10016808 superfamily 245814 5581 5650 1.69E-08 56.3435 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 3672 3740 3.95E-08 55.1879 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 5220 5291 3.47E-07 52.4915 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 243054 818 1029 7.61E-07 52.8332 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#16151 - CGI_10016808 superfamily 243054 1552 1715 8.52E-07 52.8332 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#16151 - CGI_10016808 superfamily 245814 3303 3372 2.45E-06 49.7951 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 243054 1128 1287 8.75E-06 49.7516 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#16151 - CGI_10016808 superfamily 245814 8755 8824 1.69E-05 47.4839 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 5893 5975 4.40E-17 81.7828 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 6027 6104 8.17E-15 75.2344 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 3794 3878 1.13E-12 68.686 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 3100 3182 1.53E-12 68.3008 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 2691 2770 5.67E-12 66.76 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 245814 2469 2551 7.95E-11 63.2932 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 247069 356 459 1.42E-10 63.9018 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#16151 - CGI_10016808 superfamily 245814 5363 5430 1.67E-07 53.6633 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16151 - CGI_10016808 superfamily 243054 468 691 2.47E-05 48.596 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#16151 - CGI_10016808 superfamily 245814 10539 10621 0.000428054 43.2629 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16153 - CGI_10016810 superfamily 241578 8 89 3.45E-09 52.2938 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16155 - CGI_10003058 superfamily 241631 666 850 5.30E-83 272.174 cl00136 Sec7 superfamily - - Sec7 domain; Domain named after the S. cerevisiae SEC7 gene product. The Sec7 domain is the central domain of the guanine-nucleotide-exchange factors (GEFs) of the ADP-ribosylation factor family of small GTPases (ARFs) . It carries the exchange factor activity. Q#16155 - CGI_10003058 superfamily 204198 1182 1267 8.32E-34 127.321 cl07820 DUF1981 superfamily - - Domain of unknown function (DUF1981); Members of this family of functionally uncharacterized domains are found in various plant and yeast protein transport proteins. Q#16156 - CGI_10003059 superfamily 246598 41 311 5.70E-142 405.483 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#16161 - CGI_10012682 superfamily 247724 5 179 6.99E-92 268.913 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16162 - CGI_10012683 superfamily 247724 5 169 1.30E-88 260.053 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16163 - CGI_10012684 superfamily 247724 5 179 8.48E-134 380.621 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16163 - CGI_10012684 superfamily 247724 180 307 9.88E-94 278.543 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16164 - CGI_10012685 superfamily 243092 728 1002 1.31E-31 126.296 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16164 - CGI_10012685 superfamily 241563 87 125 6.33E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16165 - CGI_10012686 superfamily 238076 7 135 1.95E-64 203.805 cl18938 PAX superfamily - - Paired Box domain Q#16166 - CGI_10012687 superfamily 216290 157 266 7.75E-16 71.1654 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#16167 - CGI_10012688 superfamily 221377 257 300 7.31E-05 41.6855 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#16171 - CGI_10012692 superfamily 241675 66 297 1.80E-100 305.71 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#16172 - CGI_10012694 superfamily 152787 88 153 1.30E-11 56.8373 cl18053 V-SNARE_C superfamily - - Snare region anchored in the vesicle membrane C-terminus; Within the SNARE proteins interactions in the C-terminal half of the SNARE helix are critical to the driving of membrane fusion; whereas interactions in the N-terminal half of the SNARE domain are important for promoting priming or docking of the vesicle pfam05008. Q#16173 - CGI_10012695 superfamily 248318 481 533 1.53E-13 66.3053 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#16173 - CGI_10012695 superfamily 248318 596 645 5.56E-13 64.7645 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#16173 - CGI_10012695 superfamily 247724 61 290 1.38E-30 119.735 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16174 - CGI_10012696 superfamily 200724 308 395 5.05E-29 111.975 cl02186 Plus-3 superfamily - - "Plus-3 domain; This domain is about 90 residues in length and is often found associated with the pfam02213 domain. The function of this domain is uncertain. It is possible that this domain is involved in DNA binding as it has three conserved positively charged residues, hence this domain has been named the plus-3 domain. It is found in yeast Rtf1 which may be a transcription elongation factor." Q#16175 - CGI_10012697 superfamily 221913 91 252 2.05E-63 200.459 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#16176 - CGI_10012698 superfamily 222005 244 296 3.85E-08 49.658 cl18632 AAA_19 superfamily - - Part of AAA domain; Part of AAA domain. Q#16177 - CGI_10014735 superfamily 241600 61 273 4.10E-73 225.582 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16179 - CGI_10014737 superfamily 241594 239 305 9.73E-06 44.9888 cl00077 HECTc superfamily N - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#16180 - CGI_10014738 superfamily 246675 1 274 1.81E-68 217.109 cl14615 PI-PLCc_GDPD_SF superfamily - - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#16181 - CGI_10014739 superfamily 219541 1 112 1.29E-20 83.6719 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#16182 - CGI_10014740 superfamily 219541 245 390 2.28E-19 85.2127 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#16182 - CGI_10014740 superfamily 215896 49 120 2.85E-07 49.2156 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#16182 - CGI_10014740 superfamily 219542 483 568 3.71E-05 42.614 cl18517 Cu-oxidase_3 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#16184 - CGI_10014742 superfamily 241570 271 388 3.44E-31 115.501 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#16184 - CGI_10014742 superfamily 241570 159 262 4.29E-21 87.3813 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#16184 - CGI_10014742 superfamily 213107 5 43 2.53E-15 69.5819 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#16185 - CGI_10014743 superfamily 110440 366 392 0.00343429 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#16187 - CGI_10014745 superfamily 241659 153 230 1.69E-17 74.4787 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#16188 - CGI_10014746 superfamily 247727 500 554 0.000960461 38.1799 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#16188 - CGI_10014746 superfamily 218498 533 683 3.78E-09 56.5548 cl18459 TRM13 superfamily N - Methyltransferase TRM13; This is a family of eukaryotic proteins which are responsible for 2'-O-methylation of tRNA at position 4. TRM13 shows no sequence similarity to other known methyltransferases. Q#16188 - CGI_10014746 superfamily 243175 322 374 0.000739344 38.4252 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#16189 - CGI_10014747 superfamily 241577 252 452 1.14E-159 452.797 cl00056 MH2 superfamily - - "C-terminal Mad Homology 2 (MH2) domain; The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers." Q#16189 - CGI_10014747 superfamily 241576 10 133 3.25E-79 243.949 cl00055 MH1 superfamily - - "N-terminal Mad Homology 1 (MH1) domain; The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. Receptor-regulated SMAD proteins (R-SMADs, including SMAD1, SMAD2, SMAD3, SMAD5, and SMAD9) are activated by phosphorylation by transforming growth factor (TGF)-beta type I receptors. The active R-SMAD associates with a common mediator SMAD (Co-SMAD or SMAD4) and other cofactors, which together translocate to the nucleus to regulate gene expression. The inhibitory or antagonistic SMADs (I-SMADs, including SMAD6 and SMAD7) negatively regulate TGF-beta signaling by competing with R-SMADs for type I receptor or Co-SMADs. MH1 domains of R-SMAD and SMAD4 contain a nuclear localization signal as well as DNA-binding activity. The activated R-SMAD/SMAD4 complex then binds with very low affinity to a DNA sequence CAGAC called SMAD-binding element (SBE) via the MH1 domain." Q#16190 - CGI_10014748 superfamily 243066 37 145 1.31E-26 103.081 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#16190 - CGI_10014748 superfamily 198867 155 259 1.12E-10 58.5068 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#16191 - CGI_10014749 superfamily 247684 15 180 6.88E-43 158.609 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#16194 - CGI_10014752 superfamily 241589 11 132 9.12E-33 121.587 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#16194 - CGI_10014752 superfamily 241589 428 538 2.95E-32 120.431 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#16194 - CGI_10014752 superfamily 241589 150 278 5.14E-32 119.661 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#16194 - CGI_10014752 superfamily 241589 287 416 1.60E-21 90.7711 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#16196 - CGI_10003275 superfamily 222429 51 100 0.000455397 37.6053 cl18676 Myb_DNA-bind_5 superfamily N - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#16197 - CGI_10003276 superfamily 203031 96 154 6.97E-09 51.1748 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#16198 - CGI_10012924 superfamily 241691 224 268 2.31E-05 43.5414 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#16199 - CGI_10012925 superfamily 241568 378 426 0.00617086 34.8584 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#16200 - CGI_10012926 superfamily 192357 3 235 3.53E-92 279.499 cl10725 DUF2045 superfamily - - Uncharacterized conserved protein (DUF2045); This entry is the conserved 250 residues of proteins of approximately 450 amino acids. It contains several highly conserved motifs including a CVxLxxxD motif.The function is unknown. Q#16202 - CGI_10012929 superfamily 241610 3 55 8.60E-21 82.683 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#16203 - CGI_10012930 superfamily 247683 220 257 9.71E-16 69.2786 cl17036 SH3 superfamily C - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#16204 - CGI_10012931 superfamily 247683 44 69 7.29E-10 49.6334 cl17036 SH3 superfamily N - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#16206 - CGI_10012933 superfamily 245598 90 201 6.18E-26 99.6611 cl11396 Patatin_and_cPLA2 superfamily C - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#16207 - CGI_10012934 superfamily 245598 1 82 7.89E-35 122.773 cl11396 Patatin_and_cPLA2 superfamily N - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#16208 - CGI_10012935 superfamily 242406 16 112 1.22E-12 61.0681 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#16209 - CGI_10012936 superfamily 245213 257 293 0.000400593 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16209 - CGI_10012936 superfamily 243061 99 198 5.31E-38 131.695 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#16209 - CGI_10012936 superfamily 243061 51 97 4.24E-06 43.8698 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#16211 - CGI_10007716 superfamily 243098 100 148 4.97E-08 48.7483 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#16212 - CGI_10007717 superfamily 247799 42 116 4.57E-09 50.2511 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#16212 - CGI_10007717 superfamily 247799 128 187 2.29E-05 40.2359 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#16213 - CGI_10007718 superfamily 247724 2492 2654 5.26E-47 169.439 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16213 - CGI_10007718 superfamily 243109 1493 1628 2.04E-24 104.298 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#16213 - CGI_10007718 superfamily 243109 697 830 5.13E-11 63.4497 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#16217 - CGI_10006243 superfamily 245213 415 442 0.000149601 40.3126 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16219 - CGI_10002292 superfamily 241584 2 37 0.00529396 33.2387 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16221 - CGI_10004705 superfamily 241563 61 96 0.00035992 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16223 - CGI_10003450 superfamily 218758 77 246 1.00E-27 106.477 cl05400 Noggin superfamily N - "Noggin; This family consists of the eukaryotic Noggin proteins. Noggin is a glycoprotein that binds bone morphogenetic proteins (BMPs) selectively and, when added to osteoblasts, it opposes the effects of BMPs. It has been found that noggin arrests the differentiation of stromal cells, preventing cellular maturation." Q#16225 - CGI_10003452 superfamily 218758 52 246 2.72E-55 178.895 cl05400 Noggin superfamily - - "Noggin; This family consists of the eukaryotic Noggin proteins. Noggin is a glycoprotein that binds bone morphogenetic proteins (BMPs) selectively and, when added to osteoblasts, it opposes the effects of BMPs. It has been found that noggin arrests the differentiation of stromal cells, preventing cellular maturation." Q#16226 - CGI_10003453 superfamily 245814 29 129 1.64E-10 53.6633 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16228 - CGI_10017993 superfamily 245226 19 182 4.39E-45 150.513 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#16229 - CGI_10017994 superfamily 247907 1620 1782 4.00E-36 136.394 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#16229 - CGI_10017994 superfamily 247907 1401 1552 8.83E-31 120.986 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#16229 - CGI_10017994 superfamily 247907 1131 1278 1.73E-25 105.578 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#16229 - CGI_10017994 superfamily 238012 797 837 3.34E-08 52.3566 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#16229 - CGI_10017994 superfamily 241607 110 150 2.84E-07 49.5758 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 241607 311 344 1.95E-06 47.2646 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 241607 956 989 1.27E-05 44.5682 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 241607 232 272 2.53E-05 43.7978 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 241607 676 708 3.05E-05 43.7978 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 241607 175 211 0.000281677 40.7162 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 241607 453 493 0.00317818 37.6346 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 241607 276 330 0.000155764 42.0802 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 238012 851 886 0.0020224 38.4894 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#16229 - CGI_10017994 superfamily 241607 578 621 0.00774717 36.5034 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16229 - CGI_10017994 superfamily 241607 512 552 0.00943032 36.1182 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16230 - CGI_10017995 superfamily 111984 1 114 2.13E-22 86.3013 cl03912 NtA superfamily - - "Agrin NtA domain; Agrin is a multidomain heparan sulphate proteoglycan, that is a key organiser for the induction of postsynaptic specialisations at the neuromuscular junction. Binding of agrin to basement membranes requires the amino terminal (NtA) domain. This region mediates high affinity interaction with the coiled-coil domain of laminins. The binding of agrin to laminins via the NtA domain is subject to tissue-specific regulation. The NtA domain-containing form of agrin is expressed in non-neuronal cells or in neurons that project to non-neuronal cell such as motor neurons. The structure of this domain is an OB-fold." Q#16231 - CGI_10017996 superfamily 219431 383 431 4.66E-10 56.6704 cl06504 zf-CW superfamily - - "CW-type Zinc Finger; This domain appears to be a zinc finger. The alignment shows four conserved cysteine residues and a conserved tryptophan. It was first identified by, and is predicted to be a "highly specialised mononuclear four-cysteine zinc finger...that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including...chromatin methylation status and early embryonic development." Weak homology to pfam00628 further evidences these predictions (personal obs: C Yeats). Twelve different CW-domain-containing protein subfamilies are described, with different subfamilies being characteristic of vertebrates, higher plants and other animals in which these domain is found." Q#16231 - CGI_10017996 superfamily 243083 441 467 0.00725729 35.825 cl02554 PWWP superfamily C - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#16232 - CGI_10017997 superfamily 247727 48 103 4.11E-06 43.5727 cl17173 AdoMet_MTases superfamily N - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#16233 - CGI_10017998 superfamily 243035 40 169 2.60E-24 97.3053 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#16233 - CGI_10017998 superfamily 243035 319 447 4.82E-23 93.8385 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#16234 - CGI_10017999 superfamily 247755 411 648 3.36E-142 433.889 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#16234 - CGI_10017999 superfamily 247755 1054 1292 2.17E-132 407.696 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#16234 - CGI_10017999 superfamily 216049 734 1009 1.71E-48 175.552 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#16234 - CGI_10017999 superfamily 216049 116 364 1.70E-45 167.078 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#16238 - CGI_10018003 superfamily 217859 57 268 1.32E-120 347.726 cl04376 P34-Arc superfamily - - "Arp2/3 complex, 34 kD subunit p34-Arc; Arp2/3 protein complex has been implicated in the control of actin polymerisation in cells. The human complex consists of seven subunits which include the actin related Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. This family represents the p34-Arc subunit." Q#16239 - CGI_10018004 superfamily 241601 759 843 0.00271963 38.1956 cl00086 HPT superfamily - - "Histidine Phosphotransfer domain, involved in signalling through a two part component systems in which an autophosphorylating histidine protein kinase serves as a phosphoryl donor to a response regulator protein; the response regulator protein is modulated by phosphorylation and dephosphorylation of a conserved aspartic acid residue; two-component proteins are abundant in most eubacteria; In E. coli there are 62 two-component proteins involved in a variety of processes such as chemotaxis, osmoregulation, metabolism and transport 1; also present in both Gram positive and Gram negative pathogenic bacteria where they regulate basic housekeeping functions and control expression of toxins and other proteins important for pathogenesis; in archaea and eukaryotes, two-component pathways constitute a very small number of all signaling systems; in fungi they mediate environmental stress responses and, in pathogenic yeast, hyphal development. In Dictyostelium and in plants, they are involved in important processes such as osmoregulation, cell growth, and differentiation; to date two-component proteins have not been identified in animals; in most prokaryotic systems, the output response is effected directly by the RR, which functions as a transcription factor while in eukaryotic systems, two-component proteins are found at the beginning of signaling pathways where they interface with more conventional eukaryotic signaling strategies such as MAP kinase and cyclic nucleotide cascades" Q#16241 - CGI_10018006 superfamily 247675 425 794 0 710.268 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#16242 - CGI_10018007 superfamily 241574 767 998 1.46E-102 322.613 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#16243 - CGI_10018008 superfamily 245213 430 458 0.000278727 39.157 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16243 - CGI_10018008 superfamily 241583 124 219 1.44E-18 83.7746 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#16243 - CGI_10018008 superfamily 241571 304 421 0.000214696 40.0883 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#16244 - CGI_10018009 superfamily 241563 61 96 0.000763068 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16248 - CGI_10009121 superfamily 245206 44 283 4.02E-99 294.15 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16249 - CGI_10009122 superfamily 218540 130 428 0 510.666 cl05044 Bystin superfamily - - "Bystin; Trophinin and tastin form a cell adhesion molecule complex that potentially mediates an initial attachment of the blastocyst to uterine epithelial cells at the time of implantation. Trophinin and tastin bind to an intermediary cytoplasmic protein called bystin. Bystin may be involved in implantation and trophoblast invasion because bystin is found with trophinin and tastin in the cells at human implantation sites and also in the intermediate trophoblasts at invasion front in the placenta from early pregnancy. This family also includes the yeast protein ENP1. ENP1 is an essential protein in Saccharomyces cerevisiae and is localised in the nucleus. It is thought that ENP1 plays a direct role in the early steps of rRNA processing as enp1 defective yeast cannot synthesise 20S pre-rRNA and hence 18S rRNA, which leads to reduced formation of 40S ribosomal subunits." Q#16250 - CGI_10009123 superfamily 241583 202 411 6.79E-23 96.5305 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#16251 - CGI_10009124 superfamily 243744 155 254 2.15E-10 57.7586 cl04410 DFP superfamily N - "DNA / pantothenate metabolism flavoprotein; The DNA/pantothenate metabolism flavoprotein (EC:4.1.1.36) affects synthesis of DNA, and pantothenate metabolism." Q#16251 - CGI_10009124 superfamily 243744 36 70 0.00262464 37.2742 cl04410 DFP superfamily C - "DNA / pantothenate metabolism flavoprotein; The DNA/pantothenate metabolism flavoprotein (EC:4.1.1.36) affects synthesis of DNA, and pantothenate metabolism." Q#16252 - CGI_10009125 superfamily 246597 145 226 3.70E-32 120.097 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#16252 - CGI_10009125 superfamily 246597 9 74 2.47E-24 98.5253 cl13995 MPP_superfamily superfamily C - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#16252 - CGI_10009125 superfamily 218208 305 401 3.05E-28 108.616 cl18447 CwfJ_C_1 superfamily - - Protein similar to CwfJ C-terminus 1; This region is found in the N terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. Q#16252 - CGI_10009125 superfamily 218207 411 500 4.28E-27 104.664 cl04666 CwfJ_C_2 superfamily - - Protein similar to CwfJ C-terminus 2; This region is found in the N terminus of Schizosaccharomyces pombe protein CwfJ. CwfJ is part of the Cdc5p complex involved in mRNA splicing. Q#16254 - CGI_10009127 superfamily 248136 102 208 9.37E-53 168.265 cl17582 Sybindin superfamily - - "Sybindin-like family; Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses." Q#16254 - CGI_10009127 superfamily 248136 3 48 0.00776808 34.2157 cl17582 Sybindin superfamily C - "Sybindin-like family; Sybindin is a physiological syndecan-2 ligand on dendritic spines, the small protrusions on the surface of dendrites that receive the vast majority of excitatory synapses." Q#16255 - CGI_10009128 superfamily 247792 62 104 0.000142562 35.8844 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16256 - CGI_10009129 superfamily 241622 111 190 1.76E-15 72.2142 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#16256 - CGI_10009129 superfamily 241622 206 286 2.38E-14 68.7474 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#16256 - CGI_10009129 superfamily 241622 326 413 1.76E-12 63.3546 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#16256 - CGI_10009129 superfamily 241622 442 526 2.37E-11 60.273 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#16256 - CGI_10009129 superfamily 190233 24 78 4.75E-08 50.1454 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#16258 - CGI_10009131 superfamily 247856 12 65 1.01E-09 50.2389 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16259 - CGI_10009132 superfamily 247724 527 691 8.39E-72 232.346 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16260 - CGI_10009133 superfamily 190261 189 208 2.79E-06 46.0026 cl03504 RFX_DNA_binding superfamily N - RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. Q#16261 - CGI_10009134 superfamily 247723 29 119 2.74E-41 138.879 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16262 - CGI_10009135 superfamily 246669 261 380 2.17E-43 149.75 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#16263 - CGI_10003932 superfamily 220695 73 198 0.00502757 37.1731 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#16264 - CGI_10003933 superfamily 247769 569 744 1.38E-10 60.0457 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#16264 - CGI_10003933 superfamily 248010 111 270 4.89E-24 99.7631 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#16264 - CGI_10003933 superfamily 248010 295 471 1.84E-11 62.784 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#16265 - CGI_10003934 superfamily 248097 75 197 4.62E-24 92.7134 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16268 - CGI_10003656 superfamily 243035 92 208 3.37E-12 59.9409 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#16269 - CGI_10003657 superfamily 243038 800 882 1.49E-31 119.358 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#16269 - CGI_10003657 superfamily 245014 31 350 6.35E-21 93.8969 cl09117 Mem_trans superfamily - - Membrane transport protein; This family includes auxin efflux carrier proteins and other transporter proteins from all domains of life. Q#16270 - CGI_10003658 superfamily 247905 482 585 7.69E-29 113.872 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#16270 - CGI_10003658 superfamily 247805 285 424 6.13E-17 79.6888 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16270 - CGI_10003658 superfamily 245114 687 775 4.18E-14 69.8118 cl09632 RQC superfamily - - "RQC domain; This DNA-binding domain is found in the RecQ helicase among others and has a helix-turn-helix structure. The RQC domain, found only in RecQ family enzymes, is a high affinity G4 DNA binding domain." Q#16270 - CGI_10003658 superfamily 207658 856 921 1.25E-12 64.8782 cl02578 HRDC superfamily - - HRDC domain; The HRDC (Helicase and RNase D C-terminal) domain has a putative role in nucleic acid binding. Mutations in the HRDC domain cause human disease. It is interesting to note that the RecQ helicase in Deinococcus radiodurans has three tandem HRDC domains. Q#16270 - CGI_10003658 superfamily 247805 264 297 0.000138668 41.4334 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16270 - CGI_10003658 superfamily 247811 947 995 0.00262773 37.4422 cl17257 Sigma70_r4 superfamily - - "Sigma70, region (SR) 4 refers to the most C-terminal of four conserved domains found in Escherichia coli (Ec) sigma70, the main housekeeping sigma, and related sigma-factors (SFs). A SF is a dissociable subunit of RNA polymerase, it directs bacterial or plastid core RNA polymerase to specific promoter elements located upstream of transcription initiation points. The SR4 of Ec sigma70 and other essential primary SFs contact promoter sequences located 35 base-pairs upstream of the initiation point, recognizing a 6-base-pair -35 consensus TTGACA. Sigma70 related SFs also include SFs which are dispensable for bacterial cell growth for example Ec sigmaS, SFs which activate regulons in response to a specific signal for example heat-shock Ec sigmaH, and a group of SFs which includes the extracytoplasmic function (ECF) SFs and is typified by Ec sigmaE which contains SR2 and -4 only. ECF SFs direct the transcription of genes that regulate various responses including periplasmic stress and pathogenesis. Ec sigmaE SR4 also contacts the -35 element, but recognizes a different consensus (a 7-base-pair GGAACTT). Plant SFs recognize sigma70 type promoters and direct transcription of the major plastid RNA polymerase, plastid-encoded RNA polymerase (PEP)." Q#16273 - CGI_10004328 superfamily 243083 6 90 3.39E-36 131.254 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#16273 - CGI_10004328 superfamily 151906 554 662 8.15E-15 71.2381 cl12990 LEDGF superfamily - - Lens epithelium-derived growth factor (LEDGF); LEDGF is a chromatin-associated protein that protects cells from stress-induced apoptosis. It is the binding partner of HIV-1 integrase in human cells. The integrase binding domain (IBD) of LEDGF is a compact right-handed bundle composed of five alpha-helices. The residues essential for the interaction with the integrase are present in the inter-helical loop regions of the bundle structure. Q#16274 - CGI_10004329 superfamily 222047 13 134 2.85E-24 93.5564 cl18633 HD_4 superfamily - - HD domain; HD domains are metal dependent phosphohydrolases. Q#16275 - CGI_10004330 superfamily 209366 384 407 8.50E-09 52.1944 cl11604 zf-A20 superfamily - - A20-like zinc finger; The A20 Zn-finger of bovine/human Rabex5/rabGEF1 is a Ubiquitin Binding Domain. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation. Q#16275 - CGI_10004330 superfamily 207411 540 570 8.34E-07 46.6137 cl01438 zf-AN1 superfamily - - "AN1-like Zinc finger; Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues." Q#16276 - CGI_10004331 superfamily 219574 370 561 1.12E-49 174.007 cl06698 DC_STAMP superfamily - - "DC-STAMP-like protein; This is a family of sequences which are similar to a region of the dendritic cell-specific transmembrane protein (DC-STAMP). This is thought to be a novel receptor protein that shares no identity with other multimembrane-spanning proteins. It is thought to have seven putative transmembrane regions, two of which are found in the region featured in this family. DC-STAMP is also described as having potential N-linked glycosylation sites and a potential phosphorylation site for PKC, but these are not conserved throughout the family." Q#16277 - CGI_10004332 superfamily 248028 126 359 3.39E-33 124.923 cl17474 Steroid_dh superfamily - - "3-oxo-5-alpha-steroid 4-dehydrogenase; This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalyzed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants is DET2, a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development." Q#16277 - CGI_10004332 superfamily 247684 1 87 2.96E-06 47.272 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#16278 - CGI_10001478 superfamily 247692 84 646 0 700.07 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#16279 - CGI_10002644 superfamily 241563 75 112 0.00325809 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16280 - CGI_10002645 superfamily 243095 76 110 5.37E-07 48.9707 cl02570 RhoGAP superfamily N - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#16283 - CGI_10006272 superfamily 245206 20 333 5.24E-115 342.327 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16284 - CGI_10006273 superfamily 247799 255 300 1.62E-07 47.5547 cl17245 KH-I superfamily C - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#16284 - CGI_10006273 superfamily 247799 192 252 0.000542286 37.5396 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#16286 - CGI_10006275 superfamily 217473 57 335 1.73E-29 117.851 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#16287 - CGI_10006276 superfamily 243092 56 353 4.10E-30 117.051 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16288 - CGI_10006277 superfamily 217473 55 333 1.02E-26 110.532 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#16289 - CGI_10006278 superfamily 243353 1196 1236 1.11E-10 58.98 cl03225 GRIP superfamily - - "GRIP domain; The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. At least some of these domains have been shown to bind to GTPase Arl1, see structures in." Q#16290 - CGI_10003160 superfamily 247725 892 1035 1.59E-38 140.531 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16290 - CGI_10003160 superfamily 241622 34 148 4.66E-13 66.4362 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#16291 - CGI_10009075 superfamily 189857 2 121 5.80E-43 140.078 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#16292 - CGI_10009076 superfamily 241584 1376 1468 2.05E-15 74.4551 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 1275 1368 1.76E-13 68.6771 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 1177 1269 2.06E-13 68.6771 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 876 966 3.40E-13 67.9067 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 575 670 1.16E-11 63.2843 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 1476 1567 1.30E-11 63.2843 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 781 864 6.22E-11 61.3583 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 978 1069 6.63E-11 61.3583 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 678 776 8.02E-10 57.8915 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 1074 1166 1.03E-08 54.8099 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 241584 461 563 1.13E-06 48.6467 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 245814 312 384 0.000216993 41.3207 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16292 - CGI_10009076 superfamily 241584 1572 1605 0.00377412 37.4759 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16292 - CGI_10009076 superfamily 248097 91 195 1.12E-23 99.2618 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16292 - CGI_10009076 superfamily 245814 243 285 3.54E-05 43.7643 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#16294 - CGI_10009078 superfamily 216653 100 233 6.40E-34 124.632 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#16295 - CGI_10009079 superfamily 216653 1 112 2.10E-13 62.2295 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#16296 - CGI_10021368 superfamily 247724 9 170 8.32E-131 366.115 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16303 - CGI_10021375 superfamily 243072 765 890 1.04E-34 130.581 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16303 - CGI_10021375 superfamily 243072 831 956 1.81E-34 129.811 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16303 - CGI_10021375 superfamily 243072 666 791 3.03E-34 129.041 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16303 - CGI_10021375 superfamily 243072 1029 1153 3.12E-30 117.87 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16303 - CGI_10021375 superfamily 243072 897 1022 8.06E-29 113.633 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16303 - CGI_10021375 superfamily 243072 591 725 1.86E-23 98.2246 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16307 - CGI_10021379 superfamily 221550 163 268 0.00743396 36.9827 cl13761 Casc1 superfamily C - "Cancer susceptibility candidate 1; This domain family is found in eukaryotes, and is typically between 216 and 263 amino acids in length. Casc1 has many SNPs associated with cancer susceptibility." Q#16310 - CGI_10021382 superfamily 216152 90 358 5.91E-54 183.284 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#16312 - CGI_10021384 superfamily 246940 68 268 2.19E-19 84.6925 cl15377 Radical_SAM superfamily - - "Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin." Q#16312 - CGI_10021384 superfamily 115139 238 364 4.00E-48 161.239 cl05790 Mob_synth_C superfamily - - Molybdenum Cofactor Synthesis C; This region contains two iron-sulphur (3Fe-4S) binding sites. Mutations in this region of human MOCS1 cause MOCOD (Molybdenum Co-Factor Deficiency) type A. Q#16313 - CGI_10021385 superfamily 110440 80 106 0.000251694 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#16316 - CGI_10021388 superfamily 247941 144 278 5.47E-09 53.4937 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#16317 - CGI_10021389 superfamily 247941 318 452 3.01E-08 51.9529 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#16318 - CGI_10021390 superfamily 245206 3 174 1.11E-63 199.776 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16319 - CGI_10021391 superfamily 246921 284 336 6.03E-11 59.6965 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#16319 - CGI_10021391 superfamily 246921 212 274 0.00101618 38.5105 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#16321 - CGI_10021393 superfamily 243033 55 181 1.44E-13 63.8765 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#16322 - CGI_10021394 superfamily 243859 78 171 1.82E-20 81.9926 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#16322 - CGI_10021394 superfamily 243859 4 92 1.38E-10 54.6434 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#16323 - CGI_10021395 superfamily 243859 4 75 1.49E-09 49.6358 cl04722 PLAC8 superfamily C - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#16324 - CGI_10021396 superfamily 243859 4 74 2.05E-09 49.2506 cl04722 PLAC8 superfamily C - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#16325 - CGI_10021397 superfamily 241674 140 220 1.50E-35 122.32 cl00194 EF1B superfamily - - "Elongation factor 1 beta (EF1B) guanine nucleotide exchange domain. EF1B catalyzes the exchange of GDP bound to the G-protein, EF1A, for GTP, an important step in the elongation cycle of the protein biosynthesis. EF1A binds to and delivers the aminoacyl tRNA to the ribosome. The guanine nucleotide exchange domain of EF1B, which is the alpha subunit in yeast, is responsible for the catalysis of this exchange reaction." Q#16325 - CGI_10021397 superfamily 243175 3 62 4.72E-23 89.4052 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#16326 - CGI_10021398 superfamily 245203 255 636 0 623.903 cl09928 Molybdopterin-Binding superfamily - - "Molybdopterin-Binding (MopB) domain of the MopB superfamily of proteins, a large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site." Q#16326 - CGI_10021398 superfamily 241649 33 105 2.05E-09 55.4788 cl00159 fer2 superfamily - - "2Fe-2S iron-sulfur cluster binding domain. Iron-sulfur proteins play an important role in electron transfer processes and in various enzymatic reactions. The family includes plant and algal ferredoxins, which act as electron carriers in photosynthesis and ferredoxins, which participate in redox chains (from bacteria to mammals). Fold is ismilar to thioredoxin." Q#16326 - CGI_10021398 superfamily 245548 115 155 8.89E-17 75.6072 cl11210 NADH-G_4Fe-4S_3 superfamily - - NADH-ubiquinone oxidoreductase-G iron-sulfur binding region; NADH-ubiquinone oxidoreductase-G iron-sulfur binding region. Q#16326 - CGI_10021398 superfamily 204199 663 716 1.08E-11 61.1338 cl08544 DUF1982 superfamily - - Domain of unknown function (DUF1982); Members of this family of functionally uncharacterized domains are found in the C-terminal region of various prokaryotic NADH dehydrogenases. Q#16327 - CGI_10021399 superfamily 100116 68 114 5.22E-07 46.1844 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#16327 - CGI_10021399 superfamily 241762 200 273 1.19E-28 106.673 cl00297 R3H superfamily - - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#16328 - CGI_10021400 superfamily 100116 723 771 5.85E-13 65.4444 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#16328 - CGI_10021400 superfamily 100116 843 895 5.88E-11 59.2812 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#16328 - CGI_10021400 superfamily 100116 781 824 1.06E-09 55.8144 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#16328 - CGI_10021400 superfamily 100116 905 932 5.74E-09 53.8884 cl10082 NF-X1-zinc-finger superfamily C - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#16329 - CGI_10021401 superfamily 220630 20 138 8.61E-48 152.062 cl10898 DUF2340 superfamily - - Uncharacterized conserved protein (DUF2340); This is a family of small proteins of approximately 150 amino acids of unknown function. Q#16330 - CGI_10021402 superfamily 243082 39 381 9.12E-93 302.254 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#16331 - CGI_10021403 superfamily 247085 511 628 2.60E-16 76.005 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#16331 - CGI_10021403 superfamily 245596 294 491 1.00E-110 337.639 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16331 - CGI_10021403 superfamily 245596 124 293 9.64E-82 262.14 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16333 - CGI_10021405 superfamily 248097 24 146 3.03E-13 66.1717 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16335 - CGI_10021407 superfamily 218118 100 165 4.54E-07 44.1421 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#16338 - CGI_10021410 superfamily 241754 7 315 0 579.238 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#16341 - CGI_10003021 superfamily 247799 178 252 5.88E-08 49.4807 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#16341 - CGI_10003021 superfamily 247799 108 174 0.00345806 35.2302 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#16342 - CGI_10006932 superfamily 247861 1329 1472 6.92E-28 111.835 cl17307 SpoU_methylase superfamily - - SpoU rRNA Methylase family; This family of proteins probably use S-AdoMet. Q#16344 - CGI_10006934 superfamily 199166 110 219 0.000211842 40.0032 cl15308 AMN1 superfamily C - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#16345 - CGI_10006935 superfamily 214642 331 362 0.00479354 35.6026 cl02592 HAT superfamily - - HAT (Half-A-TPR) repeats; Present in several RNA-binding proteins. Structurally and sequentially thought to be similar to TPRs. Q#16347 - CGI_10006937 superfamily 243045 154 250 3.17E-15 72.6659 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#16347 - CGI_10006937 superfamily 243045 5 57 7.63E-08 51.0947 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#16349 - CGI_10006939 superfamily 247743 141 308 1.00E-24 98.3723 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16350 - CGI_10006940 superfamily 244880 97 268 1.70E-117 344.203 cl08263 TBP_TLF superfamily - - "TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. New members of the TBP family, called TBP-like proteins (TBLP, TLF, TLP) or TBP-related factors (TRF1, TRF2,TRP), are similar to the core domain of TBPs, with identical or chemically similar amino acids at many equivalent positions, suggesting similar structure. However, TLFs contain distinct, conserved amino acids at several positions that distinguish them from TBP." Q#16350 - CGI_10006940 superfamily 241574 259 385 3.06E-32 119.636 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#16351 - CGI_10006941 superfamily 241772 37 112 0.00780207 35.0661 cl00311 UbiD superfamily NC - 3-octaprenyl-4-hydroxybenzoate carboxy-lyase; This family has been characterized as 3-octaprenyl-4- hydroxybenzoate carboxy-lyase enzymes. This enzyme catalyzes the third reaction in ubiquinone biosynthesis. For optimal activity the carboxy-lase was shown to require Mn2+. Q#16352 - CGI_10006942 superfamily 247804 379 421 2.62E-05 42.1774 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#16352 - CGI_10006942 superfamily 212559 156 200 4.29E-13 64.9431 cl18297 SANT_MTA3_like superfamily - - "Myb-Like Dna-Binding Domain of MTA3 and related proteins; Members in this SANT/myb family include domains found in mouse metastasis-associated protein 3 (MTA3) proteins and arginine-glutamic dipeptide (RERE) repeats proteins. SANT (SWI3, ADA2, N-CoR and TFIIIB) DNA-binding domains are a diverse set of proteins that share a common 3 alpha-helix bundle. MTA3 has been shown to interact with nucleosome remodeling and deacetylase (NuRD) proteins CHD4 and HDAC1, and the core cohesin complex protein RAD21 in the ovary, and regulate G2/M progression in proliferating granulosa cells. RERE belongs to the atrophin family and has been identified as a nuclear receptor corepressor; altered expression levels of RERE are associated with cancer in humans while mutations of Rere in mice cause failure in closing the anterior neural tube and fusion of the telencephalic and optic vesicles during embryogenesis." Q#16352 - CGI_10006942 superfamily 216509 71 122 1.00E-12 63.797 cl03218 ELM2 superfamily - - "ELM2 domain; The ELM2 (Egl-27 and MTA1 homology 2) domain is a small domain of unknown function. It is found in the MTA1 protein that is part of the NuRD complex. The domain is usually found to the N terminus of a myb-like DNA binding domain pfam00249. ELM2 is also found associated with an ARID DNA binding domain pfam01388 in a member from Arabidopsis thaliana. This suggests that ELM2 may also be involved in DNA binding, or perhaps is a protein-protein interaction domain." Q#16352 - CGI_10006942 superfamily 241648 249 291 6.75E-05 41.2042 cl00158 ZnF_GATA superfamily - - Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Q#16353 - CGI_10006943 superfamily 241578 357 512 1.09E-13 70.2874 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16353 - CGI_10006943 superfamily 246918 692 748 1.56E-16 75.7011 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16353 - CGI_10006943 superfamily 246918 524 575 1.41E-13 67.2267 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16353 - CGI_10006943 superfamily 246918 634 687 4.21E-09 54.5151 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16354 - CGI_10006944 superfamily 247739 14 207 1.03E-66 209.787 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#16355 - CGI_10006945 superfamily 215577 8 243 2.10E-49 168.308 cl14728 PLN03103 superfamily C - GDP-L-galactose-hexose-1-phosphate guanyltransferase; Provisional Q#16359 - CGI_10016293 superfamily 246925 3 111 0.0095108 34.2534 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#16362 - CGI_10016296 superfamily 241600 22 231 1.02E-97 286.444 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16363 - CGI_10016297 superfamily 245206 15 245 1.34E-76 235.967 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16364 - CGI_10016298 superfamily 241563 89 126 1.06E-06 46.1263 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16365 - CGI_10016299 superfamily 243092 4 217 9.17E-17 81.2272 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16365 - CGI_10016299 superfamily 243092 850 907 2.92E-05 45.7888 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16366 - CGI_10016301 superfamily 241596 60 118 3.34E-11 58.7647 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#16368 - CGI_10016303 superfamily 245864 1 201 1.85E-75 236.406 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#16369 - CGI_10016304 superfamily 244657 40 98 1.08E-22 85.8652 cl07247 CDC37_C superfamily C - Cdc37 C terminal domain; Cdc37 is a protein required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the C terminal domain whose function is unclear. It is found C terminal to the Hsp90 chaperone (Heat shocked protein 90) binding domain pfam08565 and the N terminal kinase binding domain of Cdc37 pfam03234. Q#16369 - CGI_10016304 superfamily 244658 1 27 0.00167605 35.1158 cl07248 CDC37_M superfamily N - "Cdc37 Hsp90 binding domain; Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37. It is found between the N terminal Cdc37 domain pfam03234, which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 pfam08564 whose function is unclear." Q#16370 - CGI_10016305 superfamily 247724 7 168 4.76E-60 188.255 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16372 - CGI_10016307 superfamily 242385 125 430 0 608.291 cl01244 arom_aa_hydroxylase superfamily - - "Biopterin-dependent aromatic amino acid hydroxylase; a family of non-heme, iron(II)-dependent enzymes that includes prokaryotic and eukaryotic phenylalanine-4-hydroxylase (PheOH), eukaryotic tyrosine hydroxylase (TyrOH) and eukaryotic tryptophan hydroxylase (TrpOH). PheOH converts L-phenylalanine to L-tyrosine, an important step in phenylalanine catabolism and neurotransmitter biosynthesis, and is linked to a severe variant of phenylketonuria in humans. TyrOH and TrpOH are involved in the biosynthesis of catecholamine and serotonin, respectively. The eukaryotic enzymes are all homotetramers." Q#16372 - CGI_10016307 superfamily 245020 39 112 4.29E-26 100.711 cl09141 ACT superfamily - - "ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme; Members of this CD belong to the superfamily of ACT regulatory domains. Pairs of ACT domains are commonly involved in specifically binding an amino acid or other small ligand leading to regulation of the enzyme. The ACT domain has been detected in a number of diverse proteins; some of these proteins are involved in amino acid and purine biosynthesis, phenylalanine hydroxylation, regulation of bacterial metabolism and transcription, and many remain to be characterized. ACT domain-containing enzymes involved in amino acid and purine synthesis are in many cases allosteric enzymes with complex regulation enforced by the binding of ligands. The ACT domain is commonly involved in the binding of a small regulatory molecule, such as the amino acids L-Ser and L-Phe in the case of D-3-phosphoglycerate dehydrogenase and the bifunctional chorismate mutase-prephenate dehydratase enzyme (P-protein), respectively. Aspartokinases typically consist of two C-terminal ACT domains in a tandem repeat, but the second ACT domain is inserted within the first, resulting in, what is normally the terminal beta strand of ACT2, formed from a region N-terminal of ACT1. ACT domain repeats have been shown to have nonequivalent ligand-binding sites with complex regulatory patterns such as those seen in the bifunctional enzyme, aspartokinase-homoserine dehydrogenase (ThrA). In other enzymes, such as phenylalanine hydroxylases, the ACT domain appears to function as a flexible small module providing allosteric regulation via transmission of conformational changes, these conformational changes are not necessarily initiated by regulatory ligand binding at the ACT domain itself. ACT domains are present either singularly, N- or C-terminal, or in pairs present C-terminal or between two catalytic domains. Unique to cyanobacteria are four ACT domains C-terminal to an aspartokinase domain. A few proteins are composed almost entirely of ACT domain repeats as seen in the four ACT domain protein, the ACR protein, found in higher plants; and the two ACT domain protein, the glycine cleavage system transcriptional repressor (GcvR) protein, found in some bacteria. Also seen are single ACT domain proteins similar to the Streptococcus pneumoniae ACT domain protein (uncharacterized pdb structure 1ZPV) found in both bacteria and archaea. Purportedly, the ACT domain is an evolutionarily mobile ligand binding regulatory module that has been fused to different enzymes at various times." Q#16374 - CGI_10016309 superfamily 248097 1 127 1.06E-25 95.0246 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16375 - CGI_10016310 superfamily 246597 25 311 5.22E-131 385.743 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#16375 - CGI_10016310 superfamily 217260 347 509 1.83E-45 157.416 cl03752 5_nucleotid_C superfamily - - "5'-nucleotidase, C-terminal domain; 5'-nucleotidase, C-terminal domain. " Q#16377 - CGI_10016312 superfamily 243092 100 412 6.91E-39 148.252 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16377 - CGI_10016312 superfamily 241607 1218 1243 4.64E-06 45.7238 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 578 612 4.85E-05 42.6422 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 1047 1081 6.65E-05 42.257 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 921 955 0.000120189 41.4866 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 878 912 0.000637876 39.1754 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 708 742 0.00110708 38.405 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 1132 1157 0.00169407 38.0198 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 663 687 0.00247034 37.6346 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 544 572 0.00278991 37.2494 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 750 774 0.00532181 36.479 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 243092 10 173 2.62E-20 92.398 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16377 - CGI_10016312 superfamily 241607 839 874 1.69E-07 49.975 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 1008 1043 2.88E-07 49.2046 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 1083 1106 9.77E-05 41.5765 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 1173 1193 0.000278398 40.3451 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 791 816 0.000468884 39.5747 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 963 984 0.000560517 39.5747 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16377 - CGI_10016312 superfamily 241607 616 640 0.000997441 38.8043 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16382 - CGI_10016317 superfamily 241737 129 258 8.18E-72 219.722 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#16382 - CGI_10016317 superfamily 241737 11 130 7.03E-66 204.7 cl00264 Ferritin_like superfamily C - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#16383 - CGI_10012322 superfamily 247792 251 290 0.00015154 39.7364 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16383 - CGI_10012322 superfamily 148004 186 223 0.000139844 40.3857 cl05589 Ifi-6-16 superfamily N - Interferon-induced 6-16 family; Interferon-induced 6-16 family. Q#16384 - CGI_10012323 superfamily 247792 438 477 0.000159314 39.3512 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16385 - CGI_10012324 superfamily 247792 297 336 4.31E-05 40.5068 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16385 - CGI_10012324 superfamily 148004 120 183 4.05E-12 61.1865 cl05589 Ifi-6-16 superfamily N - Interferon-induced 6-16 family; Interferon-induced 6-16 family. Q#16386 - CGI_10012325 superfamily 241822 143 187 2.94E-08 47.8544 cl00373 Ribosomal_S18 superfamily - - Ribosomal protein S18; Ribosomal protein S18. Q#16390 - CGI_10012329 superfamily 245235 864 1265 0 687.39 cl10023 POLBc superfamily - - "DNA polymerase type-B family catalytic domain. DNA-directed DNA polymerases elongate DNA by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA. DNA-directed DNA polymerases are multifunctional with both synthetic (polymerase) and degradative modes (exonucleases) and play roles in the processes of DNA replication, repair, and recombination. DNA-dependent DNA polymerases can be classified in six main groups based upon their phylogenetic relationships with E. coli polymerase I (class A), E. coli polymerase II (class B), E. coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB, and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family B DNA polymerases include E. coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon, and zeta), and eukaryotic viral and plasmid-borne enzymes. DNA polymerase is made up of distinct domains and sub-domains. The polymerase domain of DNA polymerase type B (Pol domain) is responsible for the template-directed polymerization of dNTPs onto the growing primer strand of duplex DNA that is usually magnesium dependent. In general, the architecture of the Pol domain has been likened to a right hand with fingers, thumb, and palm sub-domains with a deep groove to accommodate the nucleic acid substrate. There are a few conserved motifs in the Pol domain of family B DNA polymerases. The conserved aspartic acid residues in the DTDS motifs of the palm sub-domain is crucial for binding to divalent metal ion and is suggested to be important for polymerase catalysis." Q#16390 - CGI_10012329 superfamily 245226 558 794 1.45E-97 313.78 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#16390 - CGI_10012329 superfamily 220087 1286 1477 1.97E-51 181.307 cl18544 zf-DNA_Pol superfamily - - "DNA Polymerase alpha zinc finger; The DNA Polymerase alpha zinc finger domain adopts an alpha-helix-like structure, followed by three turns, all of which involve proline. The resulting motif is a helix-turn-helix motif, in contrast to other zinc finger domains, which show anti-parallel sheet and helix conformation. Zinc binding occurs due to the presence of four cysteine residues positioned to bind the metal centre in a tetrahedral coordination geometry. Function of this domain is uncertain: it has been proposed that the zinc finger motif may be an essential part of the DNA binding domain." Q#16390 - CGI_10012329 superfamily 221491 42 105 1.85E-14 70.7749 cl13661 DNA_pol_alpha_N superfamily - - "DNA polymerase alpha subunit p180 N terminal; This domain family is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam00136, pfam08996, pfam03104. This family is the N terminal of DNA polymerase alpha subunit p180 protein. The N terminal contains the catalytic region of the alpha subunit." Q#16391 - CGI_10012330 superfamily 245230 2 426 0 950.949 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#16392 - CGI_10012331 superfamily 242109 3 140 2.36E-34 122.484 cl00809 RbsD_FucU superfamily - - "RbsD / FucU transport protein family; The Escherichia coli high-affinity ribose-transport system consists of six proteins encoded by the rbs operon (rbsD, rbsA, rbsC, rbsB, rbsK and rbsR). Of the six components, RbsD is the only one whose function is unknown although it is thought that it somehow plays a critical role in PtsG-mediated ribose transport. This family also includes FucU a protein from the fucose biosynthesis operon that is presumably also involved in fucose transport by similarity to RbsD." Q#16392 - CGI_10012331 superfamily 242232 203 252 1.23E-12 60.6496 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#16393 - CGI_10012332 superfamily 220783 2 92 6.32E-15 68.2271 cl11136 Syntaxin-18_N superfamily - - "SNARE-complex protein Syntaxin-18 N-terminus; This is the conserved N-terminal of Syntaxin-18. Syntaxin-18 is found in the SNARE complex of the endoplasmic reticulum and functions in the trafficking between the ER intermediate compartment and the cis-Golgi vesicle. In particular, the N-terminal region is important for the formation of ER aggregates. More specifically, syntaxin-18 is involved in endoplasmic reticulum-mediated phagocytosis, presumably by regulating the specific and direct fusion of the ER with the plasma or phagosomal membranes." Q#16394 - CGI_10012333 superfamily 247684 13 187 4.79E-17 77.6291 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#16396 - CGI_10012335 superfamily 241638 35 112 2.27E-05 42.7405 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#16398 - CGI_10012337 superfamily 241638 150 259 0.00330163 35.8069 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#16399 - CGI_10012338 superfamily 198867 747 844 4.09E-30 116.874 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#16399 - CGI_10012338 superfamily 243066 646 742 1.73E-25 103.54 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#16399 - CGI_10012338 superfamily 243146 992 1035 3.47E-10 57.6714 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16399 - CGI_10012338 superfamily 243146 1086 1141 2.62E-08 52.2786 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16399 - CGI_10012338 superfamily 243146 1038 1083 9.92E-08 50.7378 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16399 - CGI_10012338 superfamily 243146 946 1002 0.00127251 38.6935 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16400 - CGI_10012339 superfamily 207712 75 180 3.21E-38 137.828 cl02728 DUF1126 superfamily - - Repeat of unknown function (DUF1126); This family consists of several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown. Q#16400 - CGI_10012339 superfamily 207712 407 514 1.48E-32 122.035 cl02728 DUF1126 superfamily - - Repeat of unknown function (DUF1126); This family consists of several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown. Q#16400 - CGI_10012339 superfamily 207712 226 345 1.41E-30 116.643 cl02728 DUF1126 superfamily - - Repeat of unknown function (DUF1126); This family consists of several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown. Q#16401 - CGI_10012340 superfamily 246679 178 395 3.90E-65 208.562 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#16401 - CGI_10012340 superfamily 246679 6 161 3.57E-13 66.0142 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#16402 - CGI_10012341 superfamily 246918 27 50 4.24E-05 39.1071 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16402 - CGI_10012341 superfamily 246918 90 133 0.00189857 34.4847 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16403 - CGI_10012342 superfamily 248458 862 1150 1.56E-16 81.9765 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16403 - CGI_10012342 superfamily 248458 1219 1471 6.52E-13 71.1909 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16403 - CGI_10012342 superfamily 248458 512 800 2.40E-11 66.1833 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16403 - CGI_10012342 superfamily 248458 37 206 2.06E-06 50.7753 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16404 - CGI_10012343 superfamily 241599 108 166 1.34E-23 91.1508 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#16404 - CGI_10012343 superfamily 146451 253 271 8.31E-05 38.8795 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#16406 - CGI_10023267 superfamily 216101 1 463 8.58E-180 521.468 cl08288 Carn_acyltransf superfamily N - Choline/Carnitine o-acyltransferase; Choline/Carnitine o-acyltransferase. Q#16407 - CGI_10023268 superfamily 241563 56 92 0.00240893 35.918 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16408 - CGI_10023269 superfamily 201678 134 163 5.44E-08 48.2652 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#16408 - CGI_10023269 superfamily 201678 104 129 9.81E-05 39.0204 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#16408 - CGI_10023269 superfamily 201678 65 88 0.000228952 38.25 cl03138 PPTA superfamily C - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#16408 - CGI_10023269 superfamily 201678 207 236 0.000820444 36.324 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#16408 - CGI_10023269 superfamily 201678 168 196 0.00421634 34.398 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#16409 - CGI_10023270 superfamily 246918 15 67 2.76E-12 56.4411 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16410 - CGI_10023271 superfamily 246918 32 53 0.000431081 33.7143 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16411 - CGI_10023272 superfamily 248288 19 99 0.00014941 35.8422 cl17734 DAN superfamily - - "DAN domain; This domain contains 9 conserved cysteines and is extracellular. Therefore the cysteines may form disulphide bridges. This family of proteins has been termed the DAN family after the first member to be reported. This family includes DAN, Cerberus and Gremlin. The gremlin protein is an antagonist of bone morphogenetic protein signaling. It is postulated that all members of this family antagonise different TGF beta pfam00019 ligands. Recent work shows that the DAN protein is not an efficient antagonist of BMP-2/4 class signals, we found that DAN was able to interact with GDF-5 in a frog embryo assay, suggesting that DAN may regulate signaling by the GDF-5/6/7 class of BMPs in vivo." Q#16412 - CGI_10023273 superfamily 246598 24 306 1.29E-165 465.189 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#16415 - CGI_10023276 superfamily 247743 42 145 5.17E-06 43.2887 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16416 - CGI_10023277 superfamily 247941 100 248 4.50E-13 64.2792 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#16417 - CGI_10023278 superfamily 247941 209 351 1.26E-16 75.8352 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#16418 - CGI_10023280 superfamily 216301 42 223 8.02E-32 115.824 cl03099 EMP24_GP25L superfamily - - emp24/gp25L/p24 family/GOLD; Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Q#16419 - CGI_10023281 superfamily 241565 8 84 0.000140868 43.0791 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#16419 - CGI_10023281 superfamily 242919 2844 3087 2.82E-91 301.122 cl02173 Tfb4 superfamily - - Transcription factor Tfb4; This family appears to be distantly related to the VWA domain. Q#16419 - CGI_10023281 superfamily 241752 269 567 2.29E-30 126.231 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#16419 - CGI_10023281 superfamily 241578 868 1031 7.15E-24 102.678 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16419 - CGI_10023281 superfamily 207701 625 733 2.86E-23 99.3547 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#16419 - CGI_10023281 superfamily 241645 1280 1348 5.22E-08 53.3354 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16423 - CGI_10023285 superfamily 243077 3 62 5.42E-14 65.2593 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#16423 - CGI_10023285 superfamily 242914 100 132 0.000472862 37.208 cl02163 zf-CSL superfamily C - CSL zinc finger; This is a zinc binding motif which contains four cysteine residues which chelate zinc. This domain is often found associated with a pfam00226 domain. This domain is named after the conserved motif of the final cysteine. Q#16424 - CGI_10023286 superfamily 218655 45 161 1.72E-66 202.082 cl05269 DUF778 superfamily - - Protein of unknown function (DUF778); This family consists of several eukaryotic proteins of unknown function. Q#16426 - CGI_10023288 superfamily 242849 7 60 2.08E-09 51.4357 cl02041 Cyt-b5 superfamily C - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#16428 - CGI_10023290 superfamily 246671 29 131 3.54E-20 85.1672 cl14606 Reeler_cohesin_like superfamily N - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#16429 - CGI_10023291 superfamily 246671 35 174 1.01E-18 79.7744 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#16430 - CGI_10023292 superfamily 245201 701 968 0 534.611 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16430 - CGI_10023292 superfamily 247057 1003 1063 2.59E-24 98.4563 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#16430 - CGI_10023292 superfamily 241584 408 492 1.64E-14 70.9883 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16430 - CGI_10023292 superfamily 241584 512 601 5.99E-06 45.5651 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16430 - CGI_10023292 superfamily 243148 130 280 8.95E-48 169.506 cl02704 EphR_LBD superfamily - - "Ligand Binding Domain of Ephrin Receptors; Ephrin receptors (EphRs) comprise the largest subfamily of receptor tyrosine kinases (RTKs). They are subdivided into 2 groups, A and B type receptors, depending on their ligand ephrin-A or ephrin-B, respectively. In general, class EphA receptors bind GPI-anchored ephrin-A ligands. There are ten vertebrate EphA receptors (EphA1-10), which display promiscuous interactions with six ephrin-A ligands. Class EphB receptors bind to transmembrane ephrin-B ligands. There are six vertebrate EhpB receptors (EphB1-6), which display promiscuous interactions with three ephrin-B ligands. One exception is EphB2, which also interacts with ephrin A5. EphRs contain a ligand binding domain and two fibronectin repeats extracellularly, a transmembrane segment, and a cytoplasmic tyrosine kinase domain. Binding of the ephrin ligand to EphR requires cell-cell contact since both are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling) and ephrin-expressing cells (reverse signaling). Ephrin/EphR interaction mainly results in cell-cell repulsion or adhesion, making it important in neural development and plasticity, cell morphogenesis, cell-fate determination, embryonic development, tissue patterning, and angiogenesis." Q#16430 - CGI_10023292 superfamily 219467 354 393 0.0069368 36.1572 cl08456 NCD3G superfamily - - "Nine Cysteines Domain of family 3 GPCR; This conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the pfam00003 in several receptor proteins." Q#16432 - CGI_10023294 superfamily 192535 34 183 5.09E-06 45.2794 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#16433 - CGI_10023295 superfamily 247870 70 244 0.00541927 37.6498 cl17316 Trp_Tyr_perm superfamily C - Tryptophan/tyrosine permease family; Tryptophan/tyrosine permease family. Q#16435 - CGI_10023297 superfamily 247856 97 151 1.12E-10 54.0909 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16435 - CGI_10023297 superfamily 247856 29 88 4.16E-10 52.5501 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16435 - CGI_10023297 superfamily 247856 134 185 0.00319683 33.6753 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16438 - CGI_10023300 superfamily 241743 744 842 0.00197456 37.7413 cl00274 ML superfamily N - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#16439 - CGI_10023301 superfamily 241743 102 204 0.00562356 35.4301 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#16442 - CGI_10023304 superfamily 245595 398 681 1.70E-153 454.358 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#16442 - CGI_10023304 superfamily 245595 5 271 6.68E-146 434.713 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#16442 - CGI_10023304 superfamily 248053 277 326 4.29E-12 63.312 cl17499 Peptidase_M14NE-CP-C_like superfamily C - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#16442 - CGI_10023304 superfamily 248053 687 736 2.82E-11 60.6156 cl17499 Peptidase_M14NE-CP-C_like superfamily C - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#16443 - CGI_10023305 superfamily 245201 45 416 0 740.405 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16444 - CGI_10023306 superfamily 219316 305 489 1.34E-53 187.043 cl06268 B9-C2 superfamily - - "Ciliary basal body-associated, B9 protein; The B9-C2 domain is found in proteins associated with the ciliary basal body. B9 domains were identified as a specific family of C2 domains. There are three sub-families represented by this family, notably, Mks1-Xbx7, Stumpy-Tza1 and Tza2 groups of proteins. Mutations in human Mks1 result in the developmental disorder Mechler-Gruber syndrome; mutations in mouse Stumpy lead to perinatal hydrocephalus and severe polycystic kidney disease. All the three distinct types of B9-C2 proteins cooperatively localise to the basal body or centrosome of cilia." Q#16445 - CGI_10023307 superfamily 241565 537 607 1.60E-07 49.2423 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#16445 - CGI_10023307 superfamily 241565 303 370 2.99E-05 42.3087 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#16445 - CGI_10023307 superfamily 202000 1 148 1.51E-50 173.044 cl03375 XRCC1_N superfamily - - XRCC1 N terminal domain; XRCC1 N terminal domain. Q#16446 - CGI_10023308 superfamily 241600 19 217 8.05E-84 250.62 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#16447 - CGI_10023309 superfamily 243092 212 506 1.62E-52 180.994 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16447 - CGI_10023309 superfamily 203998 66 135 3.53E-36 128.901 cl07284 Prp19 superfamily - - Prp19/Pso4-like; This regions is found specifically in PRP19-like protein. The region represented by this family covers the sequence implicated in self-interaction and a coiled-coiled motif. PRP19-like proteins form an oligomer that is necessary for spliceosome assembly. Q#16447 - CGI_10023309 superfamily 248098 2 68 1.17E-19 83.0533 cl17544 U-box superfamily - - U-box domain; This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. Q#16448 - CGI_10003719 superfamily 247941 186 325 1.11E-12 64.2792 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#16449 - CGI_10003720 superfamily 243146 257 298 1.91E-07 47.6562 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16449 - CGI_10003720 superfamily 243146 151 191 0.00132674 36.4855 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16449 - CGI_10003720 superfamily 243146 34 94 0.00227777 35.7151 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16450 - CGI_10003721 superfamily 247743 337 482 0.0018217 38.6663 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16450 - CGI_10003721 superfamily 144608 699 757 5.96E-05 44.0429 cl18013 Mg_chelatase superfamily NC - "Magnesium chelatase, subunit ChlI; Magnesium-chelatase is a three-component enzyme that catalyzes the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in channelling inter- mediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weight between 38-42 kDa." Q#16451 - CGI_10003722 superfamily 245244 197 262 1.90E-21 86.1726 cl10045 tRNA_int_endo superfamily C - "tRNA intron endonuclease, catalytic C-terminal domain; Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9." Q#16461 - CGI_10008371 superfamily 217658 23 70 2.97E-17 71.3901 cl04196 UPF0041 superfamily C - Uncharacterized protein family (UPF0041); Uncharacterized protein family (UPF0041). Q#16462 - CGI_10008372 superfamily 247724 17 180 3.77E-64 209.942 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16463 - CGI_10008373 superfamily 207690 553 575 2.26E-06 45.0013 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#16464 - CGI_10008374 superfamily 217658 64 154 9.92E-26 96.0428 cl04196 UPF0041 superfamily - - Uncharacterized protein family (UPF0041); Uncharacterized protein family (UPF0041). Q#16465 - CGI_10008375 superfamily 243072 563 659 1.14E-10 60.0898 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16465 - CGI_10008375 superfamily 243072 630 809 7.03E-05 41.9855 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16468 - CGI_10008378 superfamily 241832 76 262 6.91E-47 155.751 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16469 - CGI_10008379 superfamily 248097 136 239 2.94E-09 52.6526 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16470 - CGI_10008380 superfamily 248097 156 263 4.46E-09 52.6526 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16471 - CGI_10008381 superfamily 247727 781 878 5.82E-09 54.7434 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#16471 - CGI_10008381 superfamily 247727 114 209 7.19E-09 54.3583 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#16471 - CGI_10008381 superfamily 247727 558 654 7.28E-08 51.6619 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#16471 - CGI_10008381 superfamily 247727 334 429 2.82E-07 49.7359 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#16474 - CGI_10008384 superfamily 199166 69 272 7.92E-19 84.6864 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#16474 - CGI_10008384 superfamily 199166 348 434 6.94E-12 63.8856 cl15308 AMN1 superfamily NC - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#16475 - CGI_10008385 superfamily 247724 11 172 9.04E-111 316.859 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16476 - CGI_10008386 superfamily 243056 298 533 2.06E-52 180.58 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#16476 - CGI_10008386 superfamily 192931 7 197 9.56E-48 167.259 cl13498 DUF3548 superfamily - - Domain of unknown function (DUF3548); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 184 to 216 amino acids in length. This domain is found associated with pfam00566. This domain is found at the N-terminus of GYP7 proteins. Q#16477 - CGI_10008387 superfamily 247723 44 118 8.45E-29 104.6 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16478 - CGI_10011371 superfamily 247058 15 178 9.15E-49 160.804 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#16479 - CGI_10011372 superfamily 220672 448 564 5.66E-13 68.0422 cl10957 Frag1 superfamily N - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#16479 - CGI_10011372 superfamily 220672 596 708 1.09E-12 66.8866 cl10957 Frag1 superfamily N - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#16479 - CGI_10011372 superfamily 220672 248 341 3.75E-09 56.101 cl10957 Frag1 superfamily C - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#16480 - CGI_10011373 superfamily 186602 50 328 4.46E-125 365.876 cl03849 PSS superfamily - - Phosphatidyl serine synthase; Phosphatidyl serine synthase is also known as serine exchange enzyme. This family represents eukaryotic PSS I and II which are membrane bound proteins which catalyzes the replacement of the head group of a phospholipid (phosphotidylcholine or phosphotidylethanolamine) by L-serine. Q#16481 - CGI_10011374 superfamily 241644 32 163 1.21E-37 128.474 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#16482 - CGI_10011375 superfamily 241659 82 159 1.47E-30 107.991 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#16483 - CGI_10011376 superfamily 241659 68 143 1.60E-26 97.2054 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#16484 - CGI_10011377 superfamily 245206 53 296 1.45E-111 326.892 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16485 - CGI_10011378 superfamily 241546 252 366 1.48E-39 139.612 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16485 - CGI_10011378 superfamily 241546 380 461 4.69E-38 135.375 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16485 - CGI_10011378 superfamily 241546 118 244 8.02E-29 109.567 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 716 841 1.46E-51 178.903 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 868 987 3.44E-51 178.132 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 1152 1273 3.15E-47 166.576 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 999 1125 3.16E-46 163.88 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 434 541 1.18E-45 162.339 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 305 423 4.99E-42 151.939 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 179 294 6.39E-37 137.301 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 48 165 7.24E-36 134.22 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 1311 1428 1.41E-31 121.893 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 563 602 2.85E-13 68.7356 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 3 38 1.76E-07 51.0164 cl00011 PLAT superfamily N - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16487 - CGI_10011380 superfamily 241546 668 702 0.000332085 41.0012 cl00011 PLAT superfamily N - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#16488 - CGI_10011381 superfamily 243065 309 462 1.41E-10 60.5333 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#16492 - CGI_10011385 superfamily 248097 59 97 5.04E-09 49.571 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16493 - CGI_10011386 superfamily 248097 9 103 5.57E-13 60.7418 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16494 - CGI_10011387 superfamily 248097 9 125 6.07E-18 74.609 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16495 - CGI_10017902 superfamily 243058 670 768 3.85E-05 44.2276 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#16495 - CGI_10017902 superfamily 245201 10 283 9.49E-49 176.556 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16495 - CGI_10017902 superfamily 243093 1581 1659 3.00E-20 88.3561 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#16495 - CGI_10017902 superfamily 243093 1393 1471 3.46E-19 85.2745 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#16495 - CGI_10017902 superfamily 243093 1487 1565 2.02E-18 82.9633 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#16495 - CGI_10017902 superfamily 243093 1684 1796 2.01E-11 62.5478 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#16495 - CGI_10017902 superfamily 243093 1813 1855 1.98E-05 44.8286 cl02568 WSC superfamily C - WSC domain; This domain may be involved in carbohydrate binding. Q#16496 - CGI_10017903 superfamily 246683 66 373 2.23E-162 461.205 cl14648 Aldose_epim superfamily - - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#16497 - CGI_10017904 superfamily 222150 247 272 0.00543566 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16497 - CGI_10017904 superfamily 246975 233 256 0.00935045 33.7655 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#16498 - CGI_10017905 superfamily 246680 115 177 6.34E-05 41.5522 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#16501 - CGI_10017908 superfamily 203495 42 92 1.34E-06 42.9954 cl10663 Cep57_MT_bd superfamily N - "Centrosome microtubule-binding domain of Cep57; This C-terminal region of Cep57 binds, nucleates and bundles microtubules. The N-terminal part, family Cep57_CLD, pfam14073, is the centrosome localisation domain Cep57." Q#16509 - CGI_10017916 superfamily 247805 71 209 4.76E-19 84.6964 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16509 - CGI_10017916 superfamily 243778 496 587 3.30E-34 125.799 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#16509 - CGI_10017916 superfamily 247905 347 400 1.41E-08 52.5237 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#16511 - CGI_10017918 superfamily 220692 46 340 6.42E-06 46.4285 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#16513 - CGI_10017920 superfamily 219532 1 78 1.74E-30 106.63 cl06657 OB_NTP_bind superfamily N - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#16514 - CGI_10017921 superfamily 193256 2969 3231 4.91E-69 237.538 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#16514 - CGI_10017921 superfamily 193257 3609 3839 1.99E-54 193.663 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#16514 - CGI_10017921 superfamily 193251 2565 2829 1.02E-44 167.035 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#16514 - CGI_10017921 superfamily 193253 3342 3589 2.72E-25 110.897 cl15084 MT superfamily N - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#16514 - CGI_10017921 superfamily 247743 2258 2393 1.36E-06 49.9864 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16516 - CGI_10017923 superfamily 242169 10 100 1.73E-23 87.2894 cl00886 Robl_LC7 superfamily - - "Roadblock/LC7 domain; This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role." Q#16517 - CGI_10017924 superfamily 243158 87 122 5.66E-06 40.62 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#16518 - CGI_10017925 superfamily 241733 5 83 3.71E-50 158.098 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#16519 - CGI_10017926 superfamily 243066 38 131 2.94E-17 74.1265 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#16520 - CGI_10017927 superfamily 217311 4 378 1.55E-107 330.453 cl18402 DUF229 superfamily N - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#16522 - CGI_10005801 superfamily 110998 27 248 1.33E-91 281.449 cl03422 Glyco_hydro_30 superfamily C - O-Glycosyl hydrolase family 30; O-Glycosyl hydrolase family 30. Q#16523 - CGI_10005802 superfamily 241592 37 97 2.46E-15 65.7109 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#16524 - CGI_10005803 superfamily 128778 110 218 0.00111167 38.0147 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#16526 - CGI_10005805 superfamily 241832 22 118 9.57E-44 145.215 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16527 - CGI_10005806 superfamily 243074 28 62 0.00385049 34.7897 cl02535 F-box-like superfamily C - F-box-like; This is an F-box-like family. Q#16529 - CGI_10005808 superfamily 241979 7 127 3.59E-68 204.11 cl00610 Ribosomal_S17e superfamily - - Ribosomal S17; Ribosomal S17. Q#16530 - CGI_10005809 superfamily 243072 82 194 3.23E-31 112.092 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16530 - CGI_10005809 superfamily 247683 25 72 1.07E-25 95.4445 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#16532 - CGI_10005811 superfamily 148167 71 142 3.38E-35 119.157 cl05742 GFRP superfamily - - "GTP cyclohydrolase I feedback regulatory protein (GFRP); Tetrahydrobiopterin, the cofactor required for hydroxylation of aromatic amino acids regulates its own synthesis in via feedback inhibition of GTP cyclohydrolase I. This mechanism is mediated by the regulatory subunit called GTP cyclohydrolase I feedback regulatory protein (GFRP)." Q#16534 - CGI_10005813 superfamily 247724 60 300 9.66E-156 447.491 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16534 - CGI_10005813 superfamily 247856 449 514 2.39E-21 88.0454 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16535 - CGI_10005814 superfamily 217436 1 109 9.68E-56 171.23 cl03943 Yippee superfamily - - Yippee putative zinc-binding protein; Yippee putative zinc-binding protein. Q#16536 - CGI_10005955 superfamily 245847 8 75 0.000838677 34.8398 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#16539 - CGI_10005958 superfamily 241599 131 185 2.07E-17 73.4316 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#16540 - CGI_10005959 superfamily 241659 103 180 2.00E-25 95.2794 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#16541 - CGI_10005960 superfamily 241659 13 90 7.13E-27 96.0498 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#16542 - CGI_10005961 superfamily 242898 11 122 8.11E-18 75.2086 cl02130 Got1 superfamily - - "Got1/Sft2-like family; Traffic through the yeast Golgi complex depends on a member of the syntaxin family of SNARE proteins, Sed5, present in early Golgi cisternae. Got1 is thought to facilitate Sed5-dependent fusion events. This is a family of sequences derived from eukaryotic proteins. They are similar to a region of a SNARE-like protein required for traffic through the Golgi complex, SFT2 protein. This is a conserved protein with four putative transmembrane helices, thought to be involved in vesicular transport in later Golgi compartments." Q#16546 - CGI_10024878 superfamily 248097 87 193 6.40E-16 74.2238 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16546 - CGI_10024878 superfamily 245852 296 475 1.67E-08 55.0506 cl12050 TraB superfamily NC - "TraB family; pAD1 is a hemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. It encodes a mating response to a peptide sex pheromone, cAD1, secreted by recipient bacteria. Once the plasmid pAD1 is acquired, production of the pheromone ceases--a trait related in part to a determinant designated traB. However a related protein is found in C. elegans, suggesting that members of the TraB family have some more general function. This family also includes the bacterial GumN protein. The family has a conserved GXXH motif close to the N-terminus, a conserved glutamate and a conserved arginine that may be catalytic. The family also includes a second conserved GXXH motif near the C-terminus." Q#16547 - CGI_10024879 superfamily 248097 11 124 1.43E-13 64.979 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16548 - CGI_10024880 superfamily 241646 478 525 6.35E-08 49.7559 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#16548 - CGI_10024880 superfamily 241646 25 66 1.46E-05 42.823 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#16550 - CGI_10024882 superfamily 243035 727 827 1.02E-15 74.9637 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#16550 - CGI_10024882 superfamily 241874 22 520 0 768.392 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#16550 - CGI_10024882 superfamily 243035 886 986 5.57E-15 73.1077 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#16551 - CGI_10024883 superfamily 241874 24 567 0 811.92 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#16552 - CGI_10024884 superfamily 243050 13 40 8.37E-06 37.9433 cl02475 LIM superfamily NC - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#16553 - CGI_10024885 superfamily 245201 328 528 1.86E-50 176.658 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16553 - CGI_10024885 superfamily 243050 55 108 1.73E-28 109.377 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#16553 - CGI_10024885 superfamily 241622 185 223 1.24E-06 47.1763 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#16553 - CGI_10024885 superfamily 243050 22 48 4.20E-09 54.0323 cl02475 LIM superfamily N - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#16555 - CGI_10024887 superfamily 247044 7 135 1.68E-42 155.793 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#16555 - CGI_10024887 superfamily 193256 3008 3276 6.84E-81 271.821 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#16555 - CGI_10024887 superfamily 193251 2661 2929 1.57E-60 213.259 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#16555 - CGI_10024887 superfamily 193257 3643 3873 3.96E-41 155.529 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#16555 - CGI_10024887 superfamily 193253 3288 3624 2.07E-33 135.935 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#16555 - CGI_10024887 superfamily 248019 4758 4859 1.39E-10 66.4471 cl17465 DAGK_cat superfamily NC - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#16555 - CGI_10024887 superfamily 247743 2351 2486 8.06E-06 47.6752 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16556 - CGI_10024888 superfamily 248374 14 171 1.19E-27 103.233 cl17820 Asp_Arg_Hydrox superfamily - - "Aspartyl/Asparaginyl beta-hydroxylase; Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyze oxidative reactions in a range of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a 2-OG oxygenase catalyzing oxidation of a free alpha-amino acid. The structure of proline 3-hydroxylase contains the conserved motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron and 2-oxoglutarate, consistent with divergent evolution within the extended family. This family represent the arginine, asparagine and proline hydroxylases. The aspartyl/asparaginyl beta-hydroxylase (EC:1.14.11.16) specifically hydroxylates one aspartic or asparagine residue in certain epidermal growth factor-like domains of a number of proteins." Q#16557 - CGI_10024889 superfamily 248019 76 169 1.56E-12 69.5287 cl17465 DAGK_cat superfamily NC - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#16559 - CGI_10024891 superfamily 244083 136 229 1.35E-45 150.53 cl05417 PLA2_like superfamily - - "PLA2_like: Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers." Q#16560 - CGI_10024892 superfamily 243078 31 143 1.95E-27 109.231 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#16560 - CGI_10024892 superfamily 201885 834 1027 1.86E-77 252.929 cl12249 I_LWEQ superfamily - - "I/LWEQ domain; I/LWEQ domains bind to actin. It has been shown that the I/LWEQ domains from mouse talin and yeast Sla2p interact with F-actin. I/LWEQ domains can be placed into four major groups based on sequence similarity: (1) Metazoan talin; (2) Dictyostelium TalA/TalB and SLA110; (3) metazoan Hip1p; and (4) yeast Sla2p. The domain has four conserved blocks, the name of the domain is derived from the initial conserved amino acid of each of the four blocks." Q#16560 - CGI_10024892 superfamily 242926 476 623 0.00630527 38.1925 cl02193 PRK13874 superfamily C - conjugal transfer protein TrbJ; Provisional Q#16563 - CGI_10024895 superfamily 217293 19 210 8.22E-37 134.297 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16563 - CGI_10024895 superfamily 202474 236 312 0.00648574 36.4777 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16564 - CGI_10024896 superfamily 217293 8 111 1.16E-20 87.3031 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16564 - CGI_10024896 superfamily 202474 162 213 8.07E-06 44.9521 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16565 - CGI_10024897 superfamily 241578 335 492 1.38E-40 148.594 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16565 - CGI_10024897 superfamily 241578 692 851 1.77E-38 142.431 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16565 - CGI_10024897 superfamily 247856 150 210 2.57E-05 43.6905 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16565 - CGI_10024897 superfamily 245598 12 46 7.13E-07 49.5852 cl11396 Patatin_and_cPLA2 superfamily N - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#16565 - CGI_10024897 superfamily 247097 520 555 0.00571658 36.275 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#16565 - CGI_10024897 superfamily 247097 965 998 0.00772745 35.8181 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#16568 - CGI_10024900 superfamily 217293 28 228 1.78E-35 130.445 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16568 - CGI_10024900 superfamily 202474 232 286 2.85E-06 46.8781 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16569 - CGI_10024901 superfamily 217293 36 233 4.02E-38 138.149 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16569 - CGI_10024901 superfamily 202474 240 330 1.81E-13 68.0641 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16570 - CGI_10024902 superfamily 217293 30 227 7.99E-41 145.083 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16570 - CGI_10024902 superfamily 202474 254 330 4.80E-10 58.0489 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16571 - CGI_10024903 superfamily 217293 30 227 8.64E-42 147.779 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16571 - CGI_10024903 superfamily 202474 255 325 5.75E-10 57.6637 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16574 - CGI_10024906 superfamily 219525 88 136 0.000232545 38.5542 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#16574 - CGI_10024906 superfamily 219525 129 170 0.00190201 35.8578 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#16574 - CGI_10024906 superfamily 219525 26 68 0.00774738 33.9318 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#16575 - CGI_10024907 superfamily 247905 375 480 8.28E-15 72.2704 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#16575 - CGI_10024907 superfamily 204947 631 679 5.00E-14 67.9324 cl13893 SUV3_C superfamily - - "Mitochondrial degradasome RNA helicase subunit C terminal; This domain family is found in bacteria and eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00271. The yeast mitochondrial degradosome (mtEXO) is an NTP-dependent exoribonuclease involved in mitochondrial RNA metabolism. mtEXO is made up of two subunits: an RNase (DSS1) and an RNA helicase (SUV3). These co-purify with mitochondrial ribosomes." Q#16575 - CGI_10024907 superfamily 247805 211 315 0.00021053 40.7836 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16577 - CGI_10024909 superfamily 222150 99 122 0.00839212 32.749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16578 - CGI_10024910 superfamily 247684 4 192 1.44E-29 116.608 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#16579 - CGI_10024911 superfamily 247684 10 113 1.99E-27 104.667 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#16579 - CGI_10024911 superfamily 247684 108 136 0.00242982 35.7304 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#16580 - CGI_10024912 superfamily 247723 39 114 3.32E-16 70.7941 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16580 - CGI_10024912 superfamily 247723 130 185 1.32E-08 50.3785 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16581 - CGI_10024913 superfamily 242406 1 61 0.00467274 33.506 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#16582 - CGI_10024914 superfamily 216363 26 108 1.15E-14 65.5694 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#16587 - CGI_10024919 superfamily 193251 3016 3291 2.02E-26 113.492 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#16587 - CGI_10024919 superfamily 193256 3378 3613 1.24E-22 101.948 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#16587 - CGI_10024919 superfamily 193257 4218 4404 1.32E-09 60.7695 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#16588 - CGI_10024920 superfamily 241563 73 112 3.77E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16589 - CGI_10024921 superfamily 245604 777 839 3.37E-10 57.8112 cl11404 Biotinyl_lipoyl_domains superfamily - - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#16589 - CGI_10024921 superfamily 247809 285 512 8.27E-58 197.909 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#16589 - CGI_10024921 superfamily 201133 161 279 1.67E-33 126.058 cl02837 CPSase_L_chain superfamily - - "Carbamoyl-phosphate synthase L chain, N-terminal domain; Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117." Q#16589 - CGI_10024921 superfamily 244920 550 656 4.15E-30 115.973 cl08365 Biotin_carb_C superfamily - - "Biotin carboxylase C-terminal domain; Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyzes the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain." Q#16589 - CGI_10024921 superfamily 219797 840 928 1.39E-19 92.7637 cl09596 ACC_central superfamily C - "Acetyl-CoA carboxylase, central region; The region featured in this family is found in various eukaryotic acetyl-CoA carboxylases, N-terminal to the catalytic domain (pfam01039). This enzyme (EC:6.4.1.2) is involved in the synthesis of long-chain fatty acids, as it catalyzes the rate-limiting step in this process." Q#16590 - CGI_10024922 superfamily 247792 221 263 5.64E-13 64.004 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16592 - CGI_10024924 superfamily 241596 99 159 1.21E-13 63.0019 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#16593 - CGI_10024925 superfamily 247723 498 567 1.78E-09 55.3889 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16593 - CGI_10024925 superfamily 247723 245 317 2.10E-08 52.3073 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16593 - CGI_10024925 superfamily 243098 727 766 7.53E-06 44.5111 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#16593 - CGI_10024925 superfamily 246749 20 96 3.39E-07 49.0328 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#16593 - CGI_10024925 superfamily 247723 149 194 0.000311039 39.838 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16594 - CGI_10024926 superfamily 242253 382 494 4.27E-42 147.092 cl01020 ASCH superfamily - - "ASC-1 homology or ASCH domain, a small beta-barrel domain found in all three kingdoms of life. ASCH resembles the RNA-binding PUA domain and may also interact with RNA. ASCH has been proposed to function as an RNA-binding domain during coactivation, RNA-processing and the regulation of prokaryotic translation. The domain has been named after the ASC-1 protein, the activating signal cointegrator 1 or thyroid hormone receptor interactor protein 4 (TRIP4). ASC-1 is conserved in many eukaryotes and has been suggested to participate in a protein complex that interacts with RNA. It has been shown that ASC-1 mediates the interaction between various transciption factors and the basal transcriptional machinery." Q#16594 - CGI_10024926 superfamily 218944 143 195 7.32E-21 86.224 cl05634 zf-C2HC5 superfamily - - "Putative zinc finger motif, C2HC5-type; This zinc finger appears to be common in activating signal cointegrator 1/thyroid receptor interacting protein 4." Q#16596 - CGI_10004570 superfamily 243072 549 665 9.56E-29 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16596 - CGI_10004570 superfamily 243072 213 331 5.64E-17 81.2758 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16596 - CGI_10004570 superfamily 243072 1348 1488 4.64E-08 54.697 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16596 - CGI_10004570 superfamily 243072 2177 2259 2.71E-06 48.919 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16596 - CGI_10004570 superfamily 243072 1286 1403 6.96E-06 47.7634 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16596 - CGI_10004570 superfamily 243072 366 464 0.000120644 43.9115 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16597 - CGI_10004571 superfamily 111929 117 175 1.42E-05 42.0026 cl03885 Str_synth superfamily N - Strictosidine synthase; Strictosidine synthase (E.C. 4.3.3.2) is a key enzyme in alkaloid biosynthesis. It catalyzes the condensation of tryptamine with secologanin to form strictosidine. Q#16598 - CGI_10004572 superfamily 245239 38 162 1.27E-30 109.138 cl10033 F1-ATPase_delta superfamily - - "mitochondrial ATP synthase delta subunit; The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain, F1, is composed of alpha, beta, gamma, delta, and epsilon subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain. In bacteria, which is lacking a eukaryotic epsilon subunit homolog, this subunit is called the epsilon subunit." Q#16599 - CGI_10004573 superfamily 247912 37 383 7.65E-41 148.804 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#16600 - CGI_10004574 superfamily 247905 763 892 2.69E-27 109.635 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#16600 - CGI_10004574 superfamily 247805 463 611 3.34E-18 83.5408 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16601 - CGI_10004575 superfamily 151909 6 71 2.94E-25 99.9871 cl12992 TUG superfamily - - GLUT4 regulating protein TUG; TUG is a GLUT4 regulating protein and functions to retain membrane vesicles containing GLUT4 intracellularly. TUG releases the GLUT4 containing vesicles to the cellular exocytic machinery in response to insulin stimulation which allows translocation to the plasma membrane. TUG has an N-terminal ubiquitin-like domain (UBL1) which in similar proteins appears to participate in protein-protein interactions. The region does have a area of negative electrostatic potential and increased backbone motility which leads to suggestions of a potential protein-protein interaction site. Q#16601 - CGI_10004575 superfamily 241645 467 537 0.000197336 40.3324 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16603 - CGI_10004577 superfamily 246669 284 377 0.00884092 35.1558 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#16604 - CGI_10011753 superfamily 241739 543 739 8.53E-50 173.213 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#16604 - CGI_10011753 superfamily 202662 268 427 1.43E-20 89.5385 cl18231 B3_4 superfamily - - B3/4 domain; This domain is found in tRNA synthetase beta subunits as well as in some non tRNA synthetase proteins. Q#16604 - CGI_10011753 superfamily 208909 458 525 1.53E-07 49.4157 cl08394 B5 superfamily - - tRNA synthetase B5 domain; This domain is found in phenylalanine-tRNA synthetase beta subunits. Q#16605 - CGI_10011754 superfamily 245601 332 407 5.19E-12 64.3176 cl11399 HP superfamily N - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#16605 - CGI_10011754 superfamily 245601 128 199 3.08E-07 50.0652 cl11399 HP superfamily C - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#16610 - CGI_10011759 superfamily 243082 40 328 7.30E-11 63.271 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#16611 - CGI_10011760 superfamily 241874 20 548 0 567.495 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#16612 - CGI_10011761 superfamily 248312 1 102 7.38E-12 60.0683 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#16613 - CGI_10006180 superfamily 217473 140 371 8.02E-24 101.673 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#16615 - CGI_10006182 superfamily 241832 79 169 1.67E-05 41.9608 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16616 - CGI_10006183 superfamily 191582 71 146 1.98E-15 68.7953 cl05954 DUF1180 superfamily N - Protein of unknown function (DUF1180); This family consists of several hypothetical mammalian proteins of around 190 residues in length. The function of this family is unknown. Q#16618 - CGI_10006185 superfamily 245201 25 220 2.75E-47 165.488 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16619 - CGI_10006186 superfamily 245201 18 213 5.84E-51 177.044 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16620 - CGI_10006187 superfamily 242323 125 231 3.79E-23 93.3419 cl01132 FA_hydroxylase superfamily - - "Fatty acid hydroxylase superfamily; This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins." Q#16621 - CGI_10006188 superfamily 190308 135 312 2.37E-08 53.0915 cl18163 Fringe superfamily - - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#16623 - CGI_10005601 superfamily 247999 10 59 0.000771313 38.6256 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#16626 - CGI_10005604 superfamily 245847 9 79 1.53E-14 64.5001 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#16627 - CGI_10005605 superfamily 243099 6 142 5.72E-45 145.939 cl02575 Bcl-2_like superfamily - - "Apoptosis regulator proteins of the Bcl-2 family, named after B-cell lymphoma 2. This alignment model spans what have been described as Bcl-2 homology regions BH1, BH2, BH3, and BH4. Many members of this family have an additional C-terminal transmembrane segment. Some homologous proteins, which are not included in this model, may miss either the BH4 (Bax, Bak) or the BH2 (Bcl-X(S)) region, and some appear to only share the BH3 region (Bik, Bim, Bad, Bid, Egl-1). This family is involved in the regulation of the outer mitochondrial membrane's permeability and in promoting or preventing the release of apoptogenic factors, which in turn may trigger apoptosis by activating caspases. Bcl-2 and the closely related Bcl-X(L) are anti-apoptotic key regulators of programmed cell death. They are assumed to function via heterodimeric protein-protein interactions, binding pro-apoptotic proteins such as Bad (BCL2-antagonist of cell death), Bid, and Bim, by specifically interacting with their BH3 regions. Interfering with this heterodimeric interaction via small-molecule inhibitors may prove effective in targeting various cancers. This family also includes the Caenorhabditis elegans Bcl-2 homolog CED-9, which binds to CED-4, the C. Elegans homolog of mammalian Apaf-1. Apaf-1, however, does not seem to be inhibited by Bcl-2 directly." Q#16628 - CGI_10005606 superfamily 241619 123 154 0.00834821 34.3327 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#16628 - CGI_10005606 superfamily 241619 320 351 0.00834821 34.3327 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#16633 - CGI_10012605 superfamily 247805 35 233 9.47E-87 274.362 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16633 - CGI_10012605 superfamily 247905 257 378 2.25E-21 91.1452 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#16634 - CGI_10012606 superfamily 243083 1630 1718 3.12E-21 91.2764 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#16634 - CGI_10012606 superfamily 241617 321 380 0.000288512 41.2026 cl00110 MBD superfamily - - "MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family." Q#16635 - CGI_10012607 superfamily 241645 11 87 5.96E-22 84.5484 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16636 - CGI_10012608 superfamily 241645 28 107 2.81E-20 78.7704 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16637 - CGI_10012609 superfamily 241645 20 90 1.13E-16 77.6148 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16637 - CGI_10012609 superfamily 241645 99 176 6.63E-16 75.3036 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16637 - CGI_10012609 superfamily 241645 224 313 0.000750201 39.934 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16638 - CGI_10012610 superfamily 241866 36 440 0 828.861 cl00445 Iso_dh superfamily - - Isocitrate/isopropylmalate dehydrogenase; Isocitrate/isopropylmalate dehydrogenase. Q#16639 - CGI_10012611 superfamily 247794 253 616 0 611.333 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#16639 - CGI_10012611 superfamily 241952 803 1259 8.53E-174 525.434 cl00566 PntB superfamily - - NAD/NADP transhydrogenase beta subunit [Energy production and conversion] Q#16639 - CGI_10012611 superfamily 247794 53 244 5.94E-98 318.581 cl17240 FDH_GDH_like superfamily C - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#16639 - CGI_10012611 superfamily 221763 679 767 7.57E-40 144.19 cl15081 DUF3814 superfamily - - "Domain of unknown function (DUF3814); This is a domain of unknown function. It is often found in combination with pfam05222, pfam01262 and pfam02233 on alanine dehydrogenase and pyridine nucleotide transhydrogenase enzymes." Q#16642 - CGI_10012614 superfamily 245835 15 228 1.14E-90 281.51 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#16642 - CGI_10012614 superfamily 247683 500 552 3.28E-23 93.253 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#16643 - CGI_10012615 superfamily 247805 276 421 8.20E-24 98.5635 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16644 - CGI_10012616 superfamily 197746 101 155 0.000141456 38.0911 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#16644 - CGI_10012616 superfamily 197746 162 203 0.00866724 33.0835 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#16648 - CGI_10005707 superfamily 243034 511 594 0.000395082 40.056 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#16648 - CGI_10005707 superfamily 221741 100 186 6.81E-20 86.6159 cl15057 Cadherin-like superfamily - - "Cadherin-like beta sandwich domain; This domain is found in several bacterial, metazoan and chlorophyte algal proteins. A profile-profile comparison recovered the cadherin domain and a comparison of the predicted structure of this domain with the crystal structure of the cadherin showed a congruent seven stranded secondary structure. The domain is widespread in bacteria and seen in the firmicutes, actinobacteria, certain proteobacteria, bacteroides and chlamydiae with an expansion in Clostridium. In contrast, it is limited in its distribution in eukaryotes suggesting that it was derived through lateral transfer from bacteria. In prokaryotes, this domain is widely fused to other domains such as FNIII (Fibronectin Type III), TIG, SLH (S-layer homology), discoidin, cell-wall-binding repeat domain and alpha-amylase-like glycohydrolases. These associations are suggestive of a carbohydrate-binding function for this cadherin-like domain. In animal proteins it is associated with an ATP-grasp domain." Q#16648 - CGI_10005707 superfamily 221741 6 93 2.89E-18 81.9935 cl15057 Cadherin-like superfamily - - "Cadherin-like beta sandwich domain; This domain is found in several bacterial, metazoan and chlorophyte algal proteins. A profile-profile comparison recovered the cadherin domain and a comparison of the predicted structure of this domain with the crystal structure of the cadherin showed a congruent seven stranded secondary structure. The domain is widespread in bacteria and seen in the firmicutes, actinobacteria, certain proteobacteria, bacteroides and chlamydiae with an expansion in Clostridium. In contrast, it is limited in its distribution in eukaryotes suggesting that it was derived through lateral transfer from bacteria. In prokaryotes, this domain is widely fused to other domains such as FNIII (Fibronectin Type III), TIG, SLH (S-layer homology), discoidin, cell-wall-binding repeat domain and alpha-amylase-like glycohydrolases. These associations are suggestive of a carbohydrate-binding function for this cadherin-like domain. In animal proteins it is associated with an ATP-grasp domain." Q#16649 - CGI_10005708 superfamily 246675 309 462 1.19E-107 338.158 cl14615 PI-PLCc_GDPD_SF superfamily C - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#16649 - CGI_10005708 superfamily 247725 13 137 1.77E-37 138.092 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16649 - CGI_10005708 superfamily 246669 718 836 2.82E-37 137.673 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#16649 - CGI_10005708 superfamily 246675 583 686 3.04E-68 229.532 cl14615 PI-PLCc_GDPD_SF superfamily N - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#16649 - CGI_10005708 superfamily 150071 215 309 8.84E-20 86.0882 cl08538 efhand_like superfamily - - "Phosphoinositide-specific phospholipase C, efhand-like; Members of this family are predominantly found in phosphoinositide-specific phospholipase C. They adopt a structure consisting of a core of four alpha helices, in an EF like fold, and are required for functioning of the enzyme." Q#16655 - CGI_10008864 superfamily 243092 230 513 7.66E-70 227.603 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16655 - CGI_10008864 superfamily 207680 99 117 8.14E-06 43.1155 cl02632 PRP4 superfamily C - pre-mRNA processing factor 4 (PRP4) like; This small domain is found on PRP4 ribonuleoproteins. PRP4 is a U4/U6 small nuclear ribonucleoprotein that is involved in pre-mRNA processing. Q#16656 - CGI_10008865 superfamily 241884 18 229 3.51E-122 348.48 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#16657 - CGI_10008866 superfamily 247754 78 132 1.43E-08 47.5478 cl17200 HTH_XRE superfamily - - Helix-turn-helix XRE-family like proteins. Prokaryotic DNA binding proteins belonging to the xenobiotic response element family of transcriptional regulators. Q#16657 - CGI_10008866 superfamily 117100 5 73 8.54E-22 83.9448 cl07226 MBF1 superfamily - - "Multiprotein bridging factor 1; This domain is found in the multiprotein bridging factor 1 (MBF1) which forms a heterodimer with MBF2. It has been shown to make direct contact with the TATA-box binding protein (TBP) and interacts with Ftz-F1, stabilising the Ftz-F1-DNA complex. It is also found in the endothelial differentiation-related factor (EDF-1). Human EDF-1 is involved in the repression of endothelial differentiation, interacts with CaM and is phosphorylated by PKC. The domain is found in a wide range of eukaryotic proteins including metazoans, fungi and plants. A helix-turn-helix motif (pfam01381) is found to its C-terminus." Q#16658 - CGI_10008867 superfamily 241645 17 73 5.89E-19 83.4302 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16659 - CGI_10008868 superfamily 243034 216 302 3.75E-08 53.1528 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#16660 - CGI_10008869 superfamily 220715 14 147 3.66E-27 104.334 cl11025 NT-C2 superfamily - - "N-terminal C2 in EEIG1 and EHBP1 proteins; This version of the C2 domain was initally identified in the vertebrate estrogen early-induced gene 1 (EEIG1), and its Drosophila ortholog required for uptake of dsRNA via the endocytotic machinery to induce RNAi silencing. It is also in C.elegans ortholog Sym-3 (SYnthetic lethal with Mec-3) and the mammalian protein EHBP1 (EH domain Binding Protein-1) that regulates endocytotic recycling and two plant proteins, RPG that regulates Rhizobium-directed polar growth and PMI1 (Plastid Movement Impaired 1) that is essential for intracellular movement of chloroplasts in response to blue light." Q#16662 - CGI_10008871 superfamily 247739 14 193 8.71E-45 149.344 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#16663 - CGI_10008872 superfamily 241563 68 109 5.50E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16663 - CGI_10008872 superfamily 246954 135 207 0.00408072 38.5534 cl15415 Sec1 superfamily NC - Sec1 family; Sec1 family. Q#16664 - CGI_10008873 superfamily 247805 43 130 4.86E-05 41.4759 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16665 - CGI_10008874 superfamily 247692 198 556 1.68E-63 213.537 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#16665 - CGI_10008874 superfamily 247692 30 232 2.40E-09 58.3601 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#16666 - CGI_10008875 superfamily 247692 71 410 4.40E-57 192.736 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#16667 - CGI_10008876 superfamily 247856 483 545 6.00E-09 52.9353 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16667 - CGI_10008876 superfamily 246925 185 402 5.63E-18 83.9441 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#16668 - CGI_10008877 superfamily 241832 4 218 1.84E-117 336.049 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16669 - CGI_10005072 superfamily 241583 12 127 6.99E-19 81.2661 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#16672 - CGI_10005075 superfamily 216363 133 238 1.40E-24 94.8445 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#16674 - CGI_10005077 superfamily 241568 34 71 0.0002324 38.598 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#16676 - CGI_10004740 superfamily 245206 33 322 9.70E-101 311.134 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16676 - CGI_10004740 superfamily 245206 361 618 7.46E-86 272.229 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16677 - CGI_10004741 superfamily 177822 42 291 3.40E-18 81.8901 cl18088 PLN02164 superfamily N - sulfotransferase Q#16678 - CGI_10004742 superfamily 217410 7 165 1.41E-10 56.5936 cl18409 DDE_1 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localised to the centromere." Q#16679 - CGI_10004743 superfamily 177822 24 146 0.000216137 39.9033 cl18088 PLN02164 superfamily C - sulfotransferase Q#16682 - CGI_10004746 superfamily 177822 4 129 7.45E-10 54.5409 cl18088 PLN02164 superfamily N - sulfotransferase Q#16685 - CGI_10004749 superfamily 241578 238 432 2.38E-18 83.9768 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16685 - CGI_10004749 superfamily 217211 446 514 1.44E-07 49.9754 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#16685 - CGI_10004749 superfamily 219821 184 212 0.000188728 40.8174 cl07136 VWA_N superfamily N - "VWA N-terminal; This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits." Q#16686 - CGI_10010747 superfamily 244530 1 158 1.24E-25 99.9268 cl06844 SRR1 superfamily N - SRR1; SRR1 proteins are signalling proteins involved in regulating the circadian clock in Arabidopsis. Q#16687 - CGI_10010748 superfamily 243088 58 171 1.13E-09 55.0549 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#16689 - CGI_10010750 superfamily 241594 476 828 2.97E-165 486.302 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#16689 - CGI_10010750 superfamily 246669 14 137 2.55E-36 133.942 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#16689 - CGI_10010750 superfamily 241647 365 394 3.02E-11 59.849 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#16689 - CGI_10010750 superfamily 241647 289 317 9.19E-11 58.3082 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#16689 - CGI_10010750 superfamily 241647 257 287 1.41E-10 57.923 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#16689 - CGI_10010750 superfamily 241647 405 434 1.07E-08 52.5302 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#16690 - CGI_10010751 superfamily 246722 9 112 6.57E-50 158.133 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#16693 - CGI_10010755 superfamily 245205 213 287 1.44E-07 49.1585 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#16693 - CGI_10010755 superfamily 245596 368 501 4.79E-48 167.374 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16694 - CGI_10010756 superfamily 183292 1 32 0.00167491 35.5676 cl18135 PRK11728 superfamily C - hydroxyglutarate oxidase; Provisional Q#16696 - CGI_10002294 superfamily 243109 37 196 1.83E-92 269.883 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#16697 - CGI_10002295 superfamily 213405 11 119 1.63E-26 97.1948 cl17101 Fis1 superfamily - - "Mitochondria Fission Protein Fis1, cytosolic domain; Fis1, along with Dnm1 and Mdv1, is an essential protein in mediating mitochondrial fission. Dnm1 and Fis1 are highly conserved, with a common mechanism in disparate species. In mutants of these proteins, mitochondrial fission is impaired, resulting in networks of undivided mitochondria. The Fis1 N-terminus is cytosolic and tethered to the mitochondrial outer membrane via a C-terminal transmembrane domain. Fis1 appears to act via the recruitment of division complexes to the mitochondrial outer membrane, via interactions with Mdv1 or Caf4. Fis1 has tandem TPR helix-turn-helix motifs which are known to mediate protein-protein interactions." Q#16698 - CGI_10002296 superfamily 243074 101 147 1.13E-08 48.6569 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#16699 - CGI_10002297 superfamily 217617 71 170 4.83E-12 60.5077 cl15988 Sulfotransfer_2 superfamily C - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#16701 - CGI_10004993 superfamily 241584 87 177 9.51E-17 75.6107 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16701 - CGI_10004993 superfamily 241567 291 348 0.00154176 38.4502 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#16702 - CGI_10004994 superfamily 241584 320 405 3.38E-11 60.9731 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16702 - CGI_10004994 superfamily 241584 127 215 3.00E-09 55.1951 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16702 - CGI_10004994 superfamily 241584 789 863 4.16E-05 42.8687 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16702 - CGI_10004994 superfamily 241584 525 585 0.00020274 40.5575 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16702 - CGI_10004994 superfamily 241584 231 314 0.00303567 37.0907 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16702 - CGI_10004994 superfamily 241567 682 779 2.16E-08 54.2434 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#16703 - CGI_10004995 superfamily 241584 256 324 1.10E-07 50.5727 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16703 - CGI_10004995 superfamily 241584 642 728 4.22E-07 49.0319 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16703 - CGI_10004995 superfamily 241584 77 151 1.06E-06 47.8763 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#16705 - CGI_10004997 superfamily 245225 29 426 1.13E-57 200.615 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#16706 - CGI_10010142 superfamily 215647 108 358 2.56E-67 217.476 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#16706 - CGI_10010142 superfamily 243029 22 89 5.72E-20 83.5541 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#16708 - CGI_10010144 superfamily 243029 212 277 8.94E-21 86.2505 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#16708 - CGI_10010144 superfamily 215647 307 363 3.76E-10 58.7741 cl18338 7tm_2 superfamily NC - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#16708 - CGI_10010144 superfamily 215647 335 407 7.34E-08 51.8405 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#16708 - CGI_10010144 superfamily 245201 31 90 3.77E-05 44.0477 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16709 - CGI_10010145 superfamily 241687 17 94 4.02E-12 59.2681 cl00208 RNase_T2 superfamily N - "Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen." Q#16710 - CGI_10010146 superfamily 247068 47 148 3.65E-10 58.4789 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#16710 - CGI_10010146 superfamily 241611 354 532 1.74E-08 53.9316 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#16710 - CGI_10010146 superfamily 247068 159 245 0.000535926 39.6042 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#16711 - CGI_10010147 superfamily 247724 102 331 1.87E-38 141.145 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16711 - CGI_10010147 superfamily 191096 584 748 2.69E-71 231.904 cl12302 Fzo_mitofusin superfamily - - fzo-like conserved region; Family of putative transmembrane GTPase. The fzo protein is a mediator of mitochondrial fusion. This conserved region is also found in the human mitofusin protein. Q#16712 - CGI_10010148 superfamily 219603 454 722 2.11E-104 327.098 cl06744 GCFC superfamily - - "GC-rich sequence DNA-binding factor-like protein; Sequences found in this family are similar to a region of a human GC-rich sequence DNA-binding factor homolog. This is thought to be a protein involved in transcriptional regulation due to partial homologies to a transcription repressor and histone-interacting protein. This family also contains tuftelin interacting protein 11 which has been identified as both a nuclear and cytoplasmic protein, and has been implicated in the secretory pathway. Sip1, a septin interacting protein is also a member of this family." Q#16712 - CGI_10010148 superfamily 221586 81 145 1.32E-21 91.6632 cl13843 TIP_N superfamily C - "Tuftelin interacting protein N terminal; This domain family is found in eukaryotes, and is typically between 99 and 114 amino acids in length. The family is found in association with pfam08697, pfam01585. There are two completely conserved residues (G and F) that may be functionally important. TIP is involved in enamel assembly by interacting with one of the major proteins responsible for biomineralisation of enamel - tuftelin." Q#16712 - CGI_10010148 superfamily 243107 214 243 8.99E-11 58.6728 cl02611 G-patch superfamily C - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#16713 - CGI_10010149 superfamily 246974 8 303 2.14E-129 377.321 cl15477 diphth2_R superfamily - - "diphthamide biosynthesis enzyme Dph1/Dph2 domain; Archaea and Eukaryotes, but not Eubacteria, share the property of having a covalently modified residue, 2'-[3-carboxamido-3-(trimethylammonio)propyl]histidine, as a part of a cytosolic protein. The modified His, termed diphthamide, is part of translation elongation factor EF-2 and is the site for ADP-ribosylation by diphtheria toxin. This model includes both Dph1 and Dph2 from Saccharomyces cerevisiae, although only Dph2 is found in the Archaea (see TIGR03682). Dph2 has been shown to act analogously to the radical SAM (rSAM) family (pfam04055), with 4Fe-4S-assisted cleavage of S-adenosylmethionine to create a free radical, but a different organic radical than in rSAM." Q#16714 - CGI_10010150 superfamily 242889 736 824 4.30E-14 69.5783 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#16717 - CGI_10010153 superfamily 241607 23 57 1.50E-05 39.5606 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16717 - CGI_10010153 superfamily 241607 141 165 3.75E-05 38.405 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16717 - CGI_10010153 superfamily 241607 63 97 4.95E-05 38.0198 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16717 - CGI_10010153 superfamily 241607 103 137 0.000657265 34.9382 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16718 - CGI_10010154 superfamily 241607 56 81 6.13E-08 44.9534 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16718 - CGI_10010154 superfamily 241607 16 50 3.76E-06 39.9458 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16719 - CGI_10010155 superfamily 241607 64 84 0.000275081 34.1819 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16719 - CGI_10010155 superfamily 241607 26 49 0.00855466 29.9306 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#16721 - CGI_10010157 superfamily 241572 164 253 3.14E-09 53.7817 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#16721 - CGI_10010157 superfamily 241572 49 138 4.91E-09 53.3965 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#16722 - CGI_10010158 superfamily 247856 273 325 2.31E-06 44.4609 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16723 - CGI_10003957 superfamily 248097 55 179 1.77E-22 88.091 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16724 - CGI_10001543 superfamily 241596 59 120 1.91E-05 41.4307 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#16730 - CGI_10009960 superfamily 222429 38 86 0.00943035 32.9829 cl18676 Myb_DNA-bind_5 superfamily C - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#16736 - CGI_10006190 superfamily 245601 53 191 3.63E-18 81.3396 cl11399 HP superfamily C - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#16738 - CGI_10006192 superfamily 219740 242 358 1.48E-06 48.9546 cl06992 Peptidase_S64 superfamily N - "Peptidase family S64; This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1. The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS:S1) and to have a typical trypsin-like catalytic triad." Q#16740 - CGI_10019942 superfamily 247743 216 383 1.85E-28 110.699 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16740 - CGI_10019942 superfamily 247743 2 106 8.11E-13 66.0155 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#16740 - CGI_10019942 superfamily 204202 452 495 0.00044059 38.3905 cl07827 Vps4_C superfamily - - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#16741 - CGI_10019943 superfamily 241902 19 86 3.92E-18 73.2973 cl00493 trimeric_dUTPase superfamily - - "Trimeric dUTP diphosphatases; Trimeric dUTP diphosphatases, or dUTPases, are the most common family of dUTPase, found in bacteria, eukaryotes, and archaea. They catalyze the hydrolysis of the dUTP-Mg complex (dUTP-Mg) into dUMP and pyrophosphate. This reaction is crucial for the preservation of chromosomal integrity as it removes dUTP and therefore reduces the cellular dUTP/dTTP ratio, and prevents dUTP from being incorporated into DNA. It also provides dUMP as the precursor for dTTP synthesis via the thymidylate synthase pathway. dUTPases are homotrimeric, except some monomeric viral dUTPases, which have been shown to mimic a trimer. Active sites are located at the subunit interface." Q#16742 - CGI_10019944 superfamily 243072 15 123 2.88E-27 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16745 - CGI_10019948 superfamily 247724 321 528 1.58E-55 188.898 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16746 - CGI_10019949 superfamily 241902 48 135 9.63E-31 117.21 cl00493 trimeric_dUTPase superfamily - - "Trimeric dUTP diphosphatases; Trimeric dUTP diphosphatases, or dUTPases, are the most common family of dUTPase, found in bacteria, eukaryotes, and archaea. They catalyze the hydrolysis of the dUTP-Mg complex (dUTP-Mg) into dUMP and pyrophosphate. This reaction is crucial for the preservation of chromosomal integrity as it removes dUTP and therefore reduces the cellular dUTP/dTTP ratio, and prevents dUTP from being incorporated into DNA. It also provides dUMP as the precursor for dTTP synthesis via the thymidylate synthase pathway. dUTPases are homotrimeric, except some monomeric viral dUTPases, which have been shown to mimic a trimer. Active sites are located at the subunit interface." Q#16746 - CGI_10019949 superfamily 247724 552 740 4.42E-35 133.044 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#16747 - CGI_10019950 superfamily 243098 244 289 3.71E-06 44.5111 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#16747 - CGI_10019950 superfamily 243107 348 389 9.10E-11 57.9024 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#16748 - CGI_10019951 superfamily 241564 415 482 5.63E-29 109.277 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#16748 - CGI_10019951 superfamily 241564 271 336 5.10E-17 76.1503 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#16753 - CGI_10019957 superfamily 243161 5 69 4.31E-09 52.0522 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#16755 - CGI_10019959 superfamily 241563 187 222 0.00883208 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16756 - CGI_10019960 superfamily 241609 2 57 9.46E-17 67.7883 cl00100 KR superfamily N - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#16760 - CGI_10019964 superfamily 247803 26 207 3.43E-53 174.254 cl17249 YlqF_related_GTPase superfamily - - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#16761 - CGI_10019965 superfamily 247792 18 62 2.92E-09 54.7592 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16761 - CGI_10019965 superfamily 110440 863 890 1.34E-07 49.3285 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#16761 - CGI_10019965 superfamily 110440 910 937 3.84E-07 48.1729 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#16761 - CGI_10019965 superfamily 110440 960 984 4.11E-05 42.3949 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#16761 - CGI_10019965 superfamily 110440 816 843 4.80E-05 42.0097 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#16761 - CGI_10019965 superfamily 110440 768 796 0.000104532 41.2393 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#16762 - CGI_10019966 superfamily 247792 93 148 5.00E-06 41.2113 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16764 - CGI_10019968 superfamily 247792 54 102 8.44E-08 48.9812 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16764 - CGI_10019968 superfamily 219412 253 329 0.00475218 35.6838 cl06465 DUF1515 superfamily C - Protein of unknown function (DUF1515); This family consists of several hypothetical bacterial proteins of around 130 residues in length. Members of this family seem to be found exclusively in Rhizobium species. The function of this family is unknown. Q#16765 - CGI_10019969 superfamily 242918 59 303 6.57E-95 284.525 cl02172 Per1 superfamily - - Per1-like; PER1 is required for GPI-phospholipase A2 activity and is involved in lipid remodelling of GPI-anchored proteins. Q#16767 - CGI_10019972 superfamily 241563 61 96 0.00174454 37.4588 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16769 - CGI_10019974 superfamily 241623 215 314 9.30E-25 102.496 cl00119 PI3Kc_like superfamily C - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#16769 - CGI_10019974 superfamily 245847 318 381 1.98E-05 42.929 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#16769 - CGI_10019974 superfamily 241742 188 211 0.000211959 40.3738 cl00271 PI3Ka superfamily N - "Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture." Q#16771 - CGI_10006995 superfamily 243096 193 333 7.87E-25 102.761 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#16771 - CGI_10006995 superfamily 247725 347 438 5.12E-27 106.234 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16773 - CGI_10006997 superfamily 242422 7 196 4.37E-19 81.4812 cl01306 LCM superfamily - - Leucine carboxyl methyltransferase; Family of leucine carboxyl methyltransferases EC:2.1.1.-. This family may need divides a the full alignment contains a significantly shorter mouse sequence. Q#16774 - CGI_10006998 superfamily 246710 339 480 1.14E-31 122.611 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#16774 - CGI_10006998 superfamily 246710 528 666 5.23E-31 120.685 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#16774 - CGI_10006998 superfamily 246710 859 988 2.54E-30 118.374 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#16774 - CGI_10006998 superfamily 246710 686 795 7.40E-19 85.2467 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#16774 - CGI_10006998 superfamily 245213 198 236 0.00064171 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16774 - CGI_10006998 superfamily 243060 1020 1117 1.10E-12 66.2484 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#16774 - CGI_10006998 superfamily 243555 66 162 0.000364608 41.6078 cl03871 Chitin_bind_3 superfamily N - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#16775 - CGI_10006999 superfamily 241645 189 240 0.00583079 33.7798 cl00155 UBQ superfamily C - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16776 - CGI_10007000 superfamily 247941 10 144 2.26E-15 68.9016 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#16777 - CGI_10007002 superfamily 241610 94 146 4.62E-18 73.8234 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#16777 - CGI_10007002 superfamily 241646 34 81 0.000269813 35.8894 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#16778 - CGI_10009504 superfamily 214531 141 184 6.34E-08 47.5965 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#16778 - CGI_10009504 superfamily 214531 186 227 5.93E-06 42.2037 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#16779 - CGI_10009505 superfamily 215821 27 68 3.16E-08 47.2351 cl18346 FKBP_C superfamily N - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#16779 - CGI_10009505 superfamily 247856 75 143 3.57E-05 38.2977 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16780 - CGI_10009506 superfamily 215821 6 70 0.000761395 36.0643 cl18346 FKBP_C superfamily N - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#16781 - CGI_10009507 superfamily 215821 38 135 2.16E-20 82.6734 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#16781 - CGI_10009507 superfamily 247856 142 212 8.06E-05 38.6829 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16782 - CGI_10009508 superfamily 241573 52 373 6.81E-115 348.938 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#16782 - CGI_10009508 superfamily 246669 504 631 4.10E-30 115.453 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#16782 - CGI_10009508 superfamily 241653 385 475 4.83E-19 84.2944 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#16783 - CGI_10009509 superfamily 241639 709 1170 3.48E-146 458.955 cl00148 TOP4c superfamily - - "DNA Topoisomerase, subtype IIA; domain A'; bacterial DNA topoisomerase IV (C subunit, ParC), bacterial DNA gyrases (A subunit, GyrA),mammalian DNA toposiomerases II. DNA topoisomerases are essential enzymes that regulate the conformational changes in DNA topology by catalysing the concerted breakage and rejoining of DNA strands during normal cellular growth." Q#16783 - CGI_10009509 superfamily 243181 263 415 1.43E-84 274.934 cl02783 TopoII_MutL_Trans superfamily - - "MutL_Trans: transducer domain, having a ribosomal S5 domain 2-like fold, conserved in the C-terminal domain of type II DNA topoisomerases (Topo II) and DNA mismatch repair (MutL/MLH1/PMS2) proteins. This transducer domain is homologous to the second domain of the DNA gyrase B subunit, which is known to be important in nucleotide hydrolysis and the transduction of structural signals from ATP-binding site to the DNA breakage/reunion regions of the enzymes. The GyrB dimerizes in response to ATP binding, and is homologous to the N-terminal half of eukaryotic Topo II and the ATPase fragment of MutL. Type II DNA topoisomerases catalyze the ATP-dependent transport of one DNA duplex through another, in the process generating transient double strand breaks via covalent attachments to both DNA strands at the 5' positions. Included in this group are proteins similar to human MLH1 and PMS2. MLH1 forms a heterodimer with PMS2 which functions in meiosis and in DNA mismatch repair (MMR). Cells lacking either hMLH1 or hPMS2 have a strong mutator phenotype and display microsatellite instability (MSI). Mutation in hMLH1 accounts for a large fraction of Lynch syndrome (HNPCC) families." Q#16783 - CGI_10009509 superfamily 242046 453 573 2.06E-70 233.346 cl00718 TOPRIM superfamily - - "Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function." Q#16783 - CGI_10009509 superfamily 241593 79 172 9.15E-06 45.7154 cl00075 HATPase_c superfamily C - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#16784 - CGI_10009510 superfamily 241644 7 159 9.23E-61 187.024 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#16785 - CGI_10009511 superfamily 245201 148 382 1.48E-67 227.034 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16786 - CGI_10009512 superfamily 241993 2 105 1.40E-23 91.6063 cl00630 YdcF-like superfamily - - "YdcF-like. YdcF-like is a large family of mainly bacterial proteins, with a few members found in fungi, plants, and archaea. Escherichia coli YdcF has been shown to bind S-adenosyl-L-methionine (AdoMet), but a biochemical function has not been idenitified. The family also includes Escherichia coli sanA and Salmonella typhimurium sfiX, which are involved in vancomycin resistance; sfiX may also be involved in murein synthesis." Q#16788 - CGI_10009514 superfamily 222366 833 1112 6.07E-120 395.93 cl16381 E3_UbLigase_R4 superfamily C - E3 ubiquitin-protein ligase UBR4; This is a family of E## ubiquitin ligase enzymes. Q#16788 - CGI_10009514 superfamily 217293 32 219 2.03E-30 121.586 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16788 - CGI_10009514 superfamily 191182 539 611 3.83E-28 111.624 cl04917 Nsp1_C superfamily C - Nsp1-like C-terminal region; This family probably forms a coiled-coil. This important region of Nsp1 is involved in binding Nup82. Q#16788 - CGI_10009514 superfamily 222274 295 375 2.97E-13 68.2815 cl18658 Nucleoporin_FG superfamily - - "Nucleoporin FG repeat region; This family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145." Q#16788 - CGI_10009514 superfamily 222003 1285 1381 5.24E-05 43.0642 cl17871 Hydrolase_like superfamily - - HAD-hyrolase-like; HAD-hyrolase-like. Q#16788 - CGI_10009514 superfamily 202474 226 270 0.00150699 40.3297 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16788 - CGI_10009514 superfamily 248469 1230 1332 0.0034555 38.1199 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#16789 - CGI_10009515 superfamily 245596 598 798 1.03E-78 261.859 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16789 - CGI_10009515 superfamily 245596 476 531 1.07E-08 56.5472 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16790 - CGI_10009516 superfamily 245596 112 329 5.11E-75 245.295 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16790 - CGI_10009516 superfamily 245596 26 82 1.33E-09 58.088 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16791 - CGI_10009517 superfamily 245596 385 613 4.47E-79 255.695 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16791 - CGI_10009517 superfamily 245596 299 358 5.82E-10 58.8584 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#16792 - CGI_10009518 superfamily 241832 430 532 5.93E-42 148.594 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16792 - CGI_10009518 superfamily 243077 19 73 1.01E-17 78.7413 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#16792 - CGI_10009518 superfamily 241832 363 424 0.000131871 41.057 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16792 - CGI_10009518 superfamily 241832 115 213 7.71E-45 156.531 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16792 - CGI_10009518 superfamily 241832 538 647 3.46E-38 138.194 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16792 - CGI_10009518 superfamily 241832 656 735 2.78E-24 98.9033 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16793 - CGI_10009519 superfamily 193687 2 146 1.62E-74 222.86 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#16794 - CGI_10007189 superfamily 241749 29 169 5.99E-26 97.8417 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#16795 - CGI_10007190 superfamily 241571 311 437 2.52E-07 49.3331 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#16795 - CGI_10007190 superfamily 245213 442 475 0.000127178 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16795 - CGI_10007190 superfamily 241583 87 274 1.79E-38 140.014 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#16796 - CGI_10007191 superfamily 245213 655 683 0.00096144 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16796 - CGI_10007191 superfamily 245213 454 484 0.00698343 35.6902 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16796 - CGI_10007191 superfamily 241583 96 283 8.74E-40 146.562 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#16796 - CGI_10007191 superfamily 246918 693 745 6.62E-18 79.9383 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16796 - CGI_10007191 superfamily 246918 750 802 5.45E-17 77.2419 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16796 - CGI_10007191 superfamily 246918 865 916 5.78E-17 77.2419 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16796 - CGI_10007191 superfamily 246918 808 859 2.35E-16 75.3159 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16796 - CGI_10007191 superfamily 243051 937 1074 1.64E-13 69.7141 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#16797 - CGI_10007192 superfamily 241613 153 184 7.29E-05 39.8826 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#16797 - CGI_10007192 superfamily 241613 202 242 0.000100043 39.4974 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#16797 - CGI_10007192 superfamily 241571 33 127 0.00415619 35.4659 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#16798 - CGI_10007193 superfamily 202203 42 107 9.06E-34 120.751 cl03534 E2F_TDP superfamily - - "E2F/DP family winged-helix DNA-binding domain; This family contains the transcription factor E2F and its dimerisation partners TDP1 and TDP2, which stimulate E2F-dependent transcription. E2F binds to DNA as a homodimer or as a heterodimer in association with TDP1/2, the heterodimer having increased binding efficiency. The crystal structure of an E2F4-DP2-DNA complex shows that the DNA-binding domains of the E2F and DP proteins both have a fold related to the winged-helix DNA-binding motif. Recognition of the central c/gGCGCg/c sequence of the consensus DNA-binding site is symmetric, and amino acids that contact these bases are conserved among all known E2F and DP proteins." Q#16801 - CGI_10007196 superfamily 207794 218 572 9.05E-145 429.325 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#16801 - CGI_10007196 superfamily 111707 146 216 3.94E-15 73.218 cl03741 Glyco_hydro_20b superfamily N - "Glycosyl hydrolase family 20, domain 2; This domain has a zincin-like fold." Q#16802 - CGI_10007197 superfamily 217904 3 503 7.28E-141 422.204 cl04404 Gpi16 superfamily - - "Gpi16 subunit, GPI transamidase component; GPI (glycosyl phosphatidyl inositol) transamidase is a multi-protein complex. Gpi16, Gpi8 and Gaa1 for a sub-complex of the GPI transamidase. GPI transamidase that adds glycosylphosphatidylinositols (GPIs) to newly synthesised proteins. Gpi16 is an essential N-glycosylated transmembrane glycoprotein. Gpi16 is largely found on the lumenal side of the ER. It has a single C-terminal transmembrane domain and a small C-terminal, cytosolic extension with an ER retrieval motif." Q#16803 - CGI_10007198 superfamily 202224 1434 1549 1.21E-24 101.989 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#16803 - CGI_10007198 superfamily 210240 1176 1216 1.11E-16 76.5326 cl15840 JmjN superfamily - - jmjN domain; jmjN domain. Q#16803 - CGI_10007198 superfamily 243120 1242 1330 6.47E-14 69.9153 cl02633 ARID superfamily - - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#16805 - CGI_10006142 superfamily 247769 148 201 7.30E-05 41.1709 cl17215 HDc superfamily C - Metal dependent phosphohydrolases with conserved 'HD' motif Q#16808 - CGI_10006145 superfamily 247856 228 279 0.00183862 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#16809 - CGI_10006146 superfamily 247742 128 173 0.00924379 34.5097 cl17188 enolase_like superfamily NC - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#16810 - CGI_10012904 superfamily 245226 271 432 1.56E-20 87.7412 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#16812 - CGI_10012906 superfamily 241563 156 190 0.00135703 37.0736 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16814 - CGI_10012908 superfamily 248275 1 106 1.35E-26 97.4578 cl17721 zf-C2H2_jaz superfamily - - "Zinc-finger double-stranded RNA-binding; This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation." Q#16816 - CGI_10012910 superfamily 241613 522 556 1.19E-08 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#16816 - CGI_10012910 superfamily 241613 559 593 2.38E-08 51.4386 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#16816 - CGI_10012910 superfamily 241613 487 515 3.94E-07 47.9718 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#16816 - CGI_10012910 superfamily 243075 107 186 1.88E-32 122.041 cl02536 SAND superfamily - - "SAND domain; The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerisation. This region is also found in the putative transcription factor RegA from the multicellular green alga Volvox cateri. This region of RegA is known as the VARL domain." Q#16818 - CGI_10012913 superfamily 222150 322 347 0.000723573 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16818 - CGI_10012913 superfamily 222150 209 234 0.00849382 33.9045 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16819 - CGI_10012914 superfamily 242422 46 141 2.04E-19 86.874 cl01306 LCM superfamily C - Leucine carboxyl methyltransferase; Family of leucine carboxyl methyltransferases EC:2.1.1.-. This family may need divides a the full alignment contains a significantly shorter mouse sequence. Q#16819 - CGI_10012914 superfamily 243146 568 614 2.53E-05 42.6615 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16820 - CGI_10012915 superfamily 217293 2 129 3.18E-25 100.785 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16820 - CGI_10012915 superfamily 202474 136 224 1.61E-24 98.88 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16820 - CGI_10012915 superfamily 202474 264 287 5.70E-05 42.2557 cl08379 Neur_chan_memb superfamily N - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16823 - CGI_10012918 superfamily 243110 229 283 0.000519125 39.3349 cl02616 MACPF superfamily C - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#16824 - CGI_10012919 superfamily 243190 23 96 4.11E-28 98.5342 cl02794 Cyt_c_Oxidase_VIb superfamily - - "Cytochrome c oxidase subunit VIb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIb is one of three mammalian subunits that lacks a transmembrane region. It is located on the cytosolic side of the membrane and helps form the dimer interface with the corresponding subunit on the other monomer complex." Q#16825 - CGI_10012920 superfamily 247038 539 619 2.73E-06 45.9152 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#16825 - CGI_10012920 superfamily 241750 218 515 7.05E-69 229.852 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#16825 - CGI_10012920 superfamily 241750 78 131 0.000161291 43.0299 cl00281 metallo-dependent_hydrolases superfamily C - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#16829 - CGI_10009260 superfamily 243072 103 224 1.80E-34 127.115 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16829 - CGI_10009260 superfamily 216554 331 443 4.66E-28 110.647 cl15977 zf-DHHC superfamily C - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#16829 - CGI_10009260 superfamily 216554 491 574 0.00719313 36.6886 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#16829 - CGI_10009260 superfamily 243072 40 91 0.00869161 35.4371 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#16830 - CGI_10009261 superfamily 216082 9 439 2.32E-74 254.294 cl08284 Glyco_hydro_15 superfamily C - Glycosyl hydrolases family 15; In higher organisms this family is represented by phosphorylase kinase subunits. Q#16830 - CGI_10009261 superfamily 216082 911 1024 1.61E-11 66.7023 cl08284 Glyco_hydro_15 superfamily N - Glycosyl hydrolases family 15; In higher organisms this family is represented by phosphorylase kinase subunits. Q#16831 - CGI_10009262 superfamily 202478 19 99 2.78E-05 42.7879 cl03789 UcrQ superfamily - - UcrQ family; The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This family represents the 9.5 kDa subunit of the complex. Q#16833 - CGI_10009264 superfamily 241564 195 259 1.33E-09 53.8087 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#16838 - CGI_10009269 superfamily 248097 1 109 1.10E-16 70.757 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16839 - CGI_10009270 superfamily 248097 92 219 1.87E-17 75.3794 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16840 - CGI_10009271 superfamily 248097 104 182 1.94E-12 62.6678 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16840 - CGI_10009271 superfamily 248097 238 304 1.42E-07 48.8006 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16841 - CGI_10009272 superfamily 248097 92 220 6.66E-19 79.2314 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16842 - CGI_10009273 superfamily 248097 171 298 2.93E-18 78.461 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16843 - CGI_10009274 superfamily 248097 91 218 2.71E-21 85.7798 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16844 - CGI_10009275 superfamily 248097 88 216 1.31E-17 75.7646 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16846 - CGI_10009277 superfamily 247794 16 314 1.72E-143 410.247 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#16847 - CGI_10009278 superfamily 222430 344 484 1.50E-46 160.489 cl16445 Nup54 superfamily - - "Nucleoporin complex subunit 54; This is the human Nup54 subunit of the nucleoporin complex, equivalent to Nup57 of yeast. Nup54, Nup58 and Nup62 all have similar affinities for importin-beta. It seems likely that they are the only FG-repeat nucleoporins of the central channel, and as such they would form a zone of equal affinity spanning the central channel. The diffusion of importin-beta import complexes through the central channel may be a stochastic process as the affinities are similar, whereas movement from cytoplasmic fibrils to the central channel and from the central channel to the nuclear basket would be facilitated by the subtle differences in affinity between them." Q#16848 - CGI_10009279 superfamily 248097 6 137 1.74E-16 70.757 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16849 - CGI_10013853 superfamily 245213 273 308 1.98E-09 54.1798 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16849 - CGI_10013853 superfamily 245213 347 382 5.74E-09 53.0242 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16849 - CGI_10013853 superfamily 245213 310 345 4.04E-08 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16849 - CGI_10013853 superfamily 245213 237 271 6.59E-08 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16849 - CGI_10013853 superfamily 243124 86 232 7.35E-37 135.632 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#16849 - CGI_10013853 superfamily 243124 605 716 4.58E-20 87.8676 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#16849 - CGI_10013853 superfamily 243124 429 487 9.56E-09 53.9701 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#16872 - CGI_10007783 superfamily 243061 1 102 1.97E-37 123.606 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#16873 - CGI_10007784 superfamily 248458 113 248 2.90E-15 75.8133 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16873 - CGI_10007784 superfamily 248458 314 488 2.73E-10 60.7905 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16874 - CGI_10007785 superfamily 244882 73 507 1.14E-94 296.095 cl08270 Peptidase_S10 superfamily - - Serine carboxypeptidase; Serine carboxypeptidase. Q#16875 - CGI_10007786 superfamily 202715 109 206 5.61E-41 136.554 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#16876 - CGI_10007787 superfamily 241832 1 61 2.02E-35 118.385 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#16877 - CGI_10006729 superfamily 243082 109 170 9.07E-11 59.419 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#16881 - CGI_10006733 superfamily 248013 303 342 0.00383001 34.882 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#16888 - CGI_10015280 superfamily 248293 3 87 0.000837057 34.6419 cl17739 MADF_DNA_bdg superfamily - - Alcohol dehydrogenase transcription factor Myb/SANT-like; The myb/SANT-like domain in Adf-1 (MADF) is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Q#16890 - CGI_10015282 superfamily 245835 452 657 6.81E-65 217.183 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#16890 - CGI_10015282 superfamily 248264 703 793 9.77E-16 75.7365 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#16890 - CGI_10015282 superfamily 243066 20 57 0.000286475 40.2933 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#16891 - CGI_10015283 superfamily 245823 66 584 0 611.593 cl11976 SNF superfamily - - Sodium:neurotransmitter symporter family; Sodium:neurotransmitter symporter family. Q#16893 - CGI_10015285 superfamily 246748 630 659 0.00814833 37.5836 cl14876 Zinc_peptidase_like superfamily N - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#16894 - CGI_10015286 superfamily 190308 254 457 9.96E-57 190.993 cl18163 Fringe superfamily - - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#16895 - CGI_10015287 superfamily 245201 8 262 1.27E-64 218.58 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#16895 - CGI_10015287 superfamily 221402 712 921 6.35E-42 154.008 cl13493 DUF3543 superfamily - - Domain of unknown function (DUF3543); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 217 to 291 amino acids in length. This domain is found associated with pfam00069. This domain has a single completely conserved residue A that may be functionally important. Q#16896 - CGI_10015288 superfamily 247799 456 512 2.80E-09 54.4883 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#16896 - CGI_10015288 superfamily 243098 621 667 1.60E-07 49.1335 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#16898 - CGI_10015290 superfamily 220416 35 321 1.72E-62 203.807 cl10783 Morph_protein1 superfamily - - "Defects in morphology protein 1, mitochondrial precursor; Members of this family of proteins are thought to be involved in cellular morphology, though little else is known about them." Q#16902 - CGI_10015294 superfamily 241574 43 177 3.48E-39 132.732 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#16904 - CGI_10015296 superfamily 241574 44 178 1.01E-39 133.888 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#16905 - CGI_10015297 superfamily 222150 788 813 6.68E-06 44.6901 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16905 - CGI_10015297 superfamily 222150 306 331 1.07E-05 43.9197 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16905 - CGI_10015297 superfamily 222150 38 63 3.89E-05 42.3789 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16905 - CGI_10015297 superfamily 222150 643 668 0.000229416 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16907 - CGI_10015299 superfamily 222150 155 180 0.000184131 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16907 - CGI_10015299 superfamily 222150 267 291 0.000301644 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#16908 - CGI_10015300 superfamily 247727 142 281 0.000826063 38.5651 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#16909 - CGI_10015301 superfamily 241583 209 408 1.43E-85 276.812 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#16909 - CGI_10015301 superfamily 216572 14 131 2.94E-15 74.2334 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#16909 - CGI_10015301 superfamily 204025 1096 1126 7.22E-07 47.6313 cl07344 PLAC superfamily - - PLAC (protease and lacunin) domain; The PLAC (protease and lacunin) domain is a short six-cysteine region that is usually found at the C terminal of proteins. It is found in a range of proteins including PACE4 (paired basic amino acid cleaving enzyme 4) and the extracellular matrix protein lacunin. Q#16909 - CGI_10015301 superfamily 246918 910 964 0.000201643 40.6479 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16909 - CGI_10015301 superfamily 246918 1041 1091 0.000267134 40.2627 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16909 - CGI_10015301 superfamily 246918 974 1031 0.00229097 37.5663 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16909 - CGI_10015301 superfamily 246918 524 559 0.00486817 36.4107 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16909 - CGI_10015301 superfamily 221428 586 636 0.00572289 36.5975 cl13541 Salp15 superfamily N - "Salivary protein of 15kDa inhibits CD4+ T cell activation; This is a family of 15kDa salivary proteins from Acari Arachnids that is induced on feeding and assists the parasite to remain attached to its arthropod host. By repressing calcium fluxes triggered by TCR engagement, Salp15 inhibits CD4+ T cell activation. Salp15 shows weak similarity to Inhibin A, a member of the TGF-beta superfamily that inhibits the production of cytokines and the proliferation of T cells." Q#16910 - CGI_10015302 superfamily 243146 79 131 1.48E-06 45.3579 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16910 - CGI_10015302 superfamily 243146 193 242 4.74E-05 41.1207 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16910 - CGI_10015302 superfamily 243146 120 167 0.000852544 37.2063 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#16911 - CGI_10015303 superfamily 244530 8 93 1.37E-23 91.4524 cl06844 SRR1 superfamily N - SRR1; SRR1 proteins are signalling proteins involved in regulating the circadian clock in Arabidopsis. Q#16912 - CGI_10007978 superfamily 218118 54 115 4.33E-05 38.3641 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#16913 - CGI_10007979 superfamily 241578 2509 2666 4.02E-15 76.1762 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16913 - CGI_10007979 superfamily 241578 2307 2462 1.78E-14 74.2502 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16913 - CGI_10007979 superfamily 241578 1408 1566 3.54E-12 67.3166 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16913 - CGI_10007979 superfamily 241578 1164 1294 2.92E-10 61.1534 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16913 - CGI_10007979 superfamily 247068 420 511 5.94E-05 44.2266 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#16913 - CGI_10007979 superfamily 247068 2856 2950 0.00020491 42.6858 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#16913 - CGI_10007979 superfamily 152683 1940 2042 3.27E-10 60.3793 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#16913 - CGI_10007979 superfamily 241578 969 1136 6.01E-07 50.8497 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16913 - CGI_10007979 superfamily 246918 1851 1897 3.89E-06 47.1963 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16913 - CGI_10007979 superfamily 241578 273 400 7.61E-06 47.2862 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16913 - CGI_10007979 superfamily 152683 799 900 9.26E-06 46.8973 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#16913 - CGI_10007979 superfamily 216897 644 702 0.000102777 43.4389 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#16913 - CGI_10007979 superfamily 246918 711 756 0.00471306 37.9515 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16913 - CGI_10007979 superfamily 246918 15 60 0.00568059 37.9515 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#16914 - CGI_10006284 superfamily 248012 34 140 4.83E-09 53.7357 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#16915 - CGI_10006285 superfamily 248012 31 160 7.11E-08 50.3996 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#16918 - CGI_10006194 superfamily 241574 33 205 6.52E-63 198.964 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#16920 - CGI_10006196 superfamily 241752 2005 2123 9.83E-30 117.42 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#16920 - CGI_10006196 superfamily 241554 1545 1685 4.75E-25 104.265 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#16920 - CGI_10006196 superfamily 241554 1168 1295 1.72E-20 90.7827 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#16920 - CGI_10006196 superfamily 241554 1353 1493 1.43E-16 79.2267 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#16920 - CGI_10006196 superfamily 247723 571 643 1.06E-06 48.8028 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#16921 - CGI_10006197 superfamily 243090 170 282 1.44E-40 146.314 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#16921 - CGI_10006197 superfamily 241645 420 492 4.79E-28 109.386 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16921 - CGI_10006197 superfamily 241645 491 557 1.39E-07 50.3599 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#16921 - CGI_10006197 superfamily 245875 693 714 7.67E-06 44.2317 cl12112 GoLoco superfamily - - GoLoco motif; GoLoco motif. Q#16923 - CGI_10006199 superfamily 247725 714 847 1.21E-48 170.651 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#16923 - CGI_10006199 superfamily 241622 432 508 7.25E-17 77.9922 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#16923 - CGI_10006199 superfamily 241862 139 391 4.00E-24 103.975 cl00437 COG0428 superfamily - - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#16924 - CGI_10006200 superfamily 246709 31 251 1.72E-130 373.073 cl14782 RNase_H superfamily - - "RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, Type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription." Q#16926 - CGI_10006202 superfamily 241900 89 143 8.53E-12 60.3084 cl00490 EEP superfamily NC - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#16926 - CGI_10006202 superfamily 241900 19 88 0.000615616 37.9668 cl00490 EEP superfamily C - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#16927 - CGI_10006203 superfamily 193607 287 417 2.41E-65 207.424 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#16927 - CGI_10006203 superfamily 247792 238 279 1.57E-10 56.6852 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16929 - CGI_10003455 superfamily 241629 42 149 5.08E-19 86.0348 cl00133 SCP superfamily N - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#16929 - CGI_10003455 superfamily 247097 575 606 3.29E-06 45.8333 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#16930 - CGI_10003456 superfamily 247057 749 816 3.05E-19 83.3543 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#16930 - CGI_10003456 superfamily 221744 114 322 3.76E-10 59.7571 cl18614 CABIT superfamily N - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#16931 - CGI_10000528 superfamily 209898 51 73 2.54E-05 37.383 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#16933 - CGI_10000532 superfamily 243092 123 210 8.60E-05 43.4776 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#16935 - CGI_10001800 superfamily 241632 1 145 4.58E-46 155.103 cl00137 SERPIN superfamily N - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#16936 - CGI_10001801 superfamily 241632 3 79 2.78E-20 81.8847 cl00137 SERPIN superfamily C - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#16937 - CGI_10001802 superfamily 241632 1 177 3.71E-56 182.838 cl00137 SERPIN superfamily N - "SERine Proteinase INhibitors (serpins) exhibit conformational polymorphism shifting from native to cleaved, latent, delta, or polymorphic forms. Many serpins, such as antitrypsin and antichymotrypsin, function as serine protease inhibitors which regulate blood coagulation cascades. Non-inhibitory serpins perform many diverse functions such as chaperoning proteins or transporting hormones. Serpins are of medical interest because mutants have been associated with blood clotting disorders, emphysema, cirrhosis, and dementia." Q#16938 - CGI_10001803 superfamily 241629 69 200 6.85E-40 138.602 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#16939 - CGI_10001805 superfamily 248097 8 123 6.13E-16 68.831 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16940 - CGI_10004596 superfamily 245231 21 107 1.48E-25 97.2063 cl10019 PurM-like superfamily NC - "AIR (aminoimidazole ribonucleotide) synthase related protein. This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM (formylglycinamidine ribonucleotide) synthase and Selenophosphate synthetase (SelD). The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain." Q#16942 - CGI_10004598 superfamily 216101 76 460 5.59E-124 376.248 cl08288 Carn_acyltransf superfamily N - Choline/Carnitine o-acyltransferase; Choline/Carnitine o-acyltransferase. Q#16943 - CGI_10004599 superfamily 248013 73 121 2.00E-10 57.2739 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#16943 - CGI_10004599 superfamily 243091 339 463 3.54E-39 140.162 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#16943 - CGI_10004599 superfamily 243114 232 331 3.24E-35 128.682 cl02622 Pre-SET superfamily - - Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains. Q#16945 - CGI_10000691 superfamily 243077 26 78 6.57E-13 60.6369 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#16947 - CGI_10002218 superfamily 247792 142 198 3.93E-11 56.1151 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#16948 - CGI_10002219 superfamily 217505 21 448 4.57E-159 459.788 cl04021 Serinc superfamily - - Serine incorporator (Serinc); This is a family of eukaryotic membrane proteins which incorporate serine into membranes and facilitate the synthesis of the serine-derived lipids phosphatidylserine and sphingolipid. Members of this family contain 11 transmembrane domains and form intracellular complexes with key enzymes involved in serine and sphingolipid biosynthesis. Q#16952 - CGI_10006149 superfamily 245213 23 64 0.00286757 36.456 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16952 - CGI_10006149 superfamily 245213 118 159 0.00588778 35.6856 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16953 - CGI_10006150 superfamily 241578 363 402 1.20E-05 45.4536 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16953 - CGI_10006150 superfamily 241578 152 195 1.85E-05 45.0684 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16953 - CGI_10006150 superfamily 241578 572 611 2.43E-05 44.6832 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16953 - CGI_10006150 superfamily 241578 442 480 0.000256712 41.6016 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16953 - CGI_10006150 superfamily 245213 282 315 0.0027505 36.0708 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#16953 - CGI_10006150 superfamily 241578 403 434 0.00399496 37.7496 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16954 - CGI_10006151 superfamily 241563 63 97 7.59E-05 40.7336 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#16955 - CGI_10006152 superfamily 110440 350 378 0.00600431 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#16956 - CGI_10001602 superfamily 243035 152 265 0.000874495 37.5794 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#16957 - CGI_10021155 superfamily 193307 19 93 4.84E-12 57.2661 cl14916 MFS_1_like superfamily - - MFS_1 like family; In fungal members this domain is found at the C-terminus of putative transporter proteins. Q#16958 - CGI_10021156 superfamily 220238 51 151 2.43E-26 99.2888 cl18552 DUF2012 superfamily - - Protein of unknown function (DUF2012); This is a eukaryotic family of uncharacterized proteins. Q#16960 - CGI_10021158 superfamily 246675 13 307 2.35E-98 294.149 cl14615 PI-PLCc_GDPD_SF superfamily - - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#16961 - CGI_10021159 superfamily 241795 6 135 1.76E-89 259.323 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#16963 - CGI_10021161 superfamily 245206 4 54 4.80E-10 52.9735 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#16964 - CGI_10021162 superfamily 241795 9 140 1.70E-91 266.612 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#16964 - CGI_10021162 superfamily 147395 152 193 1.87E-15 67.2244 cl04973 Dpy-30 superfamily - - Dpy-30 motif; This motif is found in a wide variety of domain contexts. It is found in the Dpy-30 proteins hence the motifs name. It is about 40 residues long and is probably fomed of two alpha-helices. It may be a dimerisation motif analogous to pfam02197 (Bateman A pers obs). Q#16966 - CGI_10021164 superfamily 248019 24 124 4.96E-20 85.7988 cl17465 DAGK_cat superfamily C - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#16966 - CGI_10021164 superfamily 248019 394 480 0.00579257 37.9423 cl17465 DAGK_cat superfamily N - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#16966 - CGI_10021164 superfamily 248019 111 262 0.00913556 37.1719 cl17465 DAGK_cat superfamily NC - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#16967 - CGI_10021165 superfamily 245840 10 186 1.53E-48 158.284 cl12022 Ribosomal_L18e superfamily - - Ribosomal protein L18e/L15; This family includes eukaryotic L18 as well as prokaryotic L15. Q#16971 - CGI_10021169 superfamily 247905 374 483 2.32E-33 124.658 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#16971 - CGI_10021169 superfamily 244562 572 663 2.36E-22 92.2447 cl06960 GUCT superfamily - - "RNA-binding GUCT domain found in the RNA helicase II/Gu protein family; This family includes vertebrate RNA helicase II/Gualpha (RH-II/Gualpha) and RNA helicase II/Gubeta (RH-II/Gubeta), both of which consist of a DEAD box helicase domain (DEAD), a helicase conserved C-terminal domain, and a Gu C-terminal (GUCT) domain. They localize to nucleoli, suggesting roles in ribosomal RNA production, but RH-II/Gubeta also localizes to nuclear speckles containing the splicing factor SC35, suggesting its possible involvement in pre-mRNA splicing. In contrast to RH-II/Gualpha, RH-II/Gubeta has RNA-unwinding activity, but no RNA-folding activity. The family also contains plant DEAD-box ATP-dependent RNA helicase 7 (RH7 or PRH75), Thermus thermophilus heat resistant RNA-dependent ATPase (Hera) and similar proteins. RH7 is a new nucleus-localized member of the DEAD-box protein family from higher plants. It displays a weak ATPase activity which is barely stimulated by RNA ligands. RH7 contains an N-terminal KDES domain rich in lysine, glutamic acid, aspartic acid, and serine residues, seven highly conserved helicase motifs in the central region, a GUCT domain, and a C-terminal GYR domain harboring a large number of glycine residues interrupted by either arginines or tyrosines. Thermus thermophilus Hera is a DEAD box helicase that binds fragments of 23S rRNA and RNase P RNA via its C-terminal domain. It contains a helicase core that harbors two RecA-like domains termed RecA_N and RecA_C, a dimerization domain (DD), and a C-terminal RNA-binding domain (RBD) that reveals a compact, RRM-like fold and shows sequence similarity with the typical GUCT domain found in the RNA helicase II/Gu protein family." Q#16971 - CGI_10021169 superfamily 247805 157 336 2.07E-60 202.33 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#16972 - CGI_10021170 superfamily 241606 16 121 1.16E-25 98.9025 cl00096 IRF superfamily - - Interferon Regulatory Factor (IRF); also known as tryptophan pentad repeat. The family of IRF transcription factors is important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. The IRF family is characterized by a unique 'tryptophan cluster' DNA-binding region. Viral IRFs bind to cellular IRFs; block type I and II interferons and host IRF-mediated transcriptional activation. Q#16973 - CGI_10021171 superfamily 241606 22 128 8.70E-31 113.925 cl00096 IRF superfamily - - Interferon Regulatory Factor (IRF); also known as tryptophan pentad repeat. The family of IRF transcription factors is important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. The IRF family is characterized by a unique 'tryptophan cluster' DNA-binding region. Viral IRFs bind to cellular IRFs; block type I and II interferons and host IRF-mediated transcriptional activation. Q#16977 - CGI_10021175 superfamily 248458 109 248 1.54E-15 76.5837 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16977 - CGI_10021175 superfamily 248458 323 500 5.04E-11 63.1017 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#16980 - CGI_10021178 superfamily 246908 441 538 2.67E-32 119.455 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#16981 - CGI_10021179 superfamily 241640 795 1023 1.32E-75 249.116 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#16981 - CGI_10021179 superfamily 241578 466 596 1.90E-05 44.8642 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#16981 - CGI_10021179 superfamily 241640 188 282 2.47E-09 57.2862 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#16984 - CGI_10021182 superfamily 242406 4 86 3.79E-10 52.2085 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#16985 - CGI_10021183 superfamily 217293 22 227 5.53E-32 120.43 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16985 - CGI_10021183 superfamily 202474 234 284 1.71E-09 56.1229 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#16986 - CGI_10021184 superfamily 217293 37 229 5.81E-31 117.349 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#16987 - CGI_10021185 superfamily 241567 158 397 8.31E-55 183.185 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#16989 - CGI_10021187 superfamily 245226 11 146 6.02E-51 174.059 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#16989 - CGI_10021187 superfamily 245226 500 533 1.03E-05 44.6318 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#16990 - CGI_10021188 superfamily 245226 610 810 8.99E-36 135.49 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#16992 - CGI_10005421 superfamily 248012 581 725 1.06E-11 63.1112 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#16992 - CGI_10005421 superfamily 241646 90 120 0.000549293 38.5858 cl00156 WAP superfamily C - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#16992 - CGI_10005421 superfamily 214507 293 347 0.00136962 37.4096 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#16993 - CGI_10005422 superfamily 247912 1 333 4.61E-44 159.204 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#16993 - CGI_10005422 superfamily 221337 383 468 3.10E-06 45.3855 cl13401 DUF3471 superfamily - - "Domain of unknown function (DUF3471); This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144." Q#16994 - CGI_10005423 superfamily 247912 509 842 7.40E-35 136.477 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#16994 - CGI_10005423 superfamily 247912 1 334 3.14E-24 104.506 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#16994 - CGI_10005423 superfamily 221337 380 411 0.0012491 38.4519 cl13401 DUF3471 superfamily C - "Domain of unknown function (DUF3471); This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144." Q#16995 - CGI_10005424 superfamily 247912 4 305 3.40E-22 95.646 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#16996 - CGI_10005425 superfamily 247912 46 373 1.03E-15 76.7712 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#16997 - CGI_10005426 superfamily 247912 32 96 4.28E-13 68.682 cl17358 Beta-lactamase superfamily C - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#16997 - CGI_10005426 superfamily 247912 176 304 5.16E-05 43.6441 cl17358 Beta-lactamase superfamily N - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#16998 - CGI_10005427 superfamily 248097 14 101 1.94E-10 54.9638 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#16999 - CGI_10005428 superfamily 248097 69 190 4.99E-22 87.3206 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17000 - CGI_10005429 superfamily 248097 73 194 4.77E-18 76.535 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17001 - CGI_10005430 superfamily 246918 17 69 8.33E-13 58.7523 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#17003 - CGI_10005432 superfamily 246597 362 558 3.40E-39 142.047 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#17003 - CGI_10005432 superfamily 219833 7 250 3.02E-20 89.7631 cl07152 Pol_alpha_B_N superfamily - - DNA polymerase alpha subunit B N-terminal; This is the eukaryotic DNA polymerase alpha subunit B N-terminal domain which is involved in complex formation. Also see pfam04058. Q#17006 - CGI_10000907 superfamily 245864 4 134 3.33E-26 103.512 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#17007 - CGI_10015628 superfamily 247058 2 186 1.11E-55 182.76 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#17007 - CGI_10015628 superfamily 202367 298 328 7.51E-07 47.5344 cl18226 3HCDH_N superfamily C - "3-hydroxyacyl-CoA dehydrogenase, NAD binding domain; This family also includes lambda crystallin." Q#17008 - CGI_10015629 superfamily 199166 287 506 1.48E-10 60.0336 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#17008 - CGI_10015629 superfamily 243074 229 274 4.91E-07 47.1161 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#17009 - CGI_10015630 superfamily 114903 23 114 2.40E-10 56.0647 cl05626 BAMBI superfamily N - "BMP and activin membrane-bound inhibitor (BAMBI) N-terminal domain; This family consists of several eukaryotic BMP and activin membrane-bound inhibitor (BAMBI) proteins. Members of the transforming growth factor-beta (TGF-beta) superfamily, including TGF-beta, bone morphogenetic proteins (BMPs), activins and nodals, are vital for regulating growth and differentiation. BAMBI is related to TGF-beta-family type I receptors but lacks an intracellular kinase domain. BAMBI is co-expressed with the ventralising morphogen BMP4 during Xenopus embryogenesis and requires BMP signalling for its expression. The protein stably associates with TGF-beta-family receptors and inhibits BMP and activin as well as TGF-beta signalling." Q#17010 - CGI_10015631 superfamily 218259 324 459 4.51E-36 130.467 cl04742 Bile_Hydr_Trans superfamily - - "Acyl-CoA thioester hydrolase/BAAT N-terminal region; This family consists of the amino termini of acyl-CoA thioester hydrolase and bile acid-CoA:amino acid N-acetyltransferase (BAAT). This region is not thought to contain the active site of either enzyme. Thioesterase isoforms have been identified in peroxisomes, cytoplasm and mitochondria, where they are thought to have distinct functions in lipid metabolism. For example, in peroxisomes, the hydrolase acts on bile-CoA esters." Q#17010 - CGI_10015631 superfamily 192535 20 283 0.000216898 41.8126 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#17013 - CGI_10015634 superfamily 247757 565 679 3.22E-20 88.221 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#17014 - CGI_10015635 superfamily 245225 516 956 6.64E-55 200.547 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#17014 - CGI_10015635 superfamily 245225 41 469 2.31E-48 181.287 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#17014 - CGI_10015635 superfamily 245225 1244 1418 6.83E-13 71.8511 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#17014 - CGI_10015635 superfamily 242034 1657 1716 0.00215413 39.4472 cl00695 Lysine_decarbox superfamily N - "Possible lysine decarboxylase; The members of this family share a highly conserved motif PGGXGTXXE that is probably functionally important. This family includes proteins annotated as lysine decarboxylases, although the evidence for this is not clear." Q#17017 - CGI_10015638 superfamily 245596 108 412 0 568.801 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#17018 - CGI_10015639 superfamily 116538 649 779 5.34E-43 153.6 cl06801 Vps54 superfamily - - "Vps54-like protein; This family contains various proteins that are homologs of the yeast Vps54 protein, such as the rat homolog , the human homolog, and the mouse homolog. In yeast, Vps54 associates with Vps52 and Vps53 proteins to form a trimolecular complex that is involved in protein transport between Golgi, endosomal, and vacuolar compartments. All Vps54 homologs contain a coiled coil region (not found in the region featured in this family) and multiple dileucine motifs." Q#17019 - CGI_10015640 superfamily 241760 4 50 1.10E-18 79.644 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#17019 - CGI_10015640 superfamily 247804 64 108 4.66E-09 52.1926 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#17019 - CGI_10015640 superfamily 203011 362 436 0.00056655 37.9569 cl04515 SWIRM superfamily - - SWIRM domain; This SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in chromosomal proteins. It contains a helix-turn helix motif and binds to DNA. Q#17020 - CGI_10015641 superfamily 243084 1361 1473 4.19E-35 131.749 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#17020 - CGI_10015641 superfamily 204507 22 122 6.90E-35 130.867 cl11169 WAC_Acf1_DNA_bd superfamily - - "ATP-utilising chromatin assembly and remodelling N-terminal; ACF (for ATP-utilising chromatin assembly and remodelling factor) is a chromatin-remodelling complex that catalyzes the ATP-dependent assembly of periodic nucleosome arrays. The WAC (WSTF/Acf1/cbp146) domain is an approximately 110-residue module present at the N-termini of Acf1-related proteins in a variety of organisms. The DNA-binding region of Acf1 includes the WAC domain, which is necessary for the efficient binding of ACF complex to DNA." Q#17020 - CGI_10015641 superfamily 247999 1131 1179 9.11E-16 74.0639 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#17020 - CGI_10015641 superfamily 247999 1229 1272 3.67E-09 55.1892 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#17020 - CGI_10015641 superfamily 243137 402 461 1.52E-08 53.4054 cl02674 DDT superfamily - - "DDT domain; This domain is approximately 60 residues in length, and is predicted to be a DNA binding domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors). It is exclusively associated with nuclear domains, and is thought to be arranged into three alpha helices." Q#17021 - CGI_10015642 superfamily 247749 1 326 0 598.841 cl17195 LDH_MDH_like superfamily - - "NAD-dependent, lactate dehydrogenase-like, 2-hydroxycarboxylate dehydrogenase family; Members of this family include ubiquitous enzymes like L-lactate dehydrogenases (LDH), L-2-hydroxyisocaproate dehydrogenases, and some malate dehydrogenases (MDH). LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH/MDH-like proteins are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others." Q#17022 - CGI_10015643 superfamily 241868 4 45 5.55E-05 39.7851 cl00447 Nudix_Hydrolase superfamily C - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#17023 - CGI_10015644 superfamily 218223 17 273 7.32E-09 54.6822 cl04698 Radial_spoke superfamily N - "Radial spokehead-like protein; This family includes the radial spoke head proteins RSP4 and RSP6 from Chlamydomonas reinhardtii, and several eukaryotic homologues, including mammalian RSHL1, the protein product of a familial ciliary dyskinesia candidate gene." Q#17024 - CGI_10015645 superfamily 243061 1023 1123 5.92E-21 92.0198 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17024 - CGI_10015645 superfamily 243061 159 261 5.23E-17 80.4638 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17024 - CGI_10015645 superfamily 243061 1847 1950 7.43E-09 56.321 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17024 - CGI_10015645 superfamily 243035 2498 2595 0.00019338 42.9722 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17025 - CGI_10015646 superfamily 220161 316 448 7.86E-22 90.8738 cl07783 Rubis-subs-bind superfamily - - "Rubisco LSMT substrate-binding; Members of this family adopt a multihelical structure, with an irregular array of long and short alpha-helices. They allow binding of the protein to substrate, such as the N-terminal tails of histones H3 and H4 and the large subunit of the Rubisco holoenzyme complex." Q#17026 - CGI_10015647 superfamily 243092 313 406 0.00994106 36.9292 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17027 - CGI_10015648 superfamily 241572 30 128 3.21E-17 77.2788 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#17027 - CGI_10015648 superfamily 241572 144 197 0.00170024 37.2181 cl00050 CYCLIN superfamily C - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#17028 - CGI_10015649 superfamily 247724 19 114 3.88E-11 56.816 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17030 - CGI_10000773 superfamily 247068 131 177 0.00198423 35.4031 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17031 - CGI_10001016 superfamily 245814 39 125 0.0084768 32.8625 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17033 - CGI_10011408 superfamily 245205 19 96 6.78E-17 70.7297 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#17034 - CGI_10011409 superfamily 241613 24 59 4.24E-07 45.2754 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#17037 - CGI_10011412 superfamily 216423 72 337 9.02E-88 277.966 cl18367 Glyco_hydro_35 superfamily - - Glycosyl hydrolases family 35; Glycosyl hydrolases family 35. Q#17038 - CGI_10011413 superfamily 217740 16 190 1.29E-21 88.1873 cl18427 Scramblase superfamily - - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#17039 - CGI_10011414 superfamily 247684 266 696 3.62E-80 266.065 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17039 - CGI_10011414 superfamily 247684 33 260 2.38E-33 132.786 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17040 - CGI_10011415 superfamily 247684 40 382 2.88E-63 212.908 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17042 - CGI_10011417 superfamily 247684 40 471 2.05E-89 285.325 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17045 - CGI_10011420 superfamily 241621 74 158 2.82E-12 61.6165 cl00116 PDGF superfamily - - "Platelet-derived and vascular endothelial growth factors (PDGF, VEGF) family domain; PDGF is a potent activator for cells of mesenchymal origin; PDGF-A and PDGF-B form AA and BB homodimers and an AB heterodimer; VEGF is a potent mitogen in embryonic and somatic angiogenesis with a unique specificity for vascular endothelial cells; VEGF forms homodimers and exists in 4 different isoforms; overall, the VEGF monomer resembles that of PDGF, but its N-terminal segment is helical rather than extended; the cysteine knot motif is a common feature of this domain" Q#17048 - CGI_10011423 superfamily 241572 51 135 6.06E-12 61.1004 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#17049 - CGI_10011424 superfamily 202474 145 224 7.47E-31 115.829 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17049 - CGI_10011424 superfamily 217293 49 126 2.54E-12 63.0355 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17051 - CGI_10011426 superfamily 247755 230 484 3.45E-165 479.555 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#17051 - CGI_10011426 superfamily 247755 500 744 3.95E-143 422.587 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#17051 - CGI_10011426 superfamily 217870 158 189 1.30E-06 46.3361 cl04386 RLI superfamily - - "Possible Fer4-like domain in RNase L inhibitor, RLI; Possible metal-binding domain in endoribonuclease RNase L inhibitor. Found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, fer4, pfam00037. Also often found adjacent to the DUF367 domain pfam04034 in uncharacterized proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons, and could possibly play a more general role in the regulation of RNA stability in mammalian cells. Inhibitory activity requires concentration-dependent association of RLI with RNase L." Q#17051 - CGI_10011426 superfamily 243197 202 223 0.000151371 40.3084 cl02805 Fer4 superfamily - - "4Fe-4S binding domain; Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich." Q#17053 - CGI_10011428 superfamily 216056 70 220 1.23E-30 116.255 cl08279 Peptidase_M16 superfamily - - Insulinase (Peptidase family M16); Insulinase (Peptidase family M16). Q#17053 - CGI_10011428 superfamily 218490 225 429 5.42E-22 93.3099 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#17054 - CGI_10001319 superfamily 247724 46 215 1.57E-16 76.0461 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17055 - CGI_10001321 superfamily 245201 1 125 4.03E-45 149.535 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17056 - CGI_10001343 superfamily 245029 43 90 0.00868515 31.8492 cl09190 MAPEG superfamily N - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#17057 - CGI_10001344 superfamily 245029 18 130 3.48E-19 78.0732 cl09190 MAPEG superfamily - - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#17059 - CGI_10002025 superfamily 247068 131 223 1.47E-20 87.7541 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17059 - CGI_10002025 superfamily 247068 502 585 1.48E-14 70.4201 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17059 - CGI_10002025 superfamily 247068 244 335 2.07E-14 70.0349 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17059 - CGI_10002025 superfamily 247068 37 122 3.52E-09 55.0122 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17063 - CGI_10007320 superfamily 248213 60 107 0.000262598 38.3249 cl17659 DivIC superfamily C - Septum formation initiator; DivIC from B. subtilis is necessary for both vegetative and sporulation septum formation. These proteins are mainly composed of an amino terminal coiled-coil. Q#17066 - CGI_10007323 superfamily 246680 598 676 0.000368665 39.241 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#17069 - CGI_10007326 superfamily 193257 228 429 2.47E-17 82.7259 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#17070 - CGI_10007327 superfamily 193253 225 470 1.20E-19 88.9405 cl15084 MT superfamily C - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#17070 - CGI_10007327 superfamily 193256 32 193 1.34E-13 69.5912 cl18189 AAA_8 superfamily N - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#17071 - CGI_10007328 superfamily 222647 38 88 0.00146934 36.2458 cl16771 FlxA superfamily NC - "FlxA-like protein; This family includes FlxA from E. coli. The expression of FlxA is regulated by the FliA sigma factor, a transcription factor specific for class 3 flagellar operons. However FlxA is not required for flagellar function or formation." Q#17072 - CGI_10007329 superfamily 193251 824 1113 1.83E-26 111.181 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#17072 - CGI_10007329 superfamily 193256 1214 1306 8.79E-05 44.5532 cl18189 AAA_8 superfamily C - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#17073 - CGI_10002099 superfamily 241600 1 86 2.45E-37 131.209 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#17073 - CGI_10002099 superfamily 242566 147 182 4.02E-05 40.0766 cl01536 UPF0154 superfamily C - Uncharacterized protein family (UPF0154); This family contains a set of short bacterial proteins of unknown function. Q#17074 - CGI_10002100 superfamily 243072 413 532 3.88E-25 100.921 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17074 - CGI_10002100 superfamily 243072 128 248 1.31E-21 90.9058 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17074 - CGI_10002100 superfamily 243072 317 468 1.29E-20 88.2094 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17074 - CGI_10002100 superfamily 243072 33 184 6.77E-15 71.6458 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17075 - CGI_10002481 superfamily 243119 101 145 0.000114794 37.7985 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#17077 - CGI_10010175 superfamily 245816 280 399 3.55E-06 45.2994 cl11964 CYTH-like_Pase superfamily N - "CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases; CYTH-like superfamily enzymes hydrolyze triphosphate-containing substrates and require metal cations as cofactors. They have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB), and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions." Q#17079 - CGI_10010177 superfamily 243166 44 254 1.29E-12 63.4666 cl02759 TRAM_LAG1_CLN8 superfamily - - TLC domain; TLC domain. Q#17081 - CGI_10010179 superfamily 217062 24 268 9.16E-43 149.727 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#17082 - CGI_10010180 superfamily 247683 33 83 0.000146417 40.1387 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#17084 - CGI_10010182 superfamily 243100 43 91 4.30E-07 46.4529 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#17085 - CGI_10001250 superfamily 241563 61 97 0.00332873 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17087 - CGI_10012883 superfamily 243092 28 270 6.99E-30 117.821 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17089 - CGI_10012885 superfamily 243045 113 165 8.25E-10 56.8727 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#17089 - CGI_10012885 superfamily 241596 30 71 1.19E-08 52.6015 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#17089 - CGI_10012885 superfamily 243045 278 356 2.29E-06 46.4723 cl02459 PAS superfamily N - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#17090 - CGI_10012886 superfamily 248264 45 214 3.86E-17 77.6625 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#17092 - CGI_10012888 superfamily 241596 25 78 8.54E-11 56.0683 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#17092 - CGI_10012888 superfamily 248097 155 283 1.67E-15 70.3718 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17093 - CGI_10012889 superfamily 248097 124 234 6.93E-18 77.6906 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17094 - CGI_10012890 superfamily 248097 17 144 1.66E-19 79.2314 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17096 - CGI_10012892 superfamily 241570 558 668 1.47E-17 80.4478 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#17096 - CGI_10012892 superfamily 243045 26 124 3.85E-07 49.5539 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#17096 - CGI_10012892 superfamily 219619 428 477 7.82E-07 47.9728 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#17098 - CGI_10012894 superfamily 245596 704 903 2.84E-78 260.703 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#17098 - CGI_10012894 superfamily 245596 581 638 2.13E-08 55.3916 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#17099 - CGI_10012895 superfamily 241698 8 208 2.22E-74 225.606 cl00220 cysteine_hydrolases superfamily - - "Cysteine hydrolases; This family contains amidohydrolases, like CSHase (N-carbamoylsarcosine amidohydrolase), involved in creatine metabolism and nicotinamidase, converting nicotinamide to nicotinic acid and ammonia in the pyridine nucleotide cycle. It also contains isochorismatase, an enzyme that catalyzes the conversion of isochorismate to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of the vinyl ether bond, and other related enzymes with unknown function." Q#17100 - CGI_10012896 superfamily 245206 36 305 6.67E-89 270.303 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#17101 - CGI_10012897 superfamily 247692 19 370 1.34E-46 164.001 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#17103 - CGI_10012899 superfamily 247792 17 76 0.00173762 37.04 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17103 - CGI_10012899 superfamily 197380 466 627 7.62E-06 46.4609 cl16909 SdiA-regulated superfamily - - "SdiA-regulated; This model represents a bacterial family of proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. The C-terminal domain included in the alignment forms a five-bladed beta-propeller structure. The X-ray structure of Escherichia coli yjiK (C-terminal domain) exhibits binding of calcium ions (Ca++) in what appears to be an evolutionarily conserved site. Sequence analysis suggests a distant relationship to proteins that are characterized as containing NHL-repeats. The latter also form beta-propeller structures, with several examples known to form six-bladed beta-propellers. Several of the six-bladed beta-propellers containing NHL repeats have been characterized functionally, including members with enzymatic functions that are dependent on metal ions. No functional characterization is available for this family of five-bladed propellers, though." Q#17103 - CGI_10012899 superfamily 241563 173 207 8.32E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17104 - CGI_10012900 superfamily 241962 1110 1211 8.10E-39 141.692 cl00584 CutA1 superfamily - - "CutA1 divalent ion tolerance protein; Several gene loci with a possible involvement in cellular tolerance to copper have been identified. One such locus in eubacteria and archaebacteria, cutA, is thought to be involved in cellular tolerance to a wide variety of divalent cations other than copper. The cutA locus consists of two operons, of one and two genes. The CutA1 protein is a cytoplasmic protein, encoded by the single-gene operon and has been linked to divalent cation tolerance. It has no recognised structural motifs. This family also contains putative proteins from eukaryotes (human and Drosophila)." Q#17105 - CGI_10012901 superfamily 242494 10 146 1.85E-28 103.929 cl01418 Cupin_5 superfamily - - Cupin superfamily (DUF985); Family of uncharacterized proteins found in bacteria and eukaryotes that belongs to the Cupin superfamily. Q#17106 - CGI_10012902 superfamily 243034 423 530 1.53E-10 58.9308 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#17106 - CGI_10012902 superfamily 243034 198 300 6.68E-10 57.39 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#17106 - CGI_10012902 superfamily 243034 500 605 2.87E-09 55.464 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#17106 - CGI_10012902 superfamily 243034 270 368 9.05E-07 47.76 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#17106 - CGI_10012902 superfamily 243034 362 449 0.00175759 37.7448 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#17107 - CGI_10012903 superfamily 247907 73 184 4.99E-05 40.1244 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#17107 - CGI_10012903 superfamily 241811 1 19 0.000241306 37.1801 cl00355 Ribosomal_S14 superfamily N - Ribosomal protein S14p/S29e; This family includes both ribosomal S14 from prokaryotes and S29 from eukaryotes. Q#17110 - CGI_10002249 superfamily 241563 18 50 7.12E-05 40.3983 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17111 - CGI_10002250 superfamily 245596 22 236 3.10E-69 214.368 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#17112 - CGI_10008577 superfamily 241563 210 241 3.74E-07 47.2819 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17112 - CGI_10008577 superfamily 247792 69 120 3.99E-07 47.4404 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17113 - CGI_10008578 superfamily 219619 135 187 6.48E-16 71.4699 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#17113 - CGI_10008578 superfamily 219619 259 336 1.98E-12 61.8399 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#17114 - CGI_10008579 superfamily 241749 28 169 1.48E-31 112.479 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#17115 - CGI_10008580 superfamily 243090 316 428 7.09E-44 150.789 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#17116 - CGI_10008581 superfamily 247710 106 467 4.92E-150 435.773 cl17114 metX superfamily - - homoserine O-acetyltransferase; Provisional Q#17117 - CGI_10002835 superfamily 199166 210 377 1.11E-18 83.1456 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#17117 - CGI_10002835 superfamily 199166 97 250 4.00E-17 78.9084 cl15308 AMN1 superfamily - - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#17117 - CGI_10002835 superfamily 243074 12 56 1.82E-08 50.5829 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#17118 - CGI_10002836 superfamily 247675 29 197 2.40E-67 211.785 cl17011 Arginase_HDAC superfamily C - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#17119 - CGI_10002497 superfamily 247692 75 278 5.54E-30 116.237 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#17119 - CGI_10002497 superfamily 247692 1 44 2.99E-06 46.87 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#17122 - CGI_10003326 superfamily 248264 150 295 7.56E-38 133.131 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#17122 - CGI_10003326 superfamily 243161 1 52 9.17E-08 48.5446 cl02739 THAP superfamily N - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#17123 - CGI_10007788 superfamily 243072 103 224 9.32E-35 124.033 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17123 - CGI_10007788 superfamily 243072 165 288 4.88E-29 108.625 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17123 - CGI_10007788 superfamily 243072 9 158 1.95E-26 101.306 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17124 - CGI_10007789 superfamily 247856 175 223 0.000300438 37.9125 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17124 - CGI_10007789 superfamily 247999 90 147 0.00314385 34.7736 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#17125 - CGI_10007790 superfamily 247856 106 150 0.000132098 38.2977 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17125 - CGI_10007790 superfamily 247856 133 187 0.000495151 36.7569 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17126 - CGI_10007791 superfamily 217394 25 462 3.60E-134 410.994 cl08387 Alg6_Alg8 superfamily - - "ALG6, ALG8 glycosyltransferase family; N-linked (asparagine-linked) glycosylation of proteins is mediated by a highly conserved pathway in eukaryotes, in which a lipid (dolichol phosphate)-linked oligosaccharide is assembled at the endoplasmic reticulum membrane prior to the transfer of the oligosaccharide moiety to the target asparagine residues. This oligosaccharide is composed of Glc(3)Man(9)GlcNAc(2). The addition of the three glucose residues is the final series of steps in the synthesis of the oligosaccharide precursor. Alg6 transfers the first glucose residue, and Alg8 transfers the second one. In the human alg6 gene, a C->T transition, which causes Ala333 to be replaced with Val, has been identified as the cause of a congenital disorder of glycosylation, designated as type Ic OMIM:603147." Q#17126 - CGI_10007791 superfamily 220363 512 826 2.84E-82 269.192 cl15585 DUF2036 superfamily - - Uncharacterized conserved protein (DUF2036); This family of proteins includes members ranging in size from approximately 300 to 460 residues. There are a number of well-conserved domains along the length. Q#17128 - CGI_10007793 superfamily 215827 145 314 2.13E-23 98.6947 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#17130 - CGI_10007795 superfamily 207662 28 102 1.17E-29 110.344 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#17130 - CGI_10007795 superfamily 245599 311 458 4.47E-12 63.7811 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#17131 - CGI_10007796 superfamily 245864 30 485 7.58E-120 361.981 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#17134 - CGI_10007799 superfamily 247905 1 69 1.07E-16 71.8852 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#17135 - CGI_10002702 superfamily 241554 287 352 4.46E-12 62.6631 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#17137 - CGI_10004862 superfamily 241832 54 196 1.93E-61 190.892 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#17138 - CGI_10004863 superfamily 241629 72 103 2.80E-05 42.122 cl00133 SCP superfamily N - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#17139 - CGI_10004864 superfamily 241629 159 300 1.94E-29 111.639 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#17140 - CGI_10004865 superfamily 241629 89 225 3.63E-33 122.629 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#17141 - CGI_10004866 superfamily 148147 69 237 6.03E-10 55.6727 cl05718 Anemone_cytotox superfamily - - "Sea anemone cytotoxic protein; Sea anemones are a rich source of cytotoxic proteins. Cytolysins comprise a group of more than 30 highly basic proteins with molecular masses of about 20 kDa. Cytolysins isolated from the sea anemone, Heteractis magnifica, include magnificalysin I (HMg I), magnificalysin II (HMg II) and Heteractis magnifica toxin (HMgtxn). These are highly homologous at their N-terminals. HMg I and II have molecular masses of approximately 19 kDa, and pI values of 9.4 and 10.0, respectively. Cytolysins isolated from other sea anemones Actinia tenebrosa (Tenebrosin-C, TN-C), Actinia equina (Equinatoxin, EqT) and Stichodactyla helianthus (ShC) exhibit pore-forming, haemolytic, cytotoxic, and heart stimulatory activities." Q#17142 - CGI_10004867 superfamily 241763 137 392 4.31E-74 232.129 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#17142 - CGI_10004867 superfamily 244586 54 110 8.99E-16 71.1206 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#17143 - CGI_10004112 superfamily 247775 219 514 8.52E-38 142.724 cl17221 ArsB_NhaD_permease superfamily N - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#17143 - CGI_10004112 superfamily 247775 26 145 5.86E-17 81.4773 cl17221 ArsB_NhaD_permease superfamily C - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#17144 - CGI_10004113 superfamily 245206 22 243 3.99E-49 163.224 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#17145 - CGI_10004114 superfamily 247856 96 157 4.51E-16 69.4989 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17145 - CGI_10004114 superfamily 247856 22 84 1.01E-08 49.0833 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17146 - CGI_10004115 superfamily 247856 26 88 5.56E-12 57.5577 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17146 - CGI_10004115 superfamily 247856 100 158 1.05E-10 54.0909 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17147 - CGI_10003182 superfamily 219677 41 73 0.000100102 37.4172 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#17150 - CGI_10003185 superfamily 216198 260 354 4.06E-11 58.8644 cl08295 Transglut_C superfamily - - "Transglutaminase family, C-terminal ig like domain; Transglutaminase family, C-terminal ig like domain. " Q#17150 - CGI_10003185 superfamily 216198 157 217 0.00270491 35.7525 cl08295 Transglut_C superfamily C - "Transglutaminase family, C-terminal ig like domain; Transglutaminase family, C-terminal ig like domain. " Q#17151 - CGI_10003186 superfamily 241764 4 79 7.54E-30 113.899 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#17151 - CGI_10003186 superfamily 247743 130 288 3.60E-18 82.5791 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17151 - CGI_10003186 superfamily 201479 458 578 1.98E-30 116.96 cl02994 Transglut_N superfamily - - Transglutaminase family; Transglutaminase family. Q#17151 - CGI_10003186 superfamily 247916 727 770 3.32E-09 54.3111 cl17362 Transglut_core superfamily C - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#17151 - CGI_10003186 superfamily 204202 368 403 9.23E-08 49.9465 cl07827 Vps4_C superfamily C - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#17152 - CGI_10002753 superfamily 243035 143 263 6.92E-22 88.8309 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17152 - CGI_10002753 superfamily 243035 52 123 7.55E-07 46.459 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17153 - CGI_10002754 superfamily 243072 82 202 2.44E-34 126.729 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17153 - CGI_10002754 superfamily 243072 472 584 1.76E-13 67.7938 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17153 - CGI_10002754 superfamily 243072 10 103 5.78E-09 54.3118 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17153 - CGI_10002754 superfamily 243072 176 236 5.93E-09 54.3118 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17154 - CGI_10002755 superfamily 245602 556 735 1.43E-22 100.009 cl11402 GH31 superfamily N - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#17154 - CGI_10002755 superfamily 245602 350 606 8.96E-14 71.4778 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#17155 - CGI_10002756 superfamily 243035 64 180 2.00E-28 110.402 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17155 - CGI_10002756 superfamily 243035 388 500 3.40E-25 101.157 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17155 - CGI_10002756 superfamily 243035 197 297 4.18E-15 72.2673 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17155 - CGI_10002756 superfamily 243035 525 615 1.17E-14 71.1117 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17156 - CGI_10004471 superfamily 217380 171 454 3.98E-89 285.373 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#17158 - CGI_10004473 superfamily 221488 145 344 6.54E-65 206.374 cl13659 zf-SNAP50_C superfamily - - "snRNA-activating protein of 50kDa MW C terminal; This domain family is found in eukaryotes, and is typically between 196 and 207 amino acids in length. There is a conserved CEH sequence motif. SNAP50 is part of the snRNA-activating protein complex which activates RNA polymerases II and III. There is a cysteine-histidine cluster which contains two possible zinc finger motifs." Q#17159 - CGI_10004474 superfamily 247723 303 378 5.33E-18 78.5068 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17159 - CGI_10004474 superfamily 247723 191 246 7.48E-10 55.6208 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17160 - CGI_10004475 superfamily 221490 480 516 1.08E-11 62.2398 cl13660 CAF1A superfamily C - "Chromatin assembly factor 1 subunit A; The CAF-1 or chromatin assembly factor-1 consists of three subunits, and this is the first, or A. The A domain is uniquely required for the progression of S phase in mouse cells, independent of its ability to promote histone deposition but dependent on its ability to interact with HP1 - heterochromatin protein 1-rich heterochromatin domains next to centromeres that are crucial for chromosome segregation during mitosis. This HP1-CAF-1 interaction module functions as a built-in replication control for heterochromatin, which, like a control barrier, has an impact on S-phase progression in addition to DNA-based checkpoints." Q#17162 - CGI_10004477 superfamily 247683 294 347 5.53E-34 120.442 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#17162 - CGI_10004477 superfamily 245835 25 250 4.66E-98 292.676 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#17165 - CGI_10003390 superfamily 248458 549 689 0.000628926 41.1453 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17165 - CGI_10003390 superfamily 233349 163 328 1.29E-17 84.8544 cl11688 GPH_sucrose superfamily C - "GPH family sucrose/H+ symporter; This model represents sucrose/proton symporters, found in plants, from the Glycoside-Pentoside-Hexuronide (GPH)/cation symporter family. These proteins are predicted to have 12 transmembrane domains. Members may export sucrose (e.g. SUT1, SUT4) from green parts to the phloem for long-distance transport or import sucrose (e.g SUT2) to sucrose sinks such as the tap root of the carrot." Q#17165 - CGI_10003390 superfamily 233349 460 587 9.33E-06 47.49 cl11688 GPH_sucrose superfamily NC - "GPH family sucrose/H+ symporter; This model represents sucrose/proton symporters, found in plants, from the Glycoside-Pentoside-Hexuronide (GPH)/cation symporter family. These proteins are predicted to have 12 transmembrane domains. Members may export sucrose (e.g. SUT1, SUT4) from green parts to the phloem for long-distance transport or import sucrose (e.g SUT2) to sucrose sinks such as the tap root of the carrot." Q#17166 - CGI_10003391 superfamily 222150 341 368 3.52E-06 44.6901 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17166 - CGI_10003391 superfamily 222150 312 338 1.19E-05 43.1493 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17166 - CGI_10003391 superfamily 222150 401 427 3.19E-05 41.9937 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17166 - CGI_10003391 superfamily 222150 283 308 0.000797609 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17166 - CGI_10003391 superfamily 222150 373 398 0.00135806 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17167 - CGI_10003392 superfamily 247724 14 178 1.50E-97 285.789 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17167 - CGI_10003392 superfamily 243077 214 263 5.30E-13 61.7925 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#17168 - CGI_10003393 superfamily 243072 17 93 4.56E-18 74.7274 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17169 - CGI_10003394 superfamily 247792 16 58 5.87E-08 49.7516 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17170 - CGI_10003395 superfamily 207654 516 581 5.63E-28 109.069 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#17170 - CGI_10003395 superfamily 207654 444 509 3.40E-25 100.98 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#17170 - CGI_10003395 superfamily 207654 868 933 8.77E-25 99.8246 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#17170 - CGI_10003395 superfamily 207654 797 861 1.23E-22 93.6614 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#17170 - CGI_10003395 superfamily 207654 675 740 1.21E-20 88.2686 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#17170 - CGI_10003395 superfamily 207654 1027 1092 4.73E-19 83.6462 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#17170 - CGI_10003395 superfamily 207654 951 1017 4.82E-19 83.6462 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#17170 - CGI_10003395 superfamily 207654 608 665 2.55E-17 78.6386 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#17170 - CGI_10003395 superfamily 245008 40 102 8.65E-09 53.7168 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#17171 - CGI_10003396 superfamily 202009 87 136 2.12E-17 73.3004 cl09271 NAC superfamily - - NAC domain; NAC domain. Q#17173 - CGI_10002350 superfamily 247856 157 220 3.45E-07 46.3869 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17175 - CGI_10002352 superfamily 247684 7 411 6.69E-29 120.46 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17178 - CGI_10008401 superfamily 247724 17 114 8.20E-06 42.8288 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17179 - CGI_10008402 superfamily 241874 4 551 0 632.292 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#17180 - CGI_10008403 superfamily 247724 60 217 7.06E-21 90.4106 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17181 - CGI_10008404 superfamily 245716 125 147 4.82E-06 41.8461 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#17182 - CGI_10008405 superfamily 243197 93 115 9.32E-06 39.9232 cl02805 Fer4 superfamily - - "4Fe-4S binding domain; Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich." Q#17182 - CGI_10008405 superfamily 243197 58 77 0.00166843 33.76 cl02805 Fer4 superfamily - - "4Fe-4S binding domain; Superfamily includes proteins containing domains which bind to iron-sulfur clusters. Members include bacterial ferredoxins, various dehydrogenases, and various reductases. Structure of the domain is an alpha-antiparallel beta sandwich." Q#17183 - CGI_10008406 superfamily 241594 684 1066 1.58E-130 402.714 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#17185 - CGI_10008408 superfamily 216686 106 283 2.39E-23 95.0825 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#17186 - CGI_10008409 superfamily 245230 2 444 0 583.89 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#17187 - CGI_10008410 superfamily 220786 84 181 1.10E-05 42.377 cl11141 Ribosomal_L50 superfamily - - "Ribosomal subunit 39S; The 39S ribosomal protein appears to be a subunit of one of the larger mitochondrial 66S or 70S units. Under conditions of ethanol-stress in rats the larger subunit is largely dissociated into its smaller components. In E. coli, in the absence of the enzyme pseudouridine synthase (RluD) synthase, there is an accumulation of 50S and 30S subunits and the appearance of abnormal particles (62S and 39S), with concomitant loss of 70S ribosomes." Q#17188 - CGI_10008411 superfamily 248010 856 978 3.62E-08 53.154 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#17189 - CGI_10008412 superfamily 212166 3 245 1.67E-75 251.107 cl17009 SSH-N superfamily - - "N-terminal domain conserved in slingshot (SSH) phosphatases; This domain or region conserved in Bilateria is found N-terminal to the DEK_C-like and catalytic domains of slingshot phosphatases. Slingshot is a cofilin-specific phosphatase. Dephosphorylation reactivates cofilin, which in turn depolymerizes actin and is thus required for actin filament reorganization. Slingshot is a member of the dual-specificity protein phosphatase family. This N-terminal SSH region may be involved in P-cofilin binding (the model C-terminus plus the DEK_C-like domain, which are characterized as the "B" domain in some of the literature), and may be required for the F-actin mediated activation of slingshot (the N-terminal region of this model, sometimes referred to as the "A" domain)." Q#17189 - CGI_10008412 superfamily 241574 321 456 3.95E-46 163.934 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17189 - CGI_10008412 superfamily 204056 264 316 1.23E-09 56.3373 cl07395 DEK_C superfamily - - DEK C terminal domain; DEK is a chromatin associated protein that is linked with cancers and autoimmune disease. This domain is found at the C terminal of DEK and is of clinical importance since it can reverse the characteristic abnormal DNA-mutagen sensitivity in fibroblasts from ataxia-telangiectasia (A-T) patients. The structure of this domain shows it to be homologous to the E2F/DP transcription factor family. This domain is also found in chitin synthase proteins and in protein phosphatases. Q#17190 - CGI_10008413 superfamily 243066 30 82 3.00E-12 59.5533 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#17191 - CGI_10008414 superfamily 243100 189 231 3.70E-10 54.1569 cl02576 B_zip1 superfamily C - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#17192 - CGI_10006030 superfamily 241563 65 102 1.25E-05 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17192 - CGI_10006030 superfamily 187408 147 282 0.000285481 41.944 cl14654 V_Alix_like superfamily NC - "Protein-interacting V-domain of mammalian Alix and related domains; This superfamily contains the V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. The Alix V-domain contains a binding site, partially conserved in this superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Members of this superfamily have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members, including Alix, HD-PTP, and Bro1, also have a proline-rich region (PRR), which binds multiple partners in Alix, including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. The C-terminal portion (V-domain and PRR) of Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes; it interacts with a YPxL motif in Doa4s catalytic domain to stimulate its deubiquitination activity. Rim20 may bind the ESCRT-III subunit Snf7, bringing the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and promoting the proteolytic activation of Rim101. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate often absent in human kidney, breast, lung, and cervical tumors. HD-PTP has a C-terminal catalytically inactive tyrosine phosphatase domain." Q#17195 - CGI_10006033 superfamily 246911 11 98 2.79E-28 109.285 cl15262 PUB superfamily - - "PNGase/UBA or UBX (PUB) domain of p97 adaptor proteins; The PUB domain is found in p97 adaptor proteins such as PNGase, UBXD1 (UBX domain-containing protein 1), and RNF31 (RING finger protein 31). It functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The p97, a type II AAA+ ATPase, is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The PUB domain in UBX-domain protein 1 (UBXD1), which is widely expressed in higher eukaryotes (except for fungi) and which is involved in substrate recruitment to p97, interacts strongly with the C-terminus of p97. Peptide:N-glycanase (PNGase), a deglycosylating enzyme that functions in proteasome-dependent degradation of misfolded glycoproteins which are translocated from the endoplasmic reticulum (ER) to the cytosol during ERAD, associates with the ubiquitin-proteasome system proteins mediated by the N-terminal PUB domain. PNGase is present in all eukaryotic organisms; however, the yeast PNGase ortholog does not contain the PUB domain. The RNF31 protein, also known as HOIP or Zibra, contains an N-terminal PUB domain similar to those in PNGase and UBXD1, suggesting its association with p97." Q#17195 - CGI_10006033 superfamily 247916 278 358 9.46E-19 82.8338 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#17195 - CGI_10006033 superfamily 245322 502 583 3.10E-12 63.4401 cl10509 DUF750 superfamily - - "Domain of unknown function (DUF750); This family of proteins with unknown function shows similarity to PNG-1, a enzyme responsible for de-N-glycosylation of misfolded glycoproteins in the cytosol. However, unlike PNG-1, this protein does not contain a catalytic triad in its transglutaminase domain." Q#17195 - CGI_10006033 superfamily 217753 342 410 3.88E-06 45.8374 cl10615 Rad4 superfamily NC - Rad4 transglutaminase-like domain; Rad4 transglutaminase-like domain. Q#17197 - CGI_10006035 superfamily 247805 246 381 6.36E-21 91.2448 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#17197 - CGI_10006035 superfamily 247905 477 570 6.51E-08 52.2401 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#17197 - CGI_10006035 superfamily 243778 623 713 1.77E-25 103.073 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#17197 - CGI_10006035 superfamily 219532 840 960 4.00E-07 49.6202 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#17197 - CGI_10006035 superfamily 247805 339 517 0.00575599 38.9502 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#17198 - CGI_10006036 superfamily 246597 283 469 3.84E-55 184.804 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#17198 - CGI_10006036 superfamily 152648 5 55 2.70E-14 68.1511 cl13624 Dpoe2NT superfamily N - "DNA polymerases epsilon N terminal; This domain is found in eukaryotes, and is approximately 70 amino acids in length. The family is found in association with pfam04042. There is a single completely conserved residue F that may be functionally important. This domain is the N terminal domain of DNA polymerase epsilon subunit B. It forms a primarily alpha helical structure in which four helices are arranged in two hairpins with connecting loops containing beta strands which form a short parallel sheet. DNA polymerase epsilon is required in DNA replication for synthesis of the leading strand. This domain has close structural relation to AAA+ protein C terminal domains." Q#17201 - CGI_10006039 superfamily 246748 358 601 2.05E-100 313.76 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#17201 - CGI_10006039 superfamily 244870 143 339 5.65E-74 241.04 cl08238 PA superfamily - - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#17201 - CGI_10006039 superfamily 202944 629 752 3.93E-30 115.831 cl07854 TFR_dimer superfamily - - Transferrin receptor-like dimerisation domain; This domain is involved in dimerisation of the transferrin receptor as shown in its crystal structure. Q#17201 - CGI_10006039 superfamily 246748 59 115 7.07E-10 59.5285 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#17202 - CGI_10006040 superfamily 244870 142 337 2.79E-71 233.336 cl08238 PA superfamily - - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#17202 - CGI_10006040 superfamily 246748 357 566 6.91E-77 250.973 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#17202 - CGI_10006040 superfamily 202944 600 717 2.77E-28 110.438 cl07854 TFR_dimer superfamily - - Transferrin receptor-like dimerisation domain; This domain is involved in dimerisation of the transferrin receptor as shown in its crystal structure. Q#17202 - CGI_10006040 superfamily 246748 59 115 1.59E-09 58.3729 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#17203 - CGI_10002786 superfamily 247044 10 122 6.35E-56 179.341 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#17203 - CGI_10002786 superfamily 247044 136 214 1.60E-20 84.2112 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#17203 - CGI_10002786 superfamily 247044 239 322 1.15E-19 81.912 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#17204 - CGI_10002787 superfamily 248458 194 322 0.00171881 38.4489 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17206 - CGI_10002789 superfamily 220647 18 180 0.000676055 37.6924 cl18565 L_HGMIC_fpl superfamily - - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#17207 - CGI_10002790 superfamily 248279 6 85 8.82E-13 64.6655 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#17207 - CGI_10002790 superfamily 247999 332 382 6.77E-08 49.4112 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#17207 - CGI_10002790 superfamily 247999 186 235 1.39E-07 48.2556 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#17210 - CGI_10012565 superfamily 241752 316 661 2.48E-101 315.749 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#17210 - CGI_10012565 superfamily 242589 192 289 1.65E-49 168.352 cl01581 WGR superfamily - - "WGR domain; The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs) as well as the putative Escherichia coli molybdate metabolism regulator and related bacterial proteins, a small family of bacterial DNA ligases, and various other bacterial proteins of unknown function. It has been called WGR after the most conserved central motif of the domain. The domain occurs in single-domain proteins and in a variety of domain architectures, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain." Q#17210 - CGI_10012565 superfamily 190460 21 99 2.65E-28 110.729 cl03756 PARP_reg superfamily N - "Poly(ADP-ribose) polymerase, regulatory domain; Poly(ADP-ribose) polymerase catalyzes the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#17212 - CGI_10012567 superfamily 242059 246 474 3.24E-35 136.477 cl00738 MBOAT superfamily N - "MBOAT, membrane-bound O-acyltransferase family; The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue." Q#17213 - CGI_10012568 superfamily 246921 58 97 2.91E-10 57.0001 cl15299 FG-GAP superfamily C - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#17215 - CGI_10012570 superfamily 245201 3 303 0 590.174 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17217 - CGI_10012572 superfamily 243072 98 223 4.08E-33 125.574 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17217 - CGI_10012572 superfamily 243072 230 389 9.14E-31 119.025 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17217 - CGI_10012572 superfamily 243072 32 157 2.51E-30 117.87 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17217 - CGI_10012572 superfamily 243072 330 457 3.68E-30 117.099 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17217 - CGI_10012572 superfamily 243072 398 525 4.02E-30 117.099 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17217 - CGI_10012572 superfamily 243072 802 930 2.69E-29 114.788 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17217 - CGI_10012572 superfamily 243072 565 694 3.59E-29 114.403 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17217 - CGI_10012572 superfamily 243072 635 761 3.59E-27 108.625 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17219 - CGI_10012574 superfamily 245607 10 113 2.38E-46 147.922 cl11414 Cytochrom_C superfamily - - "Cytochrome c; The Pfam entry does not include all Prosite members. The cytochrome 556 and cytochrome c' families are not included. All these are now in a new clan together. The C-terminus of DUF989, pfam06181, has now been merged into this family." Q#17220 - CGI_10012575 superfamily 241760 248 288 3.88E-10 56.5864 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#17220 - CGI_10012575 superfamily 247792 368 410 9.97E-09 52.448 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17220 - CGI_10012575 superfamily 247792 164 219 0.00924834 35.114 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17221 - CGI_10012576 superfamily 241597 141 211 3.29E-14 64.9475 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#17222 - CGI_10012577 superfamily 206035 151 248 2.87E-31 112.641 cl16440 Enkurin superfamily - - "Calmodulin-binding; This is a family of apparent calmodulin-binding proteins found at high levels in the testis and vomeronasal organ and at lower levels in certain other tissues. Enkurin is a scaffold protein that binds PI3 kinase to sperm transient receptor potential (canonical) (TRPC) channels. The mammalian transient receptor potential (canonical) channels are the primary candidates for the Ca(2+) entry pathway activated by the hormones, growth factors, and neurotransmitters that exert their effect through activation of PLC. Calmodulin binds to the C-terminus of all TRPC channels, and dissociation of calmodulin from TRPC4 results in profound activation of the channel." Q#17223 - CGI_10012578 superfamily 241644 11 134 1.23E-17 80.3241 cl00154 UBCc superfamily C - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#17224 - CGI_10012579 superfamily 245936 87 285 9.68E-58 185.446 cl12283 IPK superfamily - - Inositol polyphosphate kinase; ArgRIII has has been demonstrated to be an inositol polyphosphate kinase. Q#17224 - CGI_10012579 superfamily 245936 1 39 2.99E-10 56.7892 cl12283 IPK superfamily C - Inositol polyphosphate kinase; ArgRIII has has been demonstrated to be an inositol polyphosphate kinase. Q#17226 - CGI_10003806 superfamily 247684 20 462 1.31E-97 306.511 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17227 - CGI_10003807 superfamily 247684 13 433 5.24E-85 272.999 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17228 - CGI_10003808 superfamily 241691 60 297 7.16E-47 159.355 cl00213 DNA_BRE_C superfamily - - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#17228 - CGI_10003808 superfamily 206538 277 325 8.69E-16 70.719 cl16833 Topo_C_assoc superfamily N - C-terminal topoisomerase domain; This domain is found at the C-terminal of topoisomerase and other similar enzymes. Q#17230 - CGI_10003025 superfamily 241584 802 892 7.18E-22 94.4855 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 1398 1492 8.73E-22 94.1003 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 1297 1386 5.45E-21 91.7891 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 903 986 2.09E-20 90.2483 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2075 2166 9.47E-20 88.3223 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 1196 1289 9.88E-20 87.9371 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2884 2978 2.13E-19 87.1667 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 1980 2063 9.54E-19 85.2407 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 703 794 4.23E-18 83.3147 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2379 2472 7.01E-18 82.5443 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 1670 1763 7.59E-18 82.5443 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2683 2776 8.17E-18 82.5443 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 1772 1863 8.45E-18 82.5443 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2481 2571 1.17E-17 82.1591 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 1874 1965 4.46E-17 80.2331 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2989 3079 5.77E-17 80.2331 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 3191 3282 1.10E-16 79.0775 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2286 2367 2.20E-15 75.6107 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2582 2671 4.88E-15 74.4551 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2177 2261 5.41E-14 71.3735 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 2784 2875 1.22E-12 67.5215 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 241584 3088 3171 2.28E-11 63.6695 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17230 - CGI_10003025 superfamily 245814 1498 1565 8.84E-11 61.3511 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 1023 1093 1.54E-08 54.8027 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 2 57 1.50E-05 45.9431 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 1119 1190 8.58E-14 70.3024 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 630 699 1.14E-13 69.9172 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 1593 1666 4.61E-10 59.5168 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 534 605 3.24E-07 50.9669 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 436 498 1.73E-06 49.0409 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 360 424 1.14E-05 46.3445 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 80 157 1.22E-05 46.3445 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 169 246 1.37E-05 46.3445 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17230 - CGI_10003025 superfamily 245814 259 335 3.62E-05 44.8037 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17231 - CGI_10003026 superfamily 245670 210 298 1.30E-24 103.815 cl11519 DENN superfamily N - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#17231 - CGI_10003026 superfamily 243635 113 166 5.85E-10 58.1149 cl04085 uDENN superfamily N - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#17232 - CGI_10016448 superfamily 241571 727 828 1.58E-41 148.715 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#17232 - CGI_10016448 superfamily 241571 408 517 7.25E-41 146.789 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#17232 - CGI_10016448 superfamily 241571 563 675 1.72E-32 123.291 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#17232 - CGI_10016448 superfamily 241571 299 406 4.69E-24 99.0238 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#17232 - CGI_10016448 superfamily 241571 833 921 4.83E-18 81.6898 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#17232 - CGI_10016448 superfamily 241583 120 297 1.59E-111 343.655 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#17232 - CGI_10016448 superfamily 241578 499 557 8.11E-08 52.7723 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17232 - CGI_10016448 superfamily 241578 676 712 3.99E-06 47.7648 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17234 - CGI_10016450 superfamily 207627 355 442 6.25E-17 77.6751 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#17234 - CGI_10016450 superfamily 216653 57 216 4.60E-16 76.0967 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#17234 - CGI_10016450 superfamily 207627 470 559 2.93E-14 69.9759 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#17234 - CGI_10016450 superfamily 216653 664 819 2.13E-11 62.2295 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#17235 - CGI_10016451 superfamily 246680 16 95 4.53E-08 50.4118 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#17237 - CGI_10016453 superfamily 217605 567 921 2.68E-97 319.453 cl04144 Tuberin superfamily - - "Tuberin; Tuberous sclerosis complex (TSC) is an autosomal dominant disorder and is characterized by the presence of hamartomas in many organs, such as brain, skin, heart, lung, and kidney. It is caused by mutation either TSC1 or TSC2 tumour suppressor gene. The TSC2 gene codes for tuberin and interacts with hamartin pfam04388, containing two coiled-coil regions, which have been shown to mediate binding to tuberin. These two proteins function within the same pathway(s) regulating cell cycle, cell growth, adhesion, and vesicular trafficking." Q#17237 - CGI_10016453 superfamily 216901 1521 1711 2.59E-47 170.074 cl03466 Rap_GAP superfamily - - Rap/ran-GAP; Rap/ran-GAP. Q#17239 - CGI_10016455 superfamily 246751 92 387 2.79E-118 351.931 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#17239 - CGI_10016455 superfamily 241546 394 495 1.38E-08 52.6828 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#17241 - CGI_10016457 superfamily 247805 28 69 2.35E-06 42.7096 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#17242 - CGI_10016458 superfamily 245847 190 335 3.84E-19 82.2193 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#17242 - CGI_10016458 superfamily 241619 85 134 0.000900737 36.7913 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#17244 - CGI_10016460 superfamily 245847 137 207 6.26E-10 57.9517 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#17244 - CGI_10016460 superfamily 241619 35 81 0.00132815 37.5617 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#17246 - CGI_10016462 superfamily 241563 63 99 3.78E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17249 - CGI_10016465 superfamily 241574 135 334 5.65E-24 97.6565 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17249 - CGI_10016465 superfamily 241574 2 85 5.89E-23 94.5749 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17251 - CGI_10016467 superfamily 243061 11 112 5.71E-38 135.547 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17251 - CGI_10016467 superfamily 243061 119 220 2.55E-35 128.614 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17251 - CGI_10016467 superfamily 243061 448 549 2.96E-35 128.229 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17251 - CGI_10016467 superfamily 243061 340 441 9.08E-31 115.902 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17251 - CGI_10016467 superfamily 243061 225 333 2.30E-22 92.405 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17251 - CGI_10016467 superfamily 243061 556 601 9.46E-08 50.4182 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17252 - CGI_10002774 superfamily 215754 36 119 6.90E-22 87.694 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#17252 - CGI_10002774 superfamily 215754 125 219 2.29E-20 83.4568 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#17252 - CGI_10002774 superfamily 215754 224 260 5.00E-06 43.396 cl02813 Mito_carr superfamily C - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#17253 - CGI_10002775 superfamily 241583 253 460 6.65E-15 74.1889 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#17256 - CGI_10013681 superfamily 242667 1 164 5.30E-92 290.26 cl01720 Phage_Nu1 superfamily - - "Phage DNA packaging protein Nu1; Terminase, the DNA packaging enzyme of bacteriophage lambda, is a heteromultimer composed of subunits Nu1 and A. The smaller Nu1 terminase subunit has a low-affinity ATPase stimulated by non-specific DNA." Q#17256 - CGI_10013681 superfamily 233457 741 897 3.30E-48 178.15 cl18843 portal_lambda superfamily N - "phage portal protein, lambda family; This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa [Mobile and extrachromosomal element functions, Prophage functions]." Q#17256 - CGI_10013681 superfamily 233457 669 725 2.07E-10 62.5898 cl18843 portal_lambda superfamily C - "phage portal protein, lambda family; This model represents one of several distantly related families of phage portal protein. This protein forms a hole, or portal, that enables DNA passage during packaging and ejection. It also forms the junction between the phage head (capsid) and the tail proteins. It functions as a dodecamer of a single polypeptide of average mol. wt. of 40-90 KDa [Mobile and extrachromosomal element functions, Prophage functions]." Q#17257 - CGI_10013691 superfamily 243078 22 138 2.48E-47 153.529 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#17259 - CGI_10013693 superfamily 243095 2 47 0.0017793 37.2813 cl02570 RhoGAP superfamily N - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#17260 - CGI_10013694 superfamily 241574 319 446 4.11E-48 169.326 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17260 - CGI_10013694 superfamily 241626 66 179 7.31E-25 102.36 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#17263 - CGI_10013697 superfamily 242466 8 156 7.06E-54 169.672 cl01379 TspO_MBR superfamily - - "TspO/MBR family; Tryptophan-rich sensory protein (TspO) is an integral membrane protein that acts as a negative regulator of the expression of specific photosynthesis genes in response to oxygen/light. It is involved in the efflux of porphyrin intermediates from the cell. This reduces the activity of coproporphyrinogen III oxidase, which is thought to lead to the accumulation of a putative repressor molecule that inhibits the expression of specific photosynthesis genes. Several conserved aromatic residues are necessary for TspO function: they are thought to be involved in binding porphyrin intermediates. In, the rat mitochondrial peripheral benzodiazepine receptor (MBR) was shown to not only retain its structure within a bacterial outer membrane, but also to be able to functionally substitute for TspO in TspO- mutants, and to act in a similar manner to TspO in its in situ location: the outer mitochondrial membrane. The biological significance of MBR remains unclear, however. It is thought to be involved in a variety of cellular functions, including cholesterol transport in steroidogenic tissues." Q#17265 - CGI_10013699 superfamily 241559 2 34 0.000355034 36.8736 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#17266 - CGI_10013700 superfamily 243105 37 334 1.78E-96 295.152 cl02603 TEA superfamily N - TEA/ATTS domain family; TEA/ATTS domain family. Q#17268 - CGI_10013702 superfamily 243092 53 81 0.00141587 33.4796 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17270 - CGI_10013704 superfamily 245819 65 233 2.68E-25 104.969 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#17270 - CGI_10013704 superfamily 245819 399 557 3.24E-11 62.9819 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#17270 - CGI_10013704 superfamily 247743 587 743 1.79E-10 60.2912 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17271 - CGI_10013705 superfamily 216981 155 254 0.00380873 35.201 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#17272 - CGI_10013706 superfamily 247692 63 566 0 573.815 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#17273 - CGI_10013707 superfamily 245814 231 303 4.11E-14 68.6699 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17273 - CGI_10013707 superfamily 245814 574 644 1.75E-11 60.9659 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17273 - CGI_10013707 superfamily 245814 132 197 0.000514219 39.0095 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17276 - CGI_10021926 superfamily 247724 127 150 0.00256046 35.8431 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17277 - CGI_10021927 superfamily 247724 78 275 2.39E-27 105.949 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17278 - CGI_10021928 superfamily 247724 222 326 2.93E-21 89.3858 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17279 - CGI_10021929 superfamily 247724 85 205 7.32E-16 72.8222 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17280 - CGI_10021930 superfamily 247724 116 306 1.26E-33 124.054 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17281 - CGI_10021931 superfamily 247725 1357 1493 6.84E-77 252.2 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#17281 - CGI_10021931 superfamily 243096 1191 1357 1.22E-30 121.251 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#17281 - CGI_10021931 superfamily 243054 840 1071 1.20E-10 62.078 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#17281 - CGI_10021931 superfamily 247069 693 839 4.87E-10 59.6646 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#17282 - CGI_10021932 superfamily 247905 808 937 7.57E-30 118.109 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#17282 - CGI_10021932 superfamily 247805 512 661 4.25E-23 98.9487 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#17282 - CGI_10021932 superfamily 248013 407 459 3.31E-06 47.2587 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#17282 - CGI_10021932 superfamily 248013 313 380 7.08E-05 43.4067 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#17282 - CGI_10021932 superfamily 207699 2256 2299 5.72E-12 63.8523 cl02688 BRK superfamily - - BRK domain; The function of this domain is unknown. It is often found associated with helicases and transcription factors. Q#17282 - CGI_10021932 superfamily 207699 2181 2219 1.32E-11 62.6967 cl02688 BRK superfamily - - BRK domain; The function of this domain is unknown. It is often found associated with helicases and transcription factors. Q#17283 - CGI_10021933 superfamily 202014 498 695 2.42E-65 219.21 cl03387 RB_A superfamily - - Retinoblastoma-associated protein A domain; This domain has the cyclin fold as predicted. Q#17283 - CGI_10021933 superfamily 221325 64 207 8.59E-46 162.486 cl13385 DUF3452 superfamily - - "Domain of unknown function (DUF3452); This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is typically between 124 to 150 amino acids in length. This domain is found associated with pfam01858, pfam01857. This domain has a single completely conserved residue W that may be functionally important." Q#17283 - CGI_10021933 superfamily 216744 812 849 1.76E-08 53.9245 cl18378 RB_B superfamily N - "Retinoblastoma-associated protein B domain; The crystal structure of the Rb pocket bound to a nine-residue E7 peptide containing the LxCxE motif, shared by other Rb-binding viral and cellular proteins, shows that the LxCxE peptide binds a highly conserved groove on the B domain. The B domain has a cyclin fold." Q#17284 - CGI_10021934 superfamily 247787 29 191 3.20E-33 120.381 cl17233 RecA-like_NTPases superfamily C - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#17288 - CGI_10021938 superfamily 247724 7 178 5.86E-119 337.209 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17289 - CGI_10021939 superfamily 247856 128 176 2.63E-07 47.5425 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17289 - CGI_10021939 superfamily 247856 258 306 2.63E-07 47.5425 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17289 - CGI_10021939 superfamily 247856 193 248 0.000369397 38.2977 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17289 - CGI_10021939 superfamily 247856 323 378 0.000411093 37.9125 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17292 - CGI_10021942 superfamily 245814 185 251 1.18E-09 55.1879 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17293 - CGI_10021943 superfamily 221591 249 469 6.24E-58 191.98 cl13854 SUFU_C superfamily - - "Suppressor of Fused Gli/Ci N terminal binding domain; This domain family is found in eukaryotes, and is typically between 192 and 219 amino acids in length. The family is found in association with pfam05076. There is a conserved HGRHFT sequence motif. This family is the C terminal domain of the Suppressor of Fused protein (Su(fu)). Su(fu) is a repressor of the Gli and Ci transcription factors of the Hedgehog signalling cascade. It functions by binding these proteins and preventing their translocation to the nucleus. The C terminal domain is only found in eukaryotic Su(fu) proteins; it is not present in bacterial homologues. The C terminal domain binds to the N terminal of Gli/Ci while the N terminal of Su(fu) binds to the C terminal of Gli/Ci. This dual binding mechanism is likely an evolutionary advancement in this signalling cascade which is not present in bacterial homologues." Q#17293 - CGI_10021943 superfamily 218418 56 237 3.92E-28 109.334 cl04922 SUFU superfamily - - "Suppressor of fused protein (SUFU); SUFU, encoding the human orthologue of Drosophila suppressor of fused, appears to have a conserved role in the repression of Hedgehog signaling. SUFU exerts its repressor role by physically interacting with GLI proteins in both the cytoplasm and the nucleus. SUFU has been found to be a tumour-suppressor gene that predisposes individuals to medulloblastoma by modulating the SHH signaling pathway. Genomic contextual analysis of bacterial SUFU versions revealed that they are immunity proteins against diverse nuclease toxins in polymorphic toxin systems." Q#17294 - CGI_10021944 superfamily 247744 489 607 7.38E-30 115.322 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#17294 - CGI_10021944 superfamily 241622 264 344 1.20E-15 73.3698 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#17294 - CGI_10021944 superfamily 247683 365 427 3.20E-31 116.763 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#17295 - CGI_10021945 superfamily 245201 60 332 2.39E-56 188.861 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17296 - CGI_10021946 superfamily 243072 165 292 6.42E-35 124.803 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17296 - CGI_10021946 superfamily 241584 4 98 9.71E-06 42.8687 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17296 - CGI_10021946 superfamily 243072 137 169 6.19E-06 42.5412 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17297 - CGI_10021947 superfamily 243056 500 734 1.09E-51 179.425 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#17298 - CGI_10021948 superfamily 142634 1051 1469 0 769.834 cl11429 RNAP_largest_subunit_C superfamily - - "Largest subunit of RNA polymerase (RNAP), C-terminal domain; RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is the final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei, RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. Structure studies revealed that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shape structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. The largest RNAP subunit (Rpb1) interacts with the second-largest RNAP subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The region covered by this domain makes up part of the foot and jaw structures. In archaea, some photosynthetic organisms, and some organelles, this domain exists as a separate subunit, while it forms the C-terminal region of the RNAP largest subunit in eukaryotes and bacteria." Q#17298 - CGI_10021948 superfamily 245715 239 539 1.18E-152 477.781 cl11591 RNA_pol_Rpb1_2 superfamily - - "RNA polymerase Rpb1, domain 2; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 2, contains the active site. The invariant motif -NADFDGD- binds the active site magnesium ion." Q#17298 - CGI_10021948 superfamily 218370 11 347 1.41E-117 378.951 cl04880 RNA_pol_Rpb1_1 superfamily - - "RNA polymerase Rpb1, domain 1; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand." Q#17298 - CGI_10021948 superfamily 218368 889 1072 1.97E-96 312.104 cl04878 RNA_pol_Rpb1_6 superfamily - - "RNA polymerase Rpb1, domain 6; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 6, represents a mobile module of the RNA polymerase. Domain 6 forms part of the shelf module. This family appears to be specific to the largest subunit of RNA polymerase II." Q#17298 - CGI_10021948 superfamily 218361 518 685 5.09E-58 200.925 cl04873 RNA_pol_Rpb1_3 superfamily - - "RNA polymerase Rpb1, domain 3; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking." Q#17298 - CGI_10021948 superfamily 218372 710 816 5.03E-44 158.688 cl04881 RNA_pol_Rpb1_4 superfamily - - "RNA polymerase Rpb1, domain 4; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 4, represents the funnel domain. The funnel contain the binding site for some elongation factors." Q#17299 - CGI_10021949 superfamily 218390 198 254 6.06E-17 78.5113 cl04895 PARG_cat superfamily N - "Poly (ADP-ribose) glycohydrolase (PARG); Poly(ADP-ribose) glycohydrolase (PARG), is a ubiquitously expressed exo- and endoglycohydrolase which mediates oxidative and excitotoxic neuronal death." Q#17301 - CGI_10021951 superfamily 151071 129 173 2.02E-19 77.9485 cl11152 APP_amyloid superfamily - - "beta-amyloid precursor protein C-terminus; This is the amyloid, C-terminal, protein of the beta-Amyloid precursor protein (APP) which is a conserved and ubiquitous transmembrane glycoprotein strongly implicated in the pathogenesis of Alzheimer's disease but whose normal biological function is unknown. The C-terminal 100 residues are released and aggregate into amyloid deposits which are strongly implicated in the pathology of Alzheimer's disease plaque-formation. The domain is associated with family A4_EXTRA, pfam02177, further towards the N-terminus." Q#17302 - CGI_10021952 superfamily 248458 43 461 1.18E-26 109.326 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17304 - CGI_10021954 superfamily 247805 655 790 1.07E-24 103.186 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#17304 - CGI_10021954 superfamily 247905 849 976 1.67E-11 64.1812 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#17304 - CGI_10021954 superfamily 247792 1874 1908 0.000572999 40.1216 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17304 - CGI_10021954 superfamily 243778 1028 1118 4.05E-13 68.0195 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#17304 - CGI_10021954 superfamily 219532 1156 1250 0.000681786 40.3754 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#17306 - CGI_10008957 superfamily 241874 39 333 1.17E-159 462.531 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#17308 - CGI_10008959 superfamily 241640 293 462 1.96E-06 46.7969 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#17309 - CGI_10008960 superfamily 246664 1 91 1.58E-49 165.13 cl14561 An_peroxidase_like superfamily N - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#17310 - CGI_10008961 superfamily 246664 42 435 0 608.88 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#17311 - CGI_10008962 superfamily 245598 190 777 2.66E-119 373.98 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#17311 - CGI_10008962 superfamily 246669 65 170 1.08E-23 97.7184 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#17312 - CGI_10008963 superfamily 222366 5221 6022 0 1090.45 cl16381 E3_UbLigase_R4 superfamily - - E3 ubiquitin-protein ligase UBR4; This is a family of E## ubiquitin ligase enzymes. Q#17313 - CGI_10008964 superfamily 243039 718 851 2.13E-54 188.585 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#17313 - CGI_10008964 superfamily 241736 94 346 1.06E-95 311.221 cl00263 TFold superfamily - - "Tunnelling fold (T-fold). The five known T-folds are found in five different enzymes with different functions: dihydroneopterin-triphosphate epimerase (DHNTPE), dihydroneopterin aldolase (DHNA) , GTP cyclohydrolase I (GTPCH-1), 6-pyrovoyl tetrahydropterin synthetase (PTPS), and uricase (UO,uroate/urate oxidase). They bind to substrates belonging to the purine or pterin families, and share a fold-related binding site with a glutamate or glutamine residue anchoring the substrate and a lot of conserved interactions. They also share a similar oligomerization mode: several T-folds join together to form a beta(2n)alpha(n) barrel, then two barrels join together in a head-to-head fashion to made up the native enzymes. The functional enzyme is a tetramer for UO, a hexamer for PTPS, an octamer for DHNA/DHNTPE and a decamer for GTPCH-1. The substrate is located in a deep and narrow pocket at the interface between monomers. In PTPS, the active site is located at the interface of three monomers, two from one trimer and one from the other trimer. In GTPCH-1, it is also located at the interface of three subunits, two from one pentamer and one from the other pentamer. There are four equivalent active sites in UO, six in PTPS, eight in DHNA/DHNTPE and ten in GTPCH-1. Each globular multimeric enzyme encloses a tunnel which is lined with charged residues for DHNA and UO, and with basic residues in PTPS. The N and C-terminal ends are located on one side of the T-fold while the residues involved in the catalytic activity are located at the opposite side. In PTPS, UO and DHNA/DHNTPE, the N and C-terminal extremities of the enzyme are located on the exterior side of the functional multimeric enzyme. In GTPCH-1, the extra C-terminal helix places the extremity inside the tunnel." Q#17313 - CGI_10008964 superfamily 243066 1088 1184 1.56E-16 77.7312 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#17313 - CGI_10008964 superfamily 190233 447 500 2.52E-07 49.7602 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#17313 - CGI_10008964 superfamily 190233 500 558 1.20E-06 47.8342 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#17313 - CGI_10008964 superfamily 198867 1284 1334 0.00982479 36.1653 cl06652 BACK superfamily C - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#17316 - CGI_10012280 superfamily 241600 216 428 4.34E-83 256.398 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#17316 - CGI_10012280 superfamily 241600 63 165 4.68E-32 121.193 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#17318 - CGI_10012282 superfamily 241785 1 84 6.02E-41 139.069 cl00324 Ribosomal_L3 superfamily N - Ribosomal protein L3; Ribosomal protein L3. Q#17320 - CGI_10012284 superfamily 243035 31 152 1.50E-29 111.558 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17320 - CGI_10012284 superfamily 243035 174 265 5.08E-13 65.7189 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17322 - CGI_10012286 superfamily 238191 34 560 1.30E-111 345.086 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#17325 - CGI_10012289 superfamily 243051 291 453 1.29E-54 182.192 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#17325 - CGI_10012289 superfamily 245835 71 281 0.00918733 36.1798 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#17326 - CGI_10012290 superfamily 222150 490 515 1.33E-06 46.6161 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17326 - CGI_10012290 superfamily 222150 518 542 7.48E-05 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17326 - CGI_10012290 superfamily 222150 376 403 0.00017005 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17326 - CGI_10012290 superfamily 222150 463 487 0.00106088 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17326 - CGI_10012290 superfamily 222150 406 431 0.00242807 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17327 - CGI_10012291 superfamily 242876 20 160 6.83E-67 202.97 cl02092 Clat_adaptor_s superfamily - - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#17328 - CGI_10012292 superfamily 248458 12 189 4.16E-11 63.1017 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17328 - CGI_10012292 superfamily 248458 337 477 8.17E-08 52.7013 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17329 - CGI_10012293 superfamily 247675 37 325 4.85E-171 480.048 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#17332 - CGI_10012296 superfamily 245201 60 108 2.14E-06 48.5245 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17333 - CGI_10012297 superfamily 217293 1 177 1.06E-74 234.835 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17333 - CGI_10012297 superfamily 202474 184 410 4.13E-66 213.284 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17334 - CGI_10012298 superfamily 217293 26 249 1.60E-89 275.666 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17334 - CGI_10012298 superfamily 202474 256 482 2.90E-64 210.203 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17335 - CGI_10012299 superfamily 217293 33 259 7.60E-82 255.635 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17335 - CGI_10012299 superfamily 202474 266 479 8.54E-52 177.076 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17336 - CGI_10012300 superfamily 217293 1 222 8.28E-63 205.559 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17336 - CGI_10012300 superfamily 202474 229 466 2.92E-62 204.81 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17337 - CGI_10012301 superfamily 202474 151 406 1.82E-47 163.979 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17337 - CGI_10012301 superfamily 217293 28 144 2.20E-31 119.275 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17338 - CGI_10012302 superfamily 202474 111 332 2.34E-49 167.446 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17338 - CGI_10012302 superfamily 217293 2 104 1.50E-23 96.5479 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17339 - CGI_10012303 superfamily 217293 36 246 3.90E-77 243.309 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17339 - CGI_10012303 superfamily 202474 253 482 4.06E-50 172.453 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17340 - CGI_10012304 superfamily 217293 52 261 2.75E-85 264.11 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17340 - CGI_10012304 superfamily 202474 268 471 1.91E-18 83.4721 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17342 - CGI_10006348 superfamily 220622 31 174 2.71E-68 209.301 cl10879 Mesd superfamily - - "Chaperone for wingless signalling and trafficking of LDL receptor; Mesd is a family of highly conserved proteins found from nematodes to humans. The final C-terminal residues, KEDL, are the endoplasmic reticulum retention sequence as it is an ER protein specifically required for the intracellular trafficking of members of the low-density lipoprotein family of receptors (LDLRs). The N- and C-terminal sequences are predicted to adopt a random coil conformation, with the exception of an isolated predicted helix within the N-terminal region, The central folded domain flanked by natively unstructured regions is the necessary structure for facilitating maturation of LRP6 (Low-Density Lipoprotein Receptor-Related Protein 6 Maturation)." Q#17343 - CGI_10006349 superfamily 245225 17 183 5.71E-32 120.045 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#17345 - CGI_10006351 superfamily 245225 1 146 1.15E-26 105.471 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#17347 - CGI_10025340 superfamily 241578 272 411 2.40E-11 61.1534 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17347 - CGI_10025340 superfamily 241568 22 73 0.006984 34.746 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#17350 - CGI_10025343 superfamily 243050 63 118 8.23E-36 127.092 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#17350 - CGI_10025343 superfamily 243050 4 55 6.04E-34 121.766 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#17350 - CGI_10025343 superfamily 241599 259 317 1.13E-20 85.758 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#17352 - CGI_10025345 superfamily 245208 54 434 0 540.318 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#17353 - CGI_10025346 superfamily 241583 161 317 2.69E-91 280.63 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#17353 - CGI_10025346 superfamily 243048 353 549 1.19E-40 146.302 cl02471 HX superfamily - - Hemopexin-like repeats.; Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). This CD contains 4 instances of the repeat. Q#17353 - CGI_10025346 superfamily 216518 72 134 0.000703512 37.8949 cl18368 PG_binding_1 superfamily - - Putative peptidoglycan binding domain; This domain is composed of three alpha helices. This domain is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. This family is found N-terminal to the catalytic domain of matrixins. The domain is found to bind peptidoglycan experimentally. Q#17355 - CGI_10025348 superfamily 241583 107 264 2.57E-80 247.503 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#17355 - CGI_10025348 superfamily 243048 316 436 1.07E-19 86.2112 cl02471 HX superfamily N - Hemopexin-like repeats.; Hemopexin is a heme-binding protein that transports heme to the liver. Hemopexin-like repeats occur in vitronectin and some matrix metalloproteinases family (matrixins). The HX repeats of some matrixins bind tissue inhibitor of metalloproteinases (TIMPs). This CD contains 4 instances of the repeat. Q#17355 - CGI_10025348 superfamily 216518 18 70 1.17E-07 48.6805 cl18368 PG_binding_1 superfamily - - Putative peptidoglycan binding domain; This domain is composed of three alpha helices. This domain is found at the N or C terminus of a variety of enzymes involved in bacterial cell wall degradation. This domain may have a general peptidoglycan binding function. This family is found N-terminal to the catalytic domain of matrixins. The domain is found to bind peptidoglycan experimentally. Q#17356 - CGI_10025349 superfamily 191444 134 198 0.00740895 33.4517 cl05558 IL17 superfamily N - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#17357 - CGI_10025350 superfamily 244083 162 258 1.39E-35 125.106 cl05417 PLA2_like superfamily - - "PLA2_like: Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers." Q#17358 - CGI_10025351 superfamily 241590 289 348 3.48E-17 74.2656 cl00072 GYF superfamily - - GYF domain: contains conserved Gly-Tyr-Phe residues; Proline-binding domain in CD2-binding and other proteins. Involved in signaling lymphocyte activity. Also present in other unrelated proteins (mainly unknown) derived from diverse eukaryotic species. Q#17360 - CGI_10025353 superfamily 247724 7 142 1.86E-86 254.281 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17361 - CGI_10025355 superfamily 215866 68 218 2.74E-49 166.732 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#17361 - CGI_10025355 superfamily 243212 241 370 8.51E-28 107.045 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#17362 - CGI_10025356 superfamily 215866 27 177 4.03E-45 154.791 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#17362 - CGI_10025356 superfamily 243212 200 328 4.07E-24 96.2589 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#17363 - CGI_10025357 superfamily 243035 97 223 1.95E-24 94.9941 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17364 - CGI_10025358 superfamily 243072 97 217 2.14E-36 127.115 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17365 - CGI_10025359 superfamily 215866 9 157 2.18E-26 101.633 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#17365 - CGI_10025359 superfamily 243212 180 287 2.64E-15 70.0653 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#17366 - CGI_10025360 superfamily 215866 7 160 1.80E-33 122.819 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#17366 - CGI_10025360 superfamily 243212 183 306 2.70E-15 71.9913 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#17367 - CGI_10025361 superfamily 241570 223 322 9.97E-07 48.8614 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#17367 - CGI_10025361 superfamily 241570 144 203 1.34E-06 48.4762 cl00047 CAP_ED superfamily N - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#17367 - CGI_10025361 superfamily 215866 1125 1267 3.07E-37 138.997 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#17367 - CGI_10025361 superfamily 215866 753 899 4.94E-28 112.419 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#17367 - CGI_10025361 superfamily 243212 922 1051 2.64E-20 89.7105 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#17367 - CGI_10025361 superfamily 243212 1290 1419 5.43E-19 85.8585 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#17368 - CGI_10025362 superfamily 241739 117 403 4.76E-125 367.27 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#17369 - CGI_10025363 superfamily 243066 67 171 3.60E-19 82.6653 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#17369 - CGI_10025363 superfamily 198867 187 290 4.10E-08 50.8028 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#17370 - CGI_10025364 superfamily 247743 620 779 3.97E-13 68.3267 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17370 - CGI_10025364 superfamily 191262 860 1070 1.84E-78 256.376 cl18170 Lon_C superfamily - - "Lon protease (S16) C-terminal proteolytic domain; The Lon serine proteases must hydrolyse ATP to degrade protein substrates. In Escherichia coli, these proteases are involved in turnover of intracellular proteins, including abnormal proteins following heat-shock. The active site for protease activity resides in a C-terminal domain. The Lon proteases are classified as family S16 in Merops." Q#17370 - CGI_10025364 superfamily 197740 144 210 9.31E-09 53.9804 cl15344 LON superfamily C - "Found in ATP-dependent protease La (LON); N-terminal domain of the ATP-dependent protease La (LON), present also in other bacterial ORFs." Q#17371 - CGI_10025365 superfamily 246669 287 423 8.82E-82 250.037 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#17371 - CGI_10025365 superfamily 246669 152 277 1.49E-61 197.573 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#17372 - CGI_10025366 superfamily 241615 79 117 0.000140044 38.6205 cl00107 LysM superfamily - - "Lysine Motif is a small domain involved in binding peptidoglycan; LysM, a small globular domain with approximately 40 amino acids, is a widespread protein module involved in binding peptidoglycan in bacteria and chitin in eukaryotes. The domain was originally identified in enzymes that degrade bacterial cell walls, but proteins involved in many other biological functions also contain this domain. It has been reported that the LysM domain functions as a signal for specific plant-bacteria recognition in bacterial pathogenesis. Many of these enzymes are modular and are composed of catalytic units linked to one or several repeats of LysM domains. LysM domains are found in bacteria and eukaryotes." Q#17374 - CGI_10025368 superfamily 247802 80 277 2.68E-118 354.141 cl17248 RIO superfamily - - "RIO kinase family, catalytic domain. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). RIO kinases are atypical protein serine kinases present in archaea, bacteria and eukaryotes. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. RIO kinases contain a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. Most organisms contain at least two RIO kinases, RIO1 and RIO2. A third protein, RIO3, is present in multicellular eukaryotes. In yeast, RIO1 and RIO2 are essential for survival. They function as non-ribosomal factors necessary for late 18S rRNA processing. RIO1 is also required for proper cell cycle progression and chromosome maintenance. The biological substrates for RIO kinases are still unknown." Q#17374 - CGI_10025368 superfamily 220140 9 92 1.38E-34 126.507 cl07723 Rio2_N superfamily - - "Rio2, N-terminal; Members of this family are found in Rio2, and are structurally homologous to the winged helix (wHTH) domain. They adopt a structure consisting of four alpha helices followed by two beta strands and a fifth alpha helix. The domain confers DNA binding properties to the protein, as per other winged helix domains." Q#17374 - CGI_10025368 superfamily 245622 499 572 4.57E-06 45.6782 cl11446 Rhomboid superfamily N - "Rhomboid family; This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite." Q#17375 - CGI_10025369 superfamily 217505 51 525 8.62E-117 355.014 cl04021 Serinc superfamily - - Serine incorporator (Serinc); This is a family of eukaryotic membrane proteins which incorporate serine into membranes and facilitate the synthesis of the serine-derived lipids phosphatidylserine and sphingolipid. Members of this family contain 11 transmembrane domains and form intracellular complexes with key enzymes involved in serine and sphingolipid biosynthesis. Q#17376 - CGI_10025370 superfamily 244539 1249 1396 1.98E-36 139.362 cl06868 FNR_like superfamily C - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#17376 - CGI_10025370 superfamily 247856 820 880 2.05E-07 50.2389 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17376 - CGI_10025370 superfamily 246664 1 583 0 563.461 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#17376 - CGI_10025370 superfamily 242267 1075 1208 1.86E-05 44.9736 cl01043 Ferric_reduct superfamily - - "Ferric reductase like transmembrane component; This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease." Q#17377 - CGI_10025371 superfamily 241997 10 120 1.66E-23 89.1753 cl00638 RNA_pol_Rpb4 superfamily - - RNA polymerase Rpb4; This family includes the Rpb4 protein. This family also includes C17 (aka CGRP-RCP) is an essential subunit of RNA polymerase III. C17 forms a subcomplex with C25 which is likely to be the counterpart of subcomplex Rpb4/7 in Pol II. Q#17378 - CGI_10025372 superfamily 148511 35 98 2.01E-05 41.8005 cl06134 Selenoprotein_S superfamily NC - Selenoprotein S (SelS); This family consists of several mammalian selenoprotein S (SelS) sequences. SelS is a plasma membrane protein and is present in a variety of tissues and cell types. The function of this family is unknown. Q#17380 - CGI_10025374 superfamily 243072 388 515 2.41E-28 111.321 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17381 - CGI_10025375 superfamily 245201 144 391 2.46E-62 207.089 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17382 - CGI_10025376 superfamily 241592 23 119 7.12E-44 141.704 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#17383 - CGI_10025377 superfamily 245335 28 277 6.13E-125 363.875 cl10571 GT_MraY-like superfamily - - "Glycosyltransferase 4 (GT4) includes both eukaryotic and prokaryotic UDP-D-N-acetylhexosamine:polyprenol phosphate D-N-acetylhexosamine-1-phosphate transferases. They catalyze the transfer of a D-N-acetylhexosamine 1-phosphate to a membrane-bound polyprenol phosphate, which is the initiation step of protein N-glycosylation in eukaryotes and peptidoglycan biosynthesis in bacteria. One member, D-N-acetylhexosamine 1-phosphate transferase (GPT) is a eukaryotic enzyme, which is specific for UDP-GlcNAc as donor substrate and dolichol-phosphate as the membrane bound acceptor. The bacterial members MraY, WecA, and WbpL/WbcO utilize undecaprenol phosphate as the acceptor substrate, but use different UDP-sugar donor substrates. MraY-type transferases are highly specific for UDP-N-acetylmuramate-pentapeptide, whereas WecA proteins are selective for UDP-N-acetylglucosamine (UDP-GlcNAc). The WbcO/WbpL substrate specificity has not yet been determined, but the structure of their biosynthetic endproducts implies that UDP-N-acetyl-D-fucosamine (UDP-FucNAc) and/or UDPN-acetyl-D-quinosamine (UDP-QuiNAc) are used. The eukaryotic reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for N-glycosylation. The prokaryotic reactions lead to the formation of polyprenol-linked oligosaccharides involved in bacterial cell wall and peptidoglycan assembly. Archaeal and eukaryotic enzymes may use the same substrates and are evolutionarily closer than the bacterial enzyme. Archaea possess the same N-glycosylation pathway as eukaryotes. A glycosyl transferase gene Mv1751 in M. voltae encodes for the enzyme that carries out the first step in the pathway, the attachment of GlcNAc to a dolichol lipid carrier in the membrane. A lethal mutation in the alg7 (GPT) gene in Saccharomyces cerevisiae was successfully complemented with Mv1751, the archaea gene." Q#17384 - CGI_10025378 superfamily 241559 26 141 7.31E-21 86.2107 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#17384 - CGI_10025378 superfamily 109460 213 237 4.42E-05 40.4846 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#17384 - CGI_10025378 superfamily 109460 294 317 6.83E-05 40.0994 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#17384 - CGI_10025378 superfamily 109460 334 357 0.00111481 36.6326 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#17384 - CGI_10025378 superfamily 109460 174 197 0.00714366 34.3214 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#17385 - CGI_10025379 superfamily 109460 92 116 0.000413309 38.1734 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#17385 - CGI_10025379 superfamily 109460 52 76 0.00739678 34.7066 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#17387 - CGI_10025381 superfamily 241584 310 392 2.50E-07 48.6467 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17387 - CGI_10025381 superfamily 245814 239 293 5.44E-07 47.4839 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17390 - CGI_10025384 superfamily 241572 20 168 5.05E-05 41.7581 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#17391 - CGI_10025385 superfamily 247692 318 880 0 654.605 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#17391 - CGI_10025385 superfamily 247692 990 1507 2.05E-171 528.645 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#17391 - CGI_10025385 superfamily 219040 11 101 1.19E-10 60.5565 cl05791 DMAP_binding superfamily - - "DMAP1-binding Domain; This domain binds DMAP1, a transcriptional co-repressor." Q#17392 - CGI_10025386 superfamily 248345 29 157 1.35E-30 112.753 cl17791 SAC3_GANP superfamily - - "SAC3/GANP/Nin1/mts3/eIF-3 p25 family; This large family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit. This family includes several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits." Q#17393 - CGI_10025387 superfamily 247727 52 157 1.95E-09 52.0471 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#17394 - CGI_10025388 superfamily 222420 390 470 0.000531482 39.1552 cl16438 DUF4199 superfamily N - Protein of unknown function (DUF4199); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria. Proteins in this family are typically between 167 and 182 amino acids in length. Q#17394 - CGI_10025388 superfamily 220413 61 133 0.0083174 35.3807 cl10778 Nop25 superfamily C - "Nucleolar protein 12 (25kDa); Members of this family of proteins are part of the yeast nuclear pore complex-associated pre-60S ribosomal subunit. The family functions as a highly conserved exonuclease that is required for the 5'-end maturation of 5.8S and 25S rRNAs, demonstrating that 5'-end processing also has a redundant pathway. Nop25 binds late pre-60S ribosomes, accompanying them from the nucleolus to the nuclear periphery; and there is evidence for both physical and functional links between late 60S subunit processing and export." Q#17395 - CGI_10025389 superfamily 245008 72 151 8.42E-34 119.241 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#17395 - CGI_10025389 superfamily 244949 181 269 1.77E-39 134.3 cl08426 AMPKBI superfamily - - "5'-AMP-activated protein kinase beta subunit, interation domain; This region is found in the beta subunit of the 5'-AMP-activated protein kinase complex, and its yeast homologues Sip1, Sip2 and Gal83, which are found in the SNF1 kinase complex. This region is sufficient for interaction of this subunit with the kinase complex, but is not solely responsible for the interaction, and the interaction partner is not known. The isoamylase N-terminal domain (pfam02922) is sometimes found in proteins belonging to this family." Q#17398 - CGI_10025392 superfamily 243092 169 313 6.12E-20 88.9312 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17398 - CGI_10025392 superfamily 243074 60 107 1.08E-08 51.7385 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#17399 - CGI_10025393 superfamily 248458 49 227 9.89E-14 71.1909 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17400 - CGI_10025394 superfamily 246925 195 492 9.45E-36 135.176 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#17401 - CGI_10025395 superfamily 192098 334 401 2.06E-24 99.6006 cl07295 RPAP1_C superfamily - - "RPAP1-like, C-terminal; Inhibition of RPAP1 synthesis in Saccharomyces cerevisiae results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11. This entry represents the C-terminal region that contains the motif GLHHH. This region is conserved from yeast to humans." Q#17401 - CGI_10025395 superfamily 192099 215 262 1.26E-12 64.9668 cl07296 RPAP1_N superfamily - - "RPAP1-like, N-terminal; Inhibition of RPAP1 synthesis in Saccharomyces cerevisiae results in changes in global gene expression that are similar to those caused by the loss of the RNAPII subunit Rpb11. This entry represents the N-terminal region of RPAP-1 that is conserved from yeast to humans." Q#17404 - CGI_10025398 superfamily 247792 184 222 0.000820208 36.2696 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17405 - CGI_10025399 superfamily 241728 46 179 2.45E-65 200.963 cl00253 Dtyr_deacylase superfamily - - D-Tyrosyl-tRNAtyr deacylases; a class of tRNA-dependent hydrolases which are capable of hydrolyzing the ester bond of D-Tyrosyl-tRNA reducing the level of cellular D-Tyrosine while recycling the peptidyl-tRNA; found in bacteria and in eukaryotes but not in archea; beta barrel-like fold structure; forms homodimers in which two surface cavities serve as the active site for tRNA binding Q#17406 - CGI_10025400 superfamily 247058 7 110 3.42E-37 128.832 cl15762 crotonase-like superfamily N - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#17407 - CGI_10009407 superfamily 245818 48 205 2.72E-25 97.2427 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#17408 - CGI_10009408 superfamily 218331 91 228 9.63E-41 140.867 cl08427 PAP_RNA-bind superfamily - - Poly(A) polymerase predicted RNA binding domain; Based on its similarity structurally to the RNA recognition motif this domain is thought to be RNA binding. Q#17413 - CGI_10009415 superfamily 147539 22 172 1.46E-69 210.717 cl05128 TRAP-delta superfamily - - "Translocon-associated protein, delta subunit precursor (TRAP-delta); This family consists of several eukaryotic translocon-associated protein, delta subunit precursors (TRAP-delta or SSR-delta). The exact function of this protein is unknown." Q#17414 - CGI_10009416 superfamily 245835 16 93 9.51E-25 93.5998 cl12013 BAR superfamily C - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#17416 - CGI_10009418 superfamily 241563 98 136 3.64E-07 47.6671 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17416 - CGI_10009418 superfamily 110440 522 549 0.000171117 39.6985 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#17418 - CGI_10009420 superfamily 243072 512 548 0.0012299 37.7483 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17419 - CGI_10009421 superfamily 245201 436 668 3.33E-101 313.353 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17419 - CGI_10009421 superfamily 247723 116 187 1.20E-28 109.779 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17419 - CGI_10009421 superfamily 243157 296 351 4.57E-13 66.048 cl02720 PB1 superfamily N - "The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants." Q#17419 - CGI_10009421 superfamily 247723 17 91 1.84E-08 52.1743 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17419 - CGI_10009421 superfamily 247723 225 268 0.000612596 38.47 cl17169 RRM_SF superfamily NC - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17421 - CGI_10011725 superfamily 184428 2 326 2.48E-62 203.26 cl14742 PRK13971 superfamily - - hydroxyproline-2-epimerase; Provisional Q#17422 - CGI_10011726 superfamily 220626 86 334 8.73E-54 181.665 cl18564 GpcrRhopsn4 superfamily - - "Rhodopsin-like GPCR transmembrane domain; This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers." Q#17425 - CGI_10011729 superfamily 241581 1141 1246 9.21E-13 66.641 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#17425 - CGI_10011729 superfamily 241596 451 510 4.83E-12 63.3871 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#17425 - CGI_10011729 superfamily 243107 1344 1389 5.09E-12 63.333 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#17425 - CGI_10011729 superfamily 216347 321 376 5.69E-09 58.7013 cl08309 Cu_amine_oxid superfamily C - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#17425 - CGI_10011729 superfamily 145726 70 159 1.61E-06 47.7254 cl08353 Cu_amine_oxidN2 superfamily - - "Copper amine oxidase, N2 domain; This domain is the first or second structural domain in copper amine oxidases, it is known as the N2 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ)." Q#17426 - CGI_10011730 superfamily 203209 51 229 3.22E-11 61.3256 cl12305 STOP superfamily - - "STOP protein; Neurons contain abundant subsets of highly stable microtubules that resist de-polymerising conditions such as exposure to the cold. Stable microtubules are thought to be essential for neuronal development, maintenance, and function. STOP is a major factor responsible for the intriguing stability properties of neuronal microtubules and is important for synaptic plasticity. Additionally knowledge of STOPs function and properties may help in the treatment of neuroleptics in illnesses such as schizophrenia, currently thought to result from synaptic defects." Q#17426 - CGI_10011730 superfamily 203209 217 363 0.000236107 40.9101 cl12305 STOP superfamily N - "STOP protein; Neurons contain abundant subsets of highly stable microtubules that resist de-polymerising conditions such as exposure to the cold. Stable microtubules are thought to be essential for neuronal development, maintenance, and function. STOP is a major factor responsible for the intriguing stability properties of neuronal microtubules and is important for synaptic plasticity. Additionally knowledge of STOPs function and properties may help in the treatment of neuroleptics in illnesses such as schizophrenia, currently thought to result from synaptic defects." Q#17427 - CGI_10011731 superfamily 246669 1030 1118 9.18E-08 52.8395 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#17427 - CGI_10011731 superfamily 247724 1665 1825 2.72E-41 152.49 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17427 - CGI_10011731 superfamily 248012 2307 2396 1.21E-11 64.5213 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#17427 - CGI_10011731 superfamily 199166 27 181 0.000318384 43.47 cl15308 AMN1 superfamily C - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#17430 - CGI_10011734 superfamily 242274 191 342 2.18E-09 56.3657 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#17431 - CGI_10011735 superfamily 243030 40 70 0.00420999 34.9155 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#17434 - CGI_10011739 superfamily 241563 7 45 1.05E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17438 - CGI_10014004 superfamily 219165 165 542 3.00E-172 495.668 cl06019 LMF1 superfamily - - "Lipase maturation factor; This family of transmembrane proteins includes the lipase maturation factor, LMF1. Lipoprotein lipase and hepatic lipase require LMF1 to fold into their active states. The precise role of LMF1 in lipase folding has yet to be determined." Q#17439 - CGI_10014005 superfamily 241574 125 261 6.36E-57 182.808 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17439 - CGI_10014005 superfamily 241626 3 90 5.95E-09 52.6694 cl00125 RHOD superfamily N - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#17442 - CGI_10014008 superfamily 241563 18 50 0.00156003 36.5463 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17443 - CGI_10014009 superfamily 241626 144 265 5.90E-57 182.034 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#17444 - CGI_10014010 superfamily 241626 159 281 1.34E-53 173.944 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#17445 - CGI_10014011 superfamily 241626 159 281 1.22E-51 168.552 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#17447 - CGI_10014013 superfamily 243092 62 236 0.00014185 41.5516 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17448 - CGI_10014014 superfamily 245847 204 365 0.000373077 39.8474 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#17449 - CGI_10014015 superfamily 241574 207 396 6.58E-42 150.814 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17452 - CGI_10014018 superfamily 241574 72 140 3.80E-05 38.8791 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17455 - CGI_10003733 superfamily 247743 1084 1241 4.13E-07 50.2223 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17455 - CGI_10003733 superfamily 247743 774 919 8.52E-05 43.2887 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17455 - CGI_10003733 superfamily 247743 1459 1602 0.000108958 42.9035 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17455 - CGI_10003733 superfamily 247743 251 316 2.55E-09 57.3052 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17455 - CGI_10003733 superfamily 247743 508 598 1.71E-06 48.4456 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17456 - CGI_10019976 superfamily 245602 330 646 1.16E-150 446.752 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#17457 - CGI_10019977 superfamily 216981 1083 1207 2.27E-15 75.2617 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#17457 - CGI_10019977 superfamily 216981 582 707 1.45E-14 72.5654 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#17458 - CGI_10019978 superfamily 241570 374 467 1.56E-15 74.2846 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#17464 - CGI_10019984 superfamily 246680 150 211 1.42E-05 41.5522 cl14633 DD_superfamily superfamily N - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#17464 - CGI_10019984 superfamily 245874 1 71 1.16E-14 67.0662 cl12111 TNFR superfamily - - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#17466 - CGI_10019986 superfamily 243146 426 478 7.17E-05 40.6195 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17466 - CGI_10019986 superfamily 243146 467 508 0.00695956 34.9447 cl02701 Kelch_3 superfamily C - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17468 - CGI_10019988 superfamily 192581 34 68 0.00064423 34.7463 cl11073 TFIIIC_sub6 superfamily - - TFIIIC subunit; This is a family of proteins subunits of TFIIIC. TFIIIC in yeast and humans is required for transcription of tRNA and 5 S RNA genes by RNA polymerase III. Yeast members of this family are fused to phosphoglycerate mutase domain. Q#17469 - CGI_10019989 superfamily 243034 71 183 2.26E-08 50.8416 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#17471 - CGI_10019991 superfamily 241758 11 157 1.88E-26 98.5962 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#17472 - CGI_10019992 superfamily 243119 445 487 0.000370561 38.5689 cl02629 CBM_14 superfamily C - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#17472 - CGI_10019992 superfamily 243119 233 291 0.000493198 38.1837 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#17473 - CGI_10019993 superfamily 243119 285 327 8.86E-05 39.7245 cl02629 CBM_14 superfamily C - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#17473 - CGI_10019993 superfamily 243119 95 131 0.000982982 36.653 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#17478 - CGI_10019998 superfamily 241596 58 115 3.84E-12 58.3795 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#17479 - CGI_10019999 superfamily 218522 1 159 3.55E-51 163.271 cl05018 UPF0220 superfamily - - Uncharacterized protein family (UPF0220); This family of proteins is functionally uncharacterized. Q#17484 - CGI_10020005 superfamily 245847 86 226 7.05E-18 77.2117 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#17488 - CGI_10006435 superfamily 248264 395 554 2.14E-38 138.139 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#17488 - CGI_10006435 superfamily 243161 136 171 1.11E-07 49.7002 cl02739 THAP superfamily NC - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#17488 - CGI_10006435 superfamily 222429 13 91 7.81E-07 47.2352 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#17488 - CGI_10006435 superfamily 222263 307 404 8.22E-06 44.2309 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#17489 - CGI_10006436 superfamily 243072 1236 1361 9.78E-38 140.211 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17489 - CGI_10006436 superfamily 243072 1500 1625 1.11E-36 137.13 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17489 - CGI_10006436 superfamily 243072 1405 1526 6.17E-36 134.819 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17489 - CGI_10006436 superfamily 243072 1170 1295 1.59E-35 133.663 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17489 - CGI_10006436 superfamily 243072 1096 1229 2.13E-23 98.6098 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17489 - CGI_10006436 superfamily 243072 1373 1403 1.40E-05 44.4672 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17491 - CGI_10006438 superfamily 241573 19 342 6.79E-113 343.93 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#17491 - CGI_10006438 superfamily 246669 510 631 1.01E-28 111.601 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#17491 - CGI_10006438 superfamily 241653 354 487 1.16E-23 97.7764 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#17492 - CGI_10006439 superfamily 213389 171 346 2.31E-12 65.7735 cl17092 STING_C superfamily - - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#17492 - CGI_10006439 superfamily 213389 529 686 4.54E-12 64.6179 cl17092 STING_C superfamily - - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#17493 - CGI_10006440 superfamily 241638 181 329 2.85E-24 96.26 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#17494 - CGI_10003633 superfamily 215825 56 425 6.76E-160 466.009 cl02828 Calreticulin superfamily - - Calreticulin family; Calreticulin family. Q#17496 - CGI_10007607 superfamily 247068 799 892 4.04E-17 79.2797 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 476 614 2.10E-14 71.5757 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 51 142 7.04E-13 66.9533 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 1368 1460 2.72E-12 65.4125 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 379 464 2.26E-11 62.7161 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 260 359 5.87E-11 61.5605 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 1283 1355 6.58E-11 61.1753 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 155 248 2.53E-08 53.4714 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 904 995 9.99E-08 51.5454 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 1007 1097 7.36E-06 46.1526 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17496 - CGI_10007607 superfamily 247068 1110 1198 6.33E-05 43.071 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17497 - CGI_10007608 superfamily 247068 484 581 2.17E-20 88.1393 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17497 - CGI_10007608 superfamily 247068 594 683 4.19E-17 78.5093 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17497 - CGI_10007608 superfamily 247068 167 319 1.35E-12 65.4125 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17497 - CGI_10007608 superfamily 247068 768 852 2.34E-11 61.5605 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17497 - CGI_10007608 superfamily 247068 338 476 4.76E-05 42.6858 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17498 - CGI_10007609 superfamily 247856 174 234 1.72E-10 56.7873 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17499 - CGI_10007610 superfamily 217324 35 155 9.14E-19 79.4149 cl03844 Folate_rec superfamily - - Folate receptor family; This family includes the folate receptor which binds to folate and reduced folic acid derivatives and mediates delivery of 5-methyltetrahydrofolate to the interior of cells. These proteins are attached to the membrane by a GPI-anchor. The proteins contain 16 conserved cysteines that form eight disulphide bridges. Q#17500 - CGI_10007611 superfamily 246723 368 553 1.13E-05 46.9127 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#17500 - CGI_10007611 superfamily 246723 233 368 0.000146128 43.4459 cl14813 GluZincin superfamily C - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#17501 - CGI_10007612 superfamily 245819 883 1018 5.76E-55 189.327 cl11967 Nucleotidyl_cyc_III superfamily C - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#17501 - CGI_10007612 superfamily 245225 28 434 1.90E-103 331.988 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#17501 - CGI_10007612 superfamily 245201 565 810 3.21E-42 156.157 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17502 - CGI_10007613 superfamily 241584 26 118 7.03E-11 56.7359 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17506 - CGI_10003077 superfamily 219026 9 283 6.80E-90 271.025 cl05768 GPI2 superfamily - - "Phosphatidylinositol N-acetylglucosaminyltransferase; Glycosylphosphatidylinositol (GPI) represents an important anchoring molecule for cell surface proteins. The first step in its synthesis is the transfer of N-acetylglucosamine (GlcNAc) from UDP-N-acetylglucosamine to phosphatidylinositol (PI). This step involves products of three or four genes in both yeast (GPI1, GPI2 and GPI3) and mammals (GPI1, PIG A, PIG H and PIG C), respectively." Q#17508 - CGI_10003079 superfamily 213389 137 302 6.94E-56 182.489 cl17092 STING_C superfamily - - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#17510 - CGI_10004850 superfamily 217293 26 162 3.97E-31 113.882 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17512 - CGI_10004852 superfamily 245608 15 215 2.61E-75 228.737 cl11421 FAA_hydrolase superfamily - - "Fumarylacetoacetate (FAA) hydrolase family; This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyses fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hepatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerises this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. This family also includes various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase, encoded by mhpD in E. coli, is involved in the phenylpropionic acid pathway of E. coli and catalyzes the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase." Q#17519 - CGI_10004859 superfamily 109845 111 150 3.80E-09 48.9191 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#17519 - CGI_10004859 superfamily 109845 26 70 4.12E-05 38.1335 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#17519 - CGI_10004859 superfamily 109845 2 40 8.72E-05 37.3631 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#17519 - CGI_10004859 superfamily 109845 81 125 0.00485429 32.7407 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#17520 - CGI_10004860 superfamily 109845 64 103 6.20E-10 50.4599 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#17520 - CGI_10004860 superfamily 109845 34 78 0.000644233 33.8963 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#17521 - CGI_10005091 superfamily 245225 34 164 7.29E-69 217.881 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#17522 - CGI_10005092 superfamily 245225 1 289 1.45E-97 311.099 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#17522 - CGI_10005092 superfamily 215648 382 622 4.16E-74 241.346 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#17522 - CGI_10005092 superfamily 219467 302 351 1.12E-12 63.8915 cl08456 NCD3G superfamily - - "Nine Cysteines Domain of family 3 GPCR; This conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the pfam00003 in several receptor proteins." Q#17524 - CGI_10002373 superfamily 246669 11 111 8.93E-24 89.5648 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#17525 - CGI_10002374 superfamily 241868 295 430 6.05E-67 212.012 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#17525 - CGI_10002374 superfamily 243072 21 102 2.33E-16 75.1126 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17525 - CGI_10002374 superfamily 204192 252 280 1.21E-05 42.2162 cl07801 zf-NADH-PPase superfamily - - NADH pyrophosphatase zinc ribbon domain; This domain is found in between two duplicated NUDIX domains. It has a zinc ribbon structure. Q#17525 - CGI_10002374 superfamily 220167 184 249 0.0042145 35.4601 cl07800 NUDIX-like superfamily N - "NADH pyrophosphatase-like rudimentary NUDIX domain; The N-terminal domain in NADH pyrophosphatase, which has a rudiment Nudix fold according to SCOP." Q#17526 - CGI_10002375 superfamily 243176 240 748 0 735.567 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#17526 - CGI_10002375 superfamily 243176 18 185 2.31E-90 293.743 cl02777 chaperonin_like superfamily C - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#17530 - CGI_10006154 superfamily 248241 8 275 1.03E-31 122.042 cl17687 5_nucleotid superfamily N - "5' nucleotidase family; This family of eukaryotic proteins includes 5' nucleotidase enzymes, such as purine 5'-nucleotidase EC:3.1.3.5." Q#17531 - CGI_10006155 superfamily 248241 7 265 7.54E-21 90.0705 cl17687 5_nucleotid superfamily - - "5' nucleotidase family; This family of eukaryotic proteins includes 5' nucleotidase enzymes, such as purine 5'-nucleotidase EC:3.1.3.5." Q#17532 - CGI_10006156 superfamily 247743 539 682 8.78E-13 66.7859 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17532 - CGI_10006156 superfamily 243514 678 744 1.63E-09 56.5255 cl03749 STAT_int superfamily N - "STAT protein, protein interaction domain; STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. STAT proteins also include an SH2 domain pfam00017." Q#17535 - CGI_10008135 superfamily 242406 1 63 1.32E-06 43.7341 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#17536 - CGI_10008137 superfamily 241600 1 131 3.55E-31 112.334 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#17537 - CGI_10008138 superfamily 214781 532 620 2.72E-14 70.834 cl02747 NRF superfamily - - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#17537 - CGI_10008138 superfamily 214781 102 190 9.81E-14 69.2932 cl02747 NRF superfamily - - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#17537 - CGI_10008138 superfamily 226122 889 1072 0.00177894 40.4343 cl18744 NolL superfamily N - Fucose 4-O-acetylase and related acetyltransferases [Carbohydrate transport and metabolism] Q#17538 - CGI_10020990 superfamily 241550 382 524 1.23E-92 290.692 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#17538 - CGI_10020990 superfamily 217810 522 709 2.09E-60 202.099 cl04341 tRNA-synt_1c_C superfamily - - "tRNA synthetases class I (E and Q), anti-codon binding domain; Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln)." Q#17538 - CGI_10020990 superfamily 241550 220 321 7.87E-59 200.17 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#17538 - CGI_10020990 superfamily 218149 3 115 2.71E-36 134.702 cl04588 tRNA_synt_1c_R1 superfamily N - "Glutaminyl-tRNA synthetase, non-specific RNA binding region part 1; This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function." Q#17538 - CGI_10020990 superfamily 218148 116 213 2.39E-19 83.9496 cl04587 tRNA_synt_1c_R2 superfamily - - "Glutaminyl-tRNA synthetase, non-specific RNA binding region part 2; This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function." Q#17539 - CGI_10020991 superfamily 243092 129 263 0.000757345 39.6256 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17539 - CGI_10020991 superfamily 110440 308 335 0.00932999 33.5353 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#17540 - CGI_10020992 superfamily 241563 102 138 7.66E-05 40.7336 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17540 - CGI_10020992 superfamily 110440 522 548 0.000282969 38.9281 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#17540 - CGI_10020992 superfamily 191851 141 252 0.00354519 37.6095 cl06708 DUF1640 superfamily - - Protein of unknown function (DUF1640); This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured. Q#17540 - CGI_10020992 superfamily 110440 563 590 0.00781739 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#17541 - CGI_10020993 superfamily 243128 755 944 3.75E-33 129.017 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#17541 - CGI_10020993 superfamily 243128 959 1154 9.48E-33 127.862 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#17541 - CGI_10020993 superfamily 217861 1243 1334 2.22E-17 82.0386 cl04380 Upf2 superfamily N - "Up-frameshift suppressor 2; Transcripts harbouring premature signals for translation termination are recognised and rapidly degraded by eukaryotic cells through a pathway known as nonsense-mediated mRNA decay. In Saccharomyces cerevisiae, three trans-acting factors (Upf1 to Upf3) are required for nonsense-mediated mRNA decay." Q#17541 - CGI_10020993 superfamily 243128 400 592 9.37E-12 65.0743 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#17541 - CGI_10020993 superfamily 198850 40 103 6.16E-09 54.8323 cl04907 L51_S25_CI-B8 superfamily - - "Mitochondrial ribosomal protein L51 / S25 / CI-B8 domain; The proteins in this family are located in the mitochondrion. The family includes ribosomal protein L51, and S25. This family also includes mitochondrial NADH-ubiquinone oxidoreductase B8 subunit (CI-B8) EC:1.6.5.3. It is not known whether all members of this family form part of the NADH-ubiquinone oxidoreductase and whether they are also all ribosomal proteins." Q#17543 - CGI_10020995 superfamily 216981 161 301 1.62E-20 87.5881 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#17544 - CGI_10020996 superfamily 220135 3 156 2.87E-42 153.441 cl07707 PPP4R2 superfamily C - "PPP4R2; PPP4R2 (protein phosphatase 4 core regulatory subunit R2) is the regulatory subunit of the histone H2A phosphatase complex. It has been shown to confer resistance to the anticancer drug cisplatin in yeast, and may confer resistance in higher eukaryotes." Q#17545 - CGI_10020997 superfamily 247684 205 573 1.59E-48 174.187 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17545 - CGI_10020997 superfamily 247725 87 134 0.000348142 39.6444 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#17546 - CGI_10020998 superfamily 247068 7 104 1.71E-28 111.636 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17546 - CGI_10020998 superfamily 247068 216 312 3.29E-27 107.784 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17546 - CGI_10020998 superfamily 247068 117 209 1.50E-22 94.6877 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17546 - CGI_10020998 superfamily 247068 321 420 1.61E-22 94.3025 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17546 - CGI_10020998 superfamily 247068 429 530 2.81E-22 93.9173 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#17549 - CGI_10021001 superfamily 241944 5 505 0 846.707 cl00554 NAD_binding_5 superfamily - - "Myo-inositol-1-phosphate synthase; This is a family of myo-inositol-1-phosphate synthases. Inositol-1-phosphate catalyzes the conversion of glucose-6- phosphate to inositol-1-phosphate, which is then dephosphorylated to inositol. Inositol phosphates play an important role in signal transduction." Q#17550 - CGI_10021002 superfamily 217293 33 243 1.06E-67 218.656 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17550 - CGI_10021002 superfamily 202474 268 499 9.77E-12 63.4417 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17551 - CGI_10021003 superfamily 247727 42 156 3.48E-13 65.9142 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#17551 - CGI_10021003 superfamily 247727 276 390 1.30E-12 64.3734 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#17552 - CGI_10021004 superfamily 245819 351 525 2.98E-50 172.764 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#17552 - CGI_10021004 superfamily 219526 112 256 3.15E-35 131.587 cl06648 HNOBA superfamily C - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#17553 - CGI_10021005 superfamily 217252 341 452 1.68E-41 144.245 cl08372 Pyr_redox_dim superfamily - - "Pyridine nucleotide-disulphide oxidoreductase, dimerisation domain; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases." Q#17553 - CGI_10021005 superfamily 215691 171 251 1.97E-16 74.1594 cl15766 Pyr_redox superfamily - - Pyridine nucleotide-disulphide oxidoreductase; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. Q#17553 - CGI_10021005 superfamily 248054 121 200 1.39E-05 44.6007 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#17554 - CGI_10021006 superfamily 241626 166 285 3.05E-56 180.878 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#17556 - CGI_10021008 superfamily 245213 317 356 9.34E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17556 - CGI_10021008 superfamily 245874 357 430 7.84E-05 43.1838 cl12111 TNFR superfamily C - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#17556 - CGI_10021008 superfamily 245835 1092 1176 0.000549573 41.7233 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#17556 - CGI_10021008 superfamily 247725 7 27 0.00495958 37.5096 cl17171 PH-like superfamily NC - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#17557 - CGI_10021009 superfamily 247916 128 198 5.95E-11 58.5483 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#17559 - CGI_10021011 superfamily 110440 910 937 0.00312807 36.6169 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#17559 - CGI_10021011 superfamily 110440 752 777 0.00476072 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#17561 - CGI_10021013 superfamily 246616 12 287 2.48E-114 334.518 cl14105 MetH superfamily - - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#17562 - CGI_10021014 superfamily 189332 19 276 5.58E-73 243.845 cl14874 Luminal_IRE1_like superfamily - - "The Luminal domain, a dimerization domain, of Inositol-requiring protein 1-like proteins; The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), eukaryotic translation Initiation Factor 2-Alpha Kinase 3 (EIF2AK3), and similar proteins. IRE1 and EIF2AK3 are serine/threonine protein kinases (STKs) and are type I transmembrane proteins that are localized in the endoplasmic reticulum (ER). They are kinase receptors that are activated through the release of BiP, a chaperone bound to their luminal domains under unstressed conditions. This results in dimerization through their luminal domains, allowing trans-autophosphorylation of their kinase domains and activation. They play roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), contains an endoribonuclease domain in its cytoplasmic side and acts as an ER stress sensor. It is the oldest and most conserved component of the UPR in eukaryotes. Its activation results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. EIF2AK3, also called PKR-like Endoplasmic Reticulum Kinase (PERK), phosphorylates the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. It functions as the central regulator of translational control during the UPR pathway. In addition to the eIF-2 alpha subunit, EIF2AK3 also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR." Q#17562 - CGI_10021014 superfamily 245201 732 911 5.18E-29 117.599 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17562 - CGI_10021014 superfamily 245201 444 508 1.07E-13 70.7285 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17563 - CGI_10021015 superfamily 247736 26 86 1.56E-08 49.9669 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#17564 - CGI_10021016 superfamily 247736 25 93 3.24E-12 58.056 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#17565 - CGI_10021017 superfamily 247736 69 130 8.59E-12 57.2856 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#17566 - CGI_10021018 superfamily 246683 14 264 2.17E-78 245.108 cl14648 Aldose_epim superfamily - - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#17570 - CGI_10006559 superfamily 241886 3 324 8.35E-111 327.594 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#17571 - CGI_10006560 superfamily 241886 3 327 2.48E-101 314.497 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#17571 - CGI_10006560 superfamily 241886 355 655 7.71E-90 284.452 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#17572 - CGI_10006561 superfamily 241886 1 297 8.82E-95 285.607 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#17575 - CGI_10002794 superfamily 247684 11 172 1.78E-44 153.202 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17576 - CGI_10002795 superfamily 247684 12 427 2.66E-87 278.777 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17577 - CGI_10002796 superfamily 242169 13 93 5.98E-13 59.8982 cl00886 Robl_LC7 superfamily - - "Roadblock/LC7 domain; This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role." Q#17578 - CGI_10020946 superfamily 238192 1 176 1.85E-47 163.945 cl18939 Cyt_C5_DNA_methylase superfamily C - "Cytosine-C5 specific DNA methylases; Methyl transfer reactions play an important role in many aspects of biology. Cytosine-specific DNA methylases are found both in prokaryotes and eukaryotes. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the mammalian genome. These effects include transcriptional repression via inhibition of transcription factor binding or the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability." Q#17578 - CGI_10020946 superfamily 238192 223 365 1.45E-11 63.023 cl18939 Cyt_C5_DNA_methylase superfamily N - "Cytosine-C5 specific DNA methylases; Methyl transfer reactions play an important role in many aspects of biology. Cytosine-specific DNA methylases are found both in prokaryotes and eukaryotes. DNA methylation, or the covalent addition of a methyl group to cytosine within the context of the CpG dinucleotide, has profound effects on the mammalian genome. These effects include transcriptional repression via inhibition of transcription factor binding or the recruitment of methyl-binding proteins and their associated chromatin remodeling factors, X chromosome inactivation, imprinting and the suppression of parasitic DNA sequences. DNA methylation is also essential for proper embryonic development and is an important player in both DNA repair and genome stability." Q#17580 - CGI_10020948 superfamily 146263 1 98 4.10E-16 73.8779 cl04138 SK_channel superfamily - - Calcium-activated SK potassium channel; Calcium-activated SK potassium channel. Q#17580 - CGI_10020948 superfamily 219619 174 250 2.24E-08 50.6692 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#17580 - CGI_10020948 superfamily 198825 271 341 0.00545997 34.7517 cl03763 CaMBD superfamily - - "Calmodulin binding domain; Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other." Q#17583 - CGI_10020951 superfamily 222150 74 99 3.96E-06 43.9197 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17583 - CGI_10020951 superfamily 222150 47 69 0.00973619 34.2897 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17584 - CGI_10020952 superfamily 246925 182 417 2.81E-05 45.4242 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#17584 - CGI_10020952 superfamily 246925 543 630 0.000170295 43.113 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#17585 - CGI_10020953 superfamily 243058 216 334 7.26E-21 88.9107 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#17585 - CGI_10020953 superfamily 243058 139 248 3.75E-19 83.9031 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#17585 - CGI_10020953 superfamily 243058 300 416 2.52E-15 73.1175 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#17586 - CGI_10020954 superfamily 247769 623 799 1.98E-12 65.8237 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#17586 - CGI_10020954 superfamily 248010 187 345 4.99E-18 82.4292 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#17586 - CGI_10020954 superfamily 248010 371 527 1.86E-17 80.8884 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#17587 - CGI_10020955 superfamily 248312 5 160 1.59E-05 41.5788 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#17588 - CGI_10020956 superfamily 246723 45 724 1.10E-126 392.822 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#17589 - CGI_10020957 superfamily 246723 120 202 1.33E-20 89.2846 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#17590 - CGI_10020958 superfamily 246723 6 98 8.02E-20 83.1214 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#17591 - CGI_10020959 superfamily 217293 24 229 1.99E-25 101.941 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17591 - CGI_10020959 superfamily 202474 251 287 5.93E-06 45.7225 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17592 - CGI_10020960 superfamily 246723 71 684 1.66E-108 343.902 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#17593 - CGI_10020961 superfamily 241794 9 149 4.65E-77 228.07 cl00334 Ribosomal_S9 superfamily - - Ribosomal protein S9/S16; This family includes small ribosomal subunit S9 from prokaryotes and S16 from eukaryotes. Q#17594 - CGI_10020962 superfamily 241573 32 309 6.09E-43 157.108 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#17594 - CGI_10020962 superfamily 241653 513 661 7.58E-11 60.412 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#17595 - CGI_10020963 superfamily 247684 23 442 2.40E-96 302.659 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#17596 - CGI_10020964 superfamily 241607 224 257 7.67E-07 46.8794 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#17597 - CGI_10020965 superfamily 245864 34 460 2.59E-75 245.651 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#17601 - CGI_10020969 superfamily 242220 25 224 2.19E-59 187.416 cl00957 Translin superfamily - - "Translin family; Members of this family include Translin that interacts with DNA and forms a ring around the DNA. This family also includes human translin-associated protein X, which was found to interact with translin with yeast two-hybrid screen." Q#17602 - CGI_10020970 superfamily 247792 138 182 1.13E-09 53.6036 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17602 - CGI_10020970 superfamily 210118 198 217 0.00134915 36.16 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#17603 - CGI_10020971 superfamily 247755 1076 1299 2.97E-108 342.1 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#17603 - CGI_10020971 superfamily 248376 726 1049 1.00E-55 197.246 cl17822 MutS_III superfamily - - "MutS domain III; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterized in." Q#17603 - CGI_10020971 superfamily 216613 400 517 7.53E-39 142.328 cl03286 MutS_I superfamily - - "MutS domain I; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with globular domain I, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in." Q#17603 - CGI_10020971 superfamily 243083 65 169 3.72E-31 120.184 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#17603 - CGI_10020971 superfamily 218486 531 687 2.14E-11 63.1489 cl04975 MutS_II superfamily - - "MutS domain II; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam01624, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. This domain corresponds to domain II in Thermus aquaticus MutS as characterized in, and has similarity resembles RNAse-H-like domains (see pfam00075)." Q#17605 - CGI_10020973 superfamily 243222 44 457 0 702.464 cl02872 DHQ_Fe-ADH superfamily - - "Dehydroquinate synthase-like (DHQ-like) and iron-containing alcohol dehydrogenases (Fe-ADH); Dehydroquinate synthase-like. This superfamily divides into two subgroups: the dehydroquinate synthase-like, and a large metal-containing alcohol dehydrogenases (ADH), known as iron-containing alcohol dehydrogenases. Dehydroquinate synthase (DHQS) catalyzes the conversion of 3-deoxy-D-arabino-heptulosonate-7-phosphate (DAHP) to dehydroquinate (DHQ) in the second step of the shikimate pathway. This pathway involves seven sequential enzymatic steps in the conversion of erythrose 4-phosphate and phosphoenolpyruvate into chorismate for subsequent synthesis of aromatic compounds. Dehydroquinate synthase-like group includes dehydroquinate synthase, 2-deoxy-scyllo-inosose synthase, and 2-epi-5-epi-valiolone synthase. The alcohol dehydrogenases in this superfamily contain a dehydroquinate synthase-like protein structural fold and mostly contain iron. They are distinct from other alcohol dehydrogenases which contains different protein domains. There are several distinct families of alcohol dehydrogenases: Zinc-containing long-chain alcohol dehydrogenases; insect-type, or short-chain alcohol dehydrogenases; iron-containing alcohol dehydrogenases, and others. The iron-containing family has a Rossmann fold-like topology that resembles the fold of the zinc-dependent alcohol dehydrogenases, but lacks sequence homology, and differs in strand arrangement. ADH catalyzes the reversible oxidation of alcohol to acetaldehyde with the simultaneous reduction of NAD(P)+ to NAD(P)H." Q#17606 - CGI_10020974 superfamily 218721 293 562 2.30E-39 149.188 cl05344 TROVE superfamily N - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#17607 - CGI_10020975 superfamily 241682 86 245 3.79E-84 250.522 cl00203 Ribosomal_L30_like superfamily - - "Ribosomal protein L30, which is found in eukaryotes and prokaryotes but not in archaea, is one of the smallest ribosomal proteins with a molecular mass of about 7kDa. L30 binds the 23SrRNA as well as the 5S rRNA and is one of five ribosomal proteins that mediate the interactions 5S rRNA makes with the ribosome. The eukaryotic L30 members have N- and/or C-terminal extensions not found in their prokaryotic orthologs. L30 is closely related to the ribosomal L7 protein found in eukaryotes and archaea." Q#17607 - CGI_10020975 superfamily 203848 14 84 4.56E-28 103.078 cl06906 Ribosomal_L30_N superfamily - - Ribosomal L30 N-terminal domain; This presumed domain is found at the N-terminus of Ribosomal L30 proteins and has been termed RL30NT or NUC018. Q#17608 - CGI_10020976 superfamily 243310 1 182 2.10E-30 114.643 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#17609 - CGI_10020977 superfamily 243310 27 265 1.23E-78 241.374 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#17610 - CGI_10020978 superfamily 245206 1 244 1.32E-74 233.291 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#17611 - CGI_10020979 superfamily 248012 59 118 9.72E-05 37.9425 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#17614 - CGI_10020982 superfamily 247858 9 149 6.58E-13 62.4054 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#17616 - CGI_10020984 superfamily 241622 91 172 1.01E-26 98.4078 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#17616 - CGI_10020984 superfamily 243136 13 64 1.79E-09 50.7056 cl02672 L27 superfamily - - L27 domain; The L27 domain is found in receptor targeting proteins Lin-2 and Lin-7. Q#17617 - CGI_10020985 superfamily 243066 7 95 1.13E-22 93.7716 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#17617 - CGI_10020985 superfamily 222150 659 683 0.000198273 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17618 - CGI_10020986 superfamily 220097 30 113 2.11E-07 47.0133 cl08518 Phospholip_A2_3 superfamily N - "Prokaryotic phospholipase A2; The prokaryotic phospholipase A2 domain is predominantly found in bacterial and fungal phospholipases, as well as various hypothetical and putative proteins. It enables the liberation of fatty acids and lysophospholipid by hydrolysing the 2-ester bond of 1,2-diacyl-3-sn-phosphoglycerides. The domain adopts an alpha-helical secondary structure, consisting of five alpha-helices and two helical segments." Q#17620 - CGI_10020988 superfamily 248312 4 152 2.61E-05 40.8084 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#17621 - CGI_10003366 superfamily 222313 63 93 0.000263286 38.7122 cl18662 Methyltransf_32 superfamily C - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#17622 - CGI_10003367 superfamily 245304 41 335 1.51E-148 443.154 cl10459 Peptidases_S8_S53 superfamily - - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#17622 - CGI_10003367 superfamily 201820 425 511 3.51E-33 123.89 cl08326 P_proprotein superfamily - - Proprotein convertase P-domain; A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. Q#17622 - CGI_10003367 superfamily 248097 773 870 1.73E-10 59.5862 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17623 - CGI_10003368 superfamily 246676 308 471 2.23E-29 113.594 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#17623 - CGI_10003368 superfamily 243146 57 107 2.40E-06 44.9727 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17623 - CGI_10003368 superfamily 243146 168 219 0.00559937 34.9575 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17624 - CGI_10000974 superfamily 245201 113 366 1.70E-14 72.184 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17625 - CGI_10004413 superfamily 241733 5 83 1.30E-50 158.098 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#17628 - CGI_10004416 superfamily 248458 32 110 1.91E-07 50.3901 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17630 - CGI_10005445 superfamily 243100 56 104 3.37E-06 44.2179 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#17631 - CGI_10005446 superfamily 242902 384 514 4.98E-16 75.8208 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#17631 - CGI_10005446 superfamily 246027 184 222 0.0073208 36.888 cl12560 DUF2806 superfamily N - Protein of unknown function (DUF2806); This bacterial family of proteins has no known function. Q#17633 - CGI_10005448 superfamily 216554 58 215 3.55E-46 160.723 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#17636 - CGI_10005451 superfamily 222150 335 358 0.00018683 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17636 - CGI_10005451 superfamily 246975 350 370 0.00410261 35.0153 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#17636 - CGI_10005451 superfamily 222150 362 386 0.00587027 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#17636 - CGI_10005451 superfamily 246975 321 342 0.00785722 34.2449 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#17637 - CGI_10003661 superfamily 245226 307 476 7.26E-28 109.312 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#17638 - CGI_10003662 superfamily 241563 97 138 2.49E-07 47.474 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17638 - CGI_10003662 superfamily 241563 48 89 7.26E-05 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17640 - CGI_10001374 superfamily 247724 6 145 1.38E-55 174.638 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17641 - CGI_10018748 superfamily 241567 51 298 4.92E-94 281.411 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#17645 - CGI_10018752 superfamily 219953 58 164 8.75E-25 94.5579 cl07317 DUF1777 superfamily N - Protein of unknown function (DUF1777); This is a family of eukaryotic proteins of unknown function. Some of the proteins in this family are putative nucleic acid binding proteins. Q#17646 - CGI_10018753 superfamily 243061 143 243 1.98E-38 137.088 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17646 - CGI_10018753 superfamily 243061 362 463 6.61E-38 135.933 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17646 - CGI_10018753 superfamily 243061 470 570 1.04E-37 135.162 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17646 - CGI_10018753 superfamily 243061 35 136 4.23E-36 130.925 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17646 - CGI_10018753 superfamily 243061 250 350 9.32E-29 110.509 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17646 - CGI_10018753 superfamily 243061 579 624 1.29E-17 79.3082 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17647 - CGI_10018754 superfamily 241578 345 502 8.18E-39 142.045 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17647 - CGI_10018754 superfamily 241613 707 739 0.00187268 37.1862 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#17647 - CGI_10018754 superfamily 243061 124 225 7.10E-40 142.866 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17647 - CGI_10018754 superfamily 243061 16 117 5.98E-37 134.777 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17647 - CGI_10018754 superfamily 243061 230 330 2.02E-36 133.236 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17647 - CGI_10018754 superfamily 246918 537 588 1.60E-14 69.5379 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#17647 - CGI_10018754 superfamily 246918 651 702 2.30E-14 69.1527 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#17647 - CGI_10018754 superfamily 246918 594 627 3.48E-09 54.1299 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#17648 - CGI_10018755 superfamily 241578 555 716 7.33E-41 148.979 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17648 - CGI_10018755 superfamily 241578 350 508 1.21E-25 105.451 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17648 - CGI_10018755 superfamily 241578 754 914 2.48E-21 92.7398 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17648 - CGI_10018755 superfamily 245213 965 1004 2.95E-06 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17648 - CGI_10018755 superfamily 243061 132 232 2.11E-41 148.259 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17648 - CGI_10018755 superfamily 243061 24 125 1.40E-40 145.948 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17648 - CGI_10018755 superfamily 243061 237 339 1.20E-34 129.384 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17649 - CGI_10018756 superfamily 206077 535 585 1.66E-18 80.3595 cl18287 AA_permease_C superfamily - - C-terminus of AA_permease; This is the C-terminus of AA-permease enzymes that is not captured by the models pfam00324 and pfam13520. Q#17650 - CGI_10018757 superfamily 247792 56 95 2.10E-05 42.4328 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17650 - CGI_10018757 superfamily 128778 229 301 0.0078234 35.7035 cl17972 BBC superfamily C - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#17651 - CGI_10018758 superfamily 217293 31 240 2.89E-50 172.047 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17651 - CGI_10018758 superfamily 202474 247 463 1.36E-10 59.9749 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17652 - CGI_10018759 superfamily 217293 532 731 9.22E-58 197.47 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17652 - CGI_10018759 superfamily 217293 42 250 8.57E-47 167.039 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17652 - CGI_10018759 superfamily 202474 257 475 1.13E-11 64.2121 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17653 - CGI_10018760 superfamily 217293 35 242 2.23E-61 201.322 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17653 - CGI_10018760 superfamily 202474 249 462 1.06E-10 59.9749 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17654 - CGI_10018761 superfamily 217293 37 242 5.59E-53 179.366 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#17654 - CGI_10018761 superfamily 202474 249 462 4.34E-12 64.2121 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#17658 - CGI_10018765 superfamily 241754 381 1135 0 825.26 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#17659 - CGI_10018766 superfamily 241622 189 279 3.08E-17 74.9106 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#17660 - CGI_10018767 superfamily 247829 3 276 2.01E-106 313.259 cl17275 PRTase_typeII superfamily - - "Phosphoribosyltransferase (PRTase) type II; This family contains two enzymes that play an important role in NAD production by either allowing quinolinic acid (QA) , quinolinate phosphoribosyl transferase (QAPRTase), or nicotinic acid (NA), nicotinate phosphoribosyltransferase (NAPRTase), to be used in the synthesis of NAD. QAPRTase catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide, an important step in the de novo synthesis of NAD. NAPRTase catalyses a similar reaction leading to NAMN and pyrophosphate, using nicotinic acid an PPRP as substrates, used in the NAD salvage pathway." Q#17661 - CGI_10018768 superfamily 222754 61 175 1.16E-25 103.504 cl18690 SM-ATX superfamily - - SM domain found in Ataxin-2; SM domain found in Ataxin-2. Q#17661 - CGI_10018768 superfamily 219155 208 275 1.22E-06 47.3241 cl10004 LsmAD superfamily - - LsmAD domain; This domain is found associated with Lsm domain. Q#17662 - CGI_10018769 superfamily 243056 111 310 1.84E-68 216.842 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#17664 - CGI_10018771 superfamily 246925 1005 1139 7.57E-14 72.7733 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#17666 - CGI_10018773 superfamily 243066 51 155 1.14E-27 107.703 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#17666 - CGI_10018773 superfamily 243146 460 507 1.46E-07 48.8247 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17666 - CGI_10018773 superfamily 243146 498 548 4.83E-07 47.271 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17666 - CGI_10018773 superfamily 198867 163 256 5.29E-07 48.1064 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#17666 - CGI_10018773 superfamily 243146 409 460 1.33E-06 46.0123 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17666 - CGI_10018773 superfamily 243146 348 395 4.83E-06 44.5746 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17666 - CGI_10018773 superfamily 243146 310 344 0.00725934 35.1166 cl02701 Kelch_3 superfamily N - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17668 - CGI_10018775 superfamily 190261 67 140 2.40E-32 119.961 cl03504 RFX_DNA_binding superfamily - - RFX DNA-binding domain; RFX is a regulatory factor which binds to the X box of MHC class II genes and is essential for their expression. The DNA-binding domain of RFX is the central domain of the protein and binds ssDNA as either a monomer or homodimer. Q#17669 - CGI_10018776 superfamily 244603 174 483 2.51E-63 215.651 cl07072 COG4 superfamily - - COG4 transport protein; This region is found in yeast oligomeric golgi complex component 4 which is involved in ER to Golgi an intra Golgi transport. Q#17671 - CGI_10018778 superfamily 245201 25 269 5.32E-70 232.512 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17673 - CGI_10018780 superfamily 248097 11 125 1.49E-15 68.0606 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17674 - CGI_10018781 superfamily 214531 13 48 1.63E-07 49.9077 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#17674 - CGI_10018781 superfamily 111397 1030 1109 7.05E-07 48.8767 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#17674 - CGI_10018781 superfamily 214531 50 91 2.88E-05 43.3593 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#17674 - CGI_10018781 superfamily 248289 889 947 6.61E-05 42.4111 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#17674 - CGI_10018781 superfamily 215683 344 392 0.00212836 37.9199 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#17674 - CGI_10018781 superfamily 219525 1325 1349 0.00343995 37.3986 cl06646 GCC2_GCC3 superfamily C - GCC2 and GCC3; GCC2 and GCC3. Q#17674 - CGI_10018781 superfamily 214531 290 321 0.00679151 36.4257 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#17674 - CGI_10018781 superfamily 214531 376 419 0.00864785 36.0405 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#17675 - CGI_10018782 superfamily 248338 117 283 9.51E-12 64.93 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#17676 - CGI_10018783 superfamily 243077 617 673 0.00672733 35.2881 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#17678 - CGI_10018786 superfamily 241638 204 337 2.99E-14 68.1636 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#17679 - CGI_10018787 superfamily 241638 192 318 6.97E-13 63.9264 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#17680 - CGI_10018788 superfamily 241638 185 331 1.06E-27 105.505 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#17682 - CGI_10018790 superfamily 245208 44 410 0 631.811 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#17682 - CGI_10018790 superfamily 245208 482 537 0.00328079 38.419 cl09933 ACAD superfamily N - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#17683 - CGI_10018791 superfamily 220673 12 112 1.87E-21 83.8833 cl10960 Med11 superfamily - - "Mediator complex protein; Mediator is a large, modular protein complex that is conserved from yeast to human and conveys regulatory signals from DNA-binding transcription factors to RNA polymerase II. Not only are the polypeptides conserved but the structural organisation is also largely conserved. One or two subunits are either fungal or vertebral specific but Med11 is one of the subunits that is conserved from fungi to humans. Med11 appears to be necessary for the full and successful assembly of the core head sub-region." Q#17684 - CGI_10004467 superfamily 247941 147 319 5.53E-11 59.8337 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#17685 - CGI_10004468 superfamily 241563 63 95 2.14E-05 42.2744 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17686 - CGI_10004469 superfamily 216112 507 874 2.13E-37 144.747 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#17686 - CGI_10004469 superfamily 219408 101 209 0.0061231 38.5626 cl06454 DUF1510 superfamily C - Protein of unknown function (DUF1510); This family consists of several hypothetical bacterial proteins of around 200 residues in length. The function of this family is unknown. Q#17687 - CGI_10004470 superfamily 242406 4 107 2.08E-12 59.5273 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#17692 - CGI_10006366 superfamily 243035 60 176 9.94E-31 109.632 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17693 - CGI_10006367 superfamily 243035 23 100 1.41E-21 91.2121 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17694 - CGI_10006368 superfamily 243035 18 146 9.65E-30 107.005 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17697 - CGI_10006371 superfamily 243061 929 997 1.75E-18 82.775 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#17699 - CGI_10006373 superfamily 242274 47 330 1.93E-114 337.007 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#17700 - CGI_10006374 superfamily 243058 341 433 5.48E-05 41.5312 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#17701 - CGI_10004962 superfamily 199011 465 539 5.66E-36 129.597 cl11064 BHD_3 superfamily - - Rad4 beta-hairpin domain 3; This short domain is found in the Rad4 protein. This domain binds to DNA. Q#17701 - CGI_10004962 superfamily 245491 403 458 1.26E-18 80.8428 cl11063 BHD_2 superfamily - - Rad4 beta-hairpin domain 2; This short domain is found in the Rad4 protein. This domain binds to DNA. Q#17701 - CGI_10004962 superfamily 245490 349 400 3.49E-18 79.5765 cl11062 BHD_1 superfamily - - Rad4 beta-hairpin domain 1; This short domain is found in the Rad4 protein. This domain binds to DNA. Q#17701 - CGI_10004962 superfamily 217753 266 343 5.90E-15 72.4162 cl10615 Rad4 superfamily N - Rad4 transglutaminase-like domain; Rad4 transglutaminase-like domain. Q#17701 - CGI_10004962 superfamily 247916 85 152 0.00311759 36.6098 cl17362 Transglut_core superfamily C - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#17702 - CGI_10004963 superfamily 241896 163 325 2.40E-77 247.05 cl00483 UDG_like superfamily - - "Uracil-DNA glycosylases (UDG) and related enzymes; Uracil-DNA glycosylases (UDG) catalyzes the removal of uracil from DNA, which initiates the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. At least five UDG families have been characterized so far; these families share similar overall folds and common active site motifs. They demonstrate different substrate specificities, but often the function of one enzyme can be complemented by the other. Family 1 enzymes are active against uracil in both ssDNA and dsDNA, and recognize uracil explicitly in an extrahelical conformation via a combination of protein and bound-water interactions. Family 2 enzymes are mismatch specific and explicitly recognize the widowed guanine on the complementary strand, rather than the extrahelical scissile pyrimidine. This allows a broader specificity so that some Family 2 enzymes can excise uracil as well as 3, N(4)-ethenocytosine from mismatches with guanine. A Family 3 UDG from human was first characterized to remove Uracil from ssDNA, hence the name hSMUG (single-strand-selective monofunctional uracil-DNA glycosylase). However, subsequent research has shown that hSMUG1 and its rat ortholog can remove uracil and its oxidized pyrimidine derivatives from both, ssDNA and dsDNA. Enzymes in Families 4 and 5 are both thermostable. Family 4 enzymes specifically recognize uracil in a manner similar to human UDG (Family 1), rather than guanine in the complementary strand DNA, as does E. coli MUG (Family 2). These results suggest that the mechanism by which Family 4 UDGs remove uracils from DNA is similar to that of Family 1 enzyme. Although Family 5 enzymes are close relatives of Family 4, they show different substrate specificities." Q#17703 - CGI_10004964 superfamily 248338 21 210 3.31E-13 66.856 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#17704 - CGI_10004965 superfamily 222058 166 230 3.66E-06 44.1449 cl16248 DUF4098 superfamily - - Domain of unknown function (DUF4098); This domain is a C-terminal repeat found in many bacterial species. Q#17705 - CGI_10004966 superfamily 241578 316 473 5.97E-27 110.733 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17705 - CGI_10004966 superfamily 245213 1916 1953 9.19E-10 57.2614 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1763 1800 2.05E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1840 1876 6.60E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 2228 2263 7.19E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1802 1837 4.54E-06 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 2195 2225 8.42E-06 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1350 1384 2.86E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 2080 2113 5.00E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1644 1679 6.26E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1269 1303 7.13E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1569 1605 8.36E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1879 1913 0.000367565 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1387 1420 0.000446468 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 1231 1266 0.0041255 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 245213 2117 2151 0.00555089 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17705 - CGI_10004966 superfamily 149481 3 269 4.11E-41 155.261 cl07163 CLCA_N superfamily - - Calcium-activated chloride channel; The CLCA family of calcium-activated chloride channels has been identified in many epithelial and endothelial cell types as well as in smooth muscle cells and has four or five putative transmembrane regions. Additionally to their role as chloride channels some CLCA proteins function as adhesion molecules and may also have roles as tumour suppressors. The domain described here is found at the N-terminus of CLCAs. Q#17705 - CGI_10004966 superfamily 150094 503 677 2.37E-25 106.69 cl09605 DUF1973 superfamily - - Domain of unknown function (DUF1973); Members of his family of functionally uncharacterized domains are found in various eukaryotic calcium-dependent chloride channels. Q#17708 - CGI_10001635 superfamily 247724 127 287 0.000360241 39.362 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#17708 - CGI_10001635 superfamily 242902 25 75 3.42E-10 56.8715 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#17710 - CGI_10001788 superfamily 248019 2 155 1.16E-20 91.485 cl17465 DAGK_cat superfamily NC - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#17714 - CGI_10001440 superfamily 247744 2 150 5.28E-27 104.56 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#17714 - CGI_10001440 superfamily 241802 169 373 3.87E-72 234.272 cl00342 Trp-synth-beta_II superfamily C - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#17715 - CGI_10001441 superfamily 241810 355 407 8.11E-24 92.9845 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#17715 - CGI_10001441 superfamily 220714 54 180 5.59E-64 203.254 cl11024 Kin17_mid superfamily - - "Domain of Kin17 curved DNA-binding protein; Kin17_mid is the conserved central 169 residue region of a family of Kin17 proteins. Towards the N-terminal end there is a zinc-finger domain, and in human and mouse members there is a RecA-like domain further downstream. The Kin17 protein in humans forms intra-nuclear foci during cell proliferation and is re-distributed in the nucleoplasm during the cell cycle." Q#17715 - CGI_10001441 superfamily 205121 28 52 0.00650241 34.402 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#17716 - CGI_10002232 superfamily 247792 505 549 1.04E-08 52.0628 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17717 - CGI_10002233 superfamily 245201 20 182 8.22E-46 153.461 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17718 - CGI_10002234 superfamily 246908 263 361 3.44E-19 82.7674 cl15255 SH2 superfamily C - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#17718 - CGI_10002234 superfamily 241563 66 102 1.79E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17718 - CGI_10002234 superfamily 243138 158 224 0.00748284 36.397 cl02675 DZF superfamily NC - DZF domain; The function of this domain is unknown. It is often found associated with pfam00098 or pfam00035. This domain has been predicted to belong to the nucleotidyltransferase superfamily. Q#17719 - CGI_10002235 superfamily 248097 27 133 1.05E-08 48.8006 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17722 - CGI_10002610 superfamily 241609 204 272 2.11E-24 93.9819 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#17722 - CGI_10002610 superfamily 241609 125 190 4.08E-24 93.2115 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#17723 - CGI_10002632 superfamily 243119 53 93 0.00458193 34.727 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#17724 - CGI_10002633 superfamily 248097 49 178 2.39E-19 79.6166 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17725 - CGI_10002706 superfamily 241584 11 83 1.32E-07 49.0319 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17727 - CGI_10002708 superfamily 219261 243 401 1.90E-51 177.956 cl15654 DUF1308 superfamily N - Protein of unknown function (DUF1308); This family consists of several hypothetical eukaryotic sequences of around 400 residues in length. The function of this family is unknown. Q#17727 - CGI_10002708 superfamily 219261 1 138 6.38E-32 124.028 cl15654 DUF1308 superfamily C - Protein of unknown function (DUF1308); This family consists of several hypothetical eukaryotic sequences of around 400 residues in length. The function of this family is unknown. Q#17728 - CGI_10002709 superfamily 219261 147 179 2.98E-05 41.9802 cl15654 DUF1308 superfamily N - Protein of unknown function (DUF1308); This family consists of several hypothetical eukaryotic sequences of around 400 residues in length. The function of this family is unknown. Q#17730 - CGI_10019250 superfamily 243047 38 81 1.18E-22 91.5275 cl02464 ArfGap superfamily C - "Putative GTPase activating protein for Arf; Putative zinc fingers with GTPase activating proteins (GAPs) towards the small GTPase, Arf. The GAP of ARD1 stimulates GTPase hydrolysis for ARD1 but not ARFs." Q#17731 - CGI_10019251 superfamily 245201 49 285 3.57E-61 202.082 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17732 - CGI_10019252 superfamily 247757 59 274 1.49E-92 275.498 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#17733 - CGI_10019253 superfamily 242868 128 204 0.00103725 37.757 cl02079 YtxH superfamily N - YtxH-like protein; This family of proteins is found in bacteria. Proteins in this family are typically between 100 and 143 amino acids in length. The N-terminal region is the most conserved. Proteins is this family are functionally uncharacterized. Q#17735 - CGI_10019255 superfamily 245814 135 210 6.10E-13 61.5807 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17735 - CGI_10019255 superfamily 245814 26 112 4.77E-05 39.7961 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17737 - CGI_10019257 superfamily 220695 117 173 0.00318208 37.5583 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#17738 - CGI_10019258 superfamily 220695 38 271 3.63E-07 50.2699 cl18571 7TM_GPCR_Srx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#17738 - CGI_10019258 superfamily 243031 401 504 2.86E-05 44.5968 cl02425 Osteopontin superfamily C - Osteopontin; Osteopontin. Q#17739 - CGI_10019259 superfamily 241574 1038 1266 8.82E-105 334.169 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17739 - CGI_10019259 superfamily 241574 1329 1561 8.63E-77 255.588 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17739 - CGI_10019259 superfamily 241584 560 649 1.57E-15 74.8403 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17739 - CGI_10019259 superfamily 241584 660 749 2.41E-08 53.6543 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17739 - CGI_10019259 superfamily 241584 336 438 4.47E-05 43.6391 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17739 - CGI_10019259 superfamily 241609 62 123 1.95E-13 68.1735 cl00100 KR superfamily C - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#17741 - CGI_10019261 superfamily 247723 191 276 7.62E-27 105.808 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17741 - CGI_10019261 superfamily 247723 103 180 4.83E-16 74.6144 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17741 - CGI_10019261 superfamily 247723 31 95 0.00753888 35.5348 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17742 - CGI_10019262 superfamily 245201 61 310 4.48E-58 197.374 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17745 - CGI_10019265 superfamily 192997 305 458 6.65E-38 140.024 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#17746 - CGI_10019266 superfamily 241596 39 76 7.53E-10 51.8311 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#17747 - CGI_10019267 superfamily 247856 48 106 5.54E-10 52.5501 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17747 - CGI_10019267 superfamily 247856 83 140 6.31E-07 44.4609 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17748 - CGI_10019268 superfamily 202009 64 108 8.20E-16 67.9076 cl09271 NAC superfamily - - NAC domain; NAC domain. Q#17749 - CGI_10019269 superfamily 243034 121 222 8.81E-06 44.2932 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#17749 - CGI_10019269 superfamily 220793 394 431 6.14E-08 49.7733 cl11153 SHNi-TPR superfamily - - SHNi-TPR; SHNi-TPR family members contain a reiterated sequence motif that is an interrupted form of TPR repeat. Q#17750 - CGI_10019270 superfamily 245595 515 659 3.12E-78 260.007 cl11393 Peptidase_M14_like superfamily N - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#17750 - CGI_10019270 superfamily 245595 178 288 6.01E-46 168.715 cl11393 Peptidase_M14_like superfamily C - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#17751 - CGI_10019271 superfamily 248313 37 100 6.87E-05 41.1406 cl17759 EamA superfamily N - EamA-like transporter family; This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. The family used to be known as DUF6. Q#17753 - CGI_10019273 superfamily 219248 40 183 9.39E-40 145.159 cl06159 POP1 superfamily - - Ribonucleases P/MRP protein subunit POP1; This family represents a conserved region approximately 150 residues long located towards the N-terminus of the POP1 subunit that is common to both the RNase MRP and RNase P ribonucleoproteins (EC:3.1.26.5). These RNA-containing enzymes generate mature tRNA molecules by cleaving their 5' ends. Q#17753 - CGI_10019273 superfamily 219735 563 654 3.18E-32 121.235 cl06978 POPLD superfamily - - POPLD (NUC188) domain; This domain is found in POP1-like nucleolar proteins. Q#17754 - CGI_10019274 superfamily 245229 62 167 4.66E-41 136.149 cl10015 YjgF_YER057c_UK114_family superfamily - - "YjgF, YER057c, and UK114 belong to a large family of proteins present in bacteria, archaea, and eukaryotes with no definitive function. The conserved domain is similar in structure to chorismate mutase but there is no sequence similarity and no functional connection. Members of this family have been implicated in isoleucine (Yeo7, Ibm1, aldR) and purine (YjgF) biosynthesis, as well as threonine anaerobic degradation (tdcF) and mitochondrial DNA maintenance (Ibm1). This domain homotrimerizes forming a distinct intersubunit cavity that may serve as a small molecule binding site." Q#17755 - CGI_10019275 superfamily 241739 52 341 5.04E-123 363.05 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#17755 - CGI_10019275 superfamily 241738 356 437 3.81E-10 56.829 cl00266 HGTP_anticodon superfamily C - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#17756 - CGI_10019276 superfamily 244539 116 294 5.61E-45 163.892 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#17756 - CGI_10019276 superfamily 241584 368 452 0.00105668 39.0167 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#17756 - CGI_10019276 superfamily 150458 54 87 1.10E-07 50.3879 cl10765 Oxidored-like superfamily - - "Oxidoreductase-like protein, N-terminal; Members of this family are found in the N terminal region of various oxidoreductase like proteins. Their exact function is, as yet, unknown." Q#17758 - CGI_10019278 superfamily 248097 266 387 4.86E-28 109.277 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17758 - CGI_10019278 superfamily 248097 123 248 2.45E-26 104.655 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17758 - CGI_10019278 superfamily 248097 39 113 3.58E-11 61.127 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#17760 - CGI_10019280 superfamily 241691 72 283 1.82E-05 43.9266 cl00213 DNA_BRE_C superfamily C - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#17762 - CGI_10019282 superfamily 241585 138 172 0.000197539 39.4244 cl00066 FU superfamily C - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#17763 - CGI_10021414 superfamily 241686 325 387 3.31E-16 75.3349 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#17763 - CGI_10021414 superfamily 241686 213 273 1.26E-15 73.7941 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#17763 - CGI_10021414 superfamily 241686 134 196 8.38E-14 68.4013 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#17763 - CGI_10021414 superfamily 241686 44 104 3.15E-12 63.7789 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#17763 - CGI_10021414 superfamily 241686 400 463 1.87E-05 44.1337 cl00207 HMA superfamily - - "Heavy-metal-associated domain (HMA) is a conserved domain of approximately 30 amino acid residues found in a number of proteins that transport or detoxify heavy metals, for example, the CPx-type heavy metal ATPases and copper chaperones. HMA domain contains two cysteine residues that are important in binding and transfer of metal ions, such as copper, cadmium, cobalt and zinc. In the case of copper, stoichiometry of binding is one Cu+ ion per binding domain. Repeats of the HMA domain in copper chaperone has been associated with Menkes/Wilson disease due to binding of multiple copper ions." Q#17763 - CGI_10021414 superfamily 215733 611 841 1.67E-58 202.025 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#17763 - CGI_10021414 superfamily 226572 1039 1151 4.78E-08 52.9464 cl18761 COG4087 superfamily N - Soluble P-type ATPase [General function prediction only] Q#17764 - CGI_10021415 superfamily 216897 2 65 6.25E-24 86.9665 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#17766 - CGI_10021417 superfamily 248469 76 161 5.53E-05 41.2015 cl17915 HAD_like superfamily C - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#17767 - CGI_10021418 superfamily 241670 144 324 8.72E-19 82.4059 cl00188 BPI superfamily - - "BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide." Q#17767 - CGI_10021418 superfamily 241670 20 120 1.68E-05 43.9204 cl00188 BPI superfamily C - "BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide." Q#17768 - CGI_10021419 superfamily 241670 31 186 6.66E-24 96.2248 cl00188 BPI superfamily - - "BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide." Q#17768 - CGI_10021419 superfamily 241670 213 375 2.87E-23 96.2731 cl00188 BPI superfamily - - "BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide." Q#17769 - CGI_10021420 superfamily 241670 1 131 2.14E-29 111.633 cl00188 BPI superfamily N - "BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide." Q#17769 - CGI_10021420 superfamily 241670 198 384 1.97E-26 105.518 cl00188 BPI superfamily - - "BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide." Q#17770 - CGI_10021421 superfamily 247743 161 298 6.04E-19 84.1199 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17770 - CGI_10021421 superfamily 192911 415 563 9.21E-79 247.754 cl15342 MgsA_C superfamily - - "MgsA AAA+ ATPase C terminal; The MgsA protein possesses DNA-dependent ATPase and ssDNA annealing activities. MgsA contributes to the recovery of stalled replication forks and therefore prevents genomic instability caused by aberrant DNA replication. Additionally, MgsA may play a role in chromosomal segregation. This is consistent with a report that MgsA co-localises with the replisome and affects chromosome segregation. This domain represents the C terminal region of MgsA." Q#17772 - CGI_10021423 superfamily 241578 4 94 1.23E-20 82.3394 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17773 - CGI_10021424 superfamily 245226 6 193 2.68E-22 94.9797 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#17773 - CGI_10021424 superfamily 247723 652 725 3.74E-17 77.2123 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17773 - CGI_10021424 superfamily 247723 225 304 1.06E-48 165.857 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17773 - CGI_10021424 superfamily 247723 308 405 9.80E-46 158.509 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17773 - CGI_10021424 superfamily 247723 457 541 1.71E-33 124.232 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#17774 - CGI_10021426 superfamily 216981 87 158 9.71E-07 45.2162 cl17087 OTU superfamily C - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#17776 - CGI_10021428 superfamily 247683 419 471 3.40E-20 84.2352 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#17776 - CGI_10021428 superfamily 247725 13 131 2.79E-23 94.9163 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#17777 - CGI_10021429 superfamily 241642 24 78 4.65E-07 42.5145 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#17778 - CGI_10021430 superfamily 217294 8 217 1.27E-89 272.785 cl08381 GatB_N superfamily - - GatB/GatE catalytic domain; This domain is found in the GatB and GatE proteins. Q#17778 - CGI_10021430 superfamily 248248 218 330 5.42E-12 62.1365 cl17694 GatB_Yqey superfamily N - "GatB domain; This domain is found in GatB. It is about 140 amino acid residues long. This domain is found at the C terminus of GatB, which transamidates Glu-tRNA to Gln-tRNA." Q#17783 - CGI_10002990 superfamily 243092 21 302 1.12E-19 90.472 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17784 - CGI_10002991 superfamily 244859 1 251 1.74E-39 139.244 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#17792 - CGI_10003599 superfamily 220775 4377 4946 0 574.576 cl11120 FSA_C superfamily - - Fragile site-associated protein C-terminus; This is the conserved C-terminal half of the protein KIAA1109 which is the fragile site-associated protein FSA. Genome-wide-association studies showed this protein to linked to the susceptibility to coeliac disease. The protein may also be associated with polycystic kidney disease. Q#17793 - CGI_10003600 superfamily 243179 164 267 2.87E-12 61.7799 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#17795 - CGI_10003867 superfamily 216363 102 194 4.90E-21 84.4442 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#17800 - CGI_10011361 superfamily 247805 33 147 3.17E-12 59.6584 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#17803 - CGI_10011364 superfamily 247866 6 73 4.70E-05 39.358 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#17804 - CGI_10011365 superfamily 245201 155 221 2.85E-05 42.2238 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17806 - CGI_10011368 superfamily 243035 74 147 5.85E-11 56.0889 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17807 - CGI_10011369 superfamily 241610 17 69 1.51E-15 64.5786 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#17810 - CGI_10004193 superfamily 241574 3 217 1.04E-89 287.175 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17810 - CGI_10004193 superfamily 241574 557 663 1.80E-27 112.294 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17810 - CGI_10004193 superfamily 241574 242 451 1.19E-19 89.1821 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17810 - CGI_10004193 superfamily 241574 743 953 3.66E-12 66.0701 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#17811 - CGI_10004194 superfamily 248264 126 178 2.53E-15 69.1881 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#17811 - CGI_10004194 superfamily 222263 43 137 1.93E-05 40.3789 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#17813 - CGI_10007517 superfamily 247746 16 55 0.00587499 31.5831 cl17192 ATP-synt_B superfamily NC - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#17816 - CGI_10007520 superfamily 242274 3 155 1.07E-11 58.963 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#17817 - CGI_10007521 superfamily 245364 26 109 2.10E-23 88.9496 cl10717 CactinC_cactus superfamily N - "Cactus-binding C-terminus of cactin protein; CactinC_cactus is the C-terminal 200 residues of the cactin protein which are necessary for the association of cactin with IkappaB-cactus as one of the intracellular members of the Rel complex. The Rel (NF-kappaB) pathway is conserved in invertebrates and vertebrates. In mammals, it controls the activities of the immune and inflammatory response genes as well as viral genes, and is critical for cell growth and survival. In Drosophila, the Rel pathway functions in the innate cellular and humoral immune response, in muscle development, and in the establishment of dorsal-ventral polarity in the early embryo. Most members of the family also have a Cactin_mid domain pfam10312 further upstream." Q#17820 - CGI_10015384 superfamily 246676 315 472 8.77E-31 117.831 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#17820 - CGI_10015384 superfamily 246671 25 152 1.35E-19 85.1672 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#17820 - CGI_10015384 superfamily 246710 172 311 4.68E-17 78.6236 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#17821 - CGI_10015385 superfamily 216897 38 88 0.00123099 36.5053 cl03463 Gal_Lectin superfamily C - Galactose binding lectin domain; Galactose binding lectin domain. Q#17822 - CGI_10015386 superfamily 247637 40 200 4.03E-31 124.604 cl16912 MDR superfamily C - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#17822 - CGI_10015386 superfamily 242289 375 570 2.27E-21 93.9607 cl01077 SIMPL superfamily - - "Protein of unknown function (DUF541); Members of this family have so far been found in bacteria and mouse SwissProt or TrEMBL entries. However possible family members have also been identified in translated rat (Genbank:AW144450) and human (Genbank:AI478629) ESTs. A mouse family member has been named SIMPL (signalling molecule that associates with mouse pelle-like kinase). SIMPL appears to facilitate and/or regulate complex formation between IRAK/mPLK (IL-1 receptor-associated kinase) and IKK (inhibitor of kappa-B kinase) containing complexes, and thus regulate NF-kappa-B activity. Separate experiments demonstrate that a mouse family member (named LaXp180) binds the Listeria monocytogenes surface protein ActA, which is a virulence factor that induces actin polymerisation. It may also bind stathmin, a protein involved in signal transduction and in the regulation of microtubule dynamics. In bacteria its function is unknown, but it is thought to be located in the periplasm or outer membrane." Q#17822 - CGI_10015386 superfamily 216897 602 652 0.00236421 37.2757 cl03463 Gal_Lectin superfamily C - Galactose binding lectin domain; Galactose binding lectin domain. Q#17823 - CGI_10015387 superfamily 148600 1 189 1.03E-38 132.548 cl06218 DUF1352 superfamily - - Protein of unknown function (DUF1352); This family consists of several hypothetical eukaryotic proteins of around 190 residues in length. The function of this family is unknown. Q#17824 - CGI_10015388 superfamily 247683 1022 1071 8.53E-26 102.493 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#17824 - CGI_10015388 superfamily 247683 927 978 3.25E-25 100.903 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#17824 - CGI_10015388 superfamily 247856 169 233 2.38E-23 95.7494 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17824 - CGI_10015388 superfamily 247683 820 875 7.13E-22 91.2654 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#17824 - CGI_10015388 superfamily 247856 394 427 3.10E-06 46.0587 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#17824 - CGI_10015388 superfamily 218657 674 754 0.0067657 36.5138 cl09577 THOC7 superfamily N - Tho complex subunit 7; The Tho complex is involved in transcription elongation and mRNA export from the nucleus. Q#17828 - CGI_10015392 superfamily 217939 124 302 3.77E-88 268.679 cl04432 TIP41 superfamily - - TIP41-like family; The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 interacts with TAP42 and negatively regulates the TOR signaling pathway. Q#17828 - CGI_10015392 superfamily 217939 337 399 6.57E-26 103.043 cl04432 TIP41 superfamily N - TIP41-like family; The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 interacts with TAP42 and negatively regulates the TOR signaling pathway. Q#17831 - CGI_10015395 superfamily 144618 53 103 0.000973543 35.1516 cl03092 PTN_MK_C superfamily - - "PTN/MK heparin-binding protein family, C-terminal domain; PTN/MK heparin-binding protein family, C-terminal domain. " Q#17831 - CGI_10015395 superfamily 144618 99 141 0.00302154 33.996 cl03092 PTN_MK_C superfamily C - "PTN/MK heparin-binding protein family, C-terminal domain; PTN/MK heparin-binding protein family, C-terminal domain. " Q#17832 - CGI_10015396 superfamily 246669 66 185 1.81E-77 233.762 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#17832 - CGI_10015396 superfamily 243130 241 282 4.21E-13 62.1229 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#17833 - CGI_10015397 superfamily 217645 96 364 3.08E-81 255.732 cl04187 DUF303 superfamily - - Domain of unknown function (DUF303); Distribution of this domain seems limited to prokaryotes and viruses. Q#17835 - CGI_10015400 superfamily 241754 10 557 0 604.259 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#17835 - CGI_10015400 superfamily 241577 714 880 4.23E-19 86.5946 cl00056 MH2 superfamily - - "C-terminal Mad Homology 2 (MH2) domain; The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers." Q#17836 - CGI_10015401 superfamily 246925 1258 1503 4.12E-18 86.2553 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#17837 - CGI_10015402 superfamily 242909 67 228 3.70E-70 214.424 cl02156 PTPLA superfamily - - "Protein tyrosine phosphatase-like protein, PTPLA; This family includes the mammalian protein tyrosine phosphatase-like protein, PTPLA. A significant variation of PTPLA from other protein tyrosine phosphatases is the presence of proline instead of catalytic arginine at the active site. It is thought that PTPLA proteins have a role in the development, differentiation, and maintenance of a number of tissue types." Q#17839 - CGI_10015404 superfamily 243035 44 118 0.00027731 37.5994 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17840 - CGI_10015405 superfamily 248098 45 76 2.45E-05 40.2961 cl17544 U-box superfamily C - U-box domain; This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. Q#17842 - CGI_10015407 superfamily 241599 102 160 2.79E-22 88.4544 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#17843 - CGI_10015408 superfamily 243072 22 129 1.66E-19 82.4314 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#17843 - CGI_10015408 superfamily 201217 196 245 3.72E-06 43.6684 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#17843 - CGI_10015408 superfamily 201217 144 193 0.00126965 36.3496 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#17848 - CGI_10027044 superfamily 241782 221 567 6.35E-148 433.937 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#17849 - CGI_10027045 superfamily 245206 60 282 9.45E-93 277.982 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#17850 - CGI_10027046 superfamily 247044 8 119 1.80E-40 133.145 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#17851 - CGI_10027047 superfamily 247727 37 89 3.06E-07 48.9655 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#17852 - CGI_10027048 superfamily 241571 65 161 1.78E-13 65.5114 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#17852 - CGI_10027048 superfamily 241613 170 208 1.80E-07 47.2014 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#17856 - CGI_10027052 superfamily 128926 27 81 1.63E-21 81.526 cl02732 DM16 superfamily - - "Repeats in sea squirt COS41.4, worm R01H10.6, fly CG1126 etc; Repeats in sea squirt COS41.4, worm R01H10.6, fly CG1126 etc. " Q#17858 - CGI_10027054 superfamily 237847 1 216 8.47E-68 209.763 cl17178 PRK14879 superfamily - - serine/threonine protein kinase; Provisional Q#17859 - CGI_10027055 superfamily 241832 11 85 7.12E-13 62.9482 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#17859 - CGI_10027055 superfamily 243175 165 275 5.08E-08 49.7551 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#17860 - CGI_10027056 superfamily 148567 42 122 1.46E-16 69.9143 cl06182 Rab5ip superfamily - - Rab5-interacting protein (Rab5ip); This family consists of several Rab5-interacting protein (RIP5 or Rab5ip) sequences. The ras-related GTPase rab5 is rate-limiting for homotypic early endosome fusion. Rab5ip represents a novel rab5 interacting protein that may function on endocytic vesicles as a receptor for rab5-GDP and participate in the activation of rab5. Q#17861 - CGI_10027057 superfamily 243092 96 436 2.32E-30 118.592 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17862 - CGI_10027058 superfamily 245599 149 368 1.68E-147 424.48 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#17862 - CGI_10027058 superfamily 207662 55 130 2.48E-52 173.144 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#17863 - CGI_10027059 superfamily 241782 58 279 1.56E-27 108.968 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#17864 - CGI_10027060 superfamily 221913 1584 1794 2.04E-56 196.607 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#17864 - CGI_10027060 superfamily 216112 668 1026 7.52E-39 149.754 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#17864 - CGI_10027060 superfamily 222258 1529 1573 3.15E-05 45.2516 cl18656 AAA_30 superfamily NC - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#17864 - CGI_10027060 superfamily 222005 1353 1412 4.11E-05 43.88 cl18632 AAA_19 superfamily C - Part of AAA domain; Part of AAA domain. Q#17866 - CGI_10027062 superfamily 115363 289 337 3.55E-11 58.1522 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#17868 - CGI_10027064 superfamily 115363 4 21 0.00427234 32.729 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#17869 - CGI_10027065 superfamily 115363 172 222 8.91E-12 60.0781 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#17871 - CGI_10027067 superfamily 241578 10 77 4.57E-07 48.9237 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#17872 - CGI_10027068 superfamily 115363 185 225 1.29E-06 45.4406 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#17872 - CGI_10027068 superfamily 207713 302 366 0.000481515 37.7046 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#17874 - CGI_10027070 superfamily 241563 13 44 0.00123243 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17875 - CGI_10027071 superfamily 247792 21 71 0.000156157 36.6548 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17879 - CGI_10027075 superfamily 247725 106 265 2.85E-86 268.356 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#17879 - CGI_10027075 superfamily 206020 304 364 1.98E-15 71.385 cl18286 Y_phosphatase_m superfamily - - "Myotubularin Y_phosphatase-like; This short region is highly conserved and seems to be common to many myotubularin proteins with protein tyrosine pyrophosphate activity. As the family has a number of highly conserved residues such as histidine, cysteine, glutamine and aspartate, it is possible that this represents a catalytic core of the active enzymatic part of the proteins." Q#17881 - CGI_10027077 superfamily 241758 10 373 7.65E-167 474.318 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#17882 - CGI_10027078 superfamily 205524 43 145 1.27E-20 85.9408 cl17723 Hydrolase_6 superfamily - - Haloacid dehalogenase-like hydrolase; This family is part of the HAD superfamily. Q#17883 - CGI_10027079 superfamily 247941 82 231 5.86E-17 75.0648 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#17884 - CGI_10027080 superfamily 247941 15 164 7.07E-15 68.1312 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#17885 - CGI_10027081 superfamily 243859 5 97 1.09E-18 75.059 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#17886 - CGI_10027082 superfamily 247746 8935 9062 5.54E-10 62.0358 cl17192 ATP-synt_B superfamily - - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#17891 - CGI_10027087 superfamily 247900 56 550 1.08E-146 436.344 cl17346 Trehalase superfamily - - Trehalase; Trehalase (EC:3.2.1.28) is known to recycle trehalose to glucose. Trehalose is a physiological hallmark of heat-shock response in yeast and protects of proteins and membranes against a variety of stresses. This family is found in conjunction with pfam07492 in fungi. Q#17894 - CGI_10027090 superfamily 245814 157 237 0.00695018 35.4795 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#17897 - CGI_10027093 superfamily 243092 90 383 6.30E-21 92.7832 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17898 - CGI_10027094 superfamily 243092 355 670 5.43E-21 93.5536 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17899 - CGI_10027095 superfamily 203428 8 319 5.26E-94 299.757 cl05703 HSL_N superfamily - - "Hormone-sensitive lipase (HSL) N-terminus; This family consists of several mammalian hormone-sensitive lipase (HSL) proteins (EC:3.1.1.-). Hormone-sensitive lipase, a key enzyme in fatty acid mobilisation, overall energy homeostasis, and possibly steroidogenesis, is acutely controlled through reversible phosphorylation by catecholamines and insulin." Q#17899 - CGI_10027095 superfamily 220701 325 425 2.42E-05 45.9823 cl18572 DUF2424 superfamily NC - Protein of unknown function (DUF2424); This is a family of proteins conserved in yeasts. The function is not known. Q#17900 - CGI_10027096 superfamily 245201 267 467 5.53E-41 147.768 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17900 - CGI_10027096 superfamily 243113 231 259 4.81E-12 61.3574 cl02621 TGF_beta_GS superfamily - - Transforming growth factor beta type I GS-motif; This motif is found in the transforming growth factor beta (TGF-beta) type I which regulates cell growth and differentiation. The name of the GS motif comes from its highly conserved GSGSGLP signature in the cytoplasmic juxtamembrane region immediately preceding the protein's kinase domain. Point mutations in the GS motif modify the signaling ability of the type I receptor. Q#17900 - CGI_10027096 superfamily 245201 387 554 0.000160766 42.68 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17900 - CGI_10027096 superfamily 245309 73 151 0.0017409 37.1121 cl10471 LU superfamily - - "Ly-6 antigen / uPA receptor -like domain; occurs singly in GPI-linked cell-surface glycoproteins (Ly-6 family,CD59, thymocyte B cell antigen, Sgp-2) or as three-fold repeated domain in urokinase-type plasminogen activator receptor. Topology of these domains is similar to that of snake venom neurotoxins." Q#17905 - CGI_10027101 superfamily 245213 117 148 3.73E-05 42.2386 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#17905 - CGI_10027101 superfamily 243092 637 913 2.78E-13 70.0564 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17905 - CGI_10027101 superfamily 246680 535 592 0.00232116 37.5508 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#17906 - CGI_10027102 superfamily 243091 199 237 1.38E-06 45.5608 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#17908 - CGI_10027104 superfamily 241550 1 22 3.19E-06 46.4181 cl00015 nt_trans superfamily NC - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#17908 - CGI_10027104 superfamily 245839 150 202 0.00260001 36.3138 cl12020 Anticodon_Ia_like superfamily NC - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#17909 - CGI_10027105 superfamily 219574 371 561 2.74E-29 116.227 cl06698 DC_STAMP superfamily - - "DC-STAMP-like protein; This is a family of sequences which are similar to a region of the dendritic cell-specific transmembrane protein (DC-STAMP). This is thought to be a novel receptor protein that shares no identity with other multimembrane-spanning proteins. It is thought to have seven putative transmembrane regions, two of which are found in the region featured in this family. DC-STAMP is also described as having potential N-linked glycosylation sites and a potential phosphorylation site for PKC, but these are not conserved throughout the family." Q#17911 - CGI_10027107 superfamily 242889 230 325 1.55E-10 57.2278 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#17912 - CGI_10027108 superfamily 198738 653 743 1.81E-25 102.731 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#17912 - CGI_10027108 superfamily 243066 22 121 5.89E-12 64.1757 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#17912 - CGI_10027108 superfamily 248312 845 959 0.00207986 39.2589 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#17913 - CGI_10027109 superfamily 247057 603 654 9.26E-06 44.1525 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#17913 - CGI_10027109 superfamily 245596 18 206 1.19E-67 226.035 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#17913 - CGI_10027109 superfamily 247057 533 592 6.55E-06 44.9817 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#17916 - CGI_10027112 superfamily 241564 82 150 3.29E-21 82.6987 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#17919 - CGI_10027115 superfamily 245226 48 216 5.63E-26 99.6824 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#17921 - CGI_10027118 superfamily 247999 38 82 1.18E-09 49.5178 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#17922 - CGI_10027120 superfamily 246680 2 75 3.18E-05 37.6662 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#17925 - CGI_10027124 superfamily 216709 132 280 5.62E-72 224.463 cl03357 Nop superfamily - - Putative snoRNA binding domain; This family consists of various Pre RNA processing ribonucleoproteins. The function of the aligned region is unknown however it may be a common RNA or snoRNA or Nop1p binding domain. Nop5p (Nop58p) from yeast is the protein component of a ribonucleoprotein protein required for pre-18s rRNA processing and is suggested to function with Nop1p in a snoRNA complex. Nop56p and Nop5p interact with Nop1p and are required for ribosome biogenesis. Prp31p is required for pre-mRNA splicing in S. cerevisiae. Q#17925 - CGI_10027124 superfamily 208568 37 88 2.19E-27 102.585 cl06890 NOSIC superfamily - - NOSIC (NUC001) domain; This is the central domain in Nop56/SIK1-like proteins. Q#17926 - CGI_10002384 superfamily 241609 18 85 2.81E-27 101.686 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#17926 - CGI_10002384 superfamily 241609 94 162 4.97E-21 84.7371 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#17927 - CGI_10002385 superfamily 241609 446 520 5.49E-28 109.005 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#17927 - CGI_10002385 superfamily 241609 528 606 4.96E-26 103.227 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#17927 - CGI_10002385 superfamily 216897 721 799 2.36E-26 103.915 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#17927 - CGI_10002385 superfamily 243093 272 353 3.56E-17 77.9558 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17927 - CGI_10002385 superfamily 243093 173 250 2.90E-16 75.2594 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17927 - CGI_10002385 superfamily 243093 73 151 3.10E-15 72.563 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17927 - CGI_10002385 superfamily 243093 1 57 3.16E-11 60.6218 cl02568 WSC superfamily N - WSC domain; This domain may be involved in carbohydrate binding. Q#17928 - CGI_10002386 superfamily 243093 152 230 1.60E-15 72.9482 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17928 - CGI_10002386 superfamily 243093 546 624 1.68E-15 72.9482 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17928 - CGI_10002386 superfamily 243093 53 131 2.10E-15 72.563 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17928 - CGI_10002386 superfamily 243093 639 717 2.13E-13 66.785 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17928 - CGI_10002386 superfamily 243093 345 432 2.75E-12 64.0285 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17928 - CGI_10002386 superfamily 243093 447 525 3.46E-12 63.3182 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17928 - CGI_10002386 superfamily 243093 246 324 1.54E-10 58.6958 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#17933 - CGI_10009400 superfamily 247725 441 568 8.71E-45 159.723 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#17933 - CGI_10009400 superfamily 243096 231 423 9.29E-35 132.807 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#17933 - CGI_10009400 superfamily 241559 28 120 1.18E-13 69.2619 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#17933 - CGI_10009400 superfamily 241566 573 622 3.85E-07 49.0276 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#17933 - CGI_10009400 superfamily 246908 737 885 4.52E-30 116.624 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#17933 - CGI_10009400 superfamily 247683 628 683 1.58E-10 58.6325 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#17935 - CGI_10009402 superfamily 243146 99 147 3.16E-07 47.2839 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17935 - CGI_10009402 superfamily 243146 238 276 2.61E-06 44.5251 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17935 - CGI_10009402 superfamily 243146 201 250 4.42E-05 41.1207 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17935 - CGI_10009402 superfamily 243146 138 192 7.72E-05 40.5093 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17935 - CGI_10009402 superfamily 243146 47 93 0.000846225 37.2687 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17936 - CGI_10009404 superfamily 243092 218 508 2.99E-35 138.622 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17936 - CGI_10009404 superfamily 243056 1071 1290 3.96E-63 217.174 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#17936 - CGI_10009404 superfamily 247725 548 635 2.55E-22 95.426 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#17937 - CGI_10009405 superfamily 241600 548 752 1.55E-74 242.531 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#17938 - CGI_10009406 superfamily 248458 24 139 2.66E-09 57.3237 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17938 - CGI_10009406 superfamily 248458 339 486 0.000296839 41.5305 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17940 - CGI_10018093 superfamily 241976 1 110 2.23E-28 101.45 cl00606 Archease superfamily N - "Archease protein family (MTH1598/TM1083); This archease family of proteins, has two SHS2 domains, with one inserted into another. It is predicted to be an enzyme. It is predicted to act as a chaperone in DNA/RNA metabolism." Q#17942 - CGI_10018095 superfamily 216686 137 323 6.56E-36 130.521 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#17943 - CGI_10018096 superfamily 241563 269 301 1.67E-05 43.0447 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17943 - CGI_10018096 superfamily 243176 20 210 1.35E-106 337.722 cl02777 chaperonin_like superfamily C - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#17943 - CGI_10018096 superfamily 245010 308 402 0.000183824 40.6791 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#17943 - CGI_10018096 superfamily 241563 214 245 0.00974717 35.0055 cl00034 BBOX superfamily C - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#17944 - CGI_10018097 superfamily 243092 3 293 3.97E-28 113.969 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17947 - CGI_10018100 superfamily 241687 32 122 4.18E-19 79.7929 cl00208 RNase_T2 superfamily N - "Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen." Q#17948 - CGI_10018101 superfamily 241687 28 67 5.50E-07 42.8137 cl00208 RNase_T2 superfamily C - "Ribonuclease T2 (RNase T2) is a widespread family of secreted RNases found in every organism examined thus far. This family includes RNase Rh, RNase MC1, RNase LE, and self-incompatibility RNases (S-RNases). Plant T2 RNases are expressed during leaf senescence in order to scavenge phosphate from ribonucleotides. They are also expressed in response to wounding or pathogen invasion. S-RNases are thought to prevent self-fertilization by acting as selective cytotoxins of "self" pollen." Q#17950 - CGI_10018105 superfamily 243035 20 131 4.51E-30 112.713 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#17950 - CGI_10018105 superfamily 246918 254 305 1.17E-12 62.9895 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#17950 - CGI_10018105 superfamily 246918 140 192 2.32E-10 56.4411 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#17950 - CGI_10018105 superfamily 246918 311 363 6.40E-10 55.2855 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#17950 - CGI_10018105 superfamily 246918 197 249 8.51E-10 54.9003 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#17951 - CGI_10018106 superfamily 217280 1 190 7.10E-43 148.508 cl14953 Fe_hyd_lg_C superfamily N - "Iron only hydrogenase large subunit, C-terminal domain; Iron only hydrogenase large subunit, C-terminal domain. " Q#17951 - CGI_10018106 superfamily 243441 203 251 3.66E-09 51.0997 cl03503 Fe_hyd_SSU superfamily - - Iron hydrogenase small subunit; This family represents the small subunit of the Fe-only hydrogenases EC:1.18.99.1. The subunit is comprised of alternating random coil and alpha helical structures that encompasses the large subunit in a novel protein fold. Q#17952 - CGI_10018107 superfamily 243092 158 397 1.06E-30 119.362 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17952 - CGI_10018107 superfamily 221499 54 124 4.70E-13 64.5501 cl13671 CAF1C_H4-bd superfamily - - Histone-binding protein RBBP4 or subunit C of CAF1 complex; The CAF-1 complex is a conserved heterotrimeric protein complex that promotes histone H3 and H4 deposition onto newly synthesized DNA during replication or DNA repair; specifically it facilitates replication-dependent nucleosome assembly with the major histone H3 (H3.1). This domain is an alpha helix which sits just upstream of the WD40 seven-bladed beta-propeller in the human RbAp46 protein. RbAp46 folds into the beta-propeller and binds histone H4 in a groove formed between this N-terminal helix and an extended loop inserted into blade six. Q#17957 - CGI_10018112 superfamily 245201 456 716 7.82E-167 484.43 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#17957 - CGI_10018112 superfamily 241570 295 409 8.90E-27 106.256 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#17957 - CGI_10018112 superfamily 241570 199 278 1.49E-17 79.6774 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#17957 - CGI_10018112 superfamily 213458 69 121 0.000197166 39.8588 cl17044 DD_cGKI superfamily - - "Dimerization/Docking domain of Cyclic GMP-dependent Protein Kinase I; Cyclic GMP-dependent Protein Kinase I (PKG1 or cGKI) is a Serine/Threonine Kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. cGKI exists as two splice variants, cGKI-alpha and cGKI-beta. They contain an N-terminal regulatory domain containing a dimerization/docking region and an autoinhibitory pseudosubstrate region, two cGMP-binding domains, and a C-terminal catalytic domain. Binding of cGMP to both binding sites releases the inhibition of the catalytic center by the pseudosubstrate region, allowing autophosphorylation and activation of the kinase. cGKI is a soluble protein expressed in all smooth muscles, platelets, cerebellum, and kidney. It is also expressed at lower concentrations in other tissues. It is involved in the regulation of smooth muscle tone, smooth cell proliferation, and platelet activation. The dimerization/docking (D/D) domain is a leucine/isoleucine zipper that mediates both homodimerization and interaction with isotype-specific G-kinase-anchoring proteins (GKAPs). The D/D domain of the two variants (alpha and beta) differ, allowing their targeting to different subcellular compartments and intracellular substrates." Q#17957 - CGI_10018112 superfamily 245597 711 760 5.07E-10 56.5999 cl11395 Pkinase_C superfamily - - Protein kinase C terminal domain; Protein kinase C terminal domain. Q#17959 - CGI_10018114 superfamily 241874 14 474 1.34E-164 480.478 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#17960 - CGI_10018115 superfamily 247707 304 527 1.84E-119 354.643 cl17107 PMM superfamily - - Eukaryotic phosphomannomutase; This enzyme EC:5.4.2.8 is involved in the synthesis of the GDP-mannose and dolichol-phosphate-mannose required for a number of critical mannosyl transfer reactions. Q#17961 - CGI_10018116 superfamily 198867 160 259 5.24E-25 99.7232 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#17961 - CGI_10018116 superfamily 243066 48 152 5.96E-25 99.6141 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#17961 - CGI_10018116 superfamily 243146 385 427 3.96E-08 50.5245 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17961 - CGI_10018116 superfamily 243146 350 393 6.50E-05 41.1207 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#17962 - CGI_10018117 superfamily 247792 320 370 1.86E-07 47.3744 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17964 - CGI_10018119 superfamily 246748 116 415 8.67E-133 390.98 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#17969 - CGI_10018124 superfamily 213107 54 85 0.0005712 33.7583 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#17971 - CGI_10018126 superfamily 243269 15 408 2.90E-76 247.566 cl03012 Ammonium_transp superfamily - - Ammonium Transporter Family; Ammonium Transporter Family. Q#17972 - CGI_10018127 superfamily 248458 113 375 5.69E-13 68.4945 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#17973 - CGI_10018128 superfamily 247743 153 238 1.70E-11 61.7783 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#17973 - CGI_10018128 superfamily 209247 366 453 3.24E-15 71.3207 cl11083 ClpB_D2-small superfamily - - "C-terminal, D2-small domain, of ClpB protein; This is the C-terminal domain of ClpB protein, referred to as the D2-small domain, and is a mixed alpha-beta structure. Compared with the D1-small domain (included in AAA, pfam00004) it lacks the long coiled-coil insertion, and instead of helix C4 contains a beta-strand (e3) that is part of a three stranded beta-pleated sheet. In Thermophilus the whole protein forms a hexamer with the D1-small and D2-small domains located on the outside of the hexamer, with the long coiled-coil being exposed on the surface. The D2-small domain is essential for oligomerisation, forming a tight interface with the D2-large domain of a neighboring subunit and thereby providing enough binding energy to stabilise the functional assembly. The domain is associated with two Clp_N, pfam02861, at the N-terminus as well as AAA, pfam00004 and AAA_2, pfam07724." Q#17974 - CGI_10018129 superfamily 221175 1 118 1.09E-20 86.3455 cl13200 RNA_pol_3_Rpc31 superfamily C - "DNA-directed RNA polymerase III subunit Rpc31; RNA polymerase III contains seventeen subunits in yeasts and in human cells. Twelve of these are akin to RNA polymerase I or II and the other five are RNA pol III-specific, and form the functionally distinct groups (i) Rpc31-Rpc34-Rpc82, and (ii) Rpc37-Rpc53. Rpc31, Rpc34 and Rpc82 form a cluster of enzyme-specific subunits that contribute to transcription initiation in S.cerevisiae and H.sapiens. There is evidence that these subunits are anchored at or near the N-terminal Zn-fold of Rpc1, itself prolonged by a highly conserved but RNA polymerase III-specific domain." Q#17975 - CGI_10018130 superfamily 241643 279 317 1.25E-06 44.7575 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#17975 - CGI_10018130 superfamily 241645 35 64 6.58E-07 46.0658 cl00155 UBQ superfamily N - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#17977 - CGI_10003770 superfamily 247792 127 167 1.28E-05 39.7364 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#17977 - CGI_10003770 superfamily 148004 46 90 7.24E-08 46.5489 cl05589 Ifi-6-16 superfamily NC - Interferon-induced 6-16 family; Interferon-induced 6-16 family. Q#17978 - CGI_10012657 superfamily 221377 257 300 6.94E-05 41.6855 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#17979 - CGI_10012659 superfamily 245847 166 208 3.94E-13 65.0661 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#17983 - CGI_10012663 superfamily 245384 75 165 1.58E-30 109.285 cl10767 AD superfamily - - "Anticodon-binding domain; This domain of approximately 100 residues is conserved from plants to humans. It is frequently found in association with Lsm domain-containing proteins. It is an anticodon-binding domain of a prolyl-tRNA synthetase, whose PDB structure is available under the identifier 1h4q." Q#17983 - CGI_10012663 superfamily 241733 7 67 1.90E-23 89.7262 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#17984 - CGI_10006520 superfamily 242025 41 132 6.82E-26 95.7473 cl00682 Alba superfamily - - Alba; Alba is a novel chromosomal protein that coats archaeal DNA without compacting it. Q#17985 - CGI_10006521 superfamily 242047 856 957 2.96E-29 113.835 cl00720 DUF296 superfamily - - "Domain of unknown function found in archaea, bacteria, and plants; This domain is found in proteins that contain AT-hook motifs, which suggests a role in DNA-binding for the proteins as a whole. Three conserved histidine residues appear to form a zinc-binding site, and the domain has been observed to form homotrimers. It co-occurs with a thioredoxin-like domain in uncharacterized cyanobacterial proteins." Q#17985 - CGI_10006521 superfamily 218535 704 838 3.16E-45 163.14 cl05037 Secretogranin_V superfamily N - "Neuroendocrine protein 7B2 precursor (Secretogranin V); The neuroendocrine protein 7B2 has a critical role in the proteolytic conversion and activation of proPC2, the enzyme responsible for the proteolytic conversion of many peptide hormone precursors. The 7B2 protein acts as an intracellular binding protein for proPC2, facilitates its maturation, and is required for its enzymatic activity. Processing of many important peptide precursors does not occur in 7B2 nulls. 7B2 null mice exhibit a unique form of Cushing's disease with many atypical symptoms, such as hypoglycemia." Q#17985 - CGI_10006521 superfamily 243092 389 608 1.15E-20 93.1684 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17985 - CGI_10006521 superfamily 243092 50 141 2.96E-08 55.0336 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17985 - CGI_10006521 superfamily 243092 350 418 2.27E-07 52.3372 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#17986 - CGI_10006522 superfamily 243158 767 801 1.12E-07 49.8648 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#17986 - CGI_10006522 superfamily 243158 984 1016 1.87E-06 46.398 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#17986 - CGI_10006522 superfamily 243158 803 838 4.61E-06 45.2424 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#17986 - CGI_10006522 superfamily 243158 693 729 0.000304361 39.8496 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#17986 - CGI_10006522 superfamily 243158 843 873 0.000311763 39.8496 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#17986 - CGI_10006522 superfamily 243158 949 982 0.000338097 39.8496 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#17986 - CGI_10006522 superfamily 243158 610 641 0.00124519 37.9236 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#17986 - CGI_10006522 superfamily 243158 736 766 0.00141605 37.9236 cl02723 Sel1 superfamily - - Sel1 repeat; This short repeat is found in the Sel1 protein. It is related to TPR repeats. Q#17988 - CGI_10006524 superfamily 220695 57 175 9.64E-06 46.0327 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#17989 - CGI_10006525 superfamily 243767 11 122 5.87E-53 167.506 cl04466 P-mevalo_kinase superfamily - - "Phosphomevalonate kinase; Phosphomevalonate kinase (EC:2.7.4.2) catalyzes the phosphorylation of 5-phosphomevalonate into 5-diphosphomevalonate, an essential step in isoprenoid biosynthesis via the mevalonate pathway. This family represents the animal type of the enzyme. The other is the ERG8 type, found in plants and fungi, and some bacteria (see pfam00288)." Q#17990 - CGI_10006526 superfamily 247727 63 166 3.81E-05 42.0319 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#17992 - CGI_10006528 superfamily 220629 138 211 7.85E-06 43.2259 cl10891 Adaptin_binding superfamily C - "Alpha and gamma adaptin binding protein p34; p34 is a protein involved in membrane trafficking. It is known to interact with both alpha and gamma adaptin. It has been speculated that p34 may play a chaperone role such as preventing the soluble adaptors from co-assembling with soluble clathrin, or helping to remove the adaptors from the coated vesicle. Another possible function is in aiding the recruitment of soluble adaptors onto the membrane." Q#17993 - CGI_10006529 superfamily 219667 72 155 6.29E-33 119.635 cl06829 Swi3 superfamily - - "Replication Fork Protection Component Swi3; Replication fork pausing is required to initiate a recombination events. More specifically, Swi1 is required for recombination near the mat1 locus. Swi3 has been found to co-purify with Swi1 Swi3, together with Swi1, define a fork protection complex that coordinates leading- and lagging-strand synthesis and stabilises stalled replication forks. The Swi1-Swi3 complex is required for accurate replication, fork protection and replication checkpoint signalling" Q#17999 - CGI_10003902 superfamily 241600 117 335 1.13E-89 270.266 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#17999 - CGI_10003902 superfamily 241619 37 85 0.000131269 39.4877 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#18000 - CGI_10005902 superfamily 245814 784 857 9.80E-07 48.2543 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18000 - CGI_10005902 superfamily 245814 986 1059 1.24E-06 47.8691 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18000 - CGI_10005902 superfamily 245814 672 747 1.35E-06 47.8691 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18000 - CGI_10005902 superfamily 245814 1086 1161 2.62E-06 47.0987 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18000 - CGI_10005902 superfamily 245814 1295 1366 5.53E-05 42.8615 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18000 - CGI_10005902 superfamily 245814 886 961 7.22E-05 42.4763 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18000 - CGI_10005902 superfamily 245814 1177 1260 1.64E-10 59.4412 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18000 - CGI_10005902 superfamily 245814 544 630 4.60E-06 46.2651 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18000 - CGI_10005902 superfamily 245814 452 531 2.38E-05 44.2468 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18001 - CGI_10005903 superfamily 243092 197 506 1.03E-15 78.916 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18001 - CGI_10005903 superfamily 240521 1121 1180 0.000375071 43.8348 cl18940 Syo1_like superfamily C - "Fungal symportin 1 (syo1) and similar proteins; This family of eukaryotic proteins includes Saccharomyces cerevisiae Ydl063c and Chaetomium thermophilum Syo1, which mediate the co-import of two ribosomal proteins, Rpl5 and Rpl11 (which both interact with 5S rRNA) into the nucleus. Import precedes their association with rRNA and subsequent ribosome assembly in the nucleolus. The primary structure of syo1 is a mixture of Armadillo- (ARM, N-terminal part of syo1) and HEAT-repeats (C-terminal part of syo1)." Q#18002 - CGI_10005904 superfamily 247723 78 149 2.67E-24 96.8857 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#18002 - CGI_10005904 superfamily 247723 195 232 1.91E-05 43.0115 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#18004 - CGI_10005906 superfamily 242406 676 803 3.50E-23 98.0472 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#18004 - CGI_10005906 superfamily 245119 36 129 5.54E-14 70.1695 cl09653 Btz superfamily - - "CASC3/Barentsz eIF4AIII binding; This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide." Q#18004 - CGI_10005906 superfamily 245119 227 263 1.22E-06 48.2132 cl09653 Btz superfamily N - "CASC3/Barentsz eIF4AIII binding; This domain is found on CASC3 (cancer susceptibility candidate gene 3 protein) which is also known as Barentsz (Btz). CASC3 is a component of the EJC (exon junction complex) which is a complex that is involved in post-transcriptional regulation of mRNA in metazoa. The complex is formed by the association of four proteins (eIF4AIII, Barentsz, Mago, and Y14), mRNA, and ATP. This domain wraps around eIF4AIII and stacks against the 5' nucleotide." Q#18005 - CGI_10005907 superfamily 247856 424 482 1.34E-11 60.6393 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18006 - CGI_10005249 superfamily 241623 3232 3519 1.83E-92 304.17 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#18006 - CGI_10005249 superfamily 202180 3563 3591 0.00720323 37.4432 cl03505 FATC superfamily - - "FATC domain; The FATC domain is named after FRAP, ATM, TRRAP C-terminal. The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability." Q#18007 - CGI_10025183 superfamily 241749 26 170 2.20E-22 88.5969 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#18012 - CGI_10025188 superfamily 241862 38 329 2.32E-08 53.5442 cl00437 COG0428 superfamily - - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#18013 - CGI_10025189 superfamily 241599 170 228 5.37E-21 84.2172 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#18016 - CGI_10025192 superfamily 243092 52 388 8.07E-47 166.741 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18016 - CGI_10025192 superfamily 243092 313 602 4.97E-18 83.5384 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18020 - CGI_10025196 superfamily 242611 329 362 0.00765644 35.964 cl01629 TPP_enzymes superfamily C - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#18021 - CGI_10025197 superfamily 220695 57 146 2.31E-05 43.7215 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18022 - CGI_10025198 superfamily 243184 439 541 6.48E-40 142.333 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#18022 - CGI_10025198 superfamily 247724 111 335 1.30E-80 257.805 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18022 - CGI_10025198 superfamily 243185 345 435 1.04E-22 93.7706 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#18024 - CGI_10025200 superfamily 217255 726 929 2.46E-63 214.155 cl03746 DDHD superfamily - - "DDHD domain; The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3). This suggests that this region is involved in functionally important interactions in other members of this family." Q#18024 - CGI_10025200 superfamily 247057 610 656 1.10E-17 79.3785 cl15755 SAM_superfamily superfamily N - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#18025 - CGI_10025201 superfamily 218565 121 441 2.63E-105 323.515 cl05086 DUF747 superfamily - - Eukaryotic membrane protein family; This family is a family of eukaryotic membrane proteins. It was previously annotated as including a putative receptor for human cytomegalovirus gH but this has has since been disputed. Analysis of the mouse Tapt1 protein (transmembrane anterior posterior transformation 1) has shown it to be involved in patterning of the vertebrate axial skeleton. Q#18028 - CGI_10025204 superfamily 246751 61 349 5.22E-114 335.367 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#18029 - CGI_10025205 superfamily 245201 429 693 9.71E-123 369.558 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18029 - CGI_10025205 superfamily 238012 183 217 0.00458605 35.793 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18030 - CGI_10025206 superfamily 209898 399 420 0.000162178 40.4015 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#18030 - CGI_10025206 superfamily 209898 424 444 0.000348042 39.6942 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#18030 - CGI_10025206 superfamily 209898 249 270 0.00299494 36.6126 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#18030 - CGI_10025206 superfamily 209898 342 363 0.00376913 36.5495 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#18030 - CGI_10025206 superfamily 209898 226 246 0.00395043 36.2274 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#18031 - CGI_10025207 superfamily 243035 1 117 2.38E-23 96.9201 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18031 - CGI_10025207 superfamily 215647 577 813 1.90E-34 132.347 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#18031 - CGI_10025207 superfamily 243086 518 564 3.16E-16 74.3337 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#18031 - CGI_10025207 superfamily 221370 300 501 5.74E-11 62.0037 cl13441 DUF3497 superfamily - - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#18031 - CGI_10025207 superfamily 243029 239 291 3.14E-05 42.7229 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#18032 - CGI_10025208 superfamily 246676 586 804 1.08E-44 161.744 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#18032 - CGI_10025208 superfamily 246710 461 622 2.27E-35 134.092 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#18032 - CGI_10025208 superfamily 246671 37 177 5.09E-28 111.746 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#18032 - CGI_10025208 superfamily 246710 196 359 4.49E-24 101.35 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#18035 - CGI_10025211 superfamily 241750 1520 1786 1.96E-62 215.9 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#18036 - CGI_10025212 superfamily 241578 266 428 1.21E-41 148.594 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18036 - CGI_10025212 superfamily 241578 75 241 3.06E-41 147.76 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18036 - CGI_10025212 superfamily 241578 451 609 6.09E-32 124.034 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18037 - CGI_10025213 superfamily 226426 5 185 0.000382914 38.6959 cl18756 COG3911 superfamily - - Predicted ATPase [General function prediction only] Q#18041 - CGI_10025217 superfamily 241832 1 78 2.32E-48 150.721 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18042 - CGI_10025218 superfamily 247743 368 517 5.27E-05 42.5183 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#18043 - CGI_10025219 superfamily 248020 200 375 3.61E-12 66.334 cl17466 Sulfatase superfamily N - Sulfatase; Sulfatase. Q#18044 - CGI_10025220 superfamily 244906 147 212 4.84E-22 91.1076 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#18044 - CGI_10025220 superfamily 246925 251 482 0.00127638 40.0314 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#18045 - CGI_10025221 superfamily 241594 550 908 6.11E-116 360.727 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#18045 - CGI_10025221 superfamily 216033 194 289 5.27E-08 51.5656 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#18049 - CGI_10025225 superfamily 248289 25 83 0.00472443 32.8732 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#18051 - CGI_10025227 superfamily 246681 218 339 8.65E-59 187.769 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#18051 - CGI_10025227 superfamily 244786 29 160 2.34E-37 132.005 cl07747 Aha1_N superfamily - - "Activator of Hsp90 ATPase, N-terminal; Members of this family, which are predominantly found in the protein 'Activator of Hsp90 ATPase' adopt a secondary structure consisting of an N-terminal alpha-helix leading into a four-stranded meandering antiparallel beta-sheet, followed by a C-terminal alpha-helix. The two helices are packed together, with the beta-sheet curving around them. They bind to the molecular chaperone HSP82 and stimulate its ATPase activity." Q#18054 - CGI_10025230 superfamily 248097 14 121 1.15E-14 65.3642 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18057 - CGI_10025234 superfamily 202095 541 625 1.24E-31 120.416 cl03409 RyR superfamily - - RyR domain; This domain is called RyR for Ryanodine receptor. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown. Q#18057 - CGI_10025234 superfamily 202095 427 516 7.62E-30 115.409 cl03409 RyR superfamily - - RyR domain; This domain is called RyR for Ryanodine receptor. The domain is found in four copies in the ryanodine receptor. The function of this domain is unknown. Q#18057 - CGI_10025234 superfamily 243094 1046 1117 6.80E-19 88.8187 cl02569 RasGAP superfamily C - "Ras GTPase Activating Domain; RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator." Q#18060 - CGI_10005375 superfamily 241596 740 800 1.74E-12 64.5427 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#18060 - CGI_10005375 superfamily 241596 824 870 1.03E-06 47.5939 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#18060 - CGI_10005375 superfamily 241596 897 943 1.39E-06 47.2087 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#18061 - CGI_10005376 superfamily 245213 75 101 2.33E-07 47.6314 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18062 - CGI_10005377 superfamily 243051 356 512 1.55E-48 166.014 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#18062 - CGI_10005377 superfamily 245814 275 327 0.00298612 35.9279 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18064 - CGI_10010783 superfamily 207662 327 400 3.33E-38 136.421 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#18064 - CGI_10010783 superfamily 245599 428 664 2.66E-98 304.056 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#18065 - CGI_10010784 superfamily 243058 864 978 1.01E-14 72.3471 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18065 - CGI_10010784 superfamily 243058 656 770 6.41E-13 66.9543 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18065 - CGI_10010784 superfamily 243058 946 1061 7.57E-13 66.9543 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18065 - CGI_10010784 superfamily 243058 564 686 9.46E-11 60.7911 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18065 - CGI_10010784 superfamily 243058 738 893 4.62E-09 55.3983 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18065 - CGI_10010784 superfamily 243058 488 597 4.30E-07 49.6203 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18066 - CGI_10010785 superfamily 243179 104 205 1.06E-21 86.4042 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#18068 - CGI_10010787 superfamily 243040 35 152 1.04E-76 241.148 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#18071 - CGI_10010790 superfamily 222150 65 90 0.00293578 31.5934 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18073 - CGI_10010792 superfamily 241644 26 149 1.00E-57 178.935 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#18074 - CGI_10010590 superfamily 248097 86 186 1.25E-24 95.0246 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18076 - CGI_10010592 superfamily 245205 94 164 1.83E-08 50.3141 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#18079 - CGI_10010595 superfamily 245847 200 266 4.24E-13 64.5001 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18079 - CGI_10010595 superfamily 241619 33 100 0.00607204 34.0949 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#18080 - CGI_10010596 superfamily 243161 2 47 7.69E-07 41.6518 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#18083 - CGI_10026468 superfamily 241770 51 187 2.91E-18 77.8212 cl00309 PRTases_typeI superfamily - - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#18084 - CGI_10026470 superfamily 248347 63 309 4.95E-18 84.4424 cl17793 Peptidase_C69 superfamily C - Peptidase family C69; Peptidase family C69. Q#18088 - CGI_10026475 superfamily 248264 2 67 1.07E-09 51.469 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#18089 - CGI_10026476 superfamily 241642 214 272 2.35E-09 52.4978 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#18089 - CGI_10026476 superfamily 241634 40 171 1.79E-10 57.2942 cl00143 SynN superfamily - - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#18090 - CGI_10026477 superfamily 245847 6 126 2.63E-23 89.5381 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18091 - CGI_10026478 superfamily 220348 16 117 6.20E-25 92.2848 cl09902 Ctf8 superfamily - - Ctf8; Ctf8 (chromosome transmissions fidelity 8) is a component of the Ctf18 RFC-like complex which is a DNA clamp loader involved in sister chromatid cohesion. Q#18092 - CGI_10026479 superfamily 219749 53 153 8.89E-43 149.043 cl07010 DUF1716 superfamily - - Eukaryotic domain of unknown function (DUF1716); This domain is found in eukaryotic proteins. A human nuclear protein with this domain is thought to have a role in apoptosis. Q#18093 - CGI_10026480 superfamily 245716 545 567 1.86E-05 43.3869 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#18094 - CGI_10026481 superfamily 247725 599 698 1.33E-09 56.9933 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18094 - CGI_10026481 superfamily 247725 356 450 3.13E-05 43.6913 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18096 - CGI_10026483 superfamily 222462 20 235 2.12E-37 142.358 cl16486 ELYS superfamily - - Nuclear pore complex assembly; ELYS (embryonic large molecule derived from yolk sac) is conserved from fungi such Aspergillus nidulans and Schizosaccharomyces pombe to human. It is important for the assembly of the nuclear pore complex. Q#18098 - CGI_10026485 superfamily 247725 1 58 2.62E-11 55.7638 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18099 - CGI_10026486 superfamily 241578 56 240 5.90E-98 292.697 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18099 - CGI_10026486 superfamily 244528 343 383 3.75E-12 60.8383 cl06838 C1_4 superfamily - - "TFIIH C1-like domain; The carboxyl-terminal region of TFIIH is essential for transcription activity. This regions binds three zinc atoms through two independent domain. The first contains a C4 zinc finger motif, whereas the second is characterized by a CX(2)CX(2-4)FCADCD motif. The solution structure of the second C-terminal domain revealed homology with the regulatory domain of protein kinase C (pfam00130)." Q#18100 - CGI_10026487 superfamily 246723 69 116 1.64E-18 80.425 cl14813 GluZincin superfamily C - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#18101 - CGI_10026488 superfamily 246723 48 569 1.59E-166 490.663 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#18102 - CGI_10026489 superfamily 243035 132 244 1.32E-20 84.2085 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18102 - CGI_10026489 superfamily 241619 13 78 0.000567932 36.6789 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#18103 - CGI_10026490 superfamily 247684 8 121 4.91E-11 59.5247 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#18104 - CGI_10026491 superfamily 247684 16 213 2.01E-87 265.41 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#18105 - CGI_10026492 superfamily 207684 6 40 4.51E-08 46.6031 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#18106 - CGI_10026493 superfamily 248012 284 417 3.48E-14 68.8393 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#18107 - CGI_10026494 superfamily 241597 54 110 3.58E-12 63.0215 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#18109 - CGI_10026496 superfamily 247727 160 261 1.99E-09 54.3583 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#18110 - CGI_10026497 superfamily 244535 173 346 9.25E-07 49.6568 cl06858 DUF1704 superfamily N - Domain of unknown function (DUF1704); This family contains many hypothetical proteins. Q#18112 - CGI_10026499 superfamily 241622 126 210 4.53E-18 81.8442 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#18112 - CGI_10026499 superfamily 241622 43 114 8.35E-16 75.2958 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#18112 - CGI_10026499 superfamily 241622 225 309 3.63E-09 55.6507 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#18113 - CGI_10026500 superfamily 216686 21 211 4.27E-36 128.98 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#18114 - CGI_10026501 superfamily 241570 378 455 9.48E-12 62.7286 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#18115 - CGI_10026502 superfamily 241758 15 154 7.82E-29 104.374 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#18116 - CGI_10026503 superfamily 247724 62 255 2.29E-46 160.779 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18117 - CGI_10026504 superfamily 241754 1 338 1.25E-169 502.997 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#18117 - CGI_10026504 superfamily 221571 648 690 9.16E-13 64.4462 cl13810 KIF1B superfamily - - "Kinesin protein 1B; This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00225, pfam00498. KIF1B is an anterograde motor for transport of mitochondria in axons of neuronal cells." Q#18117 - CGI_10026504 superfamily 241581 471 528 1.22E-05 44.1243 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#18118 - CGI_10026505 superfamily 221913 20 221 1.00E-54 176.962 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#18119 - CGI_10026506 superfamily 247805 675 796 0.00024685 41.1688 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18119 - CGI_10026506 superfamily 221913 841 1035 2.33E-43 157.317 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#18120 - CGI_10026507 superfamily 243074 394 440 8.62E-10 55.5905 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#18120 - CGI_10026507 superfamily 243092 467 503 9.36E-05 41.1438 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18120 - CGI_10026507 superfamily 243092 712 784 0.000290292 42.322 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18120 - CGI_10026507 superfamily 243092 265 349 0.00141147 40.0108 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18121 - CGI_10026508 superfamily 220710 1671 2115 6.19E-90 303.055 cl11019 Apt1 superfamily - - "Golgi-body localisation protein domain; This is the C-terminus of a family of proteins conserved from plants to humans. The plant members are localised to the Golgi proteins and appear to regulate membrane trafficking, as they are required for rapid vesicle accumulation at the tip of the pollen tube. The C-terminus probably contains the Golgi localisation signal and it is well-conserved." Q#18121 - CGI_10026508 superfamily 220707 996 1128 1.14E-32 126.974 cl11015 Fmp27_GFWDK superfamily - - RNA pol II promoter Fmp27 protein domain; Fmp27_GFWDK is a conserved domain of a family of proteins involved in RNA polymerase II transcription initiation. It contains characteristic GFWDK sequence motifs. Some members are associated with domain Fmp27_SW (pfam10305) towards the N terminus. Q#18121 - CGI_10026508 superfamily 220678 777 887 0.000571394 41.0545 cl10972 DUF2405 superfamily - - Domain of unknown function (DUF2405); This is a conserved region of a family of proteins conserved in fungi. The function is unknown. Q#18122 - CGI_10026509 superfamily 218994 1472 2006 7.60E-54 196.411 cl09408 Med13_C superfamily - - "Mediator complex subunit 13 C-terminal; Mediator is a large complex of up to 33 proteins that is conserved from plants through fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Med13 is part of the ancillary kinase module, together with Med12, CDK8 and CycC, which in yeast is implicated in transcriptional repression, though most of this activity is likely attributable to the CDK8 kinase. The large Med12 and Med13 proteins are required for specific developmental processes in Drosophila, zebrafish, and Caenorhabditis elegans but their biochemical functions are not understood." Q#18123 - CGI_10026510 superfamily 241733 6 75 2.18E-45 144.201 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#18126 - CGI_10026514 superfamily 241836 49 76 0.00219222 34.7673 cl00393 Ribosomal_L20 superfamily NC - "Ribosomal protein L20; The ribosomal protein family L20 contains members from eubacteria, as well as their mitochondrial and plastid homologs. L20 is an assembly protein, required for the first in-vitro reconstitution step of the 50S ribosomal subunit, but does not seem to be essential for ribosome activity. L20 has been shown to partially unfold in the absence of RNA, in regions corresponding to the RNA-binding sites. L20 represses the translation of its own mRNA via specific binding to two distinct mRNA sites, in a manner similar to the L20 interaction with 23S ribosomal RNA." Q#18127 - CGI_10026515 superfamily 215895 181 471 0 581.419 cl02856 6PGD superfamily - - "6-phosphogluconate dehydrogenase, C-terminal domain; This family represents the C-terminal all-alpha domain of 6-phosphogluconate dehydrogenase. The domain contains two structural repeats of 5 helices each." Q#18127 - CGI_10026515 superfamily 236582 7 295 5.90E-77 245.429 cl18895 PRK09599 superfamily - - 6-phosphogluconate dehydrogenase-like protein; Reviewed Q#18128 - CGI_10026516 superfamily 215733 380 616 1.12E-29 119.592 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#18128 - CGI_10026516 superfamily 222006 733 777 7.98E-06 45.6762 cl16182 Hydrolase_like2 superfamily N - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#18128 - CGI_10026516 superfamily 247756 948 1015 0.00380296 39.1401 cl17202 HAD superfamily N - haloacid dehalogenase-like hydrolase; haloacid dehalogenase-like hydrolase. Q#18129 - CGI_10026517 superfamily 247684 1 124 7.92E-26 102.741 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#18130 - CGI_10026518 superfamily 241619 33 100 0.00280866 32.5541 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#18131 - CGI_10026519 superfamily 245847 22 87 8.19E-17 71.0485 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18132 - CGI_10026520 superfamily 247684 1 34 6.08E-05 41.1788 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#18133 - CGI_10026521 superfamily 241758 26 75 3.07E-14 63.1578 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#18134 - CGI_10026522 superfamily 241563 62 102 0.000561023 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18134 - CGI_10026522 superfamily 110440 484 508 0.00446848 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18135 - CGI_10026523 superfamily 241758 31 176 1.23E-21 86.2698 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#18136 - CGI_10026524 superfamily 241749 2 115 1.14E-24 97.4565 cl00280 globin_like superfamily C - superfamily containing globins and truncated hemoglobins Q#18137 - CGI_10026525 superfamily 241797 2 301 2.04E-131 382.117 cl00337 UbiA superfamily - - 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases [Coenzyme metabolism] Q#18139 - CGI_10026527 superfamily 245213 972 1009 1.01E-09 56.491 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 245213 823 857 5.40E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 245213 1149 1184 0.000255246 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 245213 1261 1294 0.000283476 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 245213 896 931 0.000388248 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 245213 939 970 0.000613643 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 245213 1298 1332 0.00063416 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 245213 597 632 0.00151176 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 245213 1224 1257 0.00864802 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18139 - CGI_10026527 superfamily 220692 66 344 2.00E-11 65.6885 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#18140 - CGI_10026528 superfamily 243066 78 160 2.16E-34 127.284 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#18140 - CGI_10026528 superfamily 243066 415 497 2.16E-34 127.284 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#18140 - CGI_10026528 superfamily 219619 691 760 3.67E-10 57.6027 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#18145 - CGI_10026533 superfamily 242575 113 214 3.80E-32 114.216 cl01548 YccV-like superfamily - - Hemimethylated DNA-binding protein YccV like; YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. Q#18146 - CGI_10026534 superfamily 217804 175 219 0.00015238 38.0355 cl04337 INCENP_ARK-bind superfamily N - "Inner centromere protein, ARK binding region; This region of the inner centromere protein has been found to be necessary and sufficient for binding to aurora-related kinase. This interaction has been implicated in the coordination of chromosome segregation with cell division in yeast." Q#18147 - CGI_10026535 superfamily 241633 703 831 4.03E-36 134.324 cl00140 SNc superfamily - - "Staphylococcal nuclease homologues. SNase homologues are found in bacteria, archaea, and eukaryotes. They contain no disufide bonds." Q#18147 - CGI_10026535 superfamily 241633 522 666 5.75E-31 119.687 cl00140 SNc superfamily - - "Staphylococcal nuclease homologues. SNase homologues are found in bacteria, archaea, and eukaryotes. They contain no disufide bonds." Q#18147 - CGI_10026535 superfamily 241633 197 337 2.02E-23 98.1153 cl00140 SNc superfamily - - "Staphylococcal nuclease homologues. SNase homologues are found in bacteria, archaea, and eukaryotes. They contain no disufide bonds." Q#18147 - CGI_10026535 superfamily 241633 373 477 2.34E-23 97.7301 cl00140 SNc superfamily - - "Staphylococcal nuclease homologues. SNase homologues are found in bacteria, archaea, and eukaryotes. They contain no disufide bonds." Q#18147 - CGI_10026535 superfamily 243098 903 940 2.20E-12 63.7711 cl02573 TUDOR superfamily C - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#18147 - CGI_10026535 superfamily 247724 10 170 3.01E-67 223.438 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18150 - CGI_10026538 superfamily 241882 590 824 2.06E-13 71.5137 cl00465 yhhT superfamily N - "Predicted permease, member of the PurR regulon [General function prediction only]" Q#18151 - CGI_10026539 superfamily 243519 23 536 0 764.06 cl03757 phosphohexomutase superfamily - - "The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model." Q#18153 - CGI_10026542 superfamily 241622 118 185 8.32E-10 52.236 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#18154 - CGI_10026543 superfamily 243540 275 499 1.11E-27 110.03 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#18156 - CGI_10005838 superfamily 245213 74 109 2.85E-09 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18156 - CGI_10005838 superfamily 245213 112 147 2.95E-09 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18156 - CGI_10005838 superfamily 245213 36 71 3.13E-09 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18156 - CGI_10005838 superfamily 245847 151 219 2.72E-11 58.3369 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18158 - CGI_10008594 superfamily 241874 1 533 0 710.873 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#18159 - CGI_10008595 superfamily 241874 24 503 0 664.649 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#18159 - CGI_10008595 superfamily 241874 503 719 3.69E-90 294.857 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#18160 - CGI_10008596 superfamily 247792 50 93 3.93E-05 36.9741 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#18161 - CGI_10008597 superfamily 248458 102 490 1.05E-24 104.318 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18163 - CGI_10008599 superfamily 248458 373 513 9.35E-05 43.4565 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18163 - CGI_10008599 superfamily 248458 115 200 0.00194221 39.2193 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18164 - CGI_10008600 superfamily 241802 19 300 4.57E-88 268.398 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#18165 - CGI_10008601 superfamily 241802 17 58 2.16E-15 67.7094 cl00342 Trp-synth-beta_II superfamily C - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#18166 - CGI_10008602 superfamily 241802 12 246 2.67E-68 216.396 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#18168 - CGI_10008604 superfamily 248097 60 186 1.95E-24 93.4838 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18168 - CGI_10008604 superfamily 150420 14 84 0.00555129 34.7111 cl18042 Jnk-SapK_ap_N superfamily N - JNK_SAPK-associated protein-1; This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end. Q#18169 - CGI_10008605 superfamily 222150 527 551 2.94E-05 41.9937 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18169 - CGI_10008605 superfamily 246975 514 535 0.00726751 35.0153 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#18170 - CGI_10008606 superfamily 245456 183 454 2.36E-165 469.757 cl10970 AP_MHD_Cterm superfamily - - "C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD); This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15." Q#18170 - CGI_10008606 superfamily 242876 18 142 0.00204562 37.3337 cl02092 Clat_adaptor_s superfamily - - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#18171 - CGI_10008607 superfamily 207919 26 126 1.20E-48 153.915 cl03348 Ribosomal_L22e superfamily - - Ribosomal L22e protein family; Ribosomal L22e protein family. Q#18172 - CGI_10014390 superfamily 241563 39 77 5.60E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18172 - CGI_10014390 superfamily 110440 506 533 0.00354108 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18173 - CGI_10014391 superfamily 222070 26 106 0.00601381 33.4201 cl18634 DDE_3 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#18174 - CGI_10014392 superfamily 247683 391 443 5.43E-26 99.7245 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#18174 - CGI_10014392 superfamily 245835 61 171 1.13E-10 60.0874 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#18176 - CGI_10014394 superfamily 247724 58 127 3.31E-10 58.7121 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18179 - CGI_10014397 superfamily 217437 59 170 3.71E-27 102.025 cl03944 GILT superfamily - - "Gamma interferon inducible lysosomal thiol reductase (GILT); This family includes the two characterized human gamma-interferon-inducible lysosomal thiol reductase (GILT) sequences. It also contains several other eukaryotic putative proteins with similarity to GILT. The aligned region contains three conserved cysteine residues. In addition, the two GILT sequences possess a C-X(2)-C motif that is shared by some of the other sequences in the family. This motif is thought to be associated with disulphide bond reduction." Q#18185 - CGI_10014403 superfamily 241567 115 212 0.000123365 40.7614 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#18186 - CGI_10014404 superfamily 241567 271 468 1.26E-06 48.7507 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#18186 - CGI_10014404 superfamily 241567 568 646 0.00271687 38.4502 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#18188 - CGI_10014406 superfamily 222258 490 625 2.06E-14 73.3711 cl18656 AAA_30 superfamily N - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#18188 - CGI_10014406 superfamily 222209 802 880 6.48E-06 45.8549 cl18648 UvrD_C_2 superfamily N - Family description; This domain is found at the C-terminus of a wide variety of helicase enzymes. This domain has a AAA-like structural fold. Q#18191 - CGI_10014409 superfamily 243088 28 116 6.67E-40 135.243 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#18192 - CGI_10014410 superfamily 247683 147 200 3.23E-19 83.275 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#18192 - CGI_10014410 superfamily 247683 838 890 2.04E-17 78.0619 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#18192 - CGI_10014410 superfamily 247683 758 808 3.21E-15 71.8987 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#18192 - CGI_10014410 superfamily 247683 77 129 3.04E-19 83.2378 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#18192 - CGI_10014410 superfamily 247683 290 314 0.000573468 38.9733 cl17036 SH3 superfamily C - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#18193 - CGI_10014411 superfamily 247805 175 340 1.34E-09 55.8064 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18194 - CGI_10001090 superfamily 247856 33 84 9.22E-05 36.7569 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18197 - CGI_10008977 superfamily 241568 214 267 5.88E-10 56.3172 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#18197 - CGI_10008977 superfamily 245213 273 305 4.43E-08 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18197 - CGI_10008977 superfamily 241568 173 210 3.30E-07 48.228 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#18197 - CGI_10008977 superfamily 241863 525 702 3.83E-25 103.623 cl00438 Flavodoxin_2 superfamily - - Flavodoxin-like fold; This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. Q#18197 - CGI_10008977 superfamily 111397 70 149 1.89E-17 78.537 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#18197 - CGI_10008977 superfamily 111397 306 387 1.53E-06 46.5655 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#18201 - CGI_10008981 superfamily 198877 361 439 4.23E-43 148.913 cl06957 BING4CT superfamily - - BING4CT (NUC141) domain; This C terminal domain is found in the BING4 family of nucleolar WD40 repeat proteins. Q#18201 - CGI_10008981 superfamily 243092 208 330 4.07E-09 56.5744 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18207 - CGI_10001142 superfamily 248067 120 233 3.12E-24 96.5083 cl17513 ABC1 superfamily - - "ABC1 family; This family includes ABC1 from yeast and AarF from E. coli. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex and E. coli AarF is required for ubiquinone production. It has been suggested that members of the ABC1 family are novel chaperonins. These proteins are unrelated to the ABC transporter proteins." Q#18209 - CGI_10001170 superfamily 245606 1 66 4.58E-36 126.439 cl11410 TPP_enzyme_PYR superfamily N - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#18209 - CGI_10001170 superfamily 217227 89 209 1.13E-31 113.46 cl08363 Transketolase_C superfamily - - "Transketolase, C-terminal domain; The C-terminal domain of transketolase has been proposed as a regulatory molecule binding site." Q#18211 - CGI_10004779 superfamily 245201 209 500 0 545.824 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18212 - CGI_10004780 superfamily 222150 344 367 4.13E-06 43.5345 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18212 - CGI_10004780 superfamily 222150 314 341 0.000199413 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18213 - CGI_10004781 superfamily 222150 355 378 3.20E-06 43.9197 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18213 - CGI_10004781 superfamily 222150 325 352 0.000140723 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18213 - CGI_10004781 superfamily 197676 369 391 0.00831784 34.3638 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#18214 - CGI_10004782 superfamily 245531 248 300 9.45E-07 45.8178 cl11158 BEN superfamily C - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#18218 - CGI_10012939 superfamily 244539 65 241 3.44E-27 106.62 cl06868 FNR_like superfamily C - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#18218 - CGI_10012939 superfamily 247856 17 65 2.88E-05 41.3793 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18218 - CGI_10012939 superfamily 203841 167 344 3.97E-24 96.2528 cl17716 NAD_binding_6 superfamily - - Ferric reductase NAD binding domain; Ferric reductase NAD binding domain. Q#18222 - CGI_10012943 superfamily 241563 62 102 0.0036005 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18224 - CGI_10012945 superfamily 243555 378 567 3.85E-09 56.2454 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#18226 - CGI_10012947 superfamily 245306 45 139 9.98E-19 76.8555 cl10465 Peptidase_S24_S26 superfamily - - "The S24, S26 LexA/signal peptidase superfamily contains LexA-related and type I signal peptidase families. The S24 LexA protein domains include: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The S26 type I signal peptidase (SPase) family also includes mitochondrial inner membrane protease (IMP)-like members. SPases are essential membrane-bound proteases which function to cleave away the amino-terminal signal peptide from the translocated pre-protein, thus playing a crucial role in the transport of proteins across membranes in all living organisms. All members in this superfamily are unique serine proteases that carry out catalysis using a serine/lysine dyad instead of the prototypical serine/histidine/aspartic acid triad found in most serine proteases." Q#18227 - CGI_10012948 superfamily 245226 57 170 3.05E-18 81.1767 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#18228 - CGI_10012949 superfamily 245864 32 419 5.35E-71 233.71 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#18230 - CGI_10012951 superfamily 148767 7 89 2.34E-20 79.4596 cl06404 CI-B14_5a superfamily - - "NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a); This family contains the eukaryotic NADH:ubiquinone oxidoreductase subunit B14.5a (Complex I-B14.5a) (EC:1.6.5.3). This is approximately 100 residues long, and forms part of a multiprotein complex that resides on the inner mitochondrial membrane. The main function of the complex is the transport of electrons from NADH to ubiquinone, accompanied by translocation of protons from the mitochondrial matrix to the intermembrane space." Q#18231 - CGI_10012952 superfamily 245847 25 139 1.27E-06 45.2402 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18234 - CGI_10012955 superfamily 217293 76 245 1.36E-40 144.698 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#18234 - CGI_10012955 superfamily 202474 255 312 2.27E-15 73.8421 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#18235 - CGI_10004400 superfamily 222413 98 141 0.00804306 34.1678 cl16433 DDE_Tnp_1_7 superfamily C - Transposase IS4; Transposase IS4. Q#18236 - CGI_10004402 superfamily 243066 12 123 1.31E-17 75.7317 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#18236 - CGI_10004402 superfamily 198867 135 222 8.59E-06 42.916 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#18240 - CGI_10021594 superfamily 243072 748 891 1.28E-27 110.166 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18240 - CGI_10021594 superfamily 241565 20 89 4.40E-09 54.6351 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#18241 - CGI_10021595 superfamily 241563 58 96 8.52E-07 46.3184 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18241 - CGI_10021595 superfamily 241563 13 42 0.000307325 38.8575 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18242 - CGI_10021596 superfamily 118308 47 138 1.18E-32 112.925 cl10755 Mitoc_L55 superfamily N - Mitochondrial ribosomal protein L55; Members of this family are involved in mitochondrial biogenesis and G2/M phase cell cycle progression. They form a component of the mitochondrial ribosome large subunit (39S) which comprises a 16S rRNA and about 50 distinct proteins. Q#18243 - CGI_10021597 superfamily 248100 159 215 7.32E-12 60.6308 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#18243 - CGI_10021597 superfamily 248100 300 357 1.53E-10 56.7788 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#18245 - CGI_10021599 superfamily 241748 130 366 4.79E-139 398.79 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#18246 - CGI_10021600 superfamily 245205 181 279 1.09E-10 57.9794 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#18249 - CGI_10021603 superfamily 247805 258 459 1.39E-91 290.54 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18249 - CGI_10021603 superfamily 247905 470 600 2.99E-28 111.561 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#18251 - CGI_10021605 superfamily 242414 42 120 1.58E-26 100.045 cl01285 Gar1 superfamily C - "Gar1/Naf1 RNA binding region; Gar1 is a small nucleolar RNP that is required for pre-mRNA processing and pseudouridylation. It is co-immunoprecipitated with the H/ACA families of snoRNAs. This family represents the conserved central region of Gar1. This region is necessary and sufficient for normal cell growth, and specifically binds two snoRNAs snR10 and snR30. This region is also necessary for nucleolar targeting, and it is thought that the protein is co-transported to the nucleolus as part of a nucleoprotein complex. In humans, Gar1 is also component of telomerase in vivo. Naf1 is an essentail protein that plays a role in ribosome biogenesis, modification of spliceosomal small nuclear RNAs and telomere synthesis, and is homologous to Gar1." Q#18252 - CGI_10021606 superfamily 247723 119 204 1.17E-49 167.138 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#18252 - CGI_10021606 superfamily 247723 27 106 8.11E-37 131.338 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#18252 - CGI_10021606 superfamily 247723 343 412 1.54E-34 124.612 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#18252 - CGI_10021606 superfamily 247723 460 543 2.61E-34 124.273 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#18255 - CGI_10021609 superfamily 220695 31 209 2.56E-06 46.8031 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18256 - CGI_10021610 superfamily 220695 28 175 2.63E-05 43.7215 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18257 - CGI_10021611 superfamily 241832 8 82 3.84E-16 67.5833 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18258 - CGI_10021612 superfamily 220695 52 224 3.44E-10 58.7442 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18259 - CGI_10021613 superfamily 220695 81 136 0.000251397 40.6399 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18260 - CGI_10021614 superfamily 220695 37 154 0.00152171 38.3287 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18261 - CGI_10021615 superfamily 243035 1 67 6.39E-16 66.8745 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18263 - CGI_10021617 superfamily 243092 427 670 1.68E-36 138.622 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18263 - CGI_10021617 superfamily 202808 1 75 1.22E-49 170.404 cl08405 TLE_N superfamily C - Groucho/TLE N-terminal Q-rich domain; The N-terminal domain of the Grouch/TLE co-repressor proteins are involved in oligomerisation. Q#18264 - CGI_10021618 superfamily 246722 931 1087 2.27E-42 153.318 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#18264 - CGI_10021618 superfamily 220722 74 196 1.78E-20 89.4009 cl11040 EST1 superfamily - - Telomerase activating protein Est1; Est1 is a protein which recruits or activates telomerase at the site of polymerisation. Q#18265 - CGI_10021619 superfamily 247042 1 428 0 558.486 cl15693 Sema superfamily - - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#18265 - CGI_10021619 superfamily 243104 428 470 0.000148166 40.2292 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#18266 - CGI_10021620 superfamily 244363 11 142 8.73E-35 123.259 cl06336 Commd superfamily - - "COMM_Domain, a family of domains found at the C-terminus of HCarG, the copper metabolism gene MURR1 product, and related proteins. Presumably all COMM_Domain containing proteins are located in the nucleus and the COMM domain plays a role in protein-protein interactions. Several family members have been shown to bind and inhibit NF-kappaB. Murr1/Commd1 is a protein involved in copper homeostasis, which has also been identified as a regulator of the human delta epithelial sodium channel. HCaRG, a nuclear protein that might be involved in cell proliferation, is negatively regulated by extracellular calcium concentration, and its basal mRNA levels are higher in hypertensive animals." Q#18267 - CGI_10021621 superfamily 245874 16 54 0.0034899 35.5217 cl12111 TNFR superfamily - - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#18268 - CGI_10021622 superfamily 203031 32 88 2.21E-09 50.4044 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#18270 - CGI_10021624 superfamily 245874 163 204 0.00126857 36.6354 cl12111 TNFR superfamily NC - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#18274 - CGI_10021628 superfamily 247684 10 430 1.72E-94 297.652 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#18275 - CGI_10021629 superfamily 241752 1 62 4.53E-21 80.0561 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#18277 - CGI_10000225 superfamily 247805 99 156 5.55E-12 58.888 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18282 - CGI_10012348 superfamily 245814 592 662 2.66E-06 49.0247 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 5276 5329 5.71E-06 47.8691 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 403 471 1.09E-05 47.0987 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 1319 1385 2.00E-05 46.3283 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 3637 3706 3.13E-05 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 702 767 5.18E-05 45.1727 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 1808 1865 7.59E-05 44.4023 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 1411 1478 0.000155538 43.6319 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 3815 3882 0.000192767 43.2467 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 5177 5235 0.00035189 42.4763 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2932 2983 0.00037174 42.4763 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 997 1054 0.000531605 42.0911 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 5367 5423 0.000842478 41.3207 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 3987 4045 0.0024946 39.7799 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 193 255 0.00269441 39.7799 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2326 2384 0.00970455 38.2391 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 1674 1756 1.83E-08 55.5893 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2441 2515 4.52E-08 54.4337 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2037 2100 7.06E-08 53.5647 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 94 168 7.82E-08 53.6633 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 499 569 1.64E-07 52.8929 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 3723 3779 3.33E-07 51.2535 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 3137 3214 5.08E-07 51.3521 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 4739 4809 5.31E-06 48.4267 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 792 863 6.96E-06 47.8853 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2130 2192 1.16E-05 46.6312 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 1594 1651 1.43E-05 46.9941 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 5079 5135 1.89E-05 46.6193 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 4251 4311 2.30E-05 45.8608 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 4856 4913 2.56E-05 45.8608 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 4154 4226 4.64E-05 45.1889 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 1192 1274 6.12E-05 44.8037 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 4452 4500 0.000112263 43.9348 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2783 2857 0.000140062 43.6481 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 3888 3972 0.00018106 43.3102 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 1900 1948 0.00027007 42.7569 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 4558 4633 0.000631836 41.7969 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 3527 3597 0.000664178 41.7221 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 902 959 0.000664361 41.6236 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 4976 5034 0.00069819 41.6236 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 1107 1166 0.00103499 40.9583 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 297 364 0.00184555 40.4457 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2216 2289 0.00269429 39.6753 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2664 2726 0.00373917 39.3124 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 2558 2631 0.00412275 39.2901 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18282 - CGI_10012348 superfamily 245814 4667 4714 0.00456534 38.9272 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18284 - CGI_10012350 superfamily 243119 86 123 0.000744311 37.0382 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18285 - CGI_10012351 superfamily 241609 114 189 7.98E-25 96.2931 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#18285 - CGI_10012351 superfamily 241609 24 100 4.48E-19 80.8851 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#18285 - CGI_10012351 superfamily 245814 243 300 7.93E-06 43.2467 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18286 - CGI_10012352 superfamily 245814 402 471 3.48E-11 60.9659 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18286 - CGI_10012352 superfamily 241611 1018 1112 0.000449371 40.4496 cl00102 PTX superfamily N - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#18286 - CGI_10012352 superfamily 243119 11 57 0.00663438 36.2577 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18287 - CGI_10012353 superfamily 243119 67 117 0.00118332 37.0382 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18288 - CGI_10012354 superfamily 222150 730 755 0.000970503 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18288 - CGI_10012354 superfamily 222150 790 815 0.00307308 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18288 - CGI_10012354 superfamily 197676 776 798 0.00560622 35.9045 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#18290 - CGI_10012356 superfamily 246921 306 351 6.58E-09 53.9185 cl15299 FG-GAP superfamily C - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#18290 - CGI_10012356 superfamily 246921 373 417 5.02E-06 45.4441 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#18294 - CGI_10000839 superfamily 241563 65 102 3.25E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18297 - CGI_10002968 superfamily 243058 250 347 1.08E-07 50.0055 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18298 - CGI_10002969 superfamily 216363 3 108 1.33E-23 88.6813 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#18300 - CGI_10002971 superfamily 241563 68 104 2.34E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18300 - CGI_10002971 superfamily 241563 21 59 0.000502081 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18301 - CGI_10002972 superfamily 241563 71 104 0.000333753 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18301 - CGI_10002972 superfamily 241563 28 59 0.000398472 38.6144 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18305 - CGI_10009558 superfamily 246680 9 87 5.38E-14 64.5304 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#18307 - CGI_10009560 superfamily 243035 46 168 4.27E-27 105.394 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18311 - CGI_10009564 superfamily 214531 145 188 6.89E-08 49.1373 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#18311 - CGI_10009564 superfamily 214531 190 231 4.56E-06 44.1297 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#18312 - CGI_10009565 superfamily 219542 88 203 4.58E-34 126.202 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#18312 - CGI_10009565 superfamily 215896 310 401 3.42E-15 73.4832 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#18312 - CGI_10009565 superfamily 219541 592 628 3.83E-08 52.0855 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#18314 - CGI_10009567 superfamily 241568 2514 2550 0.000684434 40.1388 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#18314 - CGI_10009567 superfamily 214531 1790 1832 6.40E-09 54.9153 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#18314 - CGI_10009567 superfamily 214531 816 856 1.78E-08 53.7597 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#18314 - CGI_10009567 superfamily 214531 1305 1346 2.63E-07 50.2929 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#18314 - CGI_10009567 superfamily 215683 1767 1807 4.29E-07 49.4759 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#18314 - CGI_10009567 superfamily 214531 171 213 8.21E-07 48.7521 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#18314 - CGI_10009567 superfamily 214531 2019 2060 1.46E-06 47.9817 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#18314 - CGI_10009567 superfamily 214531 1260 1302 3.09E-05 44.1297 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#18314 - CGI_10009567 superfamily 215683 791 830 0.000140129 42.1571 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#18315 - CGI_10001046 superfamily 247805 23 96 0.00485147 32.6944 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18317 - CGI_10000911 superfamily 216363 7 73 1.03E-08 47.465 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#18320 - CGI_10001152 superfamily 242542 26 188 8.38E-09 51.848 cl01505 YhhN superfamily - - "YhhN-like protein; The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many of the members of this family are annotated as being possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues." Q#18321 - CGI_10001379 superfamily 243161 4 91 4.28E-20 82.4421 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#18324 - CGI_10001659 superfamily 241674 141 216 3.89E-32 119.238 cl00194 EF1B superfamily - - "Elongation factor 1 beta (EF1B) guanine nucleotide exchange domain. EF1B catalyzes the exchange of GDP bound to the G-protein, EF1A, for GTP, an important step in the elongation cycle of the protein biosynthesis. EF1A binds to and delivers the aminoacyl tRNA to the ribosome. The guanine nucleotide exchange domain of EF1B, which is the alpha subunit in yeast, is responsible for the catalysis of this exchange reaction." Q#18324 - CGI_10001659 superfamily 204519 111 134 0.00192912 36.4563 cl11209 EF-1_beta_acid superfamily - - Eukaryotic elongation factor 1 beta central acidic region; Eukaryotic elongation factor 1 beta central acidic region. Q#18326 - CGI_10001699 superfamily 218118 31 101 1.91E-17 72.2616 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#18328 - CGI_10002658 superfamily 220131 627 907 3.25E-64 220.611 cl11721 DUF1943 superfamily - - "Domain of unknown function (DUF1943); Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined." Q#18328 - CGI_10002658 superfamily 219034 935 1028 2.04E-07 50.4126 cl05778 DUF1081 superfamily - - Domain of Unknown Function (DUF1081); This region is found in Apolipophorin proteins. Q#18329 - CGI_10002659 superfamily 245210 13 381 8.55E-83 261.213 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#18332 - CGI_10023840 superfamily 243035 45 166 2.82E-27 100.387 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18334 - CGI_10023842 superfamily 241563 61 99 0.000173403 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18335 - CGI_10023843 superfamily 248097 72 195 1.19E-16 72.683 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18335 - CGI_10023843 superfamily 248213 6 63 5.76E-05 39.0953 cl17659 DivIC superfamily C - Septum formation initiator; DivIC from B. subtilis is necessary for both vegetative and sporulation septum formation. These proteins are mainly composed of an amino terminal coiled-coil. Q#18336 - CGI_10023844 superfamily 248097 24 133 4.06E-13 61.8974 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18337 - CGI_10023845 superfamily 248097 72 197 5.67E-15 68.0606 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18337 - CGI_10023845 superfamily 243100 33 63 0.0001272 37.9288 cl02576 B_zip1 superfamily NC - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#18339 - CGI_10023847 superfamily 245604 1389 1454 9.76E-21 89.0123 cl11404 Biotinyl_lipoyl_domains superfamily - - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#18339 - CGI_10023847 superfamily 247809 904 1112 4.96E-82 269.171 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#18339 - CGI_10023847 superfamily 201133 790 897 4.38E-56 191.542 cl02837 CPSase_L_chain superfamily - - "Carbamoyl-phosphate synthase L chain, N-terminal domain; Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117." Q#18339 - CGI_10023847 superfamily 244920 1124 1230 1.22E-37 138.7 cl08365 Biotin_carb_C superfamily - - "Biotin carboxylase C-terminal domain; Biotin carboxylase is a component of the acetyl-CoA carboxylase multi-component enzyme which catalyzes the first committed step in fatty acid synthesis in animals, plants and bacteria. Most of the active site residues reported in reference are in this C-terminal domain." Q#18339 - CGI_10023847 superfamily 202474 626 673 0.00426459 39.1741 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#18340 - CGI_10023848 superfamily 245864 44 380 2.98E-53 184.789 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#18341 - CGI_10023849 superfamily 241868 5 138 3.76E-53 166.269 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#18342 - CGI_10023850 superfamily 241677 25 130 2.65E-57 178.219 cl00197 cyclophilin superfamily C - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#18343 - CGI_10023851 superfamily 241677 85 237 7.68E-61 191.316 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#18344 - CGI_10023852 superfamily 247856 128 154 0.00988073 34.4457 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18347 - CGI_10023855 superfamily 247857 78 170 0.000703866 39.5344 cl17303 Esterase_713_like superfamily NC - Novel bacterial esterase that cleaves esters on halogenated cyclic compounds; This family contains proteins similar to a novel bacterial esterase (Alcaligenes esterase 713) with the alpha/beta hydrolase fold but does not contain the GXSXXG pentapeptide around the active site serine residue as commonly seen in other enzymes of this class. Esterase 713 shows negligible sequence homology to other esterase and lipase enzymes. It is active as a dimer and cleaves esters on halogenated cyclic compounds though its natural substrate is unknown. This enzyme is possibly exported from the cytosol to the periplasmic space. A large majority of sequences in this family have yet to be characterized. Q#18348 - CGI_10023856 superfamily 245818 137 248 1.82E-33 124.592 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#18348 - CGI_10023856 superfamily 217750 305 365 1.53E-15 72.2266 cl04280 PAP_assoc superfamily - - Cid1 family poly A polymerase; This domain is found in poly(A) polymerases and has been shown to have polynucleotide adenylyltransferase activity. Proteins in this family have been located to both the nucleus and the cytoplasm. Q#18350 - CGI_10023858 superfamily 241594 67 139 5.38E-05 40.3664 cl00077 HECTc superfamily N - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#18354 - CGI_10023862 superfamily 243072 58 183 4.34E-35 124.418 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18354 - CGI_10023862 superfamily 243072 124 249 2.00E-31 114.788 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18354 - CGI_10023862 superfamily 243072 223 287 7.24E-11 57.7786 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18354 - CGI_10023862 superfamily 243072 30 59 0.000190516 37.9188 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18355 - CGI_10023863 superfamily 241573 14 334 5.54E-121 363.961 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#18355 - CGI_10023863 superfamily 246669 482 606 6.33E-23 95.0379 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#18355 - CGI_10023863 superfamily 241653 347 457 8.07E-23 95.1247 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#18358 - CGI_10023866 superfamily 238012 171 197 0.00013629 39.2598 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18358 - CGI_10023866 superfamily 243061 205 302 9.17E-28 105.117 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#18358 - CGI_10023866 superfamily 243061 309 356 5.11E-13 64.2854 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#18360 - CGI_10023868 superfamily 243051 89 142 6.89E-09 50.8121 cl02479 MAM superfamily NC - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#18361 - CGI_10023869 superfamily 241889 1 47 1.42E-11 56.8697 cl00474 PAP2_like superfamily N - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#18362 - CGI_10023870 superfamily 241889 76 199 4.14E-22 88.8412 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#18363 - CGI_10023871 superfamily 241610 121 174 1.07E-07 47.2446 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#18364 - CGI_10023872 superfamily 247743 221 355 8.11E-08 50.2223 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#18364 - CGI_10023872 superfamily 244709 30 191 1.56E-60 196.297 cl07381 BCS1_N superfamily - - BCS1 N terminal; This domain is found at the N terminal of the mitochondrial ATPase BSC1. It encodes the import and intramitochondrial sorting for the protein. Q#18365 - CGI_10023873 superfamily 245201 86 232 1.49E-14 71.8841 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18365 - CGI_10023873 superfamily 245201 178 303 3.75E-07 50.1348 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18366 - CGI_10023874 superfamily 246925 137 444 1.37E-43 162.525 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#18366 - CGI_10023874 superfamily 247684 866 1224 1.83E-21 96.7621 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#18368 - CGI_10023876 superfamily 247805 298 439 1.95E-23 98.9487 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18368 - CGI_10023876 superfamily 247905 524 653 8.83E-12 64.1812 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#18368 - CGI_10023876 superfamily 243098 1089 1144 6.22E-10 57.2227 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#18368 - CGI_10023876 superfamily 243778 709 809 1.85E-17 79.9607 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#18369 - CGI_10023877 superfamily 191675 1 84 8.08E-13 61.1275 cl06196 OPA3 superfamily N - "Optic atrophy 3 protein (OPA3); This family consists of several optic atrophy 3 (OPA3) proteins. OPA3 deficiency causes type III 3-methylglutaconic aciduria (MGA) in humans. This disease manifests with early bilateral optic atrophy, spasticity, extrapyramidal dysfunction, ataxia, and cognitive deficits, but normal longevity." Q#18370 - CGI_10023878 superfamily 241559 1 97 2.80E-15 73.1139 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#18371 - CGI_10006959 superfamily 245847 136 208 1.06E-08 52.9442 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18371 - CGI_10006959 superfamily 241619 34 80 0.000103217 40.2581 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#18376 - CGI_10006965 superfamily 216901 299 483 4.60E-84 265.989 cl03466 Rap_GAP superfamily - - Rap/ran-GAP; Rap/ran-GAP. Q#18379 - CGI_10006968 superfamily 243091 29 122 1.49E-06 48.6424 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#18379 - CGI_10006968 superfamily 246975 528 549 0.00283774 38.0969 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#18379 - CGI_10006968 superfamily 222150 513 536 0.00781531 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#18382 - CGI_10025975 superfamily 243161 10 56 0.00020522 37.759 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#18387 - CGI_10025981 superfamily 203213 1 21 0.00179968 33.2454 cl04999 HTH_psq superfamily NC - "helix-turn-helix, Psq domain; This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster. In pipsqueak this domain binds to GAGA sequence." Q#18390 - CGI_10025984 superfamily 241574 1 113 1.33E-35 123.85 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18392 - CGI_10025987 superfamily 241644 1 94 1.48E-17 72.712 cl00154 UBCc superfamily N - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#18393 - CGI_10025988 superfamily 243778 355 446 3.30E-31 116.169 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#18393 - CGI_10025988 superfamily 243082 34 159 6.64E-24 101.68 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#18394 - CGI_10025989 superfamily 220692 52 363 1.40E-10 60.6809 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#18396 - CGI_10025991 superfamily 216939 9 68 2.66E-08 47.6577 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#18396 - CGI_10025991 superfamily 216939 93 135 0.000195133 37.2573 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#18397 - CGI_10025992 superfamily 241564 193 261 1.39E-16 71.9131 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#18397 - CGI_10025992 superfamily 247792 4 62 1.13E-09 52.8332 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#18398 - CGI_10025993 superfamily 245864 5 440 1.52E-72 237.947 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#18400 - CGI_10025995 superfamily 247725 176 305 7.47E-50 167.396 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18400 - CGI_10025995 superfamily 247792 389 427 7.00E-05 40.5068 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#18400 - CGI_10025995 superfamily 220215 10 89 2.92E-18 79.1914 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#18400 - CGI_10025995 superfamily 215882 94 201 3.29E-14 68.8466 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#18401 - CGI_10025996 superfamily 219963 160 269 7.42E-16 70.7146 cl08487 GCV_T_C superfamily - - "Glycine cleavage T-protein C-terminal barrel domain; This is a family of glycine cleavage T-proteins, part of the glycine cleavage multienzyme complex (GCV) found in bacteria and the mitochondria of eukaryotes. GCV catalyzes the catabolism of glycine in eukaryotes. The T-protein is an aminomethyl transferase." Q#18402 - CGI_10025997 superfamily 240523 619 817 1.74E-76 251.387 cl18941 DAXX_histone_binding superfamily - - "Histone binding domain of the death-domain associated protein (DAXX); DAXX is a nuclear protein that modulates transcription of various genes and is involved in cell death and/or the suppression of growth. DAXX is also a histone chaperone conserved in Metazoa that acts specifically on histone H3.3. This alignment models a functional domain of DAXX that interacts with the histone H3.3-H4 dimer, and in doing so competes with DNA binding and interactions between the histone chaperone ASF1/CIA and the H3-H4 dimer." Q#18403 - CGI_10025998 superfamily 217293 27 232 2.81E-41 146.239 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#18405 - CGI_10026000 superfamily 241578 167 317 5.18E-31 115.852 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18406 - CGI_10026001 superfamily 241599 264 324 2.22E-16 73.8168 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#18406 - CGI_10026001 superfamily 217730 35 262 3.01E-97 294.807 cl04264 PBC superfamily - - PBC domain; The PBC domain is a member of the TALE (three-amino-acid loop extension) superclass of homeodomain proteins. Q#18407 - CGI_10026002 superfamily 248458 148 285 2.64E-21 94.3028 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18408 - CGI_10026003 superfamily 215733 221 464 2.42E-20 91.4726 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#18408 - CGI_10026003 superfamily 226572 783 836 0.000167587 41.7756 cl18761 COG4087 superfamily NC - Soluble P-type ATPase [General function prediction only] Q#18408 - CGI_10026003 superfamily 222006 535 610 0.000794265 39.1278 cl16182 Hydrolase_like2 superfamily - - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#18409 - CGI_10026004 superfamily 155755 1 113 8.79E-18 73.3254 cl03922 V-ATPase_G superfamily - - "Vacuolar (H+)-ATPase G subunit; This family represents the eukaryotic vacuolar (H+)-ATPase (V-ATPase) G subunit. V-ATPases generate an acidic environment in several intracellular compartments. Correspondingly, they are found as membrane-attached proteins in several organelles. They are also found in the plasma membranes of some specialised cells. V-ATPases consist of peripheral (V1) and membrane integral (V0) heteromultimeric complexes. The G subunit is part of the V1 subunit, but is also thought to be strongly attached to the V0 complex. It may be involved in the coupling of ATP degradation to H+ translocation." Q#18410 - CGI_10026005 superfamily 218721 12 337 9.92E-39 144.951 cl05344 TROVE superfamily - - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#18413 - CGI_10026008 superfamily 241599 67 125 3.79E-23 95.0028 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#18415 - CGI_10026010 superfamily 219852 871 1313 2.71E-09 60.867 cl15656 Sfi1 superfamily - - Sfi1 spindle body protein; This is a family of fungal spindle pole body proteins that play a role in spindle body duplication. They contain binding sites for calmodulin-like proteins called centrins which are present in microtubule-organising centres. Q#18415 - CGI_10026010 superfamily 219852 1540 2054 1.08E-08 58.941 cl15656 Sfi1 superfamily - - Sfi1 spindle body protein; This is a family of fungal spindle pole body proteins that play a role in spindle body duplication. They contain binding sites for calmodulin-like proteins called centrins which are present in microtubule-organising centres. Q#18416 - CGI_10026011 superfamily 247905 1003 1125 6.66E-29 115.028 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#18416 - CGI_10026011 superfamily 247805 700 858 4.62E-20 89.704 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18416 - CGI_10026011 superfamily 248013 576 624 1.03E-05 45.3327 cl17459 CHROMO superfamily - - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#18416 - CGI_10026011 superfamily 248013 478 514 3.17E-05 44.1771 cl17459 CHROMO superfamily N - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#18416 - CGI_10026011 superfamily 219716 1730 1902 3.93E-101 324.147 cl06903 CHDCT2 superfamily - - CHDCT2 (NUC038) domain; The CHDCT2 C-terminal domain is found in PHD/RING finger and chromo domain-associated CHD-like helicases. Q#18416 - CGI_10026011 superfamily 219039 1333 1466 2.75E-66 223.698 cl05788 DUF1086 superfamily - - "Domain of Unknown Function (DUF1086); This family consists of several eukaryotic domains of unknown function which are present in chromodomain helicase DNA binding proteins. This domain is often found in conjunction with pfam00176, pfam00271, pfam06465, pfam00385 and pfam00628." Q#18416 - CGI_10026011 superfamily 148208 1245 1297 4.32E-18 81.8243 cl05792 DUF1087 superfamily - - "Domain of Unknown Function (DUF1087); Members of this family are found in various chromatin remodelling factors and transposases. Their exact function is, as yet, unknown." Q#18416 - CGI_10026011 superfamily 116680 111 160 7.42E-17 77.8834 cl06902 CHDNT superfamily - - CHDNT (NUC034) domain; The CHDNT domain is found in PHD/RING finger and chromo domain-associated helicases. Q#18416 - CGI_10026011 superfamily 247999 385 430 1.30E-15 74.4491 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#18416 - CGI_10026011 superfamily 247999 327 369 3.16E-14 70.2119 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#18417 - CGI_10026012 superfamily 148593 15 184 3.56E-102 294.353 cl06212 TRAP-gamma superfamily - - "Translocon-associated protein, gamma subunit (TRAP-gamma); This family consists of several eukaryotic translocon-associated protein, gamma subunit (TRAP-gamma) sequences. The translocation site (translocon), at which nascent polypeptides pass through the endoplasmic reticulum membrane, contains a component previously called 'signal sequence receptor' that is now renamed as 'translocon-associated protein' (TRAP). The TRAP complex is comprised of four membrane proteins alpha, beta, gamma and delta which are present in a stoichiometric relation, and are genuine neighbors in intact microsomes. The gamma subunit is predicted to span the membrane four times." Q#18418 - CGI_10026013 superfamily 220692 12 233 6.51E-05 42.1913 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#18420 - CGI_10026015 superfamily 198858 1 178 1.73E-25 97.2271 cl05741 AGTRAP superfamily - - "Angiotensin II, type I receptor-associated protein (AGTRAP); This family consists of several angiotensin II, type I receptor-associated protein (AGTRAP) sequences. AGTRAP is known to interact specifically with the carboxyl-terminal cytoplasmic region of the angiotensin II type 1 (AT(1)) receptor to regulate different aspects of AT(1) receptor physiology. The function of this family is unclear." Q#18421 - CGI_10026016 superfamily 204041 22 172 1.29E-47 155.047 cl07367 GLTP superfamily - - Glycolipid transfer protein (GLTP); GLTP is a cytosolic protein that catalyzes the intermembrane transfer of glycolipids. Q#18424 - CGI_10026019 superfamily 241563 62 98 1.06E-05 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18424 - CGI_10026019 superfamily 110440 662 688 0.000220893 39.6985 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18424 - CGI_10026019 superfamily 241563 8 53 0.000328085 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18424 - CGI_10026019 superfamily 110440 703 730 0.00564259 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18425 - CGI_10026020 superfamily 243092 12 169 2.15E-17 76.99 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18427 - CGI_10014527 superfamily 241574 253 406 1.59E-44 158.903 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18427 - CGI_10014527 superfamily 241574 451 625 6.23E-19 86.1005 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18427 - CGI_10014527 superfamily 245847 29 97 0.000139895 40.9488 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18429 - CGI_10014529 superfamily 145792 193 218 0.00206569 34.5439 cl10597 Antistasin superfamily - - Antistasin family; Members of this family are inhibitors of trypsin family proteases. This domain is highly disulphide bonded. The domain is also found in some large extracellular proteins in multiple copies. Q#18432 - CGI_10014532 superfamily 247724 248 408 4.99E-05 42.4436 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18432 - CGI_10014532 superfamily 242902 41 190 1.28E-14 70.7386 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#18433 - CGI_10014533 superfamily 247724 109 227 0.00226455 37.9104 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18434 - CGI_10014534 superfamily 247724 31 197 0.000558938 38.9768 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18439 - CGI_10014539 superfamily 246723 2 340 2.96E-168 480.523 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#18441 - CGI_10014541 superfamily 241754 12 718 0 1056.81 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#18441 - CGI_10014541 superfamily 218855 826 1023 1.05E-34 132.811 cl10652 Myosin_TH1 superfamily - - Myosin tail; Myosin tail. Q#18444 - CGI_10001810 superfamily 241584 25 73 0.00047885 34.0091 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18447 - CGI_10002069 superfamily 243065 310 464 3.26E-09 55.9109 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#18454 - CGI_10004766 superfamily 248020 526 762 1.77E-29 120.262 cl17466 Sulfatase superfamily N - Sulfatase; Sulfatase. Q#18454 - CGI_10004766 superfamily 248020 27 90 6.70E-11 63.2524 cl17466 Sulfatase superfamily C - Sulfatase; Sulfatase. Q#18455 - CGI_10004767 superfamily 216807 126 214 8.14E-28 105.04 cl18379 DUF106 superfamily N - Integral membrane protein DUF106; This archaebacterial protein family has no known function. Members are predicted to be integral membrane proteins. Q#18456 - CGI_10004768 superfamily 245819 778 955 1.96E-63 214.75 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#18456 - CGI_10004768 superfamily 245201 473 706 3.68E-28 114.641 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18456 - CGI_10004768 superfamily 219526 724 765 5.62E-06 47.6139 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#18457 - CGI_10004769 superfamily 241886 9 295 2.92E-93 284.837 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#18457 - CGI_10004769 superfamily 247792 352 406 0.00105085 36.6548 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#18458 - CGI_10004770 superfamily 243179 104 207 3.94E-27 103.196 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#18458 - CGI_10004770 superfamily 241622 241 311 4.13E-12 61.0434 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#18459 - CGI_10004771 superfamily 247856 577 638 2.22E-06 45.6165 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18459 - CGI_10004771 superfamily 207712 88 193 3.44E-33 123.191 cl02728 DUF1126 superfamily - - Repeat of unknown function (DUF1126); This family consists of several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown. Q#18459 - CGI_10004771 superfamily 207712 413 517 1.93E-31 118.183 cl02728 DUF1126 superfamily - - Repeat of unknown function (DUF1126); This family consists of several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown. Q#18459 - CGI_10004771 superfamily 207712 236 352 6.71E-30 113.946 cl02728 DUF1126 superfamily - - Repeat of unknown function (DUF1126); This family consists of several eukaryote specific repeats of around 35 residues in length. The function of this family is unknown. Q#18460 - CGI_10004772 superfamily 241622 52 134 1.25E-19 83.385 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#18460 - CGI_10004772 superfamily 247725 264 353 0.000766684 37.9133 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18461 - CGI_10004773 superfamily 246597 47 113 2.32E-29 109.625 cl13995 MPP_superfamily superfamily C - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#18461 - CGI_10004773 superfamily 246597 195 261 5.67E-28 105.773 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#18465 - CGI_10026268 superfamily 242406 61 165 9.44E-11 56.0605 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#18466 - CGI_10026269 superfamily 245814 119 190 1.38E-12 64.0475 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18467 - CGI_10026270 superfamily 248458 23 199 2.63E-13 69.6501 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18467 - CGI_10026270 superfamily 248458 267 440 9.40E-06 46.1529 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18468 - CGI_10026271 superfamily 248458 195 340 6.91E-07 49.2345 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18469 - CGI_10026272 superfamily 248458 262 439 2.04E-09 57.7089 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18469 - CGI_10026272 superfamily 248458 11 183 0.00627981 37.2933 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18470 - CGI_10026273 superfamily 248458 12 188 3.76E-13 69.2649 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18471 - CGI_10026274 superfamily 247743 178 344 9.51E-22 90.6683 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#18472 - CGI_10026275 superfamily 246713 29 185 8.87E-42 145.077 cl14786 ENDO3c superfamily - - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#18472 - CGI_10026275 superfamily 246713 232 327 2.71E-26 102.705 cl14786 ENDO3c superfamily N - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#18473 - CGI_10026276 superfamily 246713 58 214 5.46E-44 147.773 cl14786 ENDO3c superfamily - - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#18475 - CGI_10026278 superfamily 241770 153 310 4.02E-07 47.3904 cl00309 PRTases_typeI superfamily - - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#18475 - CGI_10026278 superfamily 222383 11 126 3.93E-47 157.573 cl16402 Pribosyltran_N superfamily - - "N-terminal domain of ribose phosphate pyrophosphokinase; This family is frequently found N-terminal to the Pribosyltran, pfam00156." Q#18476 - CGI_10026279 superfamily 247724 12 182 1.77E-112 324.581 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18476 - CGI_10026279 superfamily 243073 190 232 1.71E-15 68.3566 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#18477 - CGI_10026280 superfamily 245201 56 343 0 553.896 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18479 - CGI_10026282 superfamily 241868 203 315 1.60E-36 130.118 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#18479 - CGI_10026282 superfamily 247976 51 115 5.54E-27 103.419 cl17422 RF-1 superfamily C - "RF-1 domain; This domain is found in peptide chain release factors such as RF-1 and RF-2, and a number of smaller proteins of unknown function. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis." Q#18480 - CGI_10026283 superfamily 191640 25 127 3.71E-24 92.7185 cl06121 DUF1279 superfamily - - Protein of unknown function (DUF1279); This family represents the C-terminus (approx. 120 residues) of a number of eukaryotic proteins of unknown function. Q#18481 - CGI_10026284 superfamily 247727 84 202 0.000100338 40.1059 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#18482 - CGI_10026285 superfamily 245206 11 237 2.28E-58 188.663 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#18485 - CGI_10026288 superfamily 246937 110 190 2.34E-11 57.5835 cl15368 RNase_Ire1_like superfamily N - "RNase domain (also known as the kinase extension nuclease domain) of Ire1 and RNase L; This RNase domain is found in the multi-functional protein Ire1; Ire1 also contains a type I transmembrane serine/threonine protein kinase (STK) domain, and a Luminal dimerization domain. Ire1 is essential for the endoplasmic reticulum (ER) unfolded protein response (UPR). The UPR is activated when protein misfolding is detected in the ER in order to reduce the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1 acts as an ER stress sensor; IRE1 dimerizes through its N-terminal luminal domain and forms oligomers, promoting trans-autophosphorylation by its cytosolic kinase domain which stimulates its endoribonuclease (RNase) activity and results in the cleavage of its mRNA substrate, Hac1 in yeast and Xbp1 in metazoans, thus promoting a splicing event that enables translation into a transcription factor which activates the UPR. This RNase domain is also found in Ribonuclease L (RNase L), sometimes referred to as the 2-5A-dependent RNase. RNase L is a highly regulated, latent endoribonuclease widely expressed in most mammalian tissues. It is involved in the mediation of the antiviral and pro-apoptotic activities of the interferon-inducible 2-5A system; the interferon (IFN)-inducible 2'-5'-oligoadenylate synthetase (OAS)/RNase L pathway blocks infections by certain types of viruses through cleavage of viral and cellular single-stranded RNA. RNase L has been shown to have an impact on the pathogenesis of prostate cancer; the RNase L gene, RNASEL, has been identified as a strong candidate for the hereditary prostate cancer 1 (HPC1) allele." Q#18485 - CGI_10026288 superfamily 245201 22 61 0.000967558 37.8487 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18486 - CGI_10026289 superfamily 245201 1208 1351 2.51E-13 69.9581 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18488 - CGI_10026291 superfamily 243072 81 203 1.10E-23 92.8318 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18490 - CGI_10026293 superfamily 198757 98 193 1.99E-41 144.774 cl02658 TAFH superfamily - - "NHR1 homology to TAF; This corresponds to the region NHR1 that is conserved between the product of the nervy gene in Drosophila and the human mtg8b protein, which is hypothesised to be a transcription factor." Q#18490 - CGI_10026293 superfamily 149750 320 366 8.80E-15 69.896 cl07409 NHR2 superfamily N - NHR2 domain like; The NHR2 (Nervy homology 2) domain is found in the ETO protein where it mediates oligomerisation and protein-protein interactions. It forms an alpha-helical tetramer. Q#18491 - CGI_10026294 superfamily 244881 337 656 6.30E-74 244.795 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#18491 - CGI_10026294 superfamily 215788 162 254 1.55E-20 88.0051 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#18491 - CGI_10026294 superfamily 203720 766 808 0.000155836 40.9946 cl08457 A2M_recep superfamily C - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#18492 - CGI_10026295 superfamily 220692 85 343 1.91E-06 47.5841 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#18494 - CGI_10026297 superfamily 220695 49 193 0.000834217 39.4843 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18496 - CGI_10026299 superfamily 242206 140 261 5.64E-60 188.152 cl00938 Rieske superfamily - - "Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis." Q#18496 - CGI_10026299 superfamily 217287 68 128 2.55E-10 54.6779 cl03782 UCR_TM superfamily - - Ubiquinol cytochrome reductase transmembrane region; Each subunit of the cytochrome bc1 complex provides a single helix (this family) to make up the transmembrane region of the complex. Q#18498 - CGI_10026301 superfamily 218942 1 394 2.90E-126 374.734 cl12330 NPR2 superfamily - - Nitrogen permease regulator 2; This family of regulators are involved in post-translational control of nitrogen permease. Q#18499 - CGI_10026302 superfamily 243100 248 296 8.77E-06 42.2919 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#18500 - CGI_10026303 superfamily 243100 182 232 1.35E-10 54.9273 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#18501 - CGI_10026304 superfamily 243100 233 285 1.84E-14 66.4833 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#18502 - CGI_10026305 superfamily 243100 29 82 1.43E-07 44.1417 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#18503 - CGI_10026306 superfamily 241704 91 241 2.35E-39 138.275 cl00227 PEBP superfamily - - "PhosphatidylEthanolamine-Binding Protein (PEBP) domain; PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). A number of biological roles for members of the PEBP family include serine protease inhibition, membrane biogenesis, regulation of flowering plant stem architecture, and Raf-1 kinase inhibition. Although their overall structures are similar, the members of the PEBP family bind very different substrates including phospholipids, opioids, and hydrophobic odorant molecules as well as having different oligomerization states (monomer/dimer/tetramer)." Q#18504 - CGI_10026308 superfamily 246908 226 325 2.67E-57 184.656 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#18504 - CGI_10026308 superfamily 243073 348 388 6.95E-20 82.0803 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#18505 - CGI_10026309 superfamily 215859 124 278 4.13E-05 42.5887 cl18347 Peptidase_S9 superfamily N - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#18506 - CGI_10026310 superfamily 220692 75 370 8.66E-10 58.3697 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#18508 - CGI_10026312 superfamily 241782 45 406 6.34E-84 263.433 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#18511 - CGI_10026315 superfamily 220665 2 493 3.59E-142 420.679 cl10950 Tmemb_161AB superfamily - - Predicted transmembrane protein 161AB; Transmemb_161AB is a family of conserved proteins found from worms to humans. Members are putative transmembrane proteins but otherwise the function is not known. Q#18512 - CGI_10026316 superfamily 241564 32 97 8.76E-28 103.885 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#18512 - CGI_10026316 superfamily 247792 310 355 1.49E-10 56.234 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#18513 - CGI_10026317 superfamily 247057 234 264 0.00336255 34.7624 cl15755 SAM_superfamily superfamily N - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#18517 - CGI_10026321 superfamily 204632 223 312 0.000447815 40.6181 cl12901 DUF3166 superfamily - - Protein of unknown function (DUF3166); This eukaryotic family of proteins has no known function. Q#18518 - CGI_10026322 superfamily 241832 1 179 7.12E-42 141.563 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18519 - CGI_10026323 superfamily 241750 9 430 0 540.328 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#18521 - CGI_10026325 superfamily 241884 119 349 2.42E-97 291.484 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#18522 - CGI_10026326 superfamily 247905 88 219 2.45E-29 113.102 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#18522 - CGI_10026326 superfamily 247805 21 76 6.46E-22 94.0885 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18522 - CGI_10026326 superfamily 222474 251 314 7.36E-22 90.1749 cl16500 DUF4217 superfamily - - Domain of unknown function (DUF4217); This short domain is found at the C-terminus of many helicase proteins. Q#18523 - CGI_10026327 superfamily 217293 36 244 1.09E-60 199.011 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#18523 - CGI_10026327 superfamily 202474 252 455 2.81E-07 49.9597 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#18525 - CGI_10004278 superfamily 241600 694 902 2.24E-83 268.725 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18525 - CGI_10004278 superfamily 243092 32 301 1.36E-53 188.313 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18525 - CGI_10004278 superfamily 221242 370 669 1.20E-44 164.246 cl13285 DUF3337 superfamily - - Domain of unknown function (DUF3337); This family of proteins are functionally uncharacterized. This family is only found in eukaryotes. This presumed domain is typically between 285 to 342 amino acids in length. Q#18526 - CGI_10004279 superfamily 248264 1 110 0.000308566 36.4462 cl17710 DDE_4 superfamily NC - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#18527 - CGI_10004280 superfamily 241600 120 256 1.04E-49 164.336 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18529 - CGI_10004282 superfamily 241600 145 357 1.42E-74 232.131 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18530 - CGI_10004283 superfamily 241600 133 243 1.25E-29 111.215 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18534 - CGI_10022832 superfamily 245601 318 496 2.51E-28 110.49 cl11399 HP superfamily - - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#18535 - CGI_10022833 superfamily 216981 115 228 5.81E-19 80.2693 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#18537 - CGI_10022835 superfamily 241600 102 321 2.76E-91 273.732 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18538 - CGI_10022836 superfamily 241600 101 320 1.82E-89 269.11 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18540 - CGI_10022838 superfamily 241600 163 382 3.49E-82 252.546 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18541 - CGI_10022839 superfamily 241600 19 64 1.71E-14 63.7987 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18543 - CGI_10022841 superfamily 242902 27 126 1.04E-15 69.9682 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#18544 - CGI_10022842 superfamily 243119 97 145 2.27E-09 49.7396 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18544 - CGI_10022842 superfamily 243119 36 78 7.29E-05 37.4133 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18547 - CGI_10022845 superfamily 248054 26 77 4.46E-05 41.3036 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#18548 - CGI_10022846 superfamily 238191 94 208 7.34E-09 55.8012 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#18550 - CGI_10022848 superfamily 248289 707 764 3.44E-15 72.0715 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#18550 - CGI_10022848 superfamily 244382 427 547 6.32E-13 66.9539 cl06473 CHRD superfamily - - "CHRD domain; CHRD (after SWISS-PROT abbreviation for chordin) is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologues. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made. Its most conserved feature is a GE[I/L]RCG[V/I/L] motif towards its C-terminal end Most bacterial proteins in this family have only one CHRD domain, whereas it is found repeated in many eukaryotic proteins such as human chordin and Drosophila SOG.." Q#18550 - CGI_10022848 superfamily 244382 569 672 8.18E-07 48.4643 cl06473 CHRD superfamily - - "CHRD domain; CHRD (after SWISS-PROT abbreviation for chordin) is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologues. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made. Its most conserved feature is a GE[I/L]RCG[V/I/L] motif towards its C-terminal end Most bacterial proteins in this family have only one CHRD domain, whereas it is found repeated in many eukaryotic proteins such as human chordin and Drosophila SOG.." Q#18550 - CGI_10022848 superfamily 248289 780 843 1.12E-05 44.3371 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#18550 - CGI_10022848 superfamily 244382 189 298 5.36E-05 42.6863 cl06473 CHRD superfamily - - "CHRD domain; CHRD (after SWISS-PROT abbreviation for chordin) is a novel domain identified in chordin, an inhibitor of bone morphogenetic proteins. This family includes bacterial homologues. It is anticipated to have an immunoglobulin-like beta-barrel structure based on limited similarity to superoxide dismutases but, as yet, no clear functional prediction can be made. Its most conserved feature is a GE[I/L]RCG[V/I/L] motif towards its C-terminal end Most bacterial proteins in this family have only one CHRD domain, whereas it is found repeated in many eukaryotic proteins such as human chordin and Drosophila SOG.." Q#18551 - CGI_10022849 superfamily 243238 41 337 2.32E-93 298.027 cl02915 Voltage_gated_ClC superfamily C - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#18551 - CGI_10022849 superfamily 243238 438 557 2.11E-42 158.585 cl02915 Voltage_gated_ClC superfamily N - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#18553 - CGI_10022851 superfamily 246936 37 75 2.16E-14 63.8105 cl15354 CBS_pair superfamily N - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#18554 - CGI_10022852 superfamily 220419 93 257 1.79E-18 82.9666 cl17022 DUF2351 superfamily NC - Uncharacterized conserved protein (DUF2351); Members of this family of proteins have no known function. Q#18555 - CGI_10022853 superfamily 216574 82 219 6.47E-42 145.813 cl14794 FAD_binding_4 superfamily - - "FAD binding domain; This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidises the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan." Q#18556 - CGI_10022854 superfamily 222269 106 293 1.30E-20 88.5346 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#18557 - CGI_10022855 superfamily 248100 134 194 2.82E-11 57.164 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#18557 - CGI_10022855 superfamily 248100 68 128 2.90E-10 54.4676 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#18558 - CGI_10022856 superfamily 247677 1376 1508 9.78E-44 157.446 cl17013 W2 superfamily - - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#18558 - CGI_10022856 superfamily 243128 712 939 8.32E-43 157.106 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#18558 - CGI_10022856 superfamily 243129 1180 1291 4.21E-25 103.102 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#18558 - CGI_10022856 superfamily 247677 1488 1534 0.0097964 37.5777 cl17013 W2 superfamily N - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#18559 - CGI_10022857 superfamily 245610 20 305 5.31E-122 362.717 cl11424 nitrilase superfamily - - "Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes; This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy." Q#18561 - CGI_10022859 superfamily 241886 64 418 2.85E-78 246.702 cl00470 Aldo_ket_red superfamily - - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#18562 - CGI_10022860 superfamily 241913 329 431 1.46E-34 124.668 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#18562 - CGI_10022860 superfamily 241913 32 126 1.25E-33 121.959 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#18562 - CGI_10022860 superfamily 241913 203 254 9.15E-12 61.0975 cl00509 hot_dog superfamily N - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#18569 - CGI_10022867 superfamily 199168 219 241 0.00783984 33.8644 cl15310 LRR_TYP superfamily - - "Leucine-rich repeats, typical (most populated) subfamily; Leucine-rich repeats, typical (most populated) subfamily. " Q#18571 - CGI_10022869 superfamily 218140 291 835 4.01E-143 434.719 cl04579 Anoctamin superfamily - - "Calcium-activated chloride channel; The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes." Q#18573 - CGI_10022871 superfamily 248458 120 252 4.22E-13 68.8797 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18576 - CGI_10022874 superfamily 247912 25 369 5.40E-44 159.589 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#18576 - CGI_10022874 superfamily 221337 425 483 1.25E-06 46.5411 cl13401 DUF3471 superfamily C - "Domain of unknown function (DUF3471); This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144." Q#18577 - CGI_10004039 superfamily 244757 29 58 0.00164408 32.8329 cl07585 YopR_core superfamily N - "YopR Core; The YopR core domain, predominantly found in the Yersinia pestis virulence factor YopR, is composed of five alpha-helices, four of which are arranged in an antiparallel bundle. Little is known about this domain, though it may contribute to the virulence of the protein YopR." Q#18579 - CGI_10004041 superfamily 245531 64 116 0.000298552 36.6003 cl11158 BEN superfamily C - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#18583 - CGI_10006206 superfamily 248312 50 196 1.20E-05 42.7257 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#18585 - CGI_10006208 superfamily 248312 17 195 0.000212958 39.2589 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#18587 - CGI_10006210 superfamily 221612 55 197 7.36E-14 70.936 cl13889 DUF3715 superfamily - - "Protein of unknown function (DUF3715); This domain family is found in eukaryotes, and is approximately 170 amino acids in length." Q#18588 - CGI_10006211 superfamily 245303 63 439 0 539.454 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#18588 - CGI_10006211 superfamily 243119 712 756 9.16E-06 44.732 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18588 - CGI_10006211 superfamily 243119 660 692 1.14E-05 44.3468 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18588 - CGI_10006211 superfamily 243119 466 498 1.14E-05 44.3468 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18588 - CGI_10006211 superfamily 243119 518 562 1.18E-05 44.3468 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#18590 - CGI_10006213 superfamily 243078 15 155 5.59E-72 227.305 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#18590 - CGI_10006213 superfamily 190532 194 285 1.06E-26 103.487 cl03906 GAT superfamily - - "GAT domain; The GAT domain is responsible for binding of GGA proteins to several members of the ARF family including ARF1 and ARF3. The GAT domain stabilises membrane bound ARF1 in its GTP bound state, by interfering with GAP proteins." Q#18592 - CGI_10006215 superfamily 242940 38 160 2.18E-49 161.531 cl02228 ATP12 superfamily - - "ATP12 chaperone protein; Mitochondrial F1-ATPase is an oligomeric enzyme composed of five distinct subunit polypeptides. The alpha and beta subunits make up the bulk of protein mass of F1. In Saccharomyces cerevisiae both subunits are synthesised as precursors with amino-terminal targeting signals that are removed upon translocation of the proteins to the matrix compartment. These proteins include examples from eukaryotes and bacteria and may have chaperone activity, being involved in F1 ATPase complex assembly." Q#18593 - CGI_10006216 superfamily 242890 22 192 5.10E-85 251.436 cl02113 Vac_ImportDeg superfamily - - "Vacuolar import and degradation protein; Members of this family are involved in the negative regulation of gluconeogenesis. They are required for both proteosome-dependent and vacuolar catabolite degradation of fructose-1,6-bisphosphatase (FBPase), where they probably regulate FBPase targeting from the FBPase-containing vesicles to the vacuole." Q#18594 - CGI_10006217 superfamily 206221 59 131 9.28E-30 110.067 cl16570 Requiem_N superfamily - - "N-terminal domain of DPF2/REQ; This putative domain has been detected on the human DPF2 protein and was subsequently targeted for structure determination by the Joint Center for Structural Genomics (JCSG). Possibly, the C-terminus extends by 30 amino acids and forms a separate domain. DPF2 interacts with estrogen related receptor alpha (Err-alpha), an orphan receptor which acts as a regulator in energy metabolism. It was also identified as an adaptor molecule that links nuclear factor kappa-light-chain-enhancer of activated B cells (NF-kappa-B) dimer RelB/p52 and switch/sucrose-nonfermentable (SWI/SNF) chromatin remodeling factor." Q#18594 - CGI_10006217 superfamily 247999 331 373 2.07E-06 44.7888 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#18596 - CGI_10022023 superfamily 220695 73 193 0.0029086 36.4027 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18600 - CGI_10022027 superfamily 245604 3 75 1.08E-19 82.0671 cl11404 Biotinyl_lipoyl_domains superfamily - - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#18600 - CGI_10022027 superfamily 215782 177 388 4.16E-93 280.976 cl18344 2-oxoacid_dh superfamily - - 2-oxoacid dehydrogenases acyltransferase (catalytic domain); These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain. Q#18601 - CGI_10022028 superfamily 245206 50 295 9.44E-68 214.638 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#18602 - CGI_10022029 superfamily 241583 5 208 2.50E-44 157.921 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#18602 - CGI_10022029 superfamily 245321 216 302 6.52E-10 56.0947 cl10507 Disintegrin superfamily - - Disintegrin; Disintegrin. Q#18603 - CGI_10022030 superfamily 245206 27 302 3.29E-135 387.978 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#18604 - CGI_10022031 superfamily 245321 27 118 2.06E-10 59.1763 cl10507 Disintegrin superfamily - - Disintegrin; Disintegrin. Q#18604 - CGI_10022031 superfamily 241563 847 886 0.000340746 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18605 - CGI_10022032 superfamily 247757 333 917 0 904.219 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#18605 - CGI_10022032 superfamily 247780 118 290 7.67E-88 279.44 cl17226 NAD_bind_amino_acid_DH superfamily - - "NAD(P) binding domain of amino acid dehydrogenase-like proteins; Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts." Q#18605 - CGI_10022032 superfamily 201431 3 123 1.70E-35 131.861 cl08290 THF_DHG_CYH superfamily - - "Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain; Tetrahydrofolate dehydrogenase/cyclohydrolase, catalytic domain. " Q#18606 - CGI_10022033 superfamily 243034 132 231 9.82E-10 56.6196 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#18606 - CGI_10022033 superfamily 243058 632 730 0.000468988 39.22 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18606 - CGI_10022033 superfamily 243058 588 662 0.00306046 36.9088 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18607 - CGI_10022034 superfamily 247757 102 274 6.50E-68 217.797 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#18607 - CGI_10022034 superfamily 202493 326 431 8.76E-31 114.907 cl03814 SRP_SPB superfamily - - Signal peptide binding domain; Signal peptide binding domain. Q#18607 - CGI_10022034 superfamily 243520 10 86 7.23E-14 67.1911 cl03758 SRP54_N superfamily - - "SRP54-type protein, helical bundle domain; SRP54-type protein, helical bundle domain. " Q#18608 - CGI_10022035 superfamily 243066 6 98 4.81E-25 100.32 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#18609 - CGI_10022036 superfamily 243992 10 65 8.18E-07 41.4054 cl05087 Complex1_LYR_1 superfamily - - "Complex1_LYR-like; This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria." Q#18610 - CGI_10022037 superfamily 241578 136 251 5.03E-22 90.4286 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18610 - CGI_10022037 superfamily 245213 89 125 1.20E-08 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18610 - CGI_10022037 superfamily 245213 10 47 2.94E-06 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18610 - CGI_10022037 superfamily 245213 50 86 3.10E-06 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18611 - CGI_10022038 superfamily 241763 11 222 1.54E-95 280.664 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#18612 - CGI_10022039 superfamily 241763 729 940 2.67E-101 317.258 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#18612 - CGI_10022039 superfamily 220393 13 297 1.33E-68 230.725 cl10751 Tmem26 superfamily - - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#18612 - CGI_10022039 superfamily 220393 431 612 1.45E-39 149.063 cl10751 Tmem26 superfamily N - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#18612 - CGI_10022039 superfamily 244586 643 701 1.66E-14 69.9651 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#18614 - CGI_10022041 superfamily 222269 236 491 8.42E-24 100.091 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#18615 - CGI_10022042 superfamily 247725 2670 2783 4.20E-44 159.673 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18615 - CGI_10022042 superfamily 247683 2146 2201 7.43E-21 90.0775 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#18615 - CGI_10022042 superfamily 241754 79 716 0 846.526 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#18615 - CGI_10022042 superfamily 243052 2315 2469 5.38E-41 151.359 cl02480 MyTH4 superfamily - - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#18615 - CGI_10022042 superfamily 243052 990 1136 4.45E-31 122.469 cl02480 MyTH4 superfamily - - "MyTH4 domain; Domain in myosin and kinesin tails, present twice in myosin-VIIa, and also present in 3 other myosins." Q#18615 - CGI_10022042 superfamily 215882 2582 2692 4.60E-06 47.6606 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#18616 - CGI_10022043 superfamily 245227 1 403 0 546.827 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#18617 - CGI_10022044 superfamily 248390 15 205 3.02E-52 168.424 cl17836 NT5C superfamily - - "5' nucleotidase, deoxy (Pyrimidine), cytosolic type C protein (NT5C); This family consists of several 5' nucleotidase, deoxy (Pyrimidine), cytosolic type C (NT5C) proteins. 5'(3')-Deoxyribonucleotidase is a ubiquitous enzyme in mammalian cells whose physiological function is not known." Q#18618 - CGI_10022045 superfamily 149077 545 657 5.25E-32 121.192 cl06719 TMC superfamily - - "TMC domain; These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 and EVIN2 - this region is termed the TMC domain. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters." Q#18619 - CGI_10022046 superfamily 241574 76 254 6.26E-75 234.402 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18619 - CGI_10022046 superfamily 241574 273 359 1.51E-07 50.2769 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18622 - CGI_10022049 superfamily 241574 27 167 1.21E-53 182.786 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18622 - CGI_10022049 superfamily 241574 241 436 6.78E-14 69.9221 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18623 - CGI_10022050 superfamily 242493 210 432 7.15E-16 76.5568 cl01417 Nuc-transf superfamily - - Predicted nucleotidyltransferase; Members of this family of bacterial proteins catalyze the transfer of nucleotide residues from nucleoside diphosphates or triphosphates into dimer or polymer forms. Q#18623 - CGI_10022050 superfamily 217926 7 168 6.79E-12 62.963 cl04418 YTH superfamily - - "YT521-B-like domain; A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily." Q#18625 - CGI_10022052 superfamily 241613 221 255 2.74E-10 54.135 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#18627 - CGI_10022054 superfamily 248097 8 120 8.72E-17 71.1422 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18628 - CGI_10022055 superfamily 220695 82 196 0.0001168 41.0251 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18629 - CGI_10022056 superfamily 246597 30 324 1.82E-88 274.563 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#18631 - CGI_10003680 superfamily 199156 138 153 0.000404377 36.6524 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#18631 - CGI_10003680 superfamily 199156 100 114 0.00464236 33.5817 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#18637 - CGI_10004461 superfamily 241758 27 124 1.22E-13 62.7726 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#18638 - CGI_10004462 superfamily 241563 167 205 3.55E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18638 - CGI_10004462 superfamily 110440 629 656 0.00898975 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18639 - CGI_10004463 superfamily 241563 71 109 0.000167256 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18639 - CGI_10004463 superfamily 191851 115 222 0.000972371 39.1503 cl06708 DUF1640 superfamily - - Protein of unknown function (DUF1640); This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured. Q#18639 - CGI_10004463 superfamily 110440 492 518 0.00581179 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18643 - CGI_10018841 superfamily 247725 3 121 2.01E-62 201.218 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18644 - CGI_10018842 superfamily 198896 1579 1595 1.58E-06 47.5758 cl07394 Ca_chan_IQ superfamily C - "Voltage gated calcium channel IQ domain; Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF)." Q#18645 - CGI_10018843 superfamily 241644 33 122 3.46E-21 88.4133 cl00154 UBCc superfamily C - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#18646 - CGI_10018844 superfamily 243134 28 148 3.19E-36 127.38 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#18646 - CGI_10018844 superfamily 243134 165 284 6.92E-32 115.824 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#18647 - CGI_10018845 superfamily 243134 29 149 2.93E-36 127.38 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#18647 - CGI_10018845 superfamily 243134 166 285 6.47E-32 115.824 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#18648 - CGI_10018846 superfamily 248097 54 177 2.39E-23 90.4022 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18650 - CGI_10018848 superfamily 248458 91 235 1.11E-15 77.3541 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18650 - CGI_10018848 superfamily 248458 389 567 7.26E-12 65.7981 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18651 - CGI_10018849 superfamily 248458 68 246 3.24E-20 91.2213 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18651 - CGI_10018849 superfamily 248458 424 599 1.61E-12 68.1093 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18652 - CGI_10018850 superfamily 248458 399 574 1.60E-17 83.1321 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18652 - CGI_10018850 superfamily 248458 36 214 1.83E-16 79.6653 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18653 - CGI_10018851 superfamily 248458 38 212 7.81E-21 93.1472 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18653 - CGI_10018851 superfamily 248458 491 629 1.50E-10 61.9461 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18654 - CGI_10018852 superfamily 243072 69 195 1.21E-36 134.819 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18654 - CGI_10018852 superfamily 247057 607 667 1.44E-30 115.669 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#18655 - CGI_10018853 superfamily 241795 1437 1571 2.60E-21 92.9171 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#18655 - CGI_10018853 superfamily 241795 1634 1760 4.55E-08 53.3236 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#18655 - CGI_10018853 superfamily 241795 885 973 1.54E-06 48.3991 cl00335 NDPk superfamily - - "Nucleoside diphosphate kinases (NDP kinases, NDPks): NDP kinases, responsible for the synthesis of nucleoside triphosphates (NTPs), are involved in numerous regulatory processes associated with proliferation, development, and differentiation. They are vital for DNA/RNA synthesis, cell division, macromolecular metabolism and growth. The enzymes generate NTPs or their deoxy derivatives by terminal (gamma) phosphotransfer from an NTP such as ATP or GTP to any nucleoside diphosphate (NDP) or its deoxy derivative. The sequence of NDPk has been highly conserved through evolution. There is a single histidine residue conserved in all known NDK isozymes, which is involved in the catalytic mechanism. The first confirmed metastasis suppressor gene was the NDP kinase protein encoded by the nm23 gene. Unicellular organisms generally possess only one gene encoding NDP kinase, while most multicellular organisms possess not only an ortholog that provides most of the NDP kinase enzymatic activity but also multiple divergent paralogous genes. The human genome codes for at least nine NDP kinases and can be classified into two groups, Groups I and II, according to their genomic architecture and distinct enzymatic activity. Group I isoforms (A-D) are well-conserved, catalytically active, and share 58-88% identity between each other, while Group II are more divergent, with only NDPk6 shown to be active. NDP kinases exist in two different quaternary structures; all known eukaryotic enzymes are hexamers, while some bacterial enzymes are tetramers, as in Myxococcus. The hexamer can be viewed as trimer of dimers, while tetramers are dimers of dimers, with the dimerization interface conserved." Q#18656 - CGI_10018854 superfamily 241584 997 1093 3.96E-16 76.7663 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18656 - CGI_10018854 superfamily 241584 1101 1194 5.72E-13 67.5215 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18656 - CGI_10018854 superfamily 241584 1386 1476 9.20E-13 67.1363 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18656 - CGI_10018854 superfamily 241584 896 989 1.23E-12 66.7511 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18656 - CGI_10018854 superfamily 245814 336 402 1.73E-09 57.1139 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 245814 246 312 2.23E-09 56.7287 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 241584 1199 1288 8.06E-09 55.5803 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18656 - CGI_10018854 superfamily 245814 428 498 2.53E-08 53.6471 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 241584 1484 1573 2.53E-06 47.8763 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18656 - CGI_10018854 superfamily 245814 614 674 4.68E-06 46.7135 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 245814 709 793 3.02E-14 71.1135 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 245814 521 583 1.15E-09 57.4167 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 245814 811 899 3.32E-09 56.5771 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 245814 1312 1383 3.43E-09 56.0999 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 245814 25 113 3.50E-07 50.6194 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18656 - CGI_10018854 superfamily 245814 159 219 1.23E-05 45.513 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18657 - CGI_10018855 superfamily 247725 494 597 2.74E-39 139.354 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18659 - CGI_10018857 superfamily 245213 398 432 3.99E-05 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 245213 361 395 0.000204431 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 245213 250 284 0.000241392 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 245213 213 248 0.000404457 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 245213 472 506 0.000456298 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 245213 435 469 0.00070832 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 245213 509 544 0.00109876 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 245213 288 321 0.00179222 36.4606 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 245213 324 359 0.00206959 36.4606 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18659 - CGI_10018857 superfamily 243124 64 205 7.73E-36 131.01 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#18661 - CGI_10018859 superfamily 245201 54 338 3.95E-86 266.116 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18663 - CGI_10018861 superfamily 247792 17 64 5.13E-11 57.8408 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#18663 - CGI_10018861 superfamily 110440 327 354 9.38E-06 42.7801 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18663 - CGI_10018861 superfamily 110440 280 306 0.00743213 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18664 - CGI_10018862 superfamily 241781 154 256 1.87E-52 169.716 cl00320 tRNA_bindingDomain superfamily - - "The tRNA binding domain is also known as the Myf domain in literature. This domain is found in a diverse collection of tRNA binding proteins, including prokaryotic phenylalanyl tRNA synthetases (PheRS), methionyl-tRNA synthetases (MetRS), human tyrosyl-tRNA synthetase(hTyrRS), Saccharomyces cerevisiae Arc1p, Thermus thermophilus CsaA, Aquifex aeolicus Trbp111, human p43 and human EMAP-II. PheRS, MetRS and hTyrRS aminoacylate their cognate tRNAs. Arc1p is a transactivator of yeast methionyl-tRNA and glutamyl-tRNA synthetases. The molecular chaperones Trbp111 and CsaA also contain this domain. CsaA has export related activities; Trbp111 is structure-specific recognizing the L-shape of the tRNA fold. This domain has general tRNA binding properties. In a subset of this family this domain has the added capability of a cytokine. For example the p43 component of the Human aminoacyl-tRNA synthetase complex is cleaved to release EMAP-II cytokine. EMAP-II has multiple activities during apoptosis, angiogenesis and inflammation and participates in malignant transformation. An EMAP-II-like cytokine is released from hTyrRS upon cleavage. The active cytokine heptapeptide locates to this domain. For homodimeric members of this group which include CsaA, Trbp111 and Escherichia coli MetRS this domain acts as a dimerization domain." Q#18665 - CGI_10018863 superfamily 241743 66 215 1.80E-39 135.008 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#18666 - CGI_10018864 superfamily 241743 278 427 1.85E-34 126.534 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#18666 - CGI_10018864 superfamily 241743 68 213 1.51E-26 104.192 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#18667 - CGI_10002979 superfamily 245225 374 661 4.81E-59 203.648 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#18667 - CGI_10002979 superfamily 242197 18 84 0.00190102 37.3444 cl00928 dsDNA_bind superfamily N - Double-stranded DNA-binding domain; This domain is believed to bind double-stranded DNA of 20 bases length. Q#18668 - CGI_10002980 superfamily 247856 31 68 1.31E-05 38.6829 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18669 - CGI_10002981 superfamily 247856 42 113 9.90E-10 52.9353 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18669 - CGI_10002981 superfamily 247856 178 248 2.19E-09 52.1649 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18669 - CGI_10002981 superfamily 247856 4 67 0.00274791 34.8309 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#18670 - CGI_10002982 superfamily 242323 54 159 3.32E-15 68.3039 cl01132 FA_hydroxylase superfamily - - "Fatty acid hydroxylase superfamily; This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins." Q#18672 - CGI_10009945 superfamily 222070 75 185 3.11E-08 49.2133 cl18634 DDE_3 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#18674 - CGI_10009948 superfamily 241563 61 100 2.10E-05 42.2744 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18677 - CGI_10009951 superfamily 244895 85 429 7.64E-96 298.306 cl08294 Peptidase_M17 superfamily N - "Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants." Q#18678 - CGI_10009952 superfamily 241583 144 305 2.67E-48 168.904 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#18678 - CGI_10009952 superfamily 247097 365 401 0.00700143 35.4329 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#18679 - CGI_10009953 superfamily 245847 2 91 5.56E-08 46.0106 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18680 - CGI_10009954 superfamily 241629 37 142 1.79E-30 108.309 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#18681 - CGI_10001400 superfamily 245201 1 175 3.28E-115 343.853 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18681 - CGI_10001400 superfamily 243035 207 299 1.21E-13 67.4003 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18681 - CGI_10001400 superfamily 243035 371 461 1.75E-13 66.6308 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18685 - CGI_10009996 superfamily 241750 9 72 7.10E-13 60.665 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#18692 - CGI_10010003 superfamily 247677 204 397 1.71E-92 284.105 cl17013 W2 superfamily - - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#18693 - CGI_10010004 superfamily 243035 97 217 2.47E-25 96.5349 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18697 - CGI_10007672 superfamily 245864 5 86 1.56E-18 79.6298 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#18698 - CGI_10007673 superfamily 245864 20 416 2.15E-45 164.759 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#18699 - CGI_10007674 superfamily 245864 120 428 1.25E-43 160.136 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#18700 - CGI_10007675 superfamily 247727 259 323 0.000147012 40.1111 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#18700 - CGI_10007675 superfamily 247637 111 189 0.00613326 37.2299 cl16912 MDR superfamily N - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#18701 - CGI_10007676 superfamily 241563 60 102 7.58E-05 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18701 - CGI_10007676 superfamily 110440 421 447 9.90E-05 40.0837 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18702 - CGI_10007678 superfamily 241563 62 98 0.00117132 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18702 - CGI_10007678 superfamily 110440 429 456 0.00189788 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18702 - CGI_10007678 superfamily 247740 199 270 0.00487268 37.186 cl17186 TIM_phosphate_binding superfamily C - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#18703 - CGI_10007679 superfamily 221433 1886 1994 3.30E-29 116.244 cl13553 DUF3585 superfamily - - Protein of unknown function (DUF3585); This domain is found in eukaryotes. This domain is typically between 135 and 149 amino acids in length and is found associated with pfam00307. Q#18703 - CGI_10007679 superfamily 243050 663 720 1.86E-25 102.759 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#18703 - CGI_10007679 superfamily 248054 66 103 0.0012877 39.3776 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#18704 - CGI_10007680 superfamily 110440 135 161 0.00537201 33.1501 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18705 - CGI_10007681 superfamily 241563 40 75 0.000134323 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18705 - CGI_10007681 superfamily 110440 463 489 0.00284661 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18705 - CGI_10007681 superfamily 110440 504 531 0.00945211 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#18706 - CGI_10005830 superfamily 248264 171 311 3.67E-36 128.894 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#18706 - CGI_10005830 superfamily 222263 83 182 1.43E-06 45.3865 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#18708 - CGI_10005832 superfamily 207666 7 61 0.000406194 38.0103 cl02605 SCAN superfamily N - "SCAN oligomerization domain; The SCAN domain (named after SRE-ZBP, CTfin51, AW-1 and Number 18 cDNA) is found in several vertebrate proteins that contain C2H2 zinc finger motifs, many of which may be transcription factors playing roles in cell survival and differentiation. This protein-interaction domain is able to mediate homo- and hetero-oligomerization of SCAN-containing proteins. Some SCAN-containing proteins, including those of lower vertebrates, do not contain zinc finger motifs. It has been noted that the SCAN domain resembles a domain-swapped version of the C-terminal domain of the HIV capsid protein. This domain model features elements common to the three general groups of SCAN domains (SCAN-A1, SCAN-A2, and SCAN-B). The SCAND1 protein is truncated at the C-terminus with respect to this model, the SCAND2 protein appears to have a truncated central helix." Q#18711 - CGI_10003762 superfamily 247809 628 797 1.03E-21 94.2651 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#18712 - CGI_10003763 superfamily 202805 56 92 4.48E-16 67.5978 cl18232 Sec61_beta superfamily - - "Sec61beta family; This family consists of homologues of Sec61beta - a component of the Sec61/SecYEG protein secretory system. The domain is found in eukaryotes and archaea and is possibly homologous to the bacterial SecG. It consists of a single putative transmembrane helix, preceded by a short stretch containing various charged residues; this arrangement may help determine orientation in the cell membrane." Q#18713 - CGI_10003764 superfamily 245814 31 110 0.00331574 34.0181 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18714 - CGI_10003765 superfamily 241832 23 114 2.09E-39 128.83 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18715 - CGI_10003766 superfamily 246680 189 269 4.38E-14 65.462 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#18715 - CGI_10003766 superfamily 245874 20 83 7.20E-10 54.3546 cl12111 TNFR superfamily C - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#18716 - CGI_10003767 superfamily 241567 150 414 5.70E-87 267.159 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#18717 - CGI_10003768 superfamily 246680 7 83 4.10E-13 59.9354 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#18718 - CGI_10001245 superfamily 243033 225 356 7.22E-25 97.7741 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#18718 - CGI_10001245 superfamily 243033 67 162 1.68E-15 71.9657 cl02428 Ependymin superfamily C - Ependymin; Ependymin. Q#18719 - CGI_10005681 superfamily 218118 94 133 2.80E-08 47.2237 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#18723 - CGI_10005685 superfamily 218118 73 113 2.41E-07 44.1421 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#18724 - CGI_10005686 superfamily 218118 66 98 2.50E-05 39.9049 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#18725 - CGI_10005687 superfamily 218118 69 108 1.76E-06 41.4457 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#18727 - CGI_10005058 superfamily 216378 1 362 1.06E-142 413.54 cl03133 IDO superfamily - - "Indoleamine 2,3-dioxygenase; Indoleamine 2,3-dioxygenase. " Q#18729 - CGI_10005060 superfamily 216686 2 63 3.17E-14 64.6517 cl18377 Galactosyl_T superfamily N - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#18731 - CGI_10005062 superfamily 246669 59 184 2.82E-54 175.523 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#18731 - CGI_10005062 superfamily 246669 194 327 5.96E-45 151.719 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#18732 - CGI_10005063 superfamily 243072 91 214 4.81E-26 100.151 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18732 - CGI_10005063 superfamily 243073 252 291 2.23E-08 49.3909 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#18734 - CGI_10005065 superfamily 241596 148 191 2.08E-11 56.0683 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#18735 - CGI_10012412 superfamily 241563 61 100 9.75E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18736 - CGI_10012413 superfamily 243058 268 381 8.12E-09 53.8575 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18736 - CGI_10012413 superfamily 243058 351 469 0.000163241 40.7608 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#18736 - CGI_10012413 superfamily 241645 1 75 0.000658567 38.3126 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#18737 - CGI_10012414 superfamily 247924 126 204 7.01E-22 86.9937 cl17370 PEMT superfamily - - Phospholipid methyltransferase; The S. cerevisiae phospholipid methyltransferase (EC:2.1.1.16) has a broad substrate specificity of unsaturated phospholipids. Q#18738 - CGI_10012415 superfamily 241600 14 188 7.71E-50 162.795 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#18740 - CGI_10012417 superfamily 220397 73 182 1.88E-20 89.8748 cl10759 NDUF_B5 superfamily N - "NADH:ubiquinone oxidoreductase, NDUFB5/SGDH subunit; Members of this family mediate the transfer of electrons from NADH to the respiratory chain. The immediate electron acceptor for the enzyme is believed to be ubiquinone, the reaction that occurs being: NADH + ubiquinone = NAD(+) + ubiquinol." Q#18741 - CGI_10012418 superfamily 241762 31 91 3.21E-19 78.5244 cl00297 R3H superfamily - - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#18743 - CGI_10012420 superfamily 193607 516 647 1.43E-69 224.373 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#18743 - CGI_10012420 superfamily 247792 469 509 4.92E-11 58.9964 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#18743 - CGI_10012420 superfamily 241554 56 220 2.84E-46 162.055 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#18743 - CGI_10012420 superfamily 241554 333 435 3.65E-06 46.3737 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#18744 - CGI_10002412 superfamily 245201 1 275 0 577.831 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18745 - CGI_10002413 superfamily 217062 251 489 1.94E-36 138.171 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#18745 - CGI_10002413 superfamily 221621 524 699 2.85E-31 120.505 cl13906 Xylo_C superfamily - - "Xylosyltransferase C terminal; This domain family is found in eukaryotes, and is typically between 169 and 183 amino acids in length. The family is found in association with pfam02485. There is a single completely conserved residue G that may be functionally important. Xylosyltransferases are enzymes involved in the biosynthesis of the glycosaminoglycan linker region in proteoglycans." Q#18745 - CGI_10002413 superfamily 243093 126 204 1.31E-20 87.9709 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#18746 - CGI_10002414 superfamily 147445 4 182 3.39E-20 85.8035 cl05014 UPF0193 superfamily - - Uncharacterized protein family (UPF0193); This family of proteins is functionally uncharacterized. Q#18747 - CGI_10002415 superfamily 221683 72 161 8.63E-10 56.1231 cl15002 UPF0489 superfamily - - UPF0489 domain; This family is probably an enzyme which is related to the Arginase family. Q#18748 - CGI_10002416 superfamily 241984 61 151 2.04E-44 147.702 cl00615 Membrane-FADS-like superfamily C - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#18749 - CGI_10002417 superfamily 241984 17 113 4.98E-32 113.034 cl00615 Membrane-FADS-like superfamily N - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#18750 - CGI_10002418 superfamily 241984 17 113 4.98E-32 113.034 cl00615 Membrane-FADS-like superfamily N - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#18762 - CGI_10022297 superfamily 115363 212 236 0.000290236 37.7366 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#18762 - CGI_10022297 superfamily 115363 240 260 0.000323627 37.7366 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#18762 - CGI_10022297 superfamily 241578 29 55 0.00305479 36.5973 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18763 - CGI_10022298 superfamily 241578 18 140 1.29E-05 44.0938 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18763 - CGI_10022298 superfamily 115363 264 326 1.46E-13 65.8561 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#18764 - CGI_10022299 superfamily 241578 24 125 2.51E-05 42.553 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18764 - CGI_10022299 superfamily 115363 212 259 2.76E-12 61.2337 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#18765 - CGI_10022300 superfamily 241578 51 227 1.17E-07 50.257 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18765 - CGI_10022300 superfamily 115363 300 337 1.22E-08 51.989 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#18765 - CGI_10022300 superfamily 207713 516 556 0.00190556 36.9137 cl02729 WWE superfamily C - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#18765 - CGI_10022300 superfamily 115363 380 392 0.00375588 35.8106 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#18766 - CGI_10022301 superfamily 248458 122 234 0.00283134 38.8341 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#18766 - CGI_10022301 superfamily 241607 394 432 1.76E-08 51.5362 cl00097 KAZAL_FS superfamily N - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#18768 - CGI_10022303 superfamily 243072 167 293 6.15E-21 87.8242 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18768 - CGI_10022303 superfamily 243072 70 221 2.62E-12 63.5566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18770 - CGI_10022305 superfamily 215821 45 137 1.08E-45 146.231 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#18771 - CGI_10022306 superfamily 241733 25 114 2.23E-50 156.722 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#18774 - CGI_10022309 superfamily 247736 108 175 2.13E-05 40.7221 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#18775 - CGI_10022310 superfamily 243092 371 716 1.08E-32 130.918 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18775 - CGI_10022310 superfamily 243092 127 537 3.81E-24 105.11 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18775 - CGI_10022310 superfamily 243092 69 159 3.24E-10 61.9672 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18779 - CGI_10022314 superfamily 247858 1 92 7.26E-26 95.9178 cl17304 2OG-FeII_Oxy_3 superfamily N - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#18783 - CGI_10022318 superfamily 241570 115 220 2.73E-18 81.2182 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#18783 - CGI_10022318 superfamily 241570 242 301 0.00767754 35.3794 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#18784 - CGI_10022319 superfamily 247724 17 269 3.28E-110 321.313 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18785 - CGI_10022320 superfamily 241571 29 151 7.40E-07 48.9479 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#18785 - CGI_10022320 superfamily 245814 321 397 9.22E-07 47.8691 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18786 - CGI_10022321 superfamily 241629 66 201 1.49E-34 123.014 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#18787 - CGI_10022322 superfamily 238191 30 523 2.16E-96 319.278 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#18787 - CGI_10022322 superfamily 248012 902 1001 2.15E-18 83.0108 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#18796 - CGI_10022331 superfamily 245864 34 441 5.00E-95 297.268 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#18797 - CGI_10022332 superfamily 243051 237 327 4.53E-22 90.1297 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#18797 - CGI_10022332 superfamily 241583 37 221 3.84E-32 119.213 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#18799 - CGI_10022334 superfamily 222429 6 82 6.58E-15 71.1176 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#18803 - CGI_10004883 superfamily 247724 32 56 0.000143407 36.16 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18804 - CGI_10004884 superfamily 241688 46 352 5.31E-80 247.465 cl00210 Isoprenoid_Biosyn_C1 superfamily - - "Isoprenoid Biosynthesis enzymes, Class 1; Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes; and are widely distributed among archaea, bacteria, and eukaryota.The enzymes in this superfamily share the same 'isoprenoid synthase fold' and include several subgroups. The head-to-tail (HT) IPPS catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. Cyclic monoterpenes, diterpenes, and sesquiterpenes, are formed from their respective linear isoprenoid diphosphates by class I terpene cyclases. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Cyclization of these 30- and 40-carbon linear forms are catalyzed by class II cyclases. Both the isoprenoid chain elongation reactions and the class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Generally, the enzymes in this family exhibit an all-trans reaction pathway, an exception, is the cis-trans terpene cyclase, trichodiene synthase. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD." Q#18805 - CGI_10004885 superfamily 216977 110 291 2.94E-20 85.7497 cl03541 MAM33 superfamily - - Mitochondrial glycoprotein; This mitochondrial matrix protein family contains members of the MAM33 family which bind to the globular 'heads' of C1Q. It is thought to be involved in mitochondrial oxidative phosphorylation and in nucleus-mitochondrion interactions. Q#18806 - CGI_10004886 superfamily 217394 9 475 6.42E-114 346.665 cl08387 Alg6_Alg8 superfamily - - "ALG6, ALG8 glycosyltransferase family; N-linked (asparagine-linked) glycosylation of proteins is mediated by a highly conserved pathway in eukaryotes, in which a lipid (dolichol phosphate)-linked oligosaccharide is assembled at the endoplasmic reticulum membrane prior to the transfer of the oligosaccharide moiety to the target asparagine residues. This oligosaccharide is composed of Glc(3)Man(9)GlcNAc(2). The addition of the three glucose residues is the final series of steps in the synthesis of the oligosaccharide precursor. Alg6 transfers the first glucose residue, and Alg8 transfers the second one. In the human alg6 gene, a C->T transition, which causes Ala333 to be replaced with Val, has been identified as the cause of a congenital disorder of glycosylation, designated as type Ic OMIM:603147." Q#18807 - CGI_10004887 superfamily 247724 6 167 1.93E-45 149.35 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18809 - CGI_10004889 superfamily 243072 739 859 4.33E-27 108.24 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18809 - CGI_10004889 superfamily 243072 430 557 1.19E-17 80.8906 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18809 - CGI_10004889 superfamily 243072 68 181 2.83E-14 71.2606 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18809 - CGI_10004889 superfamily 243072 362 483 3.40E-12 65.0974 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18809 - CGI_10004889 superfamily 243072 133 247 6.39E-12 63.9418 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18811 - CGI_10019179 superfamily 192535 247 283 0.00149966 37.9606 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18814 - CGI_10019182 superfamily 241574 1 58 7.24E-17 76.4705 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18815 - CGI_10019183 superfamily 243066 4 93 2.22E-24 96.0828 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#18815 - CGI_10019183 superfamily 219619 315 391 1.99E-11 59.9139 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#18816 - CGI_10019184 superfamily 243035 161 231 4.13E-16 71.4969 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18818 - CGI_10019186 superfamily 245213 45 80 3.03E-11 54.9502 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18818 - CGI_10019186 superfamily 245847 86 152 1.30E-12 61.0333 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#18819 - CGI_10019187 superfamily 247905 1 34 3.62E-06 40.6581 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#18821 - CGI_10019189 superfamily 216939 1 26 7.94E-05 35.7165 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#18826 - CGI_10019194 superfamily 241574 346 602 4.48E-89 283.323 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18826 - CGI_10019194 superfamily 241574 740 808 2.45E-09 57.2105 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#18827 - CGI_10019195 superfamily 248293 46 94 0.00205706 32.711 cl17739 MADF_DNA_bdg superfamily C - Alcohol dehydrogenase transcription factor Myb/SANT-like; The myb/SANT-like domain in Adf-1 (MADF) is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Q#18829 - CGI_10019197 superfamily 245226 285 333 6.40E-06 44.2137 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#18832 - CGI_10019200 superfamily 203031 193 251 3.73E-08 50.0192 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#18832 - CGI_10019200 superfamily 247999 16 55 0.000422898 38.2404 cl17445 PHD superfamily N - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#18833 - CGI_10009736 superfamily 245814 63 142 1.81E-10 54.4337 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18834 - CGI_10009737 superfamily 248289 34 89 0.00121954 34.0288 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#18837 - CGI_10009740 superfamily 243062 282 366 1.29E-18 79.6272 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#18838 - CGI_10009741 superfamily 241570 242 348 1.44E-19 85.4553 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#18839 - CGI_10009742 superfamily 243072 29 153 1.74E-29 113.247 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18839 - CGI_10009742 superfamily 245201 384 491 6.62E-06 46.0757 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18840 - CGI_10009743 superfamily 241571 536 643 0.00384582 36.9844 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#18841 - CGI_10009744 superfamily 241563 1 32 0.00139003 32.6444 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#18842 - CGI_10009745 superfamily 215647 144 406 3.49E-64 215.165 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#18842 - CGI_10009745 superfamily 243029 61 127 9.26E-24 96.2657 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#18844 - CGI_10006923 superfamily 219542 71 170 1.37E-06 48.0068 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#18846 - CGI_10006925 superfamily 245814 18 49 0.00606889 31.4994 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18847 - CGI_10006926 superfamily 241638 36 163 5.68E-16 70.0896 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#18852 - CGI_10006931 superfamily 220647 12 49 0.000339779 38.4628 cl18565 L_HGMIC_fpl superfamily C - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#18853 - CGI_10016903 superfamily 243035 20 134 4.34E-15 66.8745 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18855 - CGI_10016905 superfamily 245595 1 158 1.14E-43 150.217 cl11393 Peptidase_M14_like superfamily N - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#18856 - CGI_10016906 superfamily 151798 53 182 9.00E-56 178.02 cl12893 Spy1 superfamily - - "Cell cycle regulatory protein; Speedy (Spy1) is a cell cycle regulatory protein which activates CDK2, the major kinase that allows progression through G1/S phase and further replication events. Spy1 expression overcomes a p27-induced cell cycle arrest to allow for DNA synthesis, so cell cycle progression occurs due to an interaction between Spy1 and p27. Spy1 is also known as Ringo protein A." Q#18858 - CGI_10016908 superfamily 247724 32 192 2.79E-55 176.699 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18859 - CGI_10016909 superfamily 203011 111 197 2.74E-18 81.0993 cl04515 SWIRM superfamily - - SWIRM domain; This SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in chromosomal proteins. It contains a helix-turn helix motif and binds to DNA. Q#18859 - CGI_10016909 superfamily 248054 224 273 2.93E-07 48.6224 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#18860 - CGI_10016910 superfamily 245202 12 74 4.02E-25 93.0269 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#18863 - CGI_10016913 superfamily 184428 15 156 3.98E-30 112.353 cl14742 PRK13971 superfamily C - hydroxyproline-2-epimerase; Provisional Q#18864 - CGI_10016914 superfamily 184428 2 326 8.91E-64 207.112 cl14742 PRK13971 superfamily - - hydroxyproline-2-epimerase; Provisional Q#18865 - CGI_10016915 superfamily 184428 6 138 1.40E-14 68.0551 cl14742 PRK13971 superfamily N - hydroxyproline-2-epimerase; Provisional Q#18867 - CGI_10016917 superfamily 216653 166 286 6.03E-19 80.7191 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#18868 - CGI_10016918 superfamily 207627 386 476 2.02E-37 133.529 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#18868 - CGI_10016918 superfamily 216653 81 240 1.06E-21 91.5046 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#18868 - CGI_10016918 superfamily 207627 515 580 7.60E-17 76.5243 cl02522 Calx-beta superfamily C - Calx-beta domain; Calx-beta domain. Q#18870 - CGI_10016920 superfamily 219458 868 1012 2.38E-46 163.908 cl06529 DRIM superfamily - - "Down-regulated in metastasis; These eukaryotic proteins include DRIM (Down-Regulated In Metastasis), which is differentially expressed in metastatic and non-metastatic human breast carcinoma cells. It is believed to be involved in processing of non-coding RNA." Q#18871 - CGI_10016921 superfamily 243114 3 70 9.65E-09 48.9457 cl02622 Pre-SET superfamily - - Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains. Q#18871 - CGI_10016921 superfamily 243091 79 107 5.81E-05 39.2399 cl02566 SET superfamily C - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#18872 - CGI_10016922 superfamily 243072 855 980 7.54E-36 133.663 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18872 - CGI_10016922 superfamily 243072 789 914 5.14E-35 130.967 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18877 - CGI_10016927 superfamily 247866 16 207 6.54E-26 101.76 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#18878 - CGI_10016928 superfamily 247866 88 223 6.11E-26 106.768 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#18878 - CGI_10016928 superfamily 244539 178 290 0.00626572 38.0716 cl06868 FNR_like superfamily C - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#18879 - CGI_10016929 superfamily 152695 240 439 7.53E-34 126.756 cl13667 PIP49_C superfamily - - Pancreatitis induced protein 49 C terminal; This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 344 to 431 amino acids in length. This protein has a single completely conserved residue C that may be functionally important. PIP49 is a putative transmembrane protein which is induced to express during pancreatitis. Q#18880 - CGI_10016930 superfamily 244265 98 302 5.83E-113 328.556 cl05973 FAM20_C_like superfamily - - "C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins; Drosophila Fj is a Golgi kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in the Drosophila fj gene include loss of the intermediate leg joint, and a PCP defect in the eye. Fjx1, the murine homologue of Fj, has been shown to be involved in both the Fat and Hippo signaling pathways, these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. This domain has homology to a kinase-active site, mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. This model includes the FAM20_C domain family, previously known as DUF1193; FAM20_C appears to be homologous to the catalytic domain of the phosphoinositide 3-kinase (PI3K)-like family." Q#18881 - CGI_10016931 superfamily 202668 454 558 6.81E-25 101.587 cl04110 BK_channel_a superfamily - - Calcium-activated BK potassium channel alpha subunit; Calcium-activated BK potassium channel alpha subunit. Q#18881 - CGI_10016931 superfamily 219619 236 302 1.62E-10 59.1435 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#18882 - CGI_10014835 superfamily 248345 123 238 2.17E-34 122.383 cl17791 SAC3_GANP superfamily - - "SAC3/GANP/Nin1/mts3/eIF-3 p25 family; This large family includes diverse proteins involved in large complexes. The alignment contains one highly conserved negatively charged residue and one highly conserved positively charged residue that are probably important for the function of these proteins. The family includes the yeast nuclear export factor Sac3, and mammalian GANP/MCM3-associated proteins, which facilitate the nuclear localisation of MCM3, a protein that associates with chromatin in the G1 phase of the cell-cycle. The 26S protease (or 26S proteasome) is responsible for degrading ubiquitin conjugates. It consists of 19S regulatory complexes associated with the ends of 20S proteasomes. The 19S regulatory complex is composed of about 20 different polypeptides and confers ATP-dependence and substrate specificity to the 26S enzyme. The conserved region occurs at the C-terminal of the Nin1-like regulatory subunit. This family includes several eukaryotic translation initiation factor 3 subunit 11 (eIF-3 p25) proteins. Eukaryotic initiation factor 3 (eIF3) is a multisubunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits." Q#18883 - CGI_10014836 superfamily 220645 5 326 5.80E-116 344.513 cl10924 DUF2465 superfamily - - Protein of unknown function (DUF2465); FAM98A and B proteins are found from worms to humans but their function is unknown. This entry is of a family of proteins that is rich in glycines. Q#18885 - CGI_10014838 superfamily 246918 462 516 2.51E-05 42.7778 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#18885 - CGI_10014838 superfamily 246918 639 690 0.00124197 37.5663 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#18886 - CGI_10014839 superfamily 245814 112 202 0.00153162 35.1575 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18889 - CGI_10014842 superfamily 248012 22 157 3.07E-27 100.476 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#18893 - CGI_10014846 superfamily 241555 1023 1284 3.07E-92 297.604 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#18893 - CGI_10014846 superfamily 245231 662 937 1.24E-85 280.191 cl10019 PurM-like superfamily - - "AIR (aminoimidazole ribonucleotide) synthase related protein. This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM (formylglycinamidine ribonucleotide) synthase and Selenophosphate synthetase (SelD). The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain." Q#18893 - CGI_10014846 superfamily 245231 255 553 1.40E-80 267.801 cl10019 PurM-like superfamily - - "AIR (aminoimidazole ribonucleotide) synthase related protein. This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM (formylglycinamidine ribonucleotide) synthase and Selenophosphate synthetase (SelD). The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain." Q#18894 - CGI_10014847 superfamily 241832 384 468 0.0010876 37.9229 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18894 - CGI_10014847 superfamily 247805 131 324 1.34E-53 183.455 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18894 - CGI_10014847 superfamily 218246 462 615 8.09E-27 106.649 cl12301 OST3_OST6 superfamily - - "OST3 / OST6 family; The proteins in this family are part of a complex of eight ER proteins that transfers core oligosaccharide from dolichol carrier to Asn-X-Ser/Thr motifs. This family includes both OST3 and OST6, each of which contains four predicted transmembrane helices. Disruption of OST3 and OST6 leads to a defect in the assembly of the complex. Hence, the function of these genes seems to be essential for recruiting a fully active complex necessary for efficient N-glycosylation." Q#18896 - CGI_10014849 superfamily 219621 30 152 1.13E-19 81.2535 cl06777 Rrp15p superfamily - - Rrp15p; Rrp15p is required for the formation of 60S ribosomal subunits. Q#18897 - CGI_10014850 superfamily 247907 1755 1914 9.57E-17 81.6956 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#18897 - CGI_10014850 superfamily 247907 1503 1717 1.39E-15 78.2288 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#18897 - CGI_10014850 superfamily 238012 722 770 1.11E-14 72.7722 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 238012 770 820 2.88E-12 65.8386 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 238012 616 668 8.25E-11 61.6014 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 241584 1232 1340 1.01E-09 59.4323 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 4574 4659 1.16E-09 59.4323 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 238012 988 1037 2.14E-09 57.7494 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 241584 3639 3711 4.17E-09 57.5063 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 238012 874 921 5.91E-09 56.2086 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 241584 4192 4293 1.76E-08 55.9655 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 4003 4097 2.07E-08 55.5803 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 238012 670 720 3.13E-08 54.2826 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 238012 945 987 3.14E-08 54.2826 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 241584 2670 2766 5.68E-08 54.4247 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 1045 1126 6.80E-08 54.0395 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 238012 821 873 8.07E-08 53.127 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 241584 4479 4569 1.95E-07 52.4987 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 1351 1443 2.52E-07 52.4987 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 3066 3145 3.23E-07 52.1135 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 2483 2576 9.92E-07 50.5727 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 2289 2363 1.17E-06 50.1875 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 1128 1210 4.59E-06 48.6467 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 2771 2856 5.35E-06 48.2615 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 2106 2190 6.38E-06 47.8763 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 3921 3988 1.18E-05 47.1059 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 4867 4963 1.39E-05 47.1059 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 238012 493 541 2.42E-05 45.8082 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 241584 4685 4765 2.82E-05 45.9503 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 2197 2284 9.31E-05 44.4095 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 4108 4177 0.00036549 42.4835 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 2970 3055 0.000406915 42.4835 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 2870 2959 0.000674186 41.7131 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 3816 3881 0.00165252 40.5575 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 238012 550 603 0.00337863 39.2598 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#18897 - CGI_10014850 superfamily 241584 4399 4463 0.00452664 39.0167 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 241584 2585 2667 0.00577211 38.6315 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#18897 - CGI_10014850 superfamily 243198 298 492 2.71E-21 97.4308 cl02806 Laminin_N superfamily - - Laminin N-terminal (Domain VI); Laminin N-terminal (Domain VI). Q#18897 - CGI_10014850 superfamily 241611 132 272 1.58E-12 69.3396 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#18898 - CGI_10014851 superfamily 219619 118 193 3.86E-08 48.358 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#18898 - CGI_10014851 superfamily 219619 40 78 0.00152402 35.2612 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#18899 - CGI_10014852 superfamily 247725 506 652 1.92E-72 240.695 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#18899 - CGI_10014852 superfamily 243096 348 524 2.45E-45 164.008 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#18900 - CGI_10016882 superfamily 248369 104 229 8.39E-05 41.2646 cl17815 Yip1 superfamily - - Yip1 domain; The Yip1 integral membrane domain contains four transmembrane alpha helices. The domain is characterized by the motifs DLYGP and GY. The Yip1 protein is a golgi protein involved in vesicular transport that interacts with GTPases. Q#18901 - CGI_10016883 superfamily 218653 1 194 4.82E-67 206.006 cl05265 DUF775 superfamily - - Protein of unknown function (DUF775); This family consists of several eukaryotic proteins of unknown function. Q#18904 - CGI_10016886 superfamily 245814 177 263 2.81E-06 43.8616 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18905 - CGI_10016887 superfamily 241642 22 83 0.000190622 35.5809 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#18909 - CGI_10016891 superfamily 245814 350 420 1.66E-05 42.4763 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18909 - CGI_10016891 superfamily 245814 232 312 7.08E-05 40.78 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18910 - CGI_10016892 superfamily 243161 3 49 0.000192268 35.833 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#18911 - CGI_10016893 superfamily 241564 82 150 3.19E-21 82.6987 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#18920 - CGI_10011642 superfamily 248019 157 283 2.05E-17 76.9393 cl17465 DAGK_cat superfamily - - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#18920 - CGI_10011642 superfamily 248019 16 235 8.86E-10 58.3579 cl17465 DAGK_cat superfamily C - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#18921 - CGI_10011643 superfamily 241607 80 112 7.65E-05 38.7902 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#18921 - CGI_10011643 superfamily 241607 158 184 0.00249545 34.1678 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#18922 - CGI_10011644 superfamily 248338 104 381 1.92E-58 200.905 cl17784 Peptidase_C48 superfamily N - "Ulp1 protease family, C-terminal catalytic domain; This domain contains the catalytic triad Cys-His-Asn." Q#18923 - CGI_10011645 superfamily 220695 51 175 0.00112827 39.4843 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#18925 - CGI_10011647 superfamily 244882 100 378 4.48E-30 118.518 cl08270 Peptidase_S10 superfamily N - Serine carboxypeptidase; Serine carboxypeptidase. Q#18925 - CGI_10011647 superfamily 244882 30 97 1.26E-12 67.2866 cl08270 Peptidase_S10 superfamily C - Serine carboxypeptidase; Serine carboxypeptidase. Q#18926 - CGI_10011648 superfamily 245201 1318 1552 2.01E-38 145.457 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18926 - CGI_10011648 superfamily 243072 41 132 4.06E-09 56.623 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#18926 - CGI_10011648 superfamily 247724 662 814 5.92E-15 75.0651 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#18926 - CGI_10011648 superfamily 246925 420 548 0.0020527 41.187 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#18927 - CGI_10011649 superfamily 243034 2042 2162 2.48E-06 48.1452 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#18927 - CGI_10011649 superfamily 243034 1862 1969 7.91E-05 43.5228 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#18927 - CGI_10011649 superfamily 243034 1767 1886 8.21E-05 43.5228 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#18927 - CGI_10011649 superfamily 243034 2136 2250 0.000255259 41.982 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#18929 - CGI_10011651 superfamily 241617 7 81 2.10E-20 82.4205 cl00110 MBD superfamily - - "MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family." Q#18929 - CGI_10011651 superfamily 222512 149 244 2.65E-30 109.692 cl16567 MBD_C superfamily - - "C-terminal domain of methyl-CpG binding protein 2 and 3; CpG-methylation is a frequently occurring epigenetic modification of vertebrate genomes resulting in transcriptional repression. This domain was found at the C-terminus of the methyl-CpG-binding domain (MBD) containing proteins MBD2 and MBD3, the latter was shown to not bind directly to methyl-CpG DNA but rather interact with components of the NuRD/Mi2 complex, an abundant deacetylase complex. The domain is subject to structure determination by the Joint Center of Structural Genomics." Q#18930 - CGI_10011652 superfamily 241832 364 467 4.22E-51 170.815 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18930 - CGI_10011652 superfamily 241832 234 344 3.56E-42 146.713 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18930 - CGI_10011652 superfamily 241832 20 120 3.03E-39 138.127 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18930 - CGI_10011652 superfamily 241832 128 230 8.01E-32 117.429 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#18931 - CGI_10001214 superfamily 248097 89 211 1.18E-17 75.7646 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18932 - CGI_10001215 superfamily 248097 4 43 9.67E-10 50.3414 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18933 - CGI_10001216 superfamily 248097 44 166 5.63E-22 86.5502 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#18945 - CGI_10005071 superfamily 217473 138 361 1.32E-31 124.4 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#18947 - CGI_10010344 superfamily 226749 212 549 5.61E-34 131.54 cl18777 COG4299 superfamily - - Uncharacterized protein conserved in bacteria [Function unknown] Q#18949 - CGI_10010346 superfamily 243035 214 328 6.04E-28 105.78 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18949 - CGI_10010346 superfamily 243035 16 106 7.06E-20 83.0529 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18949 - CGI_10010346 superfamily 243035 115 204 1.81E-16 73.8081 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#18952 - CGI_10010349 superfamily 241749 23 167 9.50E-28 102.464 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#18955 - CGI_10008023 superfamily 247905 59 180 2.90E-17 80.7448 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#18956 - CGI_10008024 superfamily 245819 644 818 4.20E-65 216.676 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#18956 - CGI_10008024 superfamily 245201 324 501 5.12E-38 142.376 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#18956 - CGI_10008024 superfamily 219526 588 629 9.65E-09 55.3179 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#18957 - CGI_10006053 superfamily 247745 37 379 0 568.82 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#18957 - CGI_10006053 superfamily 245003 373 458 2.25E-26 101.89 cl08536 Alpha-mann_mid superfamily - - "Alpha mannosidase, middle domain; Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase." Q#18959 - CGI_10006055 superfamily 241691 1 53 0.00150363 35.1804 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#18961 - CGI_10006057 superfamily 245819 343 498 6.08E-50 175.845 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#18961 - CGI_10006057 superfamily 245819 924 1108 7.70E-44 158.511 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#18961 - CGI_10006057 superfamily 218992 545 627 5.94E-18 81.3068 cl05691 DUF1053 superfamily - - Domain of Unknown Function (DUF1053); This domain is found in Adenylate cyclases. Q#18962 - CGI_10002585 superfamily 247068 693 761 2.26E-07 49.6194 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#18962 - CGI_10002585 superfamily 247068 513 575 5.49E-05 42.3006 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#18962 - CGI_10002585 superfamily 247068 603 676 0.000328422 40.0255 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#18963 - CGI_10002586 superfamily 243092 10 330 4.47E-69 219.899 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#18964 - CGI_10002587 superfamily 246925 55 196 0.00052423 39.261 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#18968 - CGI_10005226 superfamily 241886 5 236 9.66E-57 185.07 cl00470 Aldo_ket_red superfamily N - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#18969 - CGI_10005227 superfamily 247805 36 217 1.96E-12 63.1252 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#18970 - CGI_10005228 superfamily 247905 223 298 3.45E-12 66.1072 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#18971 - CGI_10005229 superfamily 241564 54 123 1.90E-22 85.0099 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#18973 - CGI_10005231 superfamily 243161 7 91 3.25E-09 49.3558 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#18974 - CGI_10005232 superfamily 248281 50 82 0.000235549 37.2595 cl17727 GT1 superfamily C - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#18976 - CGI_10005234 superfamily 246975 10 33 0.000430665 37.6175 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#18978 - CGI_10002724 superfamily 220701 181 354 2.08E-06 48.2935 cl18572 DUF2424 superfamily N - Protein of unknown function (DUF2424); This is a family of proteins conserved in yeasts. The function is not known. Q#18979 - CGI_10002725 superfamily 248312 2 113 0.00397486 35.0217 cl17758 PMP22_Claudin superfamily N - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#18980 - CGI_10002726 superfamily 192780 80 155 5.52E-18 76.528 cl13103 Med28 superfamily C - "Mediator complex subunit 28; Mediator is a large complex of up to 33 proteins that is conserved from plants to fungi to humans - the number and representation of individual subunits varying with species. It is arranged into four different sections, a core, a head, a tail and a kinase-activity part, and the number of subunits within each of these is what varies with species. Overall, Mediator regulates the transcriptional activity of RNA polymerase II but it would appear that each of the four different sections has a slightly different function. Subunit Med28 of the Mediator may function as a scaffolding protein within Mediator by maintaining the stability of a submodule within the head module, and components of this submodule act together in a gene-regulatory programme to suppress smooth muscle cell differentiation. Thus, mammalian Mediator subunit Med28 functions as a repressor of smooth muscle-cell differentiation, which could have implications for disorders associated with abnormalities in smooth muscle cell growth and differentiation, including atherosclerosis, asthma, hypertension, and smooth muscle tumours." Q#18984 - CGI_10007721 superfamily 222429 80 122 0.000961386 36.4497 cl18676 Myb_DNA-bind_5 superfamily N - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#18985 - CGI_10007722 superfamily 189857 1 69 5.88E-16 67.6602 cl07832 Caveolin superfamily N - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#18986 - CGI_10007723 superfamily 241577 124 288 5.34E-81 244.731 cl00056 MH2 superfamily - - "C-terminal Mad Homology 2 (MH2) domain; The MH2 domain is found in the SMAD (small mothers against decapentaplegic) family of proteins and is responsible for type I receptor interactions, phosphorylation-triggered homo- and hetero-oligomerization, and transactivation. It is negatively regulated by the N-terminal MH1 domain which prevents it from forming a complex with SMAD4. The MH2 domain is multifunctional and provides SMADs with their specificity and selectivity, as well as transcriptional activity. Several transcriptional co-activators and repressors have also been reported to regulate SMAD signaling by interacting with the MH2 domain. Mutations in the MH2 domains of SMAD2 and especially SMAD4 have been detected in colorectal and other human cancers." Q#18987 - CGI_10007724 superfamily 219574 434 592 1.31E-24 101.974 cl06698 DC_STAMP superfamily - - "DC-STAMP-like protein; This is a family of sequences which are similar to a region of the dendritic cell-specific transmembrane protein (DC-STAMP). This is thought to be a novel receptor protein that shares no identity with other multimembrane-spanning proteins. It is thought to have seven putative transmembrane regions, two of which are found in the region featured in this family. DC-STAMP is also described as having potential N-linked glycosylation sites and a potential phosphorylation site for PKC, but these are not conserved throughout the family." Q#18988 - CGI_10007725 superfamily 245814 410 474 2.26E-10 57.1139 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#18989 - CGI_10007726 superfamily 245596 77 435 6.63E-162 463.339 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#18992 - CGI_10002944 superfamily 222429 3 78 7.20E-20 78.8216 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#18993 - CGI_10012790 superfamily 247792 13 65 6.54E-06 42.4328 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#18995 - CGI_10012792 superfamily 241578 23 201 8.43E-12 61.813 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#18997 - CGI_10012794 superfamily 220215 56 87 0.00101976 34.123 cl09630 FERM_N superfamily N - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#18999 - CGI_10012796 superfamily 241571 479 584 3.90E-13 68.593 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#18999 - CGI_10012796 superfamily 245213 812 845 3.18E-08 52.2538 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18999 - CGI_10012796 superfamily 245213 928 970 0.00124319 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18999 - CGI_10012796 superfamily 245213 891 924 0.00442894 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#18999 - CGI_10012796 superfamily 219525 1285 1332 3.22E-06 46.6433 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#18999 - CGI_10012796 superfamily 219525 57 96 1.09E-05 45.1026 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#19001 - CGI_10012798 superfamily 245814 31 113 4.22E-12 58.2856 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19002 - CGI_10012799 superfamily 241862 46 279 9.07E-13 66.6409 cl00437 COG0428 superfamily - - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#19004 - CGI_10002773 superfamily 242274 6 165 8.01E-06 43.555 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#19005 - CGI_10008388 superfamily 245596 310 631 4.40E-26 108.955 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#19006 - CGI_10008389 superfamily 241563 537 573 5.16E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19006 - CGI_10008389 superfamily 241563 67 98 0.000118303 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19007 - CGI_10008390 superfamily 241758 14 63 1.22E-14 63.543 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#19009 - CGI_10008392 superfamily 245815 64 516 0 795.099 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#19009 - CGI_10008392 superfamily 245815 561 699 7.36E-78 258.131 cl11961 ALDH-SF superfamily N - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#19010 - CGI_10008393 superfamily 247068 979 1077 7.84E-27 107.399 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 668 765 1.42E-24 100.851 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 1087 1188 5.48E-22 93.5321 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 876 970 3.75E-21 91.2209 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 573 659 1.39E-20 89.2949 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 778 855 1.70E-16 77.7389 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 373 451 5.36E-12 64.2569 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 205 267 1.26E-10 60.4049 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 90 186 2.31E-09 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 1210 1283 7.86E-09 55.0122 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 279 364 8.83E-08 51.9306 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19010 - CGI_10008393 superfamily 247068 487 562 3.42E-05 43.8414 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19011 - CGI_10008394 superfamily 243161 3 59 1.66E-11 58.1745 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#19016 - CGI_10003097 superfamily 241546 87 211 2.35E-28 111.984 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#19017 - CGI_10017962 superfamily 245201 156 252 2.03E-15 71.4989 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19018 - CGI_10017963 superfamily 245201 5 98 3.90E-16 76.7682 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19019 - CGI_10017964 superfamily 245220 134 193 5.67E-19 78.2154 cl09957 zf-UBP superfamily - - Zn-finger in ubiquitin-hydrolases and other protein; Zn-finger in ubiquitin-hydrolases and other protein. Q#19021 - CGI_10017966 superfamily 247675 109 465 2.96E-171 502.996 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#19021 - CGI_10017966 superfamily 247675 547 853 4.04E-133 404.798 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#19023 - CGI_10017968 superfamily 243082 1013 1185 1.04E-56 197.51 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#19023 - CGI_10017968 superfamily 241659 196 291 8.93E-15 71.8491 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#19023 - CGI_10017968 superfamily 243082 394 577 1.31E-23 103.221 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#19023 - CGI_10017968 superfamily 241659 39 97 0.00453241 36.881 cl00175 alpha-crystallin-Hsps_p23-like superfamily C - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#19028 - CGI_10017973 superfamily 247724 43 225 4.28E-49 166.557 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19030 - CGI_10017975 superfamily 245847 150 264 0.0016836 36.7658 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#19032 - CGI_10017977 superfamily 243034 89 158 6.28E-09 53.9232 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#19033 - CGI_10017978 superfamily 242406 3 120 1.60E-20 83.7949 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#19035 - CGI_10017980 superfamily 245595 95 388 3.11E-131 381.484 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#19035 - CGI_10017980 superfamily 216944 30 79 4.40E-09 52.9699 cl03496 Propep_M14 superfamily N - "Carboxypeptidase activation peptide; Carboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase, and is responsible for modulation of folding and activity of the pro-enzyme." Q#19037 - CGI_10017982 superfamily 247097 290 325 0.00325269 35.5046 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#19037 - CGI_10017982 superfamily 247097 371 407 0.00901983 34.349 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#19038 - CGI_10017983 superfamily 215827 8 172 2.07E-30 119.11 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#19040 - CGI_10017985 superfamily 241613 115 148 1.21E-10 53.7498 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#19041 - CGI_10017986 superfamily 241613 340 370 1.64E-08 50.6682 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#19041 - CGI_10017986 superfamily 241613 163 194 8.39E-05 39.8826 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#19041 - CGI_10017986 superfamily 241571 258 331 0.00129762 37.0067 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#19043 - CGI_10017988 superfamily 241619 48 101 3.52E-05 39.4844 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#19044 - CGI_10017990 superfamily 241564 78 138 7.64E-19 75.7651 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#19053 - CGI_10008147 superfamily 216363 2 57 2.35E-06 40.5314 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#19055 - CGI_10003809 superfamily 241584 415 526 1.80E-07 49.4171 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#19055 - CGI_10003809 superfamily 247792 48 92 6.08E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19055 - CGI_10003809 superfamily 128778 254 373 0.00018706 40.3259 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#19055 - CGI_10003809 superfamily 241563 210 242 0.000960426 37.652 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19056 - CGI_10003810 superfamily 241874 10 562 0 564.497 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#19058 - CGI_10003961 superfamily 216423 11 210 5.12E-71 227.504 cl18367 Glyco_hydro_35 superfamily N - Glycosyl hydrolases family 35; Glycosyl hydrolases family 35. Q#19059 - CGI_10003962 superfamily 110440 425 451 0.00503257 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19060 - CGI_10003963 superfamily 241597 395 464 6.36E-21 86.5833 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#19060 - CGI_10003963 superfamily 207687 195 329 1.49E-16 75.5996 cl02647 AXH superfamily - - Ataxin-1 and HBP1 module (AXH); AXH is a protein-protein and RNA binding motif found in Ataxin-1 (ATX1). ATX1 is responsible for the autosomal-dominant neurodegenerative disorder Spinocerebellar ataxia type-1 (SCA1) in humans. The AXH module has also been identified in the apparently unrelated transcription factor HBP1 which is thought to be involved in the architectural regulation of chromatin and in specific gene expression. Q#19062 - CGI_10014216 superfamily 246680 293 362 2.02E-07 49.1224 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#19063 - CGI_10014217 superfamily 247684 16 108 3.56E-18 77.7027 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19064 - CGI_10014218 superfamily 227036 112 341 4.66E-12 65.7122 cl18794 COG4692 superfamily C - Predicted neuraminidase (sialidase) [Carbohydrate transport and metabolism] Q#19065 - CGI_10014219 superfamily 215647 418 616 8.50E-32 123.873 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#19065 - CGI_10014219 superfamily 243029 332 395 7.89E-15 70.4573 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#19066 - CGI_10014220 superfamily 245206 3 320 1.72E-62 205.921 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#19068 - CGI_10014222 superfamily 245610 6 233 5.73E-94 278.927 cl11424 nitrilase superfamily - - "Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes; This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy." Q#19071 - CGI_10014225 superfamily 248281 113 186 1.72E-05 41.4871 cl17727 GT1 superfamily - - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#19072 - CGI_10014226 superfamily 241886 5 227 3.66E-62 205.486 cl00470 Aldo_ket_red superfamily C - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#19072 - CGI_10014226 superfamily 241886 247 433 2.36E-35 132.683 cl00470 Aldo_ket_red superfamily N - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#19074 - CGI_10014228 superfamily 243100 50 99 0.00397625 34.8472 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#19078 - CGI_10014232 superfamily 220776 1 279 1.37E-10 59.6555 cl11123 Hap2_elong superfamily - - "Histone acetylation protein 2; Hap2 is one of three histone acetyltransferases proteins that, in yeasts, are found associated with elongating forms of RNA polymerase II (Elongator). The Haps can be isolated in two forms, as a six-subunit complex with Elongator and as a complex of the three proteins on their own. The role of the Hap complex in transcription is still speculative, being possibly to keep the HAT activity of free Elongator in check, allowing histone acetylation only in the presence of a transcribing polymerase, or the interaction with Haps might render Elongator susceptible to modifications thereby altering its activity." Q#19080 - CGI_10014234 superfamily 247684 2 73 2.85E-18 77.7027 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19081 - CGI_10014235 superfamily 247684 1 171 4.00E-35 128.164 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19082 - CGI_10014236 superfamily 247905 102 190 2.62E-13 67.2628 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#19083 - CGI_10015801 superfamily 245226 80 251 2.46E-24 96.986 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#19086 - CGI_10015804 superfamily 247725 6 125 3.11E-74 244.014 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19086 - CGI_10015804 superfamily 220787 1351 1549 4.56E-24 103.01 cl11143 NARG2_C superfamily - - "NMDA receptor-regulated gene protein 2; The transition of neuronal cells from pre-cursor to mature state is regulated by the N-methyl-d-aspartate (NMDA) receptor, a glutamate-gated ion channel that is permeable to Ca2+. NMDA receptors probably mediate this activity by permitting expression of NARG2. NARG2 is transiently expressed, being a regulatory protein that is present in the nucleus of dividing cells and then down-regulated as progenitors exit the cell cycle and begin to differentiate. NARG2 contains repeats of (S/T)PXX, (11 in mouse, six in human), a putative DNA-binding motif that is found in many gene-regulatory proteins including Kruppel, Hunchback and Antennapedi." Q#19089 - CGI_10015807 superfamily 241832 5 73 4.29E-28 102.242 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#19089 - CGI_10015807 superfamily 243175 135 188 5.91E-08 48.0026 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#19090 - CGI_10015808 superfamily 241832 4 73 6.02E-31 110.716 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#19090 - CGI_10015808 superfamily 243175 84 197 3.87E-08 49.1582 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#19090 - CGI_10015808 superfamily 243175 195 249 0.0036154 35.3711 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#19091 - CGI_10015809 superfamily 241832 4 73 2.06E-28 102.242 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#19091 - CGI_10015809 superfamily 243175 84 161 7.53E-05 38.7579 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#19092 - CGI_10015810 superfamily 214781 146 262 8.12E-15 71.6044 cl02747 NRF superfamily - - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#19092 - CGI_10015810 superfamily 226446 496 730 0.000961388 40.5797 cl18757 COG3936 superfamily N - Protein involved in polysaccharide intercellular adhesin (PIA) synthesis/biofilm formation [Carbohydrate transport and metabolism] Q#19094 - CGI_10015812 superfamily 248022 12 366 6.11E-33 127.394 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#19098 - CGI_10015816 superfamily 205121 868 892 3.48E-06 45.1876 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#19098 - CGI_10015816 superfamily 197732 630 664 1.58E-05 43.3951 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#19098 - CGI_10015816 superfamily 205121 730 753 6.39E-05 41.7208 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#19098 - CGI_10015816 superfamily 205121 780 804 0.00083316 38.254 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#19099 - CGI_10015817 superfamily 241742 512 682 3.01E-72 237.211 cl00271 PI3Ka superfamily - - "Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture." Q#19099 - CGI_10015817 superfamily 241623 685 1046 3.46E-154 464.968 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#19099 - CGI_10015817 superfamily 246669 324 479 5.68E-67 222.359 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#19099 - CGI_10015817 superfamily 198687 29 106 5.93E-27 106.59 cl02483 PI3K_p85B superfamily - - "PI3-kinase family, p85-binding domain; PI3-kinase family, p85-binding domain. " Q#19099 - CGI_10015817 superfamily 207610 175 291 1.17E-12 66.1981 cl02484 PI3K_rbd superfamily - - "PI3-kinase family, ras-binding domain; Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding pfam00788 domains (unpublished observation)." Q#19100 - CGI_10015818 superfamily 247057 837 904 1.37E-43 153.874 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#19100 - CGI_10015818 superfamily 247057 924 989 2.22E-42 150.316 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#19100 - CGI_10015818 superfamily 247057 1009 1081 7.57E-39 140.531 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#19100 - CGI_10015818 superfamily 243054 204 383 0.000211946 42.4328 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#19101 - CGI_10015819 superfamily 201526 67 125 2.16E-09 50.9997 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#19103 - CGI_10015821 superfamily 247939 346 398 0.000179496 39.7914 cl17385 DUF442 superfamily N - Putative phosphatase (DUF442); Although this domain is uncharacterized it seems likely that it performs a phosphatase function. Q#19103 - CGI_10015821 superfamily 247939 62 198 0.00151839 37.3464 cl17385 DUF442 superfamily - - Putative phosphatase (DUF442); Although this domain is uncharacterized it seems likely that it performs a phosphatase function. Q#19104 - CGI_10015822 superfamily 205166 14 43 0.00839391 33.8755 cl18265 DUF3850 superfamily C - "Domain of Unknown Function with PDB structure (DUF3850); The search results from NCBI sequence alignment indicates a conserved domain belonging to ASCH superfamily. Dali searching results show that the protein is a structurally similar to the PUA domain, suggesting it may be involved in RNA recognition. It has been reported that the deletion of PUA genes results in impaired growth (RluD) and competitive disadvantage (TruB) in Escherichia coli. Suggestions have been put forward that, apart from their usual catalytic role, certain PUS enzymes (e.g. TruB) may also act as chaperones for RNA folding. The interface interaction indicates that the biomolecule of protein NP_809782.1 should be a dimer." Q#19105 - CGI_10015823 superfamily 247724 1 108 2.47E-20 82.2392 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19107 - CGI_10015825 superfamily 247743 333 478 0.00128546 38.6663 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#19108 - CGI_10015826 superfamily 248458 31 351 1.07E-22 100.081 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19108 - CGI_10015826 superfamily 248458 555 786 2.58E-19 89.6805 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19109 - CGI_10015827 superfamily 248458 108 404 2.57E-10 60.7905 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19110 - CGI_10015828 superfamily 247725 1344 1483 2.96E-44 158.927 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19110 - CGI_10015828 superfamily 243096 1137 1322 6.00E-32 125.103 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#19110 - CGI_10015828 superfamily 247683 1496 1550 8.30E-29 112.044 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#19112 - CGI_10015830 superfamily 243035 35 158 7.79E-28 104.239 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19113 - CGI_10015831 superfamily 247725 714 821 3.71E-47 164.062 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19116 - CGI_10003213 superfamily 110440 141 161 0.00549754 32.3797 cl03211 NHL superfamily C - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19117 - CGI_10015453 superfamily 247856 76 125 0.000383836 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19121 - CGI_10015457 superfamily 247792 46 90 8.80E-15 67.4708 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19122 - CGI_10015458 superfamily 247727 296 401 5.07E-18 79.7814 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19122 - CGI_10015458 superfamily 247727 62 160 1.49E-11 61.2918 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19123 - CGI_10015459 superfamily 241565 521 600 6.00E-07 48.0867 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#19123 - CGI_10015459 superfamily 247908 156 291 4.88E-40 145.446 cl17354 NIF superfamily - - NLI interacting factor-like phosphatase; This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain. Q#19124 - CGI_10015460 superfamily 241644 15 141 2.82E-23 89.6607 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#19126 - CGI_10015462 superfamily 245864 69 497 2.41E-111 341.951 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#19127 - CGI_10015463 superfamily 242849 7 79 1.54E-31 108.83 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#19128 - CGI_10015464 superfamily 243072 112 217 6.40E-31 114.788 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19128 - CGI_10015464 superfamily 243072 33 163 4.35E-24 95.5282 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19128 - CGI_10015464 superfamily 243073 334 373 4.76E-05 40.536 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#19129 - CGI_10015465 superfamily 201894 28 345 2.63E-98 299.659 cl03288 Glyco_hydro_56 superfamily - - Hyaluronidase; Hyaluronidase. Q#19130 - CGI_10015466 superfamily 201894 56 373 5.10E-93 295.036 cl03288 Glyco_hydro_56 superfamily - - Hyaluronidase; Hyaluronidase. Q#19130 - CGI_10015466 superfamily 242240 520 673 9.08E-53 181.721 cl00997 DUF297 superfamily - - TM1410 hypothetical-related protein; TM1410 hypothetical-related protein. Q#19131 - CGI_10015467 superfamily 242240 37 211 1.36E-49 164.772 cl00997 DUF297 superfamily - - TM1410 hypothetical-related protein; TM1410 hypothetical-related protein. Q#19132 - CGI_10015468 superfamily 241680 15 229 1.89E-68 214.425 cl00200 MIP superfamily - - "Major intrinsic protein (MIP) superfamily. Members of the MIP superfamily function as membrane channels that selectively transport water, small neutral molecules, and ions out of and between cells. The channel proteins share a common fold: the N-terminal cytosolic portion followed by six transmembrane helices, which might have arisen through gene duplication. On the basis of sequence similarity and functional characteristics, the superfamily can be subdivided into two major groups: water-selective channels called aquaporins (AQPs) and glycerol uptake facilitators (GlpFs). AQPs are found in all three kingdoms of life, while GlpFs have been characterized only within microorganisms." Q#19133 - CGI_10015469 superfamily 245819 838 997 9.13E-55 188.557 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#19133 - CGI_10015469 superfamily 245225 25 401 5.83E-75 253.324 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#19133 - CGI_10015469 superfamily 245201 507 761 1.58E-42 156.928 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19133 - CGI_10015469 superfamily 219526 769 824 2.60E-06 47.9991 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#19134 - CGI_10015470 superfamily 243035 23 138 7.97E-19 78.0453 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19135 - CGI_10003663 superfamily 209898 13 35 0.000598477 35.0718 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#19135 - CGI_10003663 superfamily 209898 57 78 0.00138363 34.2383 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#19136 - CGI_10003664 superfamily 203213 485 521 2.36E-07 48.2682 cl04999 HTH_psq superfamily - - "helix-turn-helix, Psq domain; This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster. In pipsqueak this domain binds to GAGA sequence." Q#19136 - CGI_10003664 superfamily 203213 187 218 6.63E-07 47.1126 cl04999 HTH_psq superfamily C - "helix-turn-helix, Psq domain; This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster. In pipsqueak this domain binds to GAGA sequence." Q#19137 - CGI_10003665 superfamily 241646 1102 1147 8.14E-11 59.7711 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#19137 - CGI_10003665 superfamily 241646 674 722 3.97E-10 57.8451 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#19137 - CGI_10003665 superfamily 241646 853 901 4.98E-10 57.4599 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#19137 - CGI_10003665 superfamily 243060 571 640 1.14E-09 57.3888 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#19137 - CGI_10003665 superfamily 243060 999 1068 1.14E-09 57.3888 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#19137 - CGI_10003665 superfamily 243060 183 284 1.39E-08 53.922 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#19137 - CGI_10003665 superfamily 243060 786 819 0.00639977 36.588 cl02507 SEA superfamily NC - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#19139 - CGI_10003667 superfamily 241733 8 68 2.79E-34 115.306 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#19140 - CGI_10003668 superfamily 241733 4 77 9.34E-43 137.648 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#19142 - CGI_10002963 superfamily 243035 33 152 1.05E-23 90.7569 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19145 - CGI_10002966 superfamily 248097 2 113 1.09E-11 56.8898 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19148 - CGI_10003284 superfamily 245847 143 217 5.33E-05 41.334 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#19149 - CGI_10013796 superfamily 247727 36 138 5.90E-09 50.5063 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19150 - CGI_10013797 superfamily 247727 98 200 4.15E-08 49.3507 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19151 - CGI_10013798 superfamily 245201 1 256 5.12E-139 396.424 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19152 - CGI_10013799 superfamily 243058 356 462 3.41E-07 49.6203 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#19153 - CGI_10013800 superfamily 241629 23 147 1.52E-29 110.302 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#19154 - CGI_10013801 superfamily 243074 31 77 6.21E-14 66.7613 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19156 - CGI_10013803 superfamily 245208 22 511 0 593.537 cl09933 ACAD superfamily N - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#19158 - CGI_10013805 superfamily 245208 42 463 0 533.06 cl09933 ACAD superfamily C - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#19158 - CGI_10013805 superfamily 201956 451 537 5.50E-24 99.2062 cl08333 ACOX superfamily N - Acyl-CoA oxidase; This is a family of Acyl-CoA oxidases EC:1.3.3.6. Acyl-coA oxidase converts acyl-CoA into trans-2- enoyl-CoA. Q#19159 - CGI_10013806 superfamily 218499 28 393 1.88E-134 393.714 cl04986 ALG3 superfamily - - "ALG3 protein; The formation of N-glycosidic linkages of glycoproteins involves the ordered assembly of the common Glc3Man9GlcNAc2 core-oligosaccharide on the lipid carrier dolichyl pyrophosphate. Whereas early mannosylation steps occur on the cytoplasmic side of the endoplasmic reticulum with GDP-Man as donor, the final reactions from Man5GlcNAc2-PP-Dol to Man9GlcNAc2-PP-Dol on the lumenal side use Dol-P-Man. ALG3 gene encodes the Dol-P-Man:Man5GlcNAc2-PP-Dol mannosyltransferase." Q#19160 - CGI_10013807 superfamily 245231 443 740 2.70E-164 486.212 cl10019 PurM-like superfamily - - "AIR (aminoimidazole ribonucleotide) synthase related protein. This family includes Hydrogen expression/formation protein HypE, AIR synthases, FGAM (formylglycinamidine ribonucleotide) synthase and Selenophosphate synthetase (SelD). The N-terminal domain of AIR synthase forms the dimer interface of the protein, and is suggested as a putative ATP binding domain." Q#19160 - CGI_10013807 superfamily 241838 772 956 1.08E-93 296.221 cl00395 FMT_core superfamily - - "Formyltransferase, catalytic core domain; Formyltransferase, catalytic core domain. The proteins of this superfamily contain a formyltransferase domain that hydrolyzes the removal of a formyl group from its substrate as part of a multistep transfer mechanism, and this alignment model represents the catalytic core of the formyltransferase domain. This family includes the following known members; Glycinamide Ribonucleotide Transformylase (GART), Formyl-FH4 Hydrolase, Methionyl-tRNA Formyltransferase, ArnA, and 10-Formyltetrahydrofolate Dehydrogenase (FDH). Glycinamide Ribonucleotide Transformylase (GART) catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Methionyl-tRNA Formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA, which plays important role in translation initiation. ArnA is required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. 10-formyltetrahydrofolate dehydrogenase (FDH) catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. Members of this family are multidomain proteins. The formyltransferase domain is located at the N-terminus of FDH, Methionyl-tRNA Formyltransferase and ArnA, and at the C-terminus of Formyl-FH4 Hydrolase. Prokaryotic Glycinamide Ribonucleotide Transformylase (GART) is a single domain protein while eukaryotic GART is a trifunctional protein that catalyzes the second, third and fifth steps in de novo purine biosynthesis." Q#19160 - CGI_10013807 superfamily 247809 104 269 4.24E-83 267.97 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#19160 - CGI_10013807 superfamily 217251 4 103 7.84E-43 152.215 cl03744 GARS_N superfamily - - "Phosphoribosylglycinamide synthetase, N domain; Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the N-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam00289)." Q#19160 - CGI_10013807 superfamily 217250 304 399 1.30E-19 85.6554 cl03743 GARS_C superfamily - - "Phosphoribosylglycinamide synthetase, C domain; Phosphoribosylglycinamide synthetase catalyzes the second step in the de novo biosynthesis of purine. The reaction catalyzed by Phosphoribosylglycinamide synthetase is the ATP- dependent addition of 5-phosphoribosylamine to glycine to form 5'phosphoribosylglycinamide. This domain is related to the C-terminal domain of biotin carboxylase/carbamoyl phosphate synthetase (see pfam02787)." Q#19161 - CGI_10013808 superfamily 217725 65 333 9.29E-60 195.795 cl18425 FGE-sulfatase superfamily - - "Formylglycine-generating sulfatase enzyme; This domain is found in eukaryotic proteins required for post-translational sulphatase modification (SUMF1). These proteins are associated with the rare disorder multiple sulphatase deficiency (MSD). The protein product of the SUMF1 gene is FGE, formylglycine (FGly),-generating enzyme, which is a sulfatase. Sulfatases are enzymes essential for degradation and remodelling of sulfate esters, and formylglycine (FGly), the key catalytic in the active site, is unique to sulfatases. FGE is localised to the endoplasmic reticulum (ER) and interacts with and modifies the unfolded form of newly synthesised sulfatases. FGE is a single-domain monomer with a surprising paucity of secondary structure that adopts a unique fold which is stabilised by two Ca2+ ions. The effect of all mutations found in MSD patients is explained by the FGE structure, providing a molecular basis for MSD. A redox-active disulfide bond is present in the active site of FGE. An oxidized cysteine residue, possibly cysteine sulfenic acid, has been detected that may allow formulation of a structure-based mechanism for FGly formation from cysteine residues in all sulfatases." Q#19164 - CGI_10013811 superfamily 219566 32 107 5.32E-17 72.6655 cl06691 DUF1620 superfamily N - Protein of unknown function (DUF1620); These sequences are mainly derived from predicted eukaryotic proteins. The region in question lies towards the C-terminus of these large proteins and is approximately 300 amino acid residues long. Q#19165 - CGI_10017040 superfamily 245201 36 97 2.91E-07 46.2393 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19166 - CGI_10017041 superfamily 247683 499 559 4.10E-37 133.988 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#19166 - CGI_10017041 superfamily 247744 654 770 1.75E-25 103.381 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#19166 - CGI_10017041 superfamily 241622 382 462 8.18E-21 88.3926 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19166 - CGI_10017041 superfamily 241622 164 242 2.75E-20 86.8518 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19166 - CGI_10017041 superfamily 243136 2 64 8.04E-28 108.026 cl02672 L27 superfamily - - L27 domain; The L27 domain is found in receptor targeting proteins Lin-2 and Lin-7. Q#19167 - CGI_10017042 superfamily 247684 111 268 2.71E-12 64.5323 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19167 - CGI_10017042 superfamily 202746 250 487 3.87E-85 264.927 cl08402 Hexokinase_2 superfamily - - Hexokinase; Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam00349. Some members of the family have two copies of each of these domains. Q#19168 - CGI_10017043 superfamily 145949 87 199 9.93E-24 94.3121 cl03870 Nucleoplasmin superfamily C - Nucleoplasmin; Nucleoplasmins are also known as chromatin decondensation proteins. They bind to core histones and transfer DNA to them in a reaction that requires ATP. This is thought to play a role in the assembly of regular nucleosomal arrays. Q#19170 - CGI_10017045 superfamily 241579 37 85 0.00828024 33.6508 cl00060 FGF superfamily N - "Acidic and basic fibroblast growth factor family; FGFs are mitogens, which stimulate growth or differentiation of cells of mesodermal or neuroectodermal origin. The family plays essential roles in patterning and differentiation during vertebrate embryogenesis, and has neurotrophic activities. FGFs have a high affinity for heparan sulfate proteoglycans and require heparan sulfate to activate one of four cell surface FGF receptors. Upon binding to FGF, the receptors dimerize and their intracellular tyrosine kinase domains become active. FGFs have internal pseudo-threefold symmetry (beta-trefoil topology)." Q#19176 - CGI_10017052 superfamily 241749 1 90 9.18E-13 59.7069 cl00280 globin_like superfamily N - superfamily containing globins and truncated hemoglobins Q#19177 - CGI_10017053 superfamily 247724 118 270 2.39E-22 90.9782 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19180 - CGI_10017056 superfamily 247856 55 103 6.60E-15 64.4913 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19181 - CGI_10017057 superfamily 246680 30 90 7.26E-15 64.6494 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#19182 - CGI_10012829 superfamily 243088 14 132 9.26E-53 177.167 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#19182 - CGI_10012829 superfamily 245201 185 394 9.59E-10 57.6317 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19183 - CGI_10012830 superfamily 218118 761 826 7.57E-06 44.9125 cl04552 CD225 superfamily N - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#19184 - CGI_10012831 superfamily 191163 21 186 3.68E-20 90.1763 cl04888 DUF667 superfamily - - "Protein of unknown function (DUF667); This family of proteins are highly conserved in eukaryotes. Some proteins in the family are annotated as transcription factors. However, there is currently no support for this in the literature." Q#19185 - CGI_10012832 superfamily 248097 86 224 1.37E-20 83.8538 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19186 - CGI_10012833 superfamily 204122 32 69 3.91E-07 42.5271 cl07623 TMA7 superfamily N - Translation machinery associated TMA7; TMA7 plays a role in protein translation. Deletions of the TMA7 gene results in altered protein synthesis rates. Q#19187 - CGI_10012834 superfamily 241563 66 102 0.00457871 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19191 - CGI_10012838 superfamily 217725 73 350 2.95E-79 251.649 cl18425 FGE-sulfatase superfamily - - "Formylglycine-generating sulfatase enzyme; This domain is found in eukaryotic proteins required for post-translational sulphatase modification (SUMF1). These proteins are associated with the rare disorder multiple sulphatase deficiency (MSD). The protein product of the SUMF1 gene is FGE, formylglycine (FGly),-generating enzyme, which is a sulfatase. Sulfatases are enzymes essential for degradation and remodelling of sulfate esters, and formylglycine (FGly), the key catalytic in the active site, is unique to sulfatases. FGE is localised to the endoplasmic reticulum (ER) and interacts with and modifies the unfolded form of newly synthesised sulfatases. FGE is a single-domain monomer with a surprising paucity of secondary structure that adopts a unique fold which is stabilised by two Ca2+ ions. The effect of all mutations found in MSD patients is explained by the FGE structure, providing a molecular basis for MSD. A redox-active disulfide bond is present in the active site of FGE. An oxidized cysteine residue, possibly cysteine sulfenic acid, has been detected that may allow formulation of a structure-based mechanism for FGly formation from cysteine residues in all sulfatases." Q#19192 - CGI_10012839 superfamily 247727 68 173 1.84E-09 54.3583 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19192 - CGI_10012839 superfamily 247727 260 341 1.77E-08 51.6619 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19192 - CGI_10012839 superfamily 219506 332 380 0.00467234 35.3284 cl18513 Eco57I superfamily C - Eco57I restriction-modification methylase; Homologues of the Escherichia coli Eco57I restriction-modification methylase are found in several phylogenetically diverse bacteria. The structure of TaqI has been solved. Q#19193 - CGI_10012840 superfamily 245213 41 77 8.33E-06 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19194 - CGI_10012841 superfamily 245213 447 484 3.18E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19194 - CGI_10012841 superfamily 245213 333 368 4.93E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19194 - CGI_10012841 superfamily 245213 295 331 0.000171876 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19194 - CGI_10012841 superfamily 245213 371 407 0.000603014 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19194 - CGI_10012841 superfamily 245213 763 792 0.00467544 36.0754 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19194 - CGI_10012841 superfamily 245213 652 681 0.00587773 35.6902 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19195 - CGI_10012842 superfamily 216363 74 182 5.44E-11 55.9394 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#19197 - CGI_10012844 superfamily 248097 9 130 9.82E-20 79.2314 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19199 - CGI_10012846 superfamily 245205 61 161 2.09E-32 115.311 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#19199 - CGI_10012846 superfamily 241739 165 198 2.84E-09 54.1183 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#19200 - CGI_10014753 superfamily 247750 7 393 0 560.77 cl17196 E1_enzyme_family superfamily - - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#19200 - CGI_10014753 superfamily 247750 420 525 6.67E-20 90.8268 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#19201 - CGI_10014754 superfamily 217881 33 90 8.36E-18 76.8214 cl04390 Abhydro_lipase superfamily - - Partial alpha/beta-hydrolase lipase region; This family corresponds to a N-terminal part of an alpha/beta hydrolase domain. Q#19202 - CGI_10014755 superfamily 243035 24 138 6.35E-20 82.6677 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19203 - CGI_10014756 superfamily 247743 356 512 1.35E-12 66.4007 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#19203 - CGI_10014756 superfamily 191262 629 833 1.61E-96 302.6 cl18170 Lon_C superfamily - - "Lon protease (S16) C-terminal proteolytic domain; The Lon serine proteases must hydrolyse ATP to degrade protein substrates. In Escherichia coli, these proteases are involved in turnover of intracellular proteins, including abnormal proteins following heat-shock. The active site for protease activity resides in a C-terminal domain. The Lon proteases are classified as family S16 in Merops." Q#19204 - CGI_10014757 superfamily 199166 261 433 4.77E-12 64.656 cl15308 AMN1 superfamily N - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#19204 - CGI_10014757 superfamily 243074 18 50 0.00140843 37.1009 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19204 - CGI_10014757 superfamily 199166 492 567 0.00424731 37.692 cl15308 AMN1 superfamily NC - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#19205 - CGI_10014758 superfamily 243092 9 236 9.22E-16 74.2936 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19206 - CGI_10014759 superfamily 247724 34 348 0 589.112 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19207 - CGI_10014760 superfamily 247724 34 54 6.77E-12 57.5364 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19208 - CGI_10014761 superfamily 248247 26 379 4.51E-107 334.187 cl17693 Integrin_beta superfamily - - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#19208 - CGI_10014761 superfamily 149701 643 681 4.95E-09 53.3741 cl07373 Integrin_b_cyt superfamily - - "Integrin beta cytoplasmic domain; Integrins are a group of transmembrane proteins which function as extracellular matrix receptors and in cell adhesion. Integrins are ubiquitously expressed and are heterodimeric, each composed of an alpha and beta subunit. Several variations of the the alpha and beta subunits exist, and association of different alpha and beta subunits can have different a different binding specificity. This domain corresponds to the cytoplasmic domain of the beta subunit." Q#19208 - CGI_10014761 superfamily 219669 545 609 0.00584176 35.8316 cl06832 Integrin_B_tail superfamily C - Integrin beta tail domain; This is the beta tail domain of the Integrin protein. Integrins are receptors which are involved in cell-cell and cell-extracellular matrix interactions. Q#19210 - CGI_10014763 superfamily 219849 521 612 1.42E-16 77.9951 cl09597 RIH_assoc superfamily - - "RyR and IP3R Homology associated; This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1,4,5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels. There seems to be no known function for this domain. Also see the IP3-binding domain pfam01365 and pfam02815." Q#19211 - CGI_10014764 superfamily 216456 468 628 9.01E-07 49.6294 cl03182 RYDR_ITPR superfamily - - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#19211 - CGI_10014764 superfamily 197746 322 368 0.000389882 40.0171 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#19213 - CGI_10014766 superfamily 241619 26 95 0.000453707 38.6049 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#19214 - CGI_10014767 superfamily 241787 92 170 1.33E-31 110.717 cl00326 Ribosomal_L23 superfamily - - Ribosomal protein L23; Ribosomal protein L23. Q#19214 - CGI_10014767 superfamily 202819 46 85 2.47E-09 49.9922 cl04335 Ribosomal_L23eN superfamily N - "Ribosomal protein L23, N-terminal domain; The N-terminal domain appears to be specific to the eukaryotic ribosomal proteins L25, L23, and L23a." Q#19218 - CGI_10014771 superfamily 222429 11 91 4.02E-09 51.4724 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#19221 - CGI_10018282 superfamily 248391 25 259 5.67E-81 247.658 cl17837 DUF1295 superfamily - - Protein of unknown function (DUF1295); This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 300 residues long. Q#19222 - CGI_10018283 superfamily 245814 135 190 5.31E-05 40.468 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19223 - CGI_10018284 superfamily 222150 148 170 5.05E-05 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#19223 - CGI_10018284 superfamily 222150 177 196 0.00104251 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#19223 - CGI_10018284 superfamily 222150 118 144 0.00274894 34.2897 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#19225 - CGI_10018286 superfamily 242406 131 276 2.59E-39 136.182 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#19235 - CGI_10018297 superfamily 241610 88 141 1.01E-19 82.683 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#19235 - CGI_10018297 superfamily 241610 182 230 3.98E-18 78.4458 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#19235 - CGI_10018297 superfamily 241610 354 407 9.09E-18 77.2902 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#19235 - CGI_10018297 superfamily 241646 290 339 4.62E-05 41.2007 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#19237 - CGI_10018299 superfamily 218118 54 134 8.53E-13 59.9353 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#19238 - CGI_10018300 superfamily 243107 945 989 1.28E-11 61.407 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#19238 - CGI_10018300 superfamily 243154 689 740 2.81E-06 46.0564 cl02715 Surp superfamily - - Surp module; This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. Q#19239 - CGI_10018301 superfamily 243098 80 128 1.59E-08 49.5187 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19240 - CGI_10018302 superfamily 202529 114 184 4.28E-09 50.6376 cl18228 MtN3_slv superfamily - - "Sugar efflux transporter for intercellular exchange; This family includes proteins such as drosophila saliva, MtN3 involved in root nodule development and a protein involved in activation and expression of recombination activation genes (RAGs). Although the molecular function of these proteins is unknown, they are almost certainly transmembrane proteins. This family contains a region of two transmembrane helices that is found in two copies in most members of the family. This family also contains specific sugar efflux transporters that are essential for the maintenance of animal blood glucose levels, plant nectar production, and plant seed and pollen development. In many organisims it meditaes gluose transport; in Arabidopsis it is necessary for pollen viability; and two of the rice homologues are specifically exploited by bacterial pathogens for virulence by means of direct binding of a bacterial effector to the SWEET promoter." Q#19242 - CGI_10018304 superfamily 222302 204 290 1.87E-19 81.7141 cl16342 DUF4151 superfamily - - Domain of unknown function (DUF4151); This domain is found on dynein heavy chain proteins. The exact function is not known but it is conserved from plants to Sch. pombe to human. Q#19242 - CGI_10018304 superfamily 197732 46 73 0.000286785 38.0023 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#19243 - CGI_10018305 superfamily 241999 204 361 1.01E-05 44.7957 cl00641 Cas4_I-A_I-B_I-C_I-D_II-B superfamily - - CRISPR/Cas system-associated protein Cas4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; Cas4 is RecB-like nuclease with three-cysteine C-terminal cluster Q#19244 - CGI_10018306 superfamily 248289 584 636 0.000117059 40.4851 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#19244 - CGI_10018306 superfamily 248289 260 311 0.00164385 37.0183 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#19244 - CGI_10018306 superfamily 248289 318 368 0.00230877 36.6331 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#19244 - CGI_10018306 superfamily 243049 13 67 0.00344155 36.2895 cl02472 IGFBP superfamily C - Insulin-like growth factor binding protein; Insulin-like growth factor binding protein. Q#19245 - CGI_10018307 superfamily 243212 176 304 2.15E-18 80.4657 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#19245 - CGI_10018307 superfamily 215866 6 149 4.08E-12 62.728 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#19246 - CGI_10018308 superfamily 215866 7 147 9.44E-18 77.3655 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#19246 - CGI_10018308 superfamily 243212 174 278 4.87E-13 63.9022 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#19247 - CGI_10018309 superfamily 247639 172 497 0 555.706 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#19247 - CGI_10018309 superfamily 247683 506 558 3.95E-24 96.1223 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#19248 - CGI_10018310 superfamily 201678 208 236 0.000185752 38.6352 cl03138 PPTA superfamily - - "Protein prenyltransferase alpha subunit repeat; Both farnesyltransferase (FT) and geranylgeranyltransferase 1 (GGT1) recognise a CaaX motif on their substrates where 'a' stands for preferably aliphatic residues, whereas GGT2 recognises a completely different motif. Important substrates for FT include, amongst others, many members of the Ras superfamily. GGT1 substrates include some of the other small GTPases and GGT2 substrates include the Rab family." Q#19255 - CGI_10018317 superfamily 245205 36 70 0.001345 35.2913 cl09930 RPA_2b-aaRSs_OBF_like superfamily NC - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#19260 - CGI_10000425 superfamily 247866 3 153 1.05E-12 63.6256 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#19261 - CGI_10001632 superfamily 245201 108 429 0 648.445 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19261 - CGI_10001632 superfamily 241566 1 22 0.00445664 35.1227 cl00040 C1 superfamily N - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#19262 - CGI_10001633 superfamily 243066 23 126 5.47E-17 77.6577 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19262 - CGI_10001633 superfamily 198867 136 234 3.03E-11 60.818 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#19263 - CGI_10003124 superfamily 217473 103 326 9.88E-25 104.369 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#19264 - CGI_10003125 superfamily 247750 10 84 1.31E-39 137.421 cl17196 E1_enzyme_family superfamily C - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#19265 - CGI_10007008 superfamily 205157 347 382 1.32E-06 45.6063 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#19265 - CGI_10007008 superfamily 241578 124 168 4.64E-05 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#19267 - CGI_10007010 superfamily 245213 6 38 5.37E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19267 - CGI_10007010 superfamily 245213 132 172 0.00403164 35.305 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19268 - CGI_10007011 superfamily 245213 152 188 8.46E-07 47.6314 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19268 - CGI_10007011 superfamily 245213 39 74 1.42E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19268 - CGI_10007011 superfamily 245213 76 112 1.52E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19268 - CGI_10007011 superfamily 245213 114 150 4.27E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19268 - CGI_10007011 superfamily 245213 1055 1089 0.000403224 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19268 - CGI_10007011 superfamily 245213 801 832 0.00362789 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19268 - CGI_10007011 superfamily 241578 716 759 4.66E-09 56.6243 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#19268 - CGI_10007011 superfamily 241578 962 1008 0.0033637 38.9052 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#19270 - CGI_10007013 superfamily 241574 46 283 1.57E-102 318.376 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#19271 - CGI_10007014 superfamily 202711 2 145 2.06E-76 227.62 cl04190 Mob1_phocein superfamily - - "Mob1/phocein family; Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature. This family also includes phocein, a rat protein that by yeast two hybrid interacts with striatin." Q#19272 - CGI_10007015 superfamily 248097 17 46 3.79E-08 45.3338 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19274 - CGI_10003082 superfamily 241896 82 285 5.13E-96 284.008 cl00483 UDG_like superfamily - - "Uracil-DNA glycosylases (UDG) and related enzymes; Uracil-DNA glycosylases (UDG) catalyzes the removal of uracil from DNA, which initiates the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. At least five UDG families have been characterized so far; these families share similar overall folds and common active site motifs. They demonstrate different substrate specificities, but often the function of one enzyme can be complemented by the other. Family 1 enzymes are active against uracil in both ssDNA and dsDNA, and recognize uracil explicitly in an extrahelical conformation via a combination of protein and bound-water interactions. Family 2 enzymes are mismatch specific and explicitly recognize the widowed guanine on the complementary strand, rather than the extrahelical scissile pyrimidine. This allows a broader specificity so that some Family 2 enzymes can excise uracil as well as 3, N(4)-ethenocytosine from mismatches with guanine. A Family 3 UDG from human was first characterized to remove Uracil from ssDNA, hence the name hSMUG (single-strand-selective monofunctional uracil-DNA glycosylase). However, subsequent research has shown that hSMUG1 and its rat ortholog can remove uracil and its oxidized pyrimidine derivatives from both, ssDNA and dsDNA. Enzymes in Families 4 and 5 are both thermostable. Family 4 enzymes specifically recognize uracil in a manner similar to human UDG (Family 1), rather than guanine in the complementary strand DNA, as does E. coli MUG (Family 2). These results suggest that the mechanism by which Family 4 UDGs remove uracils from DNA is similar to that of Family 1 enzyme. Although Family 5 enzymes are close relatives of Family 4, they show different substrate specificities." Q#19277 - CGI_10011615 superfamily 238012 1973 2009 1.52E-05 45.0378 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#19277 - CGI_10011615 superfamily 238012 1841 1875 6.44E-05 43.1118 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#19277 - CGI_10011615 superfamily 245213 1925 1957 0.000283114 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19277 - CGI_10011615 superfamily 202224 174 290 2.48E-49 173.636 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#19277 - CGI_10011615 superfamily 210240 13 54 1.75E-11 62.2802 cl15840 JmjN superfamily - - jmjN domain; jmjN domain. Q#19277 - CGI_10011615 superfamily 248279 1609 1641 2.71E-10 60.4283 cl17725 zf-HC5HC2H superfamily C - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#19277 - CGI_10011615 superfamily 247999 1570 1602 2.35E-06 47.2551 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#19277 - CGI_10011615 superfamily 238012 1887 1925 0.00499293 37.719 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#19281 - CGI_10011619 superfamily 241597 100 165 4.13E-13 63.4157 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#19283 - CGI_10011621 superfamily 248097 144 273 7.10E-15 68.4458 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19286 - CGI_10005435 superfamily 241563 61 97 1.65E-05 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19287 - CGI_10002625 superfamily 244843 98 622 5.11E-139 418.943 cl08040 Ggt superfamily - - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#19288 - CGI_10002626 superfamily 244843 42 94 0.00141917 34.8993 cl08040 Ggt superfamily N - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#19289 - CGI_10002627 superfamily 222617 256 394 2.48E-39 143.565 cl16738 YHYH superfamily - - "YHYH protein; This domain family is found in bacteria, eukaryotes and viruses, and is typically between 141 and 198 amino acids in length. There is a conserved YHYH sequence motif." Q#19289 - CGI_10002627 superfamily 222617 36 174 6.00E-39 142.409 cl16738 YHYH superfamily - - "YHYH protein; This domain family is found in bacteria, eukaryotes and viruses, and is typically between 141 and 198 amino acids in length. There is a conserved YHYH sequence motif." Q#19289 - CGI_10002627 superfamily 207546 689 749 5.11E-33 122.529 cl02165 CBFB_NFYA superfamily - - CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B; CCAAT-binding transcription factor (CBF-B/NF-YA) subunit B. Q#19290 - CGI_10002628 superfamily 241733 5 74 3.83E-40 128.403 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#19291 - CGI_10021774 superfamily 216653 155 276 3.10E-09 53.3699 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#19292 - CGI_10021775 superfamily 219000 185 345 3.10E-16 78.4571 cl05717 Drf_FH3 superfamily - - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#19292 - CGI_10021775 superfamily 219001 70 177 1.57E-05 45.3775 cl05720 Drf_GBD superfamily N - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#19293 - CGI_10021776 superfamily 247684 58 190 1.28E-08 54.976 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19294 - CGI_10021777 superfamily 247684 27 131 1.47E-06 44.9608 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19298 - CGI_10021781 superfamily 241563 45 82 1.41E-08 51.9043 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19298 - CGI_10021781 superfamily 128778 89 214 1.58E-21 91.1722 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#19298 - CGI_10021781 superfamily 216033 257 347 5.72E-20 85.8484 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#19298 - CGI_10021781 superfamily 110440 594 621 4.57E-07 47.4025 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19298 - CGI_10021781 superfamily 110440 547 574 2.30E-05 42.3949 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19298 - CGI_10021781 superfamily 110440 638 665 0.000308854 38.9281 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19298 - CGI_10021781 superfamily 110440 500 527 0.000978473 37.3873 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19300 - CGI_10021783 superfamily 219000 145 310 5.83E-22 95.4059 cl05717 Drf_FH3 superfamily - - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#19300 - CGI_10021783 superfamily 219001 19 139 2.98E-10 59.6299 cl05720 Drf_GBD superfamily N - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#19303 - CGI_10021786 superfamily 245226 170 359 1.46E-49 174.78 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#19305 - CGI_10021788 superfamily 248363 9 138 3.57E-59 181.657 cl17809 Sedlin_N superfamily - - "Sedlin, N-terminal conserved region; Mutations in this protein are associated with the X-linked spondyloepiphyseal dysplasia tarda syndrome (OMIM:313400). This family represents an N-terminal conserved region." Q#19306 - CGI_10021789 superfamily 245213 393 427 1.29E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19307 - CGI_10021790 superfamily 243074 9 55 7.12E-06 43.2641 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19308 - CGI_10021791 superfamily 243074 13 58 8.19E-05 40.1825 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19309 - CGI_10021792 superfamily 245814 55 146 0.000111938 40.4996 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19310 - CGI_10021793 superfamily 243119 128 173 0.00314156 33.5613 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#19312 - CGI_10021795 superfamily 247743 234 286 0.00128589 37.8959 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#19313 - CGI_10021796 superfamily 241600 1 165 1.23E-61 191.685 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19317 - CGI_10021800 superfamily 241578 219 373 5.17E-24 98.903 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#19317 - CGI_10021800 superfamily 241578 411 549 7.47E-14 69.517 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#19317 - CGI_10021800 superfamily 245213 136 172 1.56E-11 59.9578 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19317 - CGI_10021800 superfamily 245213 174 211 3.74E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19318 - CGI_10021801 superfamily 245213 14 50 4.20E-09 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19318 - CGI_10021801 superfamily 245213 52 89 0.000482888 33.7642 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19321 - CGI_10021804 superfamily 219000 149 337 2.72E-21 92.7095 cl05717 Drf_FH3 superfamily - - Diaphanous FH3 Domain; This region is found in the Formin-like and and diaphanous proteins. Q#19321 - CGI_10021804 superfamily 219001 7 145 2.12E-10 59.6299 cl05720 Drf_GBD superfamily - - "Diaphanous GTPase-binding Domain; This domain is bound to by GTP-attached Rho proteins, leading to activation of the Drf protein." Q#19325 - CGI_10011454 superfamily 245201 1 217 2.08E-94 279.421 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19326 - CGI_10011455 superfamily 241600 208 419 1.81E-85 262.176 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19326 - CGI_10011455 superfamily 241600 4 129 3.48E-38 138.142 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19327 - CGI_10011456 superfamily 245205 13 113 3.52E-06 45.7003 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#19327 - CGI_10011456 superfamily 241640 430 533 0.000175403 40.6337 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#19329 - CGI_10011458 superfamily 241563 52 92 7.50E-06 42.4664 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19331 - CGI_10011460 superfamily 245596 16 217 5.05E-45 154.773 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#19333 - CGI_10011462 superfamily 241600 1 134 4.38E-40 135.061 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19334 - CGI_10011463 superfamily 241563 57 100 2.24E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19335 - CGI_10011464 superfamily 241563 63 102 3.05E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19336 - CGI_10011465 superfamily 243092 399 694 1.02E-65 222.595 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19336 - CGI_10011465 superfamily 243092 36 220 7.37E-40 149.793 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19336 - CGI_10011465 superfamily 217837 793 899 1.23E-19 86.1073 cl04367 Utp12 superfamily - - Dip2/Utp12 Family; This domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2. Q#19337 - CGI_10013390 superfamily 219501 25 65 0.00286561 36.1542 cl06622 MNNL superfamily C - N terminus of Notch ligand; This entry represents a region of conserved sequence at the N terminus of several Notch ligand proteins. Q#19338 - CGI_10013391 superfamily 243072 170 291 2.39E-32 119.796 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19338 - CGI_10013391 superfamily 243072 36 163 3.22E-29 111.321 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19338 - CGI_10013391 superfamily 243072 243 357 1.12E-27 107.084 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19338 - CGI_10013391 superfamily 246680 410 476 1.14E-05 43.5328 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#19339 - CGI_10013392 superfamily 207609 146 206 4.73E-07 45.9328 cl02481 NGF superfamily N - Nerve growth factor family; Nerve growth factor family. Q#19340 - CGI_10013393 superfamily 198867 45 144 6.12E-27 104.163 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#19340 - CGI_10013393 superfamily 243146 231 282 1.95E-09 53.9913 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19340 - CGI_10013393 superfamily 243146 434 479 2.72E-08 50.3526 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19340 - CGI_10013393 superfamily 243146 335 378 5.54E-08 49.5822 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19340 - CGI_10013393 superfamily 243146 282 333 2.33E-06 44.9598 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19340 - CGI_10013393 superfamily 243146 192 240 1.07E-05 43.0467 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19340 - CGI_10013393 superfamily 243066 1 36 8.66E-05 40.6785 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19340 - CGI_10013393 superfamily 243146 405 444 0.000262372 39.0787 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#19344 - CGI_10013397 superfamily 219619 365 433 2.58E-12 62.6103 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#19344 - CGI_10013397 superfamily 243066 9 90 1.00E-08 52.5553 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19345 - CGI_10013398 superfamily 245304 115 404 2.50E-174 508.638 cl10459 Peptidases_S8_S53 superfamily - - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#19345 - CGI_10013398 superfamily 241585 721 754 6.50E-05 41.7356 cl00066 FU superfamily C - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#19345 - CGI_10013398 superfamily 241585 665 696 0.000231131 40.1948 cl00066 FU superfamily N - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#19345 - CGI_10013398 superfamily 201820 489 575 2.32E-33 124.66 cl08326 P_proprotein superfamily - - Proprotein convertase P-domain; A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. Q#19346 - CGI_10013399 superfamily 243072 7 50 2.00E-09 53.5414 cl02529 ANK superfamily NC - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19349 - CGI_10013402 superfamily 241733 2 66 2.59E-25 99.3152 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#19349 - CGI_10013402 superfamily 241779 343 494 2.54E-13 67.6415 cl00318 YjeF_N superfamily - - "YjeF-related protein N-terminus; YjeF-N domain is a novel version of the Rossmann fold with a set of catalytic residues and structural features that are different from the conventional dehydrogenases. YjeF-N domain is fused to Ribokinases in bacteria (YjeF), where they may be phosphatases, and to divergent Sm and the FDF domain in eukaryotes (Dcp3p and FLJ21128), where they may be involved in decapping and catalyze hydrolytic RNA-processing reactions." Q#19349 - CGI_10013402 superfamily 220282 241 335 3.77E-09 54.4122 cl09757 FDF superfamily - - "FDF domain; The FDF domain, so called because of the conserved FDF at its N termini, is an entirely alpha-helical domain with multiple exposed hydrophilic loops. It is found at the C terminus of Scd6p-like SM domains. It is also found with other divergent Sm domains and in proteins such as Dcp3p and FLJ21128, where it is found N terminal to the YjeF-N domain, a novel Rossmann fold domain." Q#19350 - CGI_10013403 superfamily 241563 63 95 4.70E-06 44.2003 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19350 - CGI_10013403 superfamily 245027 97 228 0.000726515 38.5284 cl09176 FlgN superfamily - - FlgN protein; This family includes the FlgN protein and export chaperone involved in flagellar synthesis. Q#19354 - CGI_10018134 superfamily 247792 33 71 0.000422296 37.04 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19354 - CGI_10018134 superfamily 190233 176 233 0.00703345 33.5818 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#19355 - CGI_10018135 superfamily 246669 239 361 3.73E-51 172.056 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#19355 - CGI_10018135 superfamily 246669 370 508 3.98E-39 139.393 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#19356 - CGI_10018136 superfamily 247068 39 137 2.96E-24 98.1545 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19356 - CGI_10018136 superfamily 247068 249 348 1.07E-21 90.8357 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19356 - CGI_10018136 superfamily 247068 496 594 6.46E-14 68.4941 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19356 - CGI_10018136 superfamily 247068 603 687 1.93E-09 55.3974 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19357 - CGI_10018137 superfamily 243066 266 356 1.90E-13 66.5604 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19357 - CGI_10018137 superfamily 220077 462 508 7.11E-06 44.7331 cl07512 DUF1916 superfamily N - Domain of unknown function (DUF1916); This domain is found in various eukaryotic HBS1-like proteins. Q#19358 - CGI_10018138 superfamily 243161 14 101 3.97E-16 71.2713 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#19361 - CGI_10018141 superfamily 245201 269 437 2.86E-09 56.8613 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19362 - CGI_10018142 superfamily 247038 229 306 1.17E-07 50.3582 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#19362 - CGI_10018142 superfamily 208843 46 163 0.000984111 39.2112 cl08275 RHD-n superfamily - - "N-terminal sub-domain of the Rel homology domain (RHD); Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal sub-domain, which may be distantly related to the DNA-binding domain found in P53. The C-terminal sub-domain has an immunoglobulin-like fold and serves as a dimerization module that also binds DNA (see cd00102). The RHD is found in NF-kappa B, nuclear factor of activated T-cells (NFAT), the tonicity-responsive enhancer binding protein (TonEBP), and the arthropod proteins Dorsal and Relish (Rel)." Q#19363 - CGI_10018143 superfamily 219817 110 241 3.85E-15 74.5768 cl07129 Xpo1 superfamily - - "Exportin 1-like protein; The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus." Q#19363 - CGI_10018143 superfamily 243689 32 99 4.00E-09 54.9421 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#19365 - CGI_10018145 superfamily 247792 18 68 1.11E-06 46.2848 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19365 - CGI_10018145 superfamily 115400 562 584 0.00469044 35.6489 cl06002 SBBP superfamily N - Beta-propeller repeat; This family is related to pfam00400 and is likely to also form a beta-propeller. SBBP stands for Seven Bladed Beta Propeller. Q#19366 - CGI_10018146 superfamily 247792 18 68 3.75E-07 47.8256 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19368 - CGI_10018148 superfamily 247792 18 68 4.49E-07 47.4404 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19374 - CGI_10018154 superfamily 247724 27 230 3.35E-35 126.881 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19377 - CGI_10018157 superfamily 187408 333 666 1.47E-127 399.743 cl14654 V_Alix_like superfamily - - "Protein-interacting V-domain of mammalian Alix and related domains; This superfamily contains the V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. The Alix V-domain contains a binding site, partially conserved in this superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Members of this superfamily have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members, including Alix, HD-PTP, and Bro1, also have a proline-rich region (PRR), which binds multiple partners in Alix, including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. The C-terminal portion (V-domain and PRR) of Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes; it interacts with a YPxL motif in Doa4s catalytic domain to stimulate its deubiquitination activity. Rim20 may bind the ESCRT-III subunit Snf7, bringing the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and promoting the proteolytic activation of Rim101. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate often absent in human kidney, breast, lung, and cervical tumors. HD-PTP has a C-terminal catalytically inactive tyrosine phosphatase domain." Q#19377 - CGI_10018157 superfamily 187403 153 328 1.14E-73 250.808 cl14649 BRO1_Alix_like superfamily N - "Protein-interacting Bro1-like domain of mammalian Alix and related domains; This superfamily includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1 and Rim20 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, HD-PTP, and Brox) and Snf7 (in the case of yeast Bro1, and Rim20). The single domain protein human Brox, and the isolated Bro1-like domains of Alix, HD-PTP and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix, HD-PTP, Bro1, and Rim20 also have a V-shaped (V) domain, which in the case of Alix, has been shown to be a dimerization domain and to contain a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in this superfamily. Alix, HD-PTP and Bro1 also have a proline-rich region (PRR); the Alix PRR binds multiple partners. Rhophilin-1, and -2, in addition to this Bro1-like domain, have an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This protein has a C-terminal, catalytically inactive tyrosine phosphatase domain." Q#19378 - CGI_10018158 superfamily 241574 272 501 1.54E-68 226.313 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#19381 - CGI_10018162 superfamily 243072 1246 1371 2.23E-35 132.893 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19381 - CGI_10018162 superfamily 243072 683 808 3.25E-32 124.033 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19381 - CGI_10018162 superfamily 243072 1147 1272 6.61E-32 122.877 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19381 - CGI_10018162 superfamily 243072 951 1074 1.19E-30 119.411 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19381 - CGI_10018162 superfamily 243072 782 907 1.76E-30 119.025 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19381 - CGI_10018162 superfamily 243072 1015 1140 3.67E-30 117.87 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19381 - CGI_10018162 superfamily 243072 1316 1437 4.92E-30 117.485 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19381 - CGI_10018162 superfamily 243072 919 950 0.000266905 40.6152 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19382 - CGI_10019630 superfamily 247725 8 146 2.11E-71 214.91 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19383 - CGI_10019631 superfamily 247799 157 221 1.44E-19 83.2947 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#19383 - CGI_10019631 superfamily 247799 79 140 5.43E-14 67.5015 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#19383 - CGI_10019631 superfamily 247799 472 535 4.17E-08 50.6363 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#19386 - CGI_10019634 superfamily 241559 67 174 1.28E-11 63.4839 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#19386 - CGI_10019634 superfamily 243094 1023 1302 1.27E-136 427.773 cl02569 RasGAP superfamily - - "Ras GTPase Activating Domain; RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator." Q#19386 - CGI_10019634 superfamily 217754 1416 1551 2.64E-32 124.666 cl04284 RasGAP_C superfamily - - RasGAP C-terminus; RasGAP C-terminus. Q#19386 - CGI_10019634 superfamily 210118 823 843 0.000377822 40.012 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#19386 - CGI_10019634 superfamily 210118 763 783 0.0023111 37.6903 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#19387 - CGI_10019635 superfamily 245814 130 204 2.35E-06 45.1727 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19388 - CGI_10019636 superfamily 243053 1141 1373 1.57E-67 229.061 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#19388 - CGI_10019636 superfamily 243096 157 340 4.94E-36 136.659 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#19388 - CGI_10019636 superfamily 243067 606 642 7.54E-05 42.7848 cl02520 REM superfamily C - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#19388 - CGI_10019636 superfamily 243067 1034 1116 0.000167436 41.6292 cl02520 REM superfamily N - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#19388 - CGI_10019636 superfamily 247725 10 53 0.00103012 39.6616 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19389 - CGI_10019637 superfamily 247725 20 91 4.91E-28 100.523 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19390 - CGI_10019638 superfamily 247042 76 240 2.72E-22 97.1103 cl15693 Sema superfamily N - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#19394 - CGI_10019642 superfamily 220381 3 179 3.90E-32 118.717 cl10736 Use1 superfamily - - Membrane fusion protein Use1; This entry is of a family of proteins all approximately 300 residues in length. The proteins have a single C-terminal trans-membrane domain and a SNARE [soluble NSF (N-ethylmaleimide-sensitive fusion protein) attachment protein receptor] domain of approximately 60 residues. The SNARE domains are essential for membrane fusion and are conserved from yeasts to humans. Use1 is one of the three protein subunits that make up the SNARE complex and it is specifically required for Golgi-endoplasmic reticulum retrograde transport. Q#19396 - CGI_10019644 superfamily 215724 50 355 5.15E-151 430.887 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#19397 - CGI_10019645 superfamily 217473 141 388 2.68E-05 45.0485 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#19398 - CGI_10019646 superfamily 217473 177 297 5.71E-12 65.0789 cl03978 Mab-21 superfamily NC - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#19398 - CGI_10019646 superfamily 243034 481 502 0.00953095 34.3156 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#19399 - CGI_10019647 superfamily 247792 16 63 4.02E-07 47.4404 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19403 - CGI_10019652 superfamily 247941 224 348 5.08E-12 62.7384 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#19404 - CGI_10019653 superfamily 245304 196 429 1.11E-105 333.758 cl10459 Peptidases_S8_S53 superfamily N - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#19404 - CGI_10019653 superfamily 241585 642 681 3.31E-05 42.8912 cl00066 FU superfamily C - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#19404 - CGI_10019653 superfamily 201820 513 607 6.82E-37 135.061 cl08326 P_proprotein superfamily - - Proprotein convertase P-domain; A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. Q#19404 - CGI_10019653 superfamily 241585 788 829 2.59E-05 42.8822 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#19404 - CGI_10019653 superfamily 243212 2 68 3.18E-05 43.4866 cl02844 Arrestin_C superfamily N - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#19404 - CGI_10019653 superfamily 241585 885 928 0.000161475 40.571 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#19404 - CGI_10019653 superfamily 241585 686 733 0.000906232 38.645 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#19405 - CGI_10019654 superfamily 241758 79 230 8.00E-49 164.651 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#19405 - CGI_10019654 superfamily 189709 267 352 1.09E-34 124.203 cl02960 ETF_alpha superfamily - - "Electron transfer flavoprotein FAD-binding domain; This domain found at the C-terminus of electron transfer flavoprotein alpha chain and binds to FAD. The fold consists of a five-stranded parallel beta sheet as the core of the domain, flanked by alternating helices. A small part of this domain is donated by the beta chain." Q#19409 - CGI_10019658 superfamily 220080 257 393 7.35E-67 216.646 cl07526 DUF1900 superfamily - - "Domain of unknown function (DUF1900); This domain is predominantly found in the structural protein coronin, and is duplicated in some sequences. It has no known function." Q#19409 - CGI_10019658 superfamily 149883 5 67 1.99E-32 119.651 cl07525 DUF1899 superfamily - - Domain of unknown function (DUF1899); This set of domains is found in various eukaryotic proteins. Function is unknown. Q#19409 - CGI_10019658 superfamily 243092 77 301 3.10E-21 93.5536 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19411 - CGI_10007345 superfamily 218493 412 556 1.12E-48 166.378 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#19411 - CGI_10007345 superfamily 248054 13 69 0.00623642 35.1404 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#19412 - CGI_10007346 superfamily 247856 8 71 1.45E-05 38.2977 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19413 - CGI_10007347 superfamily 149431 795 861 1.14E-28 113.177 cl07111 LLGL superfamily C - LLGL2; This domain is found in lethal giant larvae homolog 2 (LLGL2) proteins and syntaxin-binding proteins like tomosyn. It has been identified in eukaryotes and tends to be found together with WD repeats (pfam00400). Q#19413 - CGI_10007347 superfamily 243092 559 782 3.52E-11 64.6636 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19413 - CGI_10007347 superfamily 187408 294 498 0.00156796 41.1736 cl14654 V_Alix_like superfamily C - "Protein-interacting V-domain of mammalian Alix and related domains; This superfamily contains the V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. The Alix V-domain contains a binding site, partially conserved in this superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Members of this superfamily have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members, including Alix, HD-PTP, and Bro1, also have a proline-rich region (PRR), which binds multiple partners in Alix, including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. The C-terminal portion (V-domain and PRR) of Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes; it interacts with a YPxL motif in Doa4s catalytic domain to stimulate its deubiquitination activity. Rim20 may bind the ESCRT-III subunit Snf7, bringing the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and promoting the proteolytic activation of Rim101. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate often absent in human kidney, breast, lung, and cervical tumors. HD-PTP has a C-terminal catalytically inactive tyrosine phosphatase domain." Q#19415 - CGI_10007349 superfamily 248458 100 419 9.91E-06 45.7677 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19423 - CGI_10004064 superfamily 245814 101 160 0.000430413 36.3293 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19424 - CGI_10004065 superfamily 245814 24 79 0.000205668 38.1568 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19425 - CGI_10007125 superfamily 247792 315 357 1.06E-05 42.4328 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19425 - CGI_10007125 superfamily 221597 97 242 9.61E-14 67.3805 cl13864 GIDE superfamily - - "E3 Ubiquitin ligase; This domain family is found in bacteria, archaea and eukaryotes, and is typically between 150 and 163 amino acids in length. There is a single completely conserved residue E that may be functionally important. GIDE is an E3 ubiquitin ligase which is involved in inducing apoptosis." Q#19426 - CGI_10007126 superfamily 247727 99 194 9.11E-07 45.8839 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19427 - CGI_10007127 superfamily 216301 45 243 1.38E-27 105.039 cl03099 EMP24_GP25L superfamily - - emp24/gp25L/p24 family/GOLD; Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Q#19428 - CGI_10007128 superfamily 243050 51 108 2.24E-13 66.6742 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#19428 - CGI_10007128 superfamily 247916 447 515 3.35E-06 45.8367 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#19429 - CGI_10007129 superfamily 243050 24 79 4.77E-14 68.215 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#19429 - CGI_10007129 superfamily 247916 381 449 1.55E-08 52.7703 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#19430 - CGI_10007130 superfamily 217879 47 94 7.43E-14 68.5418 cl04388 DNA_pol_delta_4 superfamily N - "DNA polymerase delta, subunit 4; DNA polymerase delta, subunit 4. " Q#19430 - CGI_10007130 superfamily 238191 244 355 3.28E-09 57.7272 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#19431 - CGI_10007131 superfamily 243092 97 405 1.47E-103 311.191 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19431 - CGI_10007131 superfamily 199226 11 37 0.00273964 35.488 cl11662 LisH superfamily - - "LisH; The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex." Q#19432 - CGI_10007132 superfamily 247727 102 193 6.38E-05 41.2615 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19433 - CGI_10007133 superfamily 247710 83 130 2.34E-05 44.0251 cl17114 metX superfamily NC - homoserine O-acetyltransferase; Provisional Q#19433 - CGI_10007133 superfamily 217881 26 43 0.0084617 33.6791 cl04390 Abhydro_lipase superfamily N - Partial alpha/beta-hydrolase lipase region; This family corresponds to a N-terminal part of an alpha/beta hydrolase domain. Q#19434 - CGI_10007134 superfamily 247724 3 192 9.83E-118 336.056 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19435 - CGI_10007135 superfamily 243066 53 111 2.91E-09 49.6116 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19438 - CGI_10007138 superfamily 203593 49 123 5.04E-08 47.295 cl18243 Mod_r superfamily C - "Modifier of rudimentary (Mod(r)) protein; This family represents a conserved region approximately 150 residues long within a number of eukaryotic proteins that show homology with Drosophila melanogaster Modifier of rudimentary (Mod(r)) proteins. The N-terminal half of Mod(r) proteins is acidic, whereas the C-terminal half is basic, and both of these regions are represented in this family. Members of this family include the Vps37 subunit of the endosomal sorting complex ESCRT-I, a complex involved in recruiting transport machinery for protein sorting at the multivesicular body (MVB). The yeast ESCRT-I complex consists of three proteins (Vps23, Vps28 and Vps37). The mammalian homologue of Vps37 interacts with Tsg101 (Pfam: PF05743) through its mod(r) domain and its function is essential for lysosomal sorting of EGF receptors." Q#19440 - CGI_10007140 superfamily 215754 210 301 5.99E-21 85.3828 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#19440 - CGI_10007140 superfamily 215754 20 100 9.26E-19 79.2196 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#19440 - CGI_10007140 superfamily 215754 109 207 1.06E-13 65.3524 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#19441 - CGI_10007141 superfamily 241832 2 91 5.92E-30 103.505 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#19442 - CGI_10007142 superfamily 243092 1119 1310 8.71E-06 48.1 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19443 - CGI_10007143 superfamily 241607 41 72 0.000120789 36.479 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#19444 - CGI_10007144 superfamily 247749 8 105 1.40E-42 143.013 cl17195 LDH_MDH_like superfamily N - "NAD-dependent, lactate dehydrogenase-like, 2-hydroxycarboxylate dehydrogenase family; Members of this family include ubiquitous enzymes like L-lactate dehydrogenases (LDH), L-2-hydroxyisocaproate dehydrogenases, and some malate dehydrogenases (MDH). LDH catalyzes the last step of glycolysis in which pyruvate is converted to L-lactate. MDH is one of the key enzymes in the citric acid cycle, facilitating both the conversion of malate to oxaloacetate and replenishing levels of oxalacetate by reductive carboxylation of pyruvate. The LDH/MDH-like proteins are part of the NAD(P)-binding Rossmann fold superfamily, which includes a wide variety of protein families including the NAD(P)-binding domains of alcohol dehydrogenases, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate dehydrogenases, formate/glycerate dehydrogenases, siroheme synthases, 6-phosphogluconate dehydrogenases, aminoacid dehydrogenases, repressor rex, and NAD-binding potassium channel domains, among others." Q#19445 - CGI_10005354 superfamily 245864 72 540 5.79E-116 353.892 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#19446 - CGI_10005355 superfamily 245864 10 113 1.02E-39 138.565 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#19447 - CGI_10005356 superfamily 245864 33 458 5.58E-102 315.372 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#19453 - CGI_10004972 superfamily 241754 368 720 8.32E-175 506.339 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#19455 - CGI_10004974 superfamily 241754 21 326 6.58E-130 405.912 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#19456 - CGI_10004975 superfamily 191863 123 211 2.04E-28 105.841 cl06724 HCNGP superfamily - - HCNGP-like protein; This family comprises sequences bearing significant similarity to the mouse transcriptional regulator protein HCNGP. This protein is localised to the nucleus and is thought to be involved in the regulation of beta-2-microglobulin genes. Q#19457 - CGI_10004976 superfamily 241568 94 128 0.000341618 37.0572 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#19458 - CGI_10006621 superfamily 247792 22 61 2.87E-07 48.596 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19460 - CGI_10006623 superfamily 247792 13 51 1.17E-08 52.8332 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19463 - CGI_10006658 superfamily 248028 28 266 2.69E-55 180.777 cl17474 Steroid_dh superfamily - - "3-oxo-5-alpha-steroid 4-dehydrogenase; This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalyzed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants is DET2, a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development." Q#19464 - CGI_10006659 superfamily 243671 36 208 1.71E-07 52.7466 cl04219 UPF0104 superfamily - - Uncharacterized protein family (UPF0104); This family of proteins are integral membrane proteins. These proteins are uncharacterized but contain a conserved PG motif. Some members of this family are annotated as dolichol-P-glucose synthetase and contain a pfam00535 domain. Q#19467 - CGI_10006662 superfamily 241877 72 194 0.00714826 37.609 cl00459 MIT_CorA-like superfamily NC - "metal ion transporter CorA-like divalent cation transporter superfamily; This superfamily of essential membrane proteins is involved in transporting divalent cations (uptake or efflux) across membranes. They are found in most bacteria and archaea, and in some eukaryotes. It is a functionally diverse group which includes the Mg2+ transporters of Escherichia coli and Salmonella typhimurium CorAs (which can also transport Co2+, and Ni2+ ), the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and the Zn2+ transporter Salmonella typhimurium ZntB, which mediates the efflux of Zn2+ (and Cd2+). It includes five Saccharomyces cerevisiae members: i) two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, ii) two mitochondrial inner membrane Mg2+ transporters: Mfm1p/Lpe10p, and Mrs2p, and iii) and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. It also includes a family of Arabidopsis thaliana members (AtMGTs), some of which are localized to distinct tissues, and not all of which can transport Mg2+. Thermotoga maritima CorA and Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, Mrs2p, and Alr1p. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport." Q#19468 - CGI_10022430 superfamily 109874 29 155 2.06E-09 53.4522 cl02980 Stathmin superfamily - - Stathmin family; The Stathmin family of proteins play an important role in the regulation of the microtubule cytoskeleton. They regulate microtubule dynamics by promoting depolymerization of microtubules and/or preventing polymerisation of tubulin heterodimers. Q#19468 - CGI_10022430 superfamily 109874 162 247 1.42E-07 48.4446 cl02980 Stathmin superfamily N - Stathmin family; The Stathmin family of proteins play an important role in the regulation of the microtubule cytoskeleton. They regulate microtubule dynamics by promoting depolymerization of microtubules and/or preventing polymerisation of tubulin heterodimers. Q#19469 - CGI_10022431 superfamily 248014 1 72 1.48E-21 83.0972 cl17460 Csf4_U superfamily NC - CRISPR/Cas system-associated DinG family helicase Csf4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase Q#19470 - CGI_10022432 superfamily 248014 15 68 5.82E-19 83.7703 cl17460 Csf4_U superfamily N - CRISPR/Cas system-associated DinG family helicase Csf4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase Q#19472 - CGI_10022434 superfamily 241783 18 241 6.13E-36 133.078 cl00322 Ribosomal_L1 superfamily - - "Ribosomal protein L1. The L1 protein, located near the E-site of the ribosome, forms part of the L1 stalk along with 23S rRNA. In bacteria and archaea, L1 functions both as a ribosomal protein that binds rRNA, and as a translation repressor that binds its own mRNA. Like several other large ribosomal subunit proteins, L1 displays RNA chaperone activity. L1 is one of the largest ribosomal proteins. It is composed of two domains that cycle between open and closed conformations via a hinge motion. The RNA-binding site of L1 is highly conserved, with both mRNA and rRNA binding the same binding site." Q#19473 - CGI_10022435 superfamily 244121 101 390 2.03E-149 427.482 cl05556 Apyrase superfamily - - Apyrase; This family consists of several eukaryotic apyrase proteins (EC:3.6.1.5). The salivary apyrases of blood-feeding arthropods are nucleotide hydrolysing enzymes implicated in the inhibition of host platelet aggregation through the hydrolysis of extracellular adenosine diphosphate.. Q#19474 - CGI_10022436 superfamily 243077 108 161 6.51E-19 79.1265 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#19474 - CGI_10022436 superfamily 150099 260 360 1.87E-34 123.52 cl07818 DUF1977 superfamily - - Domain of unknown function (DUF1977); Members of this family of functionally uncharacterized domains are predominantly found in dnaj-like proteins. Q#19475 - CGI_10022437 superfamily 243092 71 319 1.43E-10 60.4264 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19476 - CGI_10022438 superfamily 247723 231 304 2.54E-36 128.532 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#19476 - CGI_10022438 superfamily 245716 159 185 5.67E-05 40.3053 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#19477 - CGI_10022439 superfamily 248097 168 283 4.11E-15 69.2162 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19477 - CGI_10022439 superfamily 221533 35 93 0.0078456 33.8244 cl13726 TMF_DNA_bd superfamily - - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#19478 - CGI_10022440 superfamily 241596 10 70 8.65E-15 66.0835 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#19478 - CGI_10022440 superfamily 243123 93 134 6.93E-05 39.0786 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#19479 - CGI_10022441 superfamily 191220 401 439 0.000718778 38.3466 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#19479 - CGI_10022441 superfamily 191220 554 592 0.00073843 37.9614 cl04972 SapB_1 superfamily - - "Saposin-like type B, region 1; Saposin-like type B, region 1. " Q#19480 - CGI_10022442 superfamily 243090 100 205 2.68E-70 212.324 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#19480 - CGI_10022442 superfamily 243090 1 100 1.75E-64 197.686 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#19487 - CGI_10022449 superfamily 243066 155 254 6.88E-14 68.0277 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19487 - CGI_10022449 superfamily 198867 265 365 2.21E-12 63.7167 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#19488 - CGI_10022450 superfamily 241578 366 485 9.99E-07 48.7162 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#19488 - CGI_10022450 superfamily 115363 609 669 1.15E-15 73.5601 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#19488 - CGI_10022450 superfamily 115363 680 704 3.23E-05 43.1294 cl05972 MIB_HERC2 superfamily C - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#19488 - CGI_10022450 superfamily 207713 967 1033 0.000776139 38.8602 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#19490 - CGI_10022452 superfamily 241733 10 88 2.38E-52 160.412 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#19491 - CGI_10022453 superfamily 219057 15 135 2.98E-61 187.086 cl05809 SAP18 superfamily - - Sin3 associated polypeptide p18 (SAP18); This family consists of several eukaryotic Sin3 associated polypeptide p18 (SAP18) sequences. SAP18 is known to be a component of the Sin3-containing complex which is responsible for the repression of transcription via the modification of histone polypeptides. SAP18 is also present in the ASAP complex which is thought to be involved in the regulation of splicing during the execution of programmed cell death. Q#19492 - CGI_10022454 superfamily 241571 92 205 2.93E-05 42.0143 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#19493 - CGI_10022455 superfamily 243072 10 131 5.23E-15 72.8014 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19493 - CGI_10022455 superfamily 243072 510 645 8.24E-14 69.3346 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19493 - CGI_10022455 superfamily 243072 367 491 9.50E-07 48.1486 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19493 - CGI_10022455 superfamily 243072 102 261 2.54E-06 46.6078 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19493 - CGI_10022455 superfamily 243072 583 763 0.000440302 39.6743 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19495 - CGI_10022457 superfamily 247736 330 426 5.68E-09 52.6632 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#19496 - CGI_10022459 superfamily 247675 68 342 2.13E-140 402.646 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#19497 - CGI_10022460 superfamily 241777 75 272 2.17E-12 65.3996 cl00316 Cation_efflux superfamily - - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#19498 - CGI_10022461 superfamily 217414 322 692 1.68E-87 283.067 cl03927 Otopetrin superfamily - - "Protein of unknown function, DUF270; Protein of unknown function, DUF270. " Q#19499 - CGI_10022462 superfamily 247683 499 560 2.71E-30 114.985 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#19499 - CGI_10022462 superfamily 241622 408 486 1.66E-13 67.5918 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19499 - CGI_10022462 superfamily 247744 686 868 3.99E-45 160.921 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#19499 - CGI_10022462 superfamily 243136 336 387 2.53E-05 42.8804 cl02672 L27 superfamily - - L27 domain; The L27 domain is found in receptor targeting proteins Lin-2 and Lin-7. Q#19500 - CGI_10022464 superfamily 241568 988 1034 0.00139248 38.2128 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#19500 - CGI_10022464 superfamily 243124 97 255 1.34E-43 157.204 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#19500 - CGI_10022464 superfamily 155088 469 537 1.52E-09 57.4599 cl02758 AMOP superfamily C - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#19501 - CGI_10022465 superfamily 243124 94 186 3.14E-24 95.1864 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#19502 - CGI_10022466 superfamily 247038 271 342 0.00059945 38.9816 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#19502 - CGI_10022466 superfamily 241568 738 783 0.00218685 37.0572 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#19502 - CGI_10022466 superfamily 243124 99 256 2.64E-44 157.589 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#19502 - CGI_10022466 superfamily 243065 470 643 5.79E-09 55.1405 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#19502 - CGI_10022466 superfamily 155088 377 452 6.13E-07 48.7462 cl02758 AMOP superfamily N - AMOP domain; This domain may have a role in cell adhesion. It is called the AMOP domain after Adhesion associated domain in MUC4 and Other Proteins. This domain is extracellular and contains a number of cysteines that probably form disulphide bridges. Q#19502 - CGI_10022466 superfamily 241623 586 702 0.00776041 37.2932 cl00119 PI3Kc_like superfamily N - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#19504 - CGI_10022468 superfamily 241592 34 122 9.17E-52 161.145 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#19506 - CGI_10022470 superfamily 247068 566 663 7.24E-23 95.4581 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19506 - CGI_10022470 superfamily 247068 460 539 1.66E-19 85.8281 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19506 - CGI_10022470 superfamily 247068 242 341 3.05E-17 79.2797 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19506 - CGI_10022470 superfamily 247068 131 233 3.39E-17 78.8945 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19506 - CGI_10022470 superfamily 247068 355 452 1.89E-16 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19506 - CGI_10022470 superfamily 247068 683 770 1.24E-15 74.6573 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19506 - CGI_10022470 superfamily 247068 19 123 4.88E-06 46.1526 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19507 - CGI_10022471 superfamily 247068 569 665 5.59E-29 112.792 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19507 - CGI_10022471 superfamily 247068 465 560 8.92E-25 100.851 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19507 - CGI_10022471 superfamily 247068 251 351 1.97E-24 99.6953 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19507 - CGI_10022471 superfamily 247068 142 243 6.36E-16 75.0425 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19507 - CGI_10022471 superfamily 247068 679 769 9.72E-15 71.9609 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19507 - CGI_10022471 superfamily 247068 365 456 2.55E-12 64.6421 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19507 - CGI_10022471 superfamily 247068 30 132 1.55E-07 50.3898 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19508 - CGI_10022472 superfamily 114912 4 38 1.62E-05 39.7269 cl17946 zf-U1 superfamily - - "U1 zinc finger; This family consists of several U1 small nuclear ribonucleoprotein C (U1-C) proteins. The U1 small nuclear ribonucleoprotein (U1 snRNP) binds to the pre-mRNA 5' splice site (ss) at early stages of spliceosome assembly. Recruitment of U1 to a class of weak 5' ss is promoted by binding of the protein TIA-1 to uridine-rich sequences immediately downstream from the 5' ss. Binding of TIA-1 in the vicinity of a 5' ss helps to stabilise U1 snRNP recruitment, at least in part, via a direct interaction with U1-C, thus providing one molecular mechanism for the function of this splicing regulator. This domain is probably a zinc-binding. It is found in multiple copies in some members of the family." Q#19508 - CGI_10022472 superfamily 245716 55 76 0.00249195 33.6859 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#19509 - CGI_10022473 superfamily 245201 74 297 8.19E-133 392.549 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19511 - CGI_10004154 superfamily 241568 3 59 0.000312607 37.8276 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#19513 - CGI_10004156 superfamily 245201 15 343 0 649.74 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19515 - CGI_10004159 superfamily 220635 6 182 3.95E-55 189.668 cl12380 DUF2151 superfamily N - "Cell cycle and development regulator; This is a set of proteins conserved from worms to humans. The proteins are a PAN GU kinase substrate, Mat89Bb, essential for S-M cycles of early Drosophila embryogenesis, Xenopus embryonic cell cycles and morphogenesis, and cell division in cultured mammalian cells." Q#19516 - CGI_10004160 superfamily 247724 228 400 2.47E-06 46.2956 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19516 - CGI_10004160 superfamily 242902 32 122 9.77E-15 70.7386 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#19517 - CGI_10003507 superfamily 241645 1 72 2.41E-25 92.9067 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#19517 - CGI_10003507 superfamily 248233 73 125 2.10E-18 73.9447 cl17679 Ribosomal_S30 superfamily - - Ribosomal protein S30; Ribosomal protein S30. Q#19518 - CGI_10003522 superfamily 243992 9 65 1.68E-06 40.635 cl05087 Complex1_LYR_1 superfamily - - "Complex1_LYR-like; This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria." Q#19520 - CGI_10003524 superfamily 204376 201 270 2.01E-07 47.9081 cl10817 DUF2260 superfamily C - "Uncharacterized conserved protein (DUF2260); This domain, found in various hypothetical bacterial proteins, has no known function." Q#19521 - CGI_10020896 superfamily 207662 35 125 1.72E-37 133.34 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#19521 - CGI_10020896 superfamily 245599 395 496 2.86E-18 82.2706 cl11397 NR_LBD superfamily N - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#19521 - CGI_10020896 superfamily 245599 248 310 8.04E-16 74.9518 cl11397 NR_LBD superfamily C - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#19524 - CGI_10020899 superfamily 241578 127 291 1.82E-38 138.188 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#19527 - CGI_10020902 superfamily 243100 79 135 6.75E-07 45.2476 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#19528 - CGI_10020903 superfamily 246669 411 545 3.87E-69 221.302 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#19528 - CGI_10020903 superfamily 246669 274 402 5.60E-46 158.574 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#19529 - CGI_10020904 superfamily 220393 2 253 1.46E-77 241.511 cl10751 Tmem26 superfamily - - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#19530 - CGI_10020905 superfamily 241555 7 210 3.38E-77 239.766 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#19530 - CGI_10020905 superfamily 241555 203 371 6.43E-66 210.49 cl00020 GAT_1 superfamily N - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#19531 - CGI_10020906 superfamily 241675 134 330 1.72E-51 172.871 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#19532 - CGI_10020907 superfamily 241675 137 333 7.80E-56 184.427 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#19533 - CGI_10020908 superfamily 206035 179 276 3.03E-21 85.6767 cl16440 Enkurin superfamily - - "Calmodulin-binding; This is a family of apparent calmodulin-binding proteins found at high levels in the testis and vomeronasal organ and at lower levels in certain other tissues. Enkurin is a scaffold protein that binds PI3 kinase to sperm transient receptor potential (canonical) (TRPC) channels. The mammalian transient receptor potential (canonical) channels are the primary candidates for the Ca(2+) entry pathway activated by the hormones, growth factors, and neurotransmitters that exert their effect through activation of PLC. Calmodulin binds to the C-terminus of all TRPC channels, and dissociation of calmodulin from TRPC4 results in profound activation of the channel." Q#19535 - CGI_10020910 superfamily 245835 231 474 6.18E-73 233.737 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#19535 - CGI_10020910 superfamily 243088 111 214 3.30E-55 182.043 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#19538 - CGI_10020913 superfamily 241733 151 208 1.03E-16 73.0618 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#19540 - CGI_10020915 superfamily 242414 143 284 1.10E-31 119.305 cl01285 Gar1 superfamily - - "Gar1/Naf1 RNA binding region; Gar1 is a small nucleolar RNP that is required for pre-mRNA processing and pseudouridylation. It is co-immunoprecipitated with the H/ACA families of snoRNAs. This family represents the conserved central region of Gar1. This region is necessary and sufficient for normal cell growth, and specifically binds two snoRNAs snR10 and snR30. This region is also necessary for nucleolar targeting, and it is thought that the protein is co-transported to the nucleolus as part of a nucleoprotein complex. In humans, Gar1 is also component of telomerase in vivo. Naf1 is an essentail protein that plays a role in ribosome biogenesis, modification of spliceosomal small nuclear RNAs and telomere synthesis, and is homologous to Gar1." Q#19541 - CGI_10020916 superfamily 243015 37 177 1.34E-09 53.6885 cl02381 Tim17 superfamily - - "Tim17/Tim22/Tim23/Pmp24 family; The pre-protein translocase of the mitochondrial outer membrane (Tom) allows the import of pre-proteins from the cytoplasm. Tom forms a complex with a number of proteins, including Tim17. Tim17 and Tim23 are thought to form the translocation channel of the inner membrane. This family includes Tim17, Tim22 and Tim23. This family also includes Pmp24 a peroxisomal protein. The involvement of this domain in the targeting of PMP24 remains to be proved. PMP24 was known as Pmp27 in." Q#19542 - CGI_10020917 superfamily 246669 979 1117 8.77E-55 188.986 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#19542 - CGI_10020917 superfamily 246669 101 317 2.18E-52 182.956 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#19542 - CGI_10020917 superfamily 247912 1156 1472 7.59E-17 82.164 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#19542 - CGI_10020917 superfamily 220800 862 935 3.28E-12 65.4043 cl11172 Membr_traf_MHD superfamily C - "Munc13 (mammalian uncoordinated) homology domain; Munc13 proteins constitute a family of three highly homologous molecules (Munc13-1, Munc13-2 and Munc13-3) with homology to Caenorhabditis elegans unc-13p. Munc13 proteins contain a phorbol ester-binding C1 domain and two C2 domains, which are Ca2+/phospholipid binding domains. Sequence analyses have uncovered two regions called Munc13 homology domains 1 (MHD1) and 2 (MHD2) that are arranged between two flanking C2 domains. MHD1 and MHD2 domains are present in a wide variety of proteins from Arabidopsis thaliana, C. elegans, Drosophila melanogaster, mouse, rat and human, some of which may function in a Munc13-like manner to regulate membrane trafficking. The MHD1 and MHD2 domains are predicted to be alpha-helical." Q#19544 - CGI_10020919 superfamily 247792 100 143 5.21E-08 50.1368 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19544 - CGI_10020919 superfamily 241563 171 208 0.000266649 39.2427 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19545 - CGI_10020920 superfamily 218118 93 146 3.12E-11 55.6981 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#19546 - CGI_10020921 superfamily 241592 19 116 2.75E-48 160.964 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#19546 - CGI_10020921 superfamily 241554 183 368 1.86E-61 197.764 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19547 - CGI_10020922 superfamily 247723 60 131 5.50E-34 122.309 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#19547 - CGI_10020922 superfamily 163611 374 472 8.69E-27 103.525 cl10824 BLOC1_2 superfamily - - "Biogenesis of lysosome-related organelles complex-1 subunit 2; Members of this family of proteins play a role in cellular proliferation, as well as in the biogenesis of specialized organelles of the endosomal-lysosomal system." Q#19547 - CGI_10020922 superfamily 247723 144 208 4.41E-19 81.6435 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#19548 - CGI_10020923 superfamily 247792 78 124 5.94E-12 57.0704 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19550 - CGI_10020925 superfamily 241874 24 484 1.24E-174 504.361 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#19551 - CGI_10020926 superfamily 241874 8 490 0 521.695 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#19552 - CGI_10020927 superfamily 222070 85 170 0.00287271 36.1165 cl18634 DDE_3 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#19556 - CGI_10020931 superfamily 247097 359 393 0.000337232 38.5862 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#19558 - CGI_10020933 superfamily 246908 740 834 2.09E-18 82.7469 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#19560 - CGI_10020935 superfamily 241563 62 101 9.65E-06 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19560 - CGI_10020935 superfamily 241563 8 53 0.000407573 38.4723 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19560 - CGI_10020935 superfamily 110440 482 508 0.00558134 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19561 - CGI_10020936 superfamily 241563 62 101 2.49E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19561 - CGI_10020936 superfamily 110440 508 535 0.00828517 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19562 - CGI_10020937 superfamily 241563 40 80 6.25E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19563 - CGI_10020938 superfamily 241574 162 297 5.46E-57 184.349 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#19563 - CGI_10020938 superfamily 241626 9 116 8.02E-25 97.7378 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#19565 - CGI_10020940 superfamily 241563 81 122 6.54E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19567 - CGI_10020942 superfamily 241645 673 748 7.75E-34 124.875 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#19567 - CGI_10020942 superfamily 243154 40 93 1.08E-22 92.6505 cl02715 Surp superfamily - - Surp module; This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. Q#19567 - CGI_10020942 superfamily 243154 147 200 2.99E-19 83.0205 cl02715 Surp superfamily - - Surp module; This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. Q#19567 - CGI_10020942 superfamily 221473 205 297 7.07E-16 77.0971 cl18609 PRP21_like_P superfamily C - "Pre-mRNA splicing factor PRP21 like protein; This domain family is found in eukaryotes, and is typically between 212 and 238 amino acids in length. The family is found in association with pfam01805. There are two completely conserved residues (W and H) that may be functionally important. PRP21 is required for assembly of the prespliceosome and it interacts with U2 snRNP and/or pre-mRNA in the prespliceosome. This family also contains proteins similar to PRP21, such as the mammalian SF3a. SF3a also interacts with U2 snRNP from the prespliceosome, converting it to its active form." Q#19567 - CGI_10020942 superfamily 221473 387 469 2.42E-11 62.8447 cl18609 PRP21_like_P superfamily N - "Pre-mRNA splicing factor PRP21 like protein; This domain family is found in eukaryotes, and is typically between 212 and 238 amino acids in length. The family is found in association with pfam01805. There are two completely conserved residues (W and H) that may be functionally important. PRP21 is required for assembly of the prespliceosome and it interacts with U2 snRNP and/or pre-mRNA in the prespliceosome. This family also contains proteins similar to PRP21, such as the mammalian SF3a. SF3a also interacts with U2 snRNP from the prespliceosome, converting it to its active form." Q#19570 - CGI_10020945 superfamily 245847 70 160 3.62E-10 53.8953 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#19574 - CGI_10005086 superfamily 245009 33 159 5.71E-45 154.748 cl09109 NTF2_like superfamily - - "Nuclear transport factor 2 (NTF2-like) superfamily. This family includes members of the NTF2 family, Delta-5-3-ketosteroid isomerases, Scytalone Dehydratases, and the beta subunit of Ring hydroxylating dioxygenases. This family is a classic example of divergent evolution wherein the proteins have many common structural details but diverge greatly in their function. For example, nuclear transport factor 2 (NTF2) mediates the nuclear import of RanGDP and binds to both RanGDP and FxFG repeat-containing nucleoporins while Ketosteroid isomerases catalyze the isomerization of delta-5-3-ketosteroid to delta-4-3-ketosteroid, by intramolecular transfer of the C4-beta proton to the C6-beta position. While the function of the beta sub-unit of the Ring hydroxylating dioxygenases is not known, Scytalone Dehydratases catalyzes two reactions in the biosynthetic pathway that produces fungal melanin. Members of the NTF2-like superfamily are widely distributed among bacteria, archaea and eukaryotes." Q#19574 - CGI_10005086 superfamily 247723 360 440 1.51E-35 127.126 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#19575 - CGI_10005087 superfamily 247769 563 738 4.10E-11 61.5865 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#19575 - CGI_10005087 superfamily 248010 154 299 2.38E-21 92.0591 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#19575 - CGI_10005087 superfamily 248010 308 467 4.74E-21 91.2887 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#19576 - CGI_10005088 superfamily 238155 159 279 2.48E-28 105.92 cl08547 SPARC_EC superfamily - - "SPARC_EC; extracellular Ca2+ binding domain (containing 2 EF-hand motifs) of SPARC and related proteins (QR1, SC1/hevin, testican and tsc-36/FRP). SPARC (BM-40) is a multifunctional glycoprotein, a matricellular protein, that functions to regulate cell-matrix interactions; binds to such proteins as collagen and vitronectin and binds to endothelial cells thus inhibiting cellular proliferation. The EC domain interacts with a follistatin-like (FS) domain which appears to stabilize Ca2+ binding. The two EF-hands interact canonically but their conserved disulfide bonds confer a tight association between the EF-hand pair and an acid/amphiphilic N-terminal helix. Proposed active form involves a Ca2+ dependent symmetric homodimerization of EC-FS modules." Q#19576 - CGI_10005088 superfamily 241607 71 156 4.07E-23 91.0006 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#19577 - CGI_10005089 superfamily 247724 62 116 6.74E-24 98.069 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19577 - CGI_10005089 superfamily 219841 116 204 9.12E-20 81.8475 cl07167 MMR_HSR1_C superfamily - - "GTPase of unknown function C-terminal; This domain is found at the C-terminus of pfam01926 in archaeal and eukaryotic GTP-binding proteins. The C-terminal domain of the GTP-binding proteins is necessary for the complete activity of the protein of interacting with the 50S ribosome and binding of both adenine and guanine nucleotides, with a preference for guanine nucleotides." Q#19578 - CGI_10008650 superfamily 241785 231 443 9.78E-42 148.776 cl00324 Ribosomal_L3 superfamily - - Ribosomal protein L3; Ribosomal protein L3. Q#19578 - CGI_10008650 superfamily 243066 19 112 5.47E-20 84.9765 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19580 - CGI_10008652 superfamily 241554 121 222 3.42E-23 95.7903 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19580 - CGI_10008652 superfamily 241752 439 574 9.88E-16 75.8299 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#19580 - CGI_10008652 superfamily 241554 6 80 1.48E-07 50.4219 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19581 - CGI_10008653 superfamily 241554 33 162 1.18E-27 108.887 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19581 - CGI_10008653 superfamily 241554 197 317 3.43E-26 105.035 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19581 - CGI_10008653 superfamily 241752 424 669 5.66E-16 76.9855 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#19586 - CGI_10008658 superfamily 241554 36 152 2.51E-22 93.8643 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19586 - CGI_10008658 superfamily 241554 216 327 2.04E-20 88.4715 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19586 - CGI_10008658 superfamily 241752 415 658 2.24E-15 75.0595 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#19587 - CGI_10008659 superfamily 241554 235 353 1.11E-29 115.05 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19587 - CGI_10008659 superfamily 241554 40 169 4.09E-23 96.1755 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19587 - CGI_10008659 superfamily 241752 459 706 8.69E-16 76.2151 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#19590 - CGI_10008662 superfamily 192384 6 149 5.68E-47 151.611 cl10775 SYS1 superfamily - - Integral membrane protein S linking to the trans Golgi network; Members of this family are integral membrane proteins involved in protein trafficking between the late Golgi and endosome. They may also serve as a receptor for ADP-ribosylation factor-related protein 1 (ARFRP1). Sys1p is a small integral membrane protein with four predicted transmembrane domains that localises to the Trans Golgi network TGN in yeast and human cells. Q#19592 - CGI_10008664 superfamily 243188 61 194 4.22E-33 117.757 cl02792 Cyt_c_Oxidase_IV superfamily - - "Cytochrome c oxidase subunit IV. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit IV is the largest of the nuclear-encoded subunits. It binds ATP at the matrix side, leading to an allosteric inhibition of enzyme activity at high intramitochondrial ATP/ADP ratios. In mammals, subunit IV has a lung-specific isoform and a ubiquitously expressed isoform." Q#19593 - CGI_10008665 superfamily 246911 49 156 1.09E-26 107.909 cl15262 PUB superfamily - - "PNGase/UBA or UBX (PUB) domain of p97 adaptor proteins; The PUB domain is found in p97 adaptor proteins such as PNGase, UBXD1 (UBX domain-containing protein 1), and RNF31 (RING finger protein 31). It functions as a p97 (also known as valosin-containing protein or VCP) adaptor by interacting with the D1 and/or D2 ATPase domains. The p97, a type II AAA+ ATPase, is involved in a variety of cellular processes such as the deglycosylation of ERAD substrates, membrane fusion, transcription factor activation and cell cycle regulation through differential binding to specific adaptor proteins. The PUB domain in UBX-domain protein 1 (UBXD1), which is widely expressed in higher eukaryotes (except for fungi) and which is involved in substrate recruitment to p97, interacts strongly with the C-terminus of p97. Peptide:N-glycanase (PNGase), a deglycosylating enzyme that functions in proteasome-dependent degradation of misfolded glycoproteins which are translocated from the endoplasmic reticulum (ER) to the cytosol during ERAD, associates with the ubiquitin-proteasome system proteins mediated by the N-terminal PUB domain. PNGase is present in all eukaryotic organisms; however, the yeast PNGase ortholog does not contain the PUB domain. The RNF31 protein, also known as HOIP or Zibra, contains an N-terminal PUB domain similar to those in PNGase and UBXD1, suggesting its association with p97." Q#19596 - CGI_10010047 superfamily 247986 557 668 2.72E-10 60.0794 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#19596 - CGI_10010047 superfamily 247986 369 425 1.91E-09 57.383 cl17432 PBPb superfamily NC - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#19596 - CGI_10010047 superfamily 245225 1 314 5.24E-31 125.49 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#19597 - CGI_10010048 superfamily 241832 423 528 1.45E-35 129.984 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#19597 - CGI_10010048 superfamily 241832 11 94 3.12E-24 98.6964 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#19598 - CGI_10010049 superfamily 243072 261 386 3.16E-39 138.285 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19598 - CGI_10010049 superfamily 243072 162 287 1.19E-36 131.352 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19600 - CGI_10010051 superfamily 241874 316 435 3.76E-09 57.5168 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#19600 - CGI_10010051 superfamily 241874 69 196 2.47E-07 52.1814 cl00456 SLC5-6-like_sbd superfamily NC - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#19601 - CGI_10010052 superfamily 217403 166 270 1.24E-16 73.611 cl18408 2OG-FeII_Oxy superfamily - - "2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 catalyzing the reaction: Procollagen L-proline + 2-oxoglutarate + O2 <=> procollagen trans- 4-hydroxy-L-proline + succinate + CO2. The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB." Q#19601 - CGI_10010052 superfamily 222608 1 113 5.28E-11 58.4198 cl18680 DIOX_N superfamily - - non-haem dioxygenase in morphine synthesis N-terminal; This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity. Q#19602 - CGI_10010053 superfamily 245814 219 286 2.30E-07 48.6395 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19602 - CGI_10010053 superfamily 245814 295 373 1.24E-10 58.405 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19602 - CGI_10010053 superfamily 245814 119 194 2.84E-08 51.3521 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19602 - CGI_10010053 superfamily 245814 374 456 2.41E-07 48.6934 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19606 - CGI_10010057 superfamily 215754 4 97 5.83E-19 79.2196 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#19606 - CGI_10010057 superfamily 215754 97 187 1.07E-14 66.8932 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#19606 - CGI_10010057 superfamily 215754 191 250 5.99E-09 51.4852 cl02813 Mito_carr superfamily C - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#19607 - CGI_10010058 superfamily 241578 4 181 1.78E-45 154.68 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#19608 - CGI_10010059 superfamily 207701 2 108 4.38E-30 109.306 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#19610 - CGI_10010061 superfamily 248097 129 259 1.21E-17 76.1498 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19610 - CGI_10010061 superfamily 248213 39 89 0.00446022 34.0877 cl17659 DivIC superfamily N - Septum formation initiator; DivIC from B. subtilis is necessary for both vegetative and sporulation septum formation. These proteins are mainly composed of an amino terminal coiled-coil. Q#19611 - CGI_10016150 superfamily 241596 45 101 4.74E-07 45.6679 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#19612 - CGI_10016151 superfamily 241596 44 101 1.94E-08 49.5199 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#19613 - CGI_10016152 superfamily 241596 48 105 7.40E-08 47.9791 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#19614 - CGI_10016153 superfamily 245612 69 506 1.71E-63 215.238 cl11426 Amidase superfamily - - Amidase; Amidase. Q#19615 - CGI_10016154 superfamily 216981 137 295 2.23E-06 47.1422 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#19615 - CGI_10016154 superfamily 209366 1006 1029 0.001751 37.5658 cl11604 zf-A20 superfamily - - A20-like zinc finger; The A20 Zn-finger of bovine/human Rabex5/rabGEF1 is a Ubiquitin Binding Domain. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation. Q#19620 - CGI_10016159 superfamily 217860 5 99 3.93E-26 95.8178 cl04379 APC8 superfamily N - "Anaphase promoting complex subunit 8 / Cdc23; The anaphase-promoting complex is composed of eight protein subunits, including BimE (APC1), CDC27 (APC3), CDC16 (APC6), and CDC23 (APC8)." Q#19621 - CGI_10016160 superfamily 243072 67 176 2.94E-23 92.4466 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19621 - CGI_10016160 superfamily 243073 238 277 0.000142832 38.6053 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#19621 - CGI_10016160 superfamily 243072 32 56 0.00111696 35.9928 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19623 - CGI_10016162 superfamily 247684 68 440 0 772.318 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19624 - CGI_10016163 superfamily 217928 1 155 3.60E-60 189.087 cl04419 Erv26 superfamily - - Transmembrane adaptor Erv26; Erv26 is an integral membrane protein that is packed into COPII vesicles and cycles between the ER and Golgi compartments. It directs pro-alkaline phosphatase into endoplasmic reticulum-derived COPII transport vesicles. Q#19625 - CGI_10016164 superfamily 243066 18 121 5.18E-23 92.6805 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19625 - CGI_10016164 superfamily 198867 130 238 6.23E-15 70.448 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#19627 - CGI_10016167 superfamily 248458 322 494 3.97E-10 60.0201 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19627 - CGI_10016167 superfamily 248458 21 197 2.80E-08 54.2421 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19629 - CGI_10016169 superfamily 248458 21 197 1.93E-06 47.6937 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19629 - CGI_10016169 superfamily 248458 224 331 0.000291018 40.7601 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19630 - CGI_10016170 superfamily 219464 1 270 8.36E-114 351.182 cl06544 NAGidase superfamily - - "beta-N-acetylglucosaminidase; This family has previously been described as a hyaluronidase. However, more recently it has been shown that this family has beta-N-acetylglucosaminidase activity." Q#19631 - CGI_10016171 superfamily 245864 5 374 5.46E-23 98.8898 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#19632 - CGI_10016172 superfamily 217381 57 142 3.21E-43 146.943 cl15956 TB2_DP1_HVA22 superfamily - - "TB2/DP1, HVA22 family; This family includes members from a wide variety of eukaryotes. It includes the TB2/DP1 (deleted in polyposis) protein, which in humans is deleted in severe forms of familial adenomatous polyposis, an autosomal dominant oncological inherited disease. The family also includes the plant protein of known similarity to TB2/DP1, the HVA22 abscisic acid-induced protein, which is thought to be a regulatory protein." Q#19635 - CGI_10016175 superfamily 243072 132 241 5.90E-25 96.2986 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19635 - CGI_10016175 superfamily 243072 25 187 1.24E-22 90.1354 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19636 - CGI_10016176 superfamily 243072 62 166 5.28E-17 73.957 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19637 - CGI_10016177 superfamily 243072 104 225 3.22E-25 100.151 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19637 - CGI_10016177 superfamily 243072 171 333 4.13E-21 88.5946 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19638 - CGI_10016178 superfamily 247856 394 440 3.69E-08 50.6241 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19638 - CGI_10016178 superfamily 246925 89 295 2.78E-18 84.7145 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#19639 - CGI_10016179 superfamily 241599 85 139 1.40E-14 68.424 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#19640 - CGI_10016180 superfamily 247999 650 702 1.80E-10 57.8856 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#19641 - CGI_10016181 superfamily 243100 280 332 2.99E-05 41.0601 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#19642 - CGI_10008943 superfamily 247723 16 95 3.84E-51 160.777 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#19642 - CGI_10008943 superfamily 247723 101 172 8.79E-46 146.928 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#19643 - CGI_10008944 superfamily 128937 4 69 1.49E-15 64.9764 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#19644 - CGI_10008945 superfamily 245815 24 155 6.81E-47 165.087 cl11961 ALDH-SF superfamily N - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#19644 - CGI_10008945 superfamily 128937 254 314 4.41E-14 65.7468 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#19644 - CGI_10008945 superfamily 128937 179 244 3.76E-11 58.0428 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#19645 - CGI_10008946 superfamily 246722 369 512 1.30E-54 183.546 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#19645 - CGI_10008946 superfamily 241777 170 290 6.65E-31 122.036 cl00316 Cation_efflux superfamily N - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#19646 - CGI_10008947 superfamily 246598 14 278 1.97E-106 314.59 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#19647 - CGI_10008948 superfamily 247727 45 150 2.54E-05 41.2615 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19649 - CGI_10008950 superfamily 207724 150 207 2.25E-11 59.5515 cl02772 BSD superfamily - - BSD domain; This domain contains a distinctive -FW- motif. It is found in a family of eukaryotic transcription factors as well as a set of proteins of unknown function. Q#19650 - CGI_10008951 superfamily 248012 9 128 5.14E-11 55.7924 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#19651 - CGI_10008952 superfamily 221744 27 205 4.22E-20 86.3358 cl18614 CABIT superfamily C - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#19652 - CGI_10008953 superfamily 221744 25 207 8.21E-13 67.4611 cl18614 CABIT superfamily C - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#19655 - CGI_10008956 superfamily 247057 304 365 6.90E-08 49.0716 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#19659 - CGI_10002672 superfamily 245847 19 155 1.12E-16 75.2857 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#19660 - CGI_10012721 superfamily 248097 81 207 2.00E-21 86.165 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19661 - CGI_10012722 superfamily 248012 9 119 7.11E-10 52.5801 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#19664 - CGI_10012725 superfamily 248012 10 106 0.000280864 38.4584 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#19665 - CGI_10012726 superfamily 241832 52 165 5.40E-70 210.614 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#19666 - CGI_10012727 superfamily 247724 107 205 2.13E-42 143.754 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19668 - CGI_10012729 superfamily 243050 1295 1347 1.54E-24 99.2653 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#19672 - CGI_10012733 superfamily 245303 6 362 2.99E-31 122.156 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#19673 - CGI_10012734 superfamily 243058 146 266 3.09E-07 47.6943 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#19673 - CGI_10012734 superfamily 243058 101 180 7.38E-05 40.3756 cl02500 ARM superfamily N - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#19674 - CGI_10012735 superfamily 243091 135 175 1.01E-08 51.724 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#19675 - CGI_10012736 superfamily 218358 38 157 1.21E-24 94.4023 cl04871 IPP-2 superfamily - - "Protein phosphatase inhibitor 2 (IPP-2); Protein phosphotase inhibitor 2 (IPP-2) is a phosphoprotein conserved among all eukaryotes, and it appears in both the nucleus and cytoplasm of tissue culture cells." Q#19676 - CGI_10012737 superfamily 216981 184 324 9.04E-11 61.0094 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#19677 - CGI_10012738 superfamily 222608 28 147 3.15E-25 98.0954 cl18680 DIOX_N superfamily - - non-haem dioxygenase in morphine synthesis N-terminal; This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity. Q#19677 - CGI_10012738 superfamily 217403 198 298 6.30E-22 88.2486 cl18408 2OG-FeII_Oxy superfamily - - "2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 catalyzing the reaction: Procollagen L-proline + 2-oxoglutarate + O2 <=> procollagen trans- 4-hydroxy-L-proline + succinate + CO2. The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB." Q#19678 - CGI_10012739 superfamily 215647 19 206 8.89E-08 50.6849 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#19679 - CGI_10012740 superfamily 148314 1 139 5.61E-06 44.8754 cl05919 XRCC4 superfamily C - "DNA double-strand break repair and V(D)J recombination protein XRCC4; This family consists of several eukaryotic DNA double-strand break repair and V(D)J recombination protein XRCC4 sequences. In the non-homologous end joining pathway of DNA double-strand break repair, the ligation step is catalyzed by a complex of XRCC4 and DNA ligase IV. It is thought that XRCC4 and ligase IV are essential for alignment-based gap filling, as well as for final ligation of the breaks." Q#19680 - CGI_10001663 superfamily 242120 2 136 1.72E-63 196.21 cl00821 Ribosomal_S3Ae superfamily N - Ribosomal S3Ae family; Ribosomal S3Ae family. Q#19685 - CGI_10024981 superfamily 243033 29 162 2.33E-27 101.241 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#19686 - CGI_10024982 superfamily 247755 1872 2075 3.19E-82 271.347 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#19686 - CGI_10024982 superfamily 243034 218 284 7.45E-13 67.7904 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#19686 - CGI_10024982 superfamily 247792 1830 1870 1.79E-12 65.1596 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19686 - CGI_10024982 superfamily 247789 2282 2421 4.59E-36 138.931 cl17235 ABC2_membrane superfamily N - ABC-2 type transporter; ABC-2 type transporter. Q#19687 - CGI_10024983 superfamily 243483 188 308 1.51E-30 117.898 cl03646 DUF159 superfamily C - "Uncharacterized ACR, COG2135; Uncharacterized ACR, COG2135. " Q#19687 - CGI_10024983 superfamily 243483 396 516 3.63E-15 73.9854 cl03646 DUF159 superfamily N - "Uncharacterized ACR, COG2135; Uncharacterized ACR, COG2135. " Q#19688 - CGI_10024984 superfamily 241602 129 198 1.32E-30 116.563 cl00087 HR1 superfamily - - "Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases; The HR1 domain, also called the ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. It is found in Rho effector proteins including PKC-related kinases such as vertebrate PRK1 (or PKN) and yeast PKC1 protein kinases C, as well as in rhophilins and Rho-associated kinase (ROCK). Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 domains may occur in repeat arrangements (PKN contains three HR1 domains), separated by a short linker region." Q#19688 - CGI_10024984 superfamily 241602 209 281 1.37E-28 110.843 cl00087 HR1 superfamily - - "Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases; The HR1 domain, also called the ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. It is found in Rho effector proteins including PKC-related kinases such as vertebrate PRK1 (or PKN) and yeast PKC1 protein kinases C, as well as in rhophilins and Rho-associated kinase (ROCK). Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 domains may occur in repeat arrangements (PKN contains three HR1 domains), separated by a short linker region." Q#19688 - CGI_10024984 superfamily 241602 38 102 2.57E-15 72.2986 cl00087 HR1 superfamily - - "Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases; The HR1 domain, also called the ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. It is found in Rho effector proteins including PKC-related kinases such as vertebrate PRK1 (or PKN) and yeast PKC1 protein kinases C, as well as in rhophilins and Rho-associated kinase (ROCK). Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 domains may occur in repeat arrangements (PKN contains three HR1 domains), separated by a short linker region." Q#19688 - CGI_10024984 superfamily 245201 678 953 9.17E-133 404.834 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19688 - CGI_10024984 superfamily 246669 388 475 2.24E-29 114.022 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#19689 - CGI_10024985 superfamily 247803 160 237 6.52E-38 137.363 cl17249 YlqF_related_GTPase superfamily C - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#19689 - CGI_10024985 superfamily 247803 340 424 1.76E-35 130.43 cl17249 YlqF_related_GTPase superfamily N - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#19693 - CGI_10024989 superfamily 241623 2098 2406 2.17E-165 515.449 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#19693 - CGI_10024989 superfamily 202180 3567 3595 6.37E-05 43.6064 cl03505 FATC superfamily - - "FATC domain; The FATC domain is named after FRAP, ATM, TRRAP C-terminal. The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability." Q#19696 - CGI_10024992 superfamily 247745 55 339 0 536.795 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#19696 - CGI_10024992 superfamily 245003 385 462 7.22E-23 94.9562 cl08536 Alpha-mann_mid superfamily - - "Alpha mannosidase, middle domain; Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase." Q#19697 - CGI_10024993 superfamily 247745 37 284 5.11E-152 453.592 cl17191 GH38-57_N_LamB_YdjC_SF superfamily - - "Catalytic domain of glycoside hydrolase (GH) families 38 and 57, lactam utilization protein LamB/YcsF family proteins, YdjC-family proteins, and similar proteins; The superfamily possesses strong sequence similarities across a wide range of all three kingdoms of life. It mainly includes four families, glycoside hydrolases family 38 (GH38), heat stable retaining glycoside hydrolases family 57 (GH57), lactam utilization protein LamB/YcsF family, and YdjC-family. The GH38 family corresponds to class II alpha-mannosidases (alphaMII, EC 3.2.1.24), which contain intermediate Golgi alpha-mannosidases II, acidic lysosomal alpha-mannosidases, animal sperm and epididymal alpha -mannosidases, neutral ER/cytosolic alpha-mannosidases, and some putative prokaryotic alpha-mannosidases. AlphaMII possess a-1,3, a-1,6, and a-1,2 hydrolytic activity, and catalyzes the degradation of N-linked oligosaccharides by employing a two-step mechanism involving the formation of a covalent glycosyl enzyme complex. GH57 is a purely prokaryotic family with the majority of thermostable enzymes from extremophiles (many of them are archaeal hyperthermophiles), which exhibit the enzyme specificities of alpha-amylase (EC 3.2.1.1), 4-alpha-glucanotransferase (EC 2.4.1.25), amylopullulanase (EC 3.2.1.1/41), and alpha-galactosidase (EC 3.2.1.22). This family also includes many hypothetical proteins with uncharacterized activity and specificity. GH57 cleaves alpha-glycosidic bond by employing a retaining mechanism, which involves a glycosyl-enzyme intermediate, allowing transglycosylation. Although the exact molecular function of LamB/YcsF family and YdjC-family remains unclear, they show high sequence and structure homology to the members of GH38 and GH57. Their catalytic domains adopt a similar parallel 7-stranded beta/alpha barrel, which is remotely related to catalytic NodB homology domain of the carbohydrate esterase 4 superfamily." Q#19697 - CGI_10024993 superfamily 245003 329 406 5.90E-23 94.9562 cl08536 Alpha-mann_mid superfamily - - "Alpha mannosidase, middle domain; Members of this family adopt a structure consisting of three alpha helices, in an immunoglobulin/albumin-binding domain-like fold. They are predominantly found in the enzyme alpha-mannosidase." Q#19698 - CGI_10024994 superfamily 247858 428 601 4.01E-40 144.068 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#19698 - CGI_10024994 superfamily 203913 23 154 2.22E-30 116.162 cl07084 P4Ha_N superfamily - - "Prolyl 4-Hydroxylase alpha-subunit, N-terminal region; The members of this family are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme (EC:1.14.11.2) is important in the post-translational modification of collagen, as it catalyzes the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase. The function of the N-terminal region featured in this family does not seem to be known." Q#19699 - CGI_10024995 superfamily 243056 625 849 1.11E-45 164.402 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#19700 - CGI_10024996 superfamily 192111 732 831 0.000122467 41.5241 cl07316 BRE1 superfamily - - BRE1 E3 ubiquitin ligase; BRE1 is an E3 ubiquitin ligase that has been shown to act as a transcriptional activator through direct activator interactions. Q#19701 - CGI_10024997 superfamily 241572 412 498 3.20E-11 60.7152 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#19701 - CGI_10024997 superfamily 241572 314 402 7.37E-11 59.5596 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#19701 - CGI_10024997 superfamily 243074 38 78 9.58E-07 46.661 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19702 - CGI_10024998 superfamily 245230 2 432 0 958.283 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#19703 - CGI_10024999 superfamily 245230 2 432 0 956.742 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#19706 - CGI_10025002 superfamily 247954 19 295 1.27E-25 103.314 cl17400 PRK06823 superfamily - - ornithine cyclodeaminase; Validated Q#19707 - CGI_10025003 superfamily 219472 106 242 4.62E-12 63.8927 cl09593 Nucleopor_Nup85 superfamily C - "Nup85 Nucleoporin; A family of nucleoporins conserved from yeast to human. THe nuclear pore complex is a large assembly composed of two essential complexes: the heptameric Nup84 complex and the heteromeric Nic96-containing complex. The Nup84 complex is composed of one copy each of Nup84, Nup85, Nup120, Nup133, Nup145C, Sec13, and Seh1. The structure of a complex of Nup85 and Seh1 was solved. The N-terminus of Nup85 is inserted and forms a three-stranded blade that completes the Seh1 6-bladed beta-propeller in trans. Following its N-terminal insertion blade, Nup85 forms a compact cuboid structure composed of 20 helices, with two distinct modules, referred to as crown and trunk." Q#19708 - CGI_10025004 superfamily 219472 2 336 1.33E-101 314.272 cl09593 Nucleopor_Nup85 superfamily N - "Nup85 Nucleoporin; A family of nucleoporins conserved from yeast to human. THe nuclear pore complex is a large assembly composed of two essential complexes: the heptameric Nup84 complex and the heteromeric Nic96-containing complex. The Nup84 complex is composed of one copy each of Nup84, Nup85, Nup120, Nup133, Nup145C, Sec13, and Seh1. The structure of a complex of Nup85 and Seh1 was solved. The N-terminus of Nup85 is inserted and forms a three-stranded blade that completes the Seh1 6-bladed beta-propeller in trans. Following its N-terminal insertion blade, Nup85 forms a compact cuboid structure composed of 20 helices, with two distinct modules, referred to as crown and trunk." Q#19710 - CGI_10025006 superfamily 241629 26 157 5.18E-33 120.703 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#19710 - CGI_10025006 superfamily 247097 196 230 0.000215219 38.1293 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#19711 - CGI_10025007 superfamily 243035 718 812 0.00116778 38.755 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19711 - CGI_10025007 superfamily 243093 622 693 6.58E-06 45.599 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#19711 - CGI_10025007 superfamily 221533 7 68 0.00191805 37.6764 cl13726 TMF_DNA_bd superfamily - - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#19712 - CGI_10025008 superfamily 248318 578 630 2.54E-23 94.0397 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#19712 - CGI_10025008 superfamily 243142 85 208 1.51E-38 138.528 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#19716 - CGI_10025012 superfamily 245312 103 304 2.68E-07 49.9552 cl10482 KefB superfamily C - "Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]" Q#19717 - CGI_10025013 superfamily 245312 46 447 1.95E-06 48.7751 cl10482 KefB superfamily - - "Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]" Q#19719 - CGI_10025015 superfamily 241645 1 76 5.12E-42 147.335 cl00155 UBQ superfamily N - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#19719 - CGI_10025015 superfamily 207411 564 602 2.06E-13 65.4884 cl01438 zf-AN1 superfamily - - "AN1-like Zinc finger; Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues." Q#19720 - CGI_10025016 superfamily 241672 107 439 2.99E-133 389.281 cl00192 ribokinase_pfkB_like superfamily - - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#19721 - CGI_10025017 superfamily 247736 950 1000 0.000436075 40.7221 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#19721 - CGI_10025017 superfamily 247999 265 312 7.62E-14 69.0564 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#19721 - CGI_10025017 superfamily 241591 103 159 6.00E-08 52.1946 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#19721 - CGI_10025017 superfamily 247999 208 265 1.51E-07 50.952 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#19721 - CGI_10025017 superfamily 201844 347 373 5.37E-07 48.831 cl03250 zf-C2HC superfamily - - "Zinc finger, C2HC type; This is a DNA binding zinc finger domain." Q#19726 - CGI_10025022 superfamily 247725 327 428 2.53E-43 150.474 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19726 - CGI_10025022 superfamily 241622 241 319 6.23E-16 73.3698 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19726 - CGI_10025022 superfamily 247725 456 548 1.57E-21 89.9973 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19727 - CGI_10025023 superfamily 222150 585 606 0.000236598 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#19727 - CGI_10025023 superfamily 222150 609 634 0.00311422 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#19727 - CGI_10025023 superfamily 222150 403 424 0.00647625 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#19728 - CGI_10025024 superfamily 202894 65 130 2.74E-20 80.7278 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#19729 - CGI_10025025 superfamily 241622 764 850 1.63E-20 89.163 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19729 - CGI_10025025 superfamily 241622 1199 1287 1.05E-19 86.8518 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19729 - CGI_10025025 superfamily 241622 896 980 8.70E-18 81.0738 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19729 - CGI_10025025 superfamily 241622 1295 1365 7.91E-07 49.1023 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19729 - CGI_10025025 superfamily 246925 31 235 1.56E-12 69.3065 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#19730 - CGI_10025026 superfamily 245716 1 24 1.29E-07 48.0093 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#19731 - CGI_10025027 superfamily 243066 18 118 9.68E-22 89.5989 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19731 - CGI_10025027 superfamily 198867 128 234 0.00028188 39.2469 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#19732 - CGI_10025028 superfamily 243098 83 122 1.80E-11 55.6819 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19733 - CGI_10025029 superfamily 247725 260 382 7.44E-60 192.905 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19733 - CGI_10025029 superfamily 243088 2 105 1.91E-58 189.078 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#19734 - CGI_10025030 superfamily 245815 24 480 0 738.608 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#19737 - CGI_10025033 superfamily 217062 66 218 4.09E-11 59.9755 cl12266 Branch superfamily N - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#19738 - CGI_10025034 superfamily 247856 337 405 5.54E-13 65.6469 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19738 - CGI_10025034 superfamily 241566 497 545 8.35E-07 47.4868 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#19738 - CGI_10025034 superfamily 241566 429 481 9.54E-07 47.1016 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#19738 - CGI_10025034 superfamily 243037 750 937 2.35E-64 214.121 cl02440 DAGK_acc superfamily - - Diacylglycerol kinase accessory domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown. Q#19738 - CGI_10025034 superfamily 248019 605 726 6.00E-54 184.037 cl17465 DAGK_cat superfamily - - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#19739 - CGI_10025035 superfamily 198738 366 450 1.37E-46 158.971 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#19740 - CGI_10025036 superfamily 247068 15 112 1.70E-08 51.9306 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#19741 - CGI_10025037 superfamily 247856 21 83 7.23E-11 54.0909 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19741 - CGI_10025037 superfamily 247856 95 157 1.11E-09 51.0093 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19742 - CGI_10025038 superfamily 219425 28 133 2.32E-09 51.0031 cl06494 Hydrolase_2 superfamily - - "Cell Wall Hydrolase; These enzymes have been implicated in cell wall hydrolysis, most extensively in Bacillus subtilis. For instance B. subtilis sleB is expressed during sporulation as an inactive form and then deposited on the cell outer cortex. During germination the the enzyme is activated and hydrolyses the cortex. A similar role is carried out by the partially redundant B. subtilis cwlJ. It is not clear whether these enzymes are amidases or peptidases." Q#19744 - CGI_10021221 superfamily 202894 51 116 9.53E-17 71.8682 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#19746 - CGI_10021223 superfamily 247724 500 719 2.28E-131 397.247 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19746 - CGI_10021223 superfamily 243184 814 920 2.48E-41 148.139 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#19746 - CGI_10021223 superfamily 243185 725 808 1.03E-14 71.4024 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#19746 - CGI_10021223 superfamily 220077 1 132 5.30E-27 108.676 cl07512 DUF1916 superfamily - - Domain of unknown function (DUF1916); This domain is found in various eukaryotic HBS1-like proteins. Q#19747 - CGI_10021224 superfamily 241745 8 185 2.16E-74 224.322 cl00276 Maf_Ham1 superfamily - - "Maf_Ham1. Maf, a nucleotide binding protein, has been implicated in inhibition of septum formation in eukaryotes, bacteria and archaea. A Ham1-related protein from Methanococcus jannaschii is a novel NTPase that has been shown to hydrolyze nonstandard nucleotides, such as hypoxanthine/xanthine NTP, but not standard nucleotides." Q#19748 - CGI_10021225 superfamily 207654 159 224 6.58E-20 84.4166 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#19748 - CGI_10021225 superfamily 207654 368 433 6.58E-20 84.4166 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#19748 - CGI_10021225 superfamily 207654 90 152 4.57E-18 79.409 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#19748 - CGI_10021225 superfamily 207654 527 592 2.06E-15 71.705 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#19748 - CGI_10021225 superfamily 207654 242 308 2.46E-15 71.3198 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#19748 - CGI_10021225 superfamily 207654 451 517 2.46E-15 71.3198 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#19748 - CGI_10021225 superfamily 207654 309 361 6.72E-15 69.7435 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#19752 - CGI_10021229 superfamily 222575 76 174 2.04E-12 60.0956 cl16668 FAM110_C superfamily - - Centrosome-associated C terminus; This is the C-terminus of a family of proteins that colocalise with the centrosome/microtubule organisation centre in interphase and at the spindle poles in mitosis. Q#19752 - CGI_10021229 superfamily 206330 12 70 9.43E-09 49.5489 cl16669 FAM110_N superfamily C - Centrosome-associated N terminus; This is the N-terminus of a family of proteins that colocalise with the centrosome/microtubule organisation centre in interphase and at the spindle poles in mitosis. Q#19754 - CGI_10021231 superfamily 148646 61 366 1.15E-173 488.902 cl12343 DUF1394 superfamily - - Protein of unknown function (DUF1394); This family consists of several hypothetical eukaryotic proteins of around 320 residues in length. The function of this family is unknown. Q#19755 - CGI_10021232 superfamily 148646 1 122 8.79E-78 238.908 cl12343 DUF1394 superfamily N - Protein of unknown function (DUF1394); This family consists of several hypothetical eukaryotic proteins of around 320 residues in length. The function of this family is unknown. Q#19756 - CGI_10021233 superfamily 243077 2778 2823 0.00702518 37.51 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#19757 - CGI_10021234 superfamily 243072 250 369 1.60E-32 119.796 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19757 - CGI_10021234 superfamily 243072 143 270 1.69E-29 111.321 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19757 - CGI_10021234 superfamily 243072 44 169 4.44E-22 90.9058 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19758 - CGI_10021235 superfamily 218498 359 545 3.00E-59 201.775 cl18459 TRM13 superfamily C - Methyltransferase TRM13; This is a family of eukaryotic proteins which are responsible for 2'-O-methylation of tRNA at position 4. TRM13 shows no sequence similarity to other known methyltransferases. Q#19758 - CGI_10021235 superfamily 218498 636 714 3.23E-17 81.2076 cl18459 TRM13 superfamily N - Methyltransferase TRM13; This is a family of eukaryotic proteins which are responsible for 2'-O-methylation of tRNA at position 4. TRM13 shows no sequence similarity to other known methyltransferases. Q#19758 - CGI_10021235 superfamily 191243 270 296 0.00320612 36.2663 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#19759 - CGI_10021236 superfamily 241643 164 200 8.02E-11 57.0839 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#19759 - CGI_10021236 superfamily 241643 344 379 0.000123187 39.3647 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#19759 - CGI_10021236 superfamily 241645 1 74 9.46E-24 93.1628 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#19759 - CGI_10021236 superfamily 192241 254 312 1.53E-20 84.2046 cl18177 XPC-binding superfamily - - "XPC-binding domain; Members of this family adopt a structure consisting of four alpha helices, arranged in an array. They bind specifically and directly to the xeroderma pigmentosum group C protein (XPC) to initiate nucleotide excision repair." Q#19760 - CGI_10021237 superfamily 241881 376 716 5.35E-157 460.842 cl00464 URO-D_CIMS_like superfamily - - "The URO-D_CIMS_like protein superfamily includes bacterial and eukaryotic uroporphyrinogen decarboxylases (URO-D), coenzyme M methyltransferases and other putative bacterial methyltransferases, as well as cobalamine (B12) independent methionine synthases. Despite their sequence similarities, members of this family have clearly different functions. Uroporphyrinogen decarboxylase (URO-D) decarboxylates the four acetate side chains of uroporphyrinogen III (uro-III) to create coproporphyrinogen III, an important branching point of the tetrapyrrole biosynthetic pathway. The methyltransferases represented here are important for ability of methanogenic organisms to use other compounds than carbon dioxide for reduction to methane, and methionine synthases transfer a methyl group from a folate cofactor to L-homocysteine in a reaction requiring zinc." Q#19760 - CGI_10021237 superfamily 243092 124 343 1.56E-24 103.954 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19760 - CGI_10021237 superfamily 243074 49 92 3.27E-06 45.1901 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19761 - CGI_10021238 superfamily 247920 21 88 0.00207519 38.5689 cl17366 Herpes_HEPA superfamily C - "Herpesvirus DNA helicase/primase complex associated protein; This family includes HSV UL8, EHV-1 54, VZV 52 AND HCMV 102." Q#19763 - CGI_10021240 superfamily 191362 157 209 9.78E-23 87.7114 cl05351 zf-nanos superfamily - - "Nanos RNA binding domain; This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localised determinant of posterior pattern. Nanos RNA is localised to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localised source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localised and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development." Q#19764 - CGI_10021241 superfamily 246921 284 338 2.52E-05 41.5921 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#19764 - CGI_10021241 superfamily 246921 16 45 0.000425008 38.1253 cl15299 FG-GAP superfamily NC - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#19765 - CGI_10021242 superfamily 244880 52 218 4.61E-78 236.847 cl08263 TBP_TLF superfamily - - "TATA box binding protein (TBP): Present in archaea and eukaryotes, TBPs are transcription factors that recognize promoters and initiate transcription. TBP has been shown to be an essential component of three different transcription initiation complexes: SL1, TFIID and TFIIIB, directing transcription by RNA polymerases I, II and III, respectively. TBP binds directly to the TATA box promoter element, where it nucleates polymerase assembly, thus defining the transcription start site. TBP's binding in the minor groove induces a dramatic DNA bending while its own structure barely changes. The conserved core domain of TBP, which binds to the TATA box, has a bipartite structure, with intramolecular symmetry generating a saddle-shaped structure that sits astride the DNA. New members of the TBP family, called TBP-like proteins (TBLP, TLF, TLP) or TBP-related factors (TRF1, TRF2,TRP), are similar to the core domain of TBPs, with identical or chemically similar amino acids at many equivalent positions, suggesting similar structure. However, TLFs contain distinct, conserved amino acids at several positions that distinguish them from TBP." Q#19766 - CGI_10021243 superfamily 216574 72 154 5.70E-19 83.026 cl14794 FAD_binding_4 superfamily N - "FAD binding domain; This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidises the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan." Q#19767 - CGI_10021244 superfamily 247792 722 765 1.34E-08 52.448 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19767 - CGI_10021244 superfamily 243112 440 607 2.03E-67 221.356 cl02620 YDG_SRA superfamily - - "YDG/SRA domain; The function of this domain is unknown, it contains a conserved motif YDG after which it has been named." Q#19767 - CGI_10021244 superfamily 241645 1 77 5.08E-35 128.397 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#19767 - CGI_10021244 superfamily 152583 151 251 4.45E-24 97.878 cl13567 DUF3590 superfamily - - "Protein of unknown function (DUF3590); This domain is found in eukaryotes, and is typically between 83 and 97 amino acids in length. It is found in association with pfam00097, pfam02182, pfam00628, pfam00240. There are two conserved sequence motifs: RAR and NYN. The domain is part of the protein NIRF which has zinc finger and ubiquitinating domains. The function of this domain is likely to be mainly structural, however this has not been confirmed." Q#19767 - CGI_10021244 superfamily 247999 350 387 0.000181055 40.273 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#19769 - CGI_10021246 superfamily 241584 127 218 3.14E-07 47.8763 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#19771 - CGI_10021248 superfamily 243857 27 696 0 1069.17 cl04712 Ceramidase_alk superfamily - - Neutral/alkaline non-lysosomal ceramidase; This family represents a group of neutral/alkaline ceramidases found in both bacteria and eukaryotes. Q#19772 - CGI_10021249 superfamily 247824 213 273 1.16E-08 52.7921 cl17270 APH_ChoK_like superfamily N - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#19778 - CGI_10021255 superfamily 242169 13 93 5.98E-13 59.8982 cl00886 Robl_LC7 superfamily - - "Roadblock/LC7 domain; This family includes proteins that are about 100 amino acids long and have been shown to be related. Members of this family of proteins are associated with both flagellar outer arm dynein and Drosophila and rat brain cytoplasmic dynein. It is proposed that roadblock/LC7 family members may modulate specific dynein functions. This family also includes Golgi-associated MP1 adapter protein and MglB from Myxococcus xanthus, a protein involved in gliding motility. However the family also includes members from non-motile bacteria such as Streptomyces coelicolor, suggesting that the protein may play a structural or regulatory role." Q#19779 - CGI_10021256 superfamily 247640 27 148 2.94E-06 46.9373 cl16915 ZnPC_S1P1 superfamily C - "Zinc dependent phospholipase C/S1-P1 nuclease; This model describes both the bacterial and archeal zinc-dependent phospholipase C, a domain found in the alpha toxin of Clostridium perfringens, as well as S1/P1 nucleases, which predominantly act on single-stranded DNA and RNA." Q#19779 - CGI_10021256 superfamily 246921 508 549 0.00152154 37.7401 cl15299 FG-GAP superfamily C - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#19779 - CGI_10021256 superfamily 246921 708 739 0.00344614 36.5845 cl15299 FG-GAP superfamily C - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#19780 - CGI_10021257 superfamily 247640 9 127 3.95E-09 56.1821 cl16915 ZnPC_S1P1 superfamily C - "Zinc dependent phospholipase C/S1-P1 nuclease; This model describes both the bacterial and archeal zinc-dependent phospholipase C, a domain found in the alpha toxin of Clostridium perfringens, as well as S1/P1 nucleases, which predominantly act on single-stranded DNA and RNA." Q#19780 - CGI_10021257 superfamily 246921 678 730 1.78E-06 46.5997 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#19780 - CGI_10021257 superfamily 246921 338 388 1.98E-06 46.5997 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#19780 - CGI_10021257 superfamily 246921 415 472 3.60E-05 42.7477 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#19780 - CGI_10021257 superfamily 246921 479 513 0.00620808 36.1993 cl15299 FG-GAP superfamily C - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#19783 - CGI_10021260 superfamily 246709 222 362 6.46E-71 220.929 cl14782 RNase_H superfamily - - "RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms, including bacteria, archaea and eukaryotes. Most prokaryotic and eukaryotic genomes contain multiple RNase H genes. Despite the lack of amino acid sequence homology, Type 1 and type 2 RNase H share a main-chain fold and steric configurations of the four acidic active-site residues and have the same catalytic mechanism and functions in cells. RNase H is involved in DNA replication, repair and transcription. One of the important functions of RNase H is to remove Okazaki fragments during DNA replication. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription." Q#19783 - CGI_10021260 superfamily 150794 1 136 3.70E-45 153.279 cl10862 DUF2368 superfamily - - Uncharacterized conserved protein (DUF2368); This family is conserved from nematodes to humans. The function is not known. Q#19783 - CGI_10021260 superfamily 201924 156 182 3.48E-06 43.6725 cl03316 Cauli_VI superfamily N - "Caulimovirus viroplasmin; This family consists of various caulimovirus viroplasmin proteins. The viroplasmin protein is encoded by gene VI and is the main component of viral inclusion bodies or viroplasms. Inclusions are the site of viral assembly, DNA synthesis and accumulation. Two domains exist within gene VI corresponding approximately to the 5' third and middle third of gene VI, these influence systemic infection in a light-dependent manner." Q#19785 - CGI_10021262 superfamily 243035 449 566 2.29E-15 73.4229 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19785 - CGI_10021262 superfamily 241804 4 343 1.79E-179 516.029 cl00348 COG0182 superfamily - - "Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]" Q#19786 - CGI_10021263 superfamily 245201 4 284 0 569.089 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19787 - CGI_10021264 superfamily 247724 7 173 1.55E-117 334.236 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19788 - CGI_10021265 superfamily 243038 125 200 7.51E-14 66.9805 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#19788 - CGI_10021265 superfamily 218263 272 403 4.59E-37 132.242 cl04748 DUF547 superfamily - - "Protein of unknown function, DUF547; Family of uncharacterized proteins from C. elegans and A. thaliana." Q#19788 - CGI_10021265 superfamily 241832 7 86 2.58E-25 98.871 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#19790 - CGI_10021267 superfamily 247856 93 154 1.94E-13 65.6469 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19791 - CGI_10021268 superfamily 247739 49 246 3.93E-60 193.609 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#19792 - CGI_10021269 superfamily 247792 29 71 1.83E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19792 - CGI_10021269 superfamily 207713 97 147 1.30E-13 65.0537 cl02729 WWE superfamily C - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#19793 - CGI_10021270 superfamily 220243 5 139 2.62E-52 173.625 cl09680 eIF3_N superfamily - - eIF3 subunit 6 N terminal domain; This is the N terminal domain of subunit 6 translation initiation factor eIF3. Q#19793 - CGI_10021270 superfamily 242889 291 392 2.33E-17 77.2581 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#19794 - CGI_10021271 superfamily 241585 72 115 0.00190617 34.802 cl00066 FU superfamily - - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#19795 - CGI_10021272 superfamily 243072 175 268 3.00E-16 75.4978 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19799 - CGI_10021276 superfamily 247725 79 178 9.48E-36 132.405 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19799 - CGI_10021276 superfamily 243056 566 784 8.98E-59 201.381 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#19805 - CGI_10003153 superfamily 219188 8 44 4.16E-06 42.2723 cl18498 Lung_7-TM_R superfamily N - Lung seven transmembrane receptor; This family represents a conserved region with eukaryotic lung seven transmembrane receptors and related proteins. Q#19808 - CGI_10003912 superfamily 220388 98 457 3.99E-118 360.14 cl12372 FimP superfamily - - "Fms-interacting protein; This entry carries part of the crucial 144 N-terminal residues of the FmiP protein, which is essential for the binding of the protein to the cytoplasmic domain of activated Fms-molecules in M-CSF induced haematopoietic differentiation of macrophages. The C-terminus contains a putative nuclear localisation sequence and a leucine zipper which suggest further, as yet unknown, nuclear functions. The level of FMIP expression might form a threshold that determines whether cells differentiate into macrophages or into granulocytes." Q#19809 - CGI_10003913 superfamily 247856 56 100 3.61E-08 47.9277 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19809 - CGI_10003913 superfamily 247856 126 183 0.000873913 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19812 - CGI_10002782 superfamily 248458 38 118 9.27E-08 50.3901 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19813 - CGI_10005250 superfamily 214531 60 102 5.18E-07 46.4409 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#19813 - CGI_10005250 superfamily 215683 388 428 0.000384287 37.9199 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#19813 - CGI_10005250 superfamily 245213 170 206 0.00800473 34.1448 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19814 - CGI_10005251 superfamily 243056 1 107 1.74E-21 93.1925 cl02495 RabGAP-TBC superfamily N - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#19815 - CGI_10009353 superfamily 247755 44 102 4.31E-08 51.8796 cl17201 ABC_ATPase superfamily NC - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#19816 - CGI_10009354 superfamily 245814 498 570 7.09E-08 50.5655 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19816 - CGI_10009354 superfamily 241756 8 316 3.98E-51 180.212 cl00289 FIG superfamily - - "FIG, FBPase/IMPase/glpX-like domain. A superfamily of metal-dependent phosphatases with various substrates. Fructose-1,6-bisphospatase (both the major and the glpX-encoded variant) hydrolyze fructose-1,6,-bisphosphate to fructose-6-phosphate in gluconeogenesis. Inositol-monophosphatases and inositol polyphosphatases play vital roles in eukaryotic signalling, as they participate in metabolizing the messenger molecule Inositol-1,4,5-triphosphate. Many of these enzymes are inhibited by Li+." Q#19816 - CGI_10009354 superfamily 245814 396 475 3.16E-17 78.1952 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19817 - CGI_10009355 superfamily 245201 511 813 0 591.751 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19818 - CGI_10009356 superfamily 222090 333 507 2.25E-18 83.8614 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#19820 - CGI_10009358 superfamily 215647 200 445 9.19E-34 128.11 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#19820 - CGI_10009358 superfamily 243029 114 180 1.56E-13 66.2201 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#19823 - CGI_10006754 superfamily 241563 63 102 2.26E-05 42.2744 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19823 - CGI_10006754 superfamily 191851 101 213 4.14E-06 46.0839 cl06708 DUF1640 superfamily - - Protein of unknown function (DUF1640); This family consists of sequences derived from hypothetical eukaryotic proteins. A region approximately 100 residues in length is featured. Q#19823 - CGI_10006754 superfamily 219497 11 37 0.0023524 36.1835 cl06619 C1_3 superfamily - - C1-like domain; This short domain is rich in cysteines and histidines. The pattern of conservation is similar to that found in pfam00130. Q#19824 - CGI_10006755 superfamily 242902 25 174 1.60E-15 69.1978 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#19825 - CGI_10006756 superfamily 242902 25 87 1.24E-13 63.8051 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#19828 - CGI_10000503 superfamily 243035 77 188 8.59E-29 108.476 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19828 - CGI_10000503 superfamily 246918 311 362 2.45E-13 64.1451 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#19828 - CGI_10000503 superfamily 246918 197 249 1.40E-12 62.2191 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#19828 - CGI_10000503 superfamily 243035 26 59 7.70E-09 52.6835 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19830 - CGI_10005111 superfamily 241974 578 737 7.21E-18 80.7486 cl00604 STAS superfamily - - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#19830 - CGI_10005111 superfamily 216188 242 523 9.47E-67 224.019 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#19830 - CGI_10005111 superfamily 205965 93 176 1.40E-35 130.225 cl18285 Sulfate_tra_GLY superfamily - - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#19833 - CGI_10005114 superfamily 218493 434 581 2.77E-48 165.993 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#19833 - CGI_10005114 superfamily 248054 26 54 0.00456256 35.5256 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#19833 - CGI_10005114 superfamily 248054 228 294 0.00695902 36.8967 cl17500 NAD_binding_8 superfamily NC - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#19834 - CGI_10005115 superfamily 245823 17 486 5.60E-179 517.99 cl11976 SNF superfamily - - Sodium:neurotransmitter symporter family; Sodium:neurotransmitter symporter family. Q#19835 - CGI_10005116 superfamily 245213 234 270 1.85E-08 50.713 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19835 - CGI_10005116 superfamily 245213 348 383 9.68E-08 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19835 - CGI_10005116 superfamily 245213 272 307 2.99E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19835 - CGI_10005116 superfamily 245213 310 345 1.92E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19835 - CGI_10005116 superfamily 245213 424 459 2.18E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19835 - CGI_10005116 superfamily 245213 386 422 3.26E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19835 - CGI_10005116 superfamily 243124 87 228 2.49E-37 134.477 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#19839 - CGI_10000672 superfamily 241600 105 230 3.08E-37 130.86 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19840 - CGI_10000741 superfamily 241567 17 188 1.40E-19 83.0335 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#19843 - CGI_10019396 superfamily 217293 28 73 0.000592692 37.9975 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#19844 - CGI_10019397 superfamily 243074 4 42 7.39E-07 46.7309 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19844 - CGI_10019397 superfamily 243092 78 108 7.50E-05 40.7984 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19846 - CGI_10019399 superfamily 243074 8 52 6.75E-14 63.6797 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19846 - CGI_10019399 superfamily 243092 90 120 6.00E-08 47.732 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19847 - CGI_10019400 superfamily 243074 8 52 7.33E-14 61.3685 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#19847 - CGI_10019400 superfamily 243092 90 118 6.79E-06 39.2576 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19847 - CGI_10019400 superfamily 219215 44 76 0.00950437 31.924 cl06096 Elongin_A superfamily NC - "RNA polymerase II transcription factor SIII (Elongin) subunit A; This family represents a conserved region within RNA polymerase II transcription factor SIII (Elongin) subunit A. In mammals, the Elongin complex activates elongation by RNA polymerase II by suppressing transient pausing of the polymerase at many sites within transcription units. Elongin is a heterotrimer composed of A, B, and C subunits of 110, 18, and 15 kilodaltons, respectively. Subunit A has been shown to function as the transcriptionally active component of Elongin." Q#19848 - CGI_10019401 superfamily 247792 51 88 0.00103488 37.4252 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19848 - CGI_10019401 superfamily 243039 398 576 8.56E-82 256.799 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#19848 - CGI_10019401 superfamily 190233 197 253 1.17E-12 63.6274 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#19848 - CGI_10019401 superfamily 190233 138 194 3.31E-06 45.1378 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#19849 - CGI_10019402 superfamily 227674 361 584 1.04E-21 98.2233 cl02227 Mpp10 superfamily N - "U3 small nucleolar ribonucleoprotein component [Translation, ribosomal structure and biogenesis]" Q#19849 - CGI_10019402 superfamily 227674 153 300 0.000612147 41.599 cl02227 Mpp10 superfamily C - "U3 small nucleolar ribonucleoprotein component [Translation, ribosomal structure and biogenesis]" Q#19850 - CGI_10019403 superfamily 215821 325 413 1.71E-29 110.793 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#19850 - CGI_10019403 superfamily 191166 417 490 2.16E-09 54.2094 cl04891 SRP40_C superfamily - - "SRP40, C-terminal domain; This presumed domain is found at the C-terminus of the S. cerevisiae SRP40 protein and its homologues. SRP40/nopp40 is a chaperone involved in nucleocytoplasmic transport. SRP40 is also a suppressor of mutant AC40 subunit of RNA polymerase I and III." Q#19851 - CGI_10019404 superfamily 148072 146 228 3.11E-32 115.222 cl09585 DUF1014 superfamily N - Protein of unknown function (DUF1014); This family consists of several hypothetical eukaryotic proteins of unknown function. Q#19852 - CGI_10019405 superfamily 242274 1 39 2.47E-05 40.0052 cl01053 SGNH_hydrolase superfamily NC - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#19853 - CGI_10019406 superfamily 220692 60 348 2.39E-05 44.8877 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#19854 - CGI_10019407 superfamily 243035 141 190 2.67E-07 48.0698 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#19855 - CGI_10019408 superfamily 151671 291 394 9.81E-18 83.9355 cl12777 DUF3028 superfamily C - Protein of unknown function (DUF3028); This eukaryotic family of proteins has no known function. Q#19857 - CGI_10019410 superfamily 245596 1155 1396 4.86E-171 515.402 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#19857 - CGI_10019410 superfamily 219023 896 1073 2.29E-74 247.966 cl05763 UDP-g_GGTase superfamily - - UDP-glucose:Glycoprotein Glucosyltransferase; The N-terminal region of this group of proteins is required for correct folding of the ER UDP-Glc: glucosyltransferase. Q#19858 - CGI_10019411 superfamily 238191 19 526 2.19E-145 432.912 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#19858 - CGI_10019411 superfamily 149659 565 603 4.47E-05 41.4389 cl07334 AChE_tetra superfamily - - Acetylcholinesterase tetramerisation domain; The acetylcholinesterase tetramerisation domain is found at the C terminus and forms a left handed superhelix. Q#19859 - CGI_10019412 superfamily 222428 82 394 4.64E-154 466.673 cl18675 AAA_34 superfamily - - P-loop containing NTP hydrolase pore-1; P-loop containing NTP hydrolase pore-1. Q#19859 - CGI_10019412 superfamily 222427 536 822 1.62E-117 368.051 cl18674 Helicase_C_4 superfamily - - "Helicase_C-like; Strawberry notch proteins carry DExD/H-box groups and Helicase_C domains. These proteins promote the expression of diverse targets, potentially through interactions with transcriptional activator or repressor complexes." Q#19860 - CGI_10019413 superfamily 247724 69 204 4.58E-36 126.537 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19861 - CGI_10019414 superfamily 247724 31 167 2.53E-29 107.662 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19863 - CGI_10019416 superfamily 247724 31 150 1.35E-20 84.9356 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19864 - CGI_10019417 superfamily 241599 188 246 2.11E-21 85.758 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#19865 - CGI_10019418 superfamily 241554 346 482 1.06E-28 111.198 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19866 - CGI_10019419 superfamily 241554 5 45 2.50E-10 53.8035 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#19868 - CGI_10001188 superfamily 245814 210 280 0.000116247 39.9154 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19869 - CGI_10002881 superfamily 241584 405 497 1.85E-11 62.5139 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#19869 - CGI_10002881 superfamily 245814 326 397 3.25E-07 49.4099 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19869 - CGI_10002881 superfamily 241584 733 831 8.80E-05 42.0983 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#19869 - CGI_10002881 superfamily 241584 614 727 0.000405417 40.1723 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#19869 - CGI_10002881 superfamily 241584 508 592 0.00787425 35.9351 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#19869 - CGI_10002881 superfamily 245814 233 293 2.05E-15 73.2099 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19869 - CGI_10002881 superfamily 245814 135 215 7.21E-15 71.6618 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19869 - CGI_10002881 superfamily 245814 45 106 1.45E-14 70.5135 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#19873 - CGI_10001303 superfamily 246918 157 200 4.43E-07 46.0407 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#19874 - CGI_10001304 superfamily 241619 37 101 0.000210296 35.6324 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#19877 - CGI_10000995 superfamily 218307 133 189 1.77E-14 69.5044 cl04818 NUDE_C superfamily C - "NUDE protein, C-terminal conserved region; This family represents the C-terminal conserved region of the NUDE proteins. NUDE proteins are involved in nuclear migration." Q#19879 - CGI_10002923 superfamily 243072 860 985 1.59E-35 132.893 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19879 - CGI_10002923 superfamily 243072 761 886 6.82E-35 130.967 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19879 - CGI_10002923 superfamily 243072 695 820 1.94E-32 124.033 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19879 - CGI_10002923 superfamily 243072 619 753 3.20E-28 111.707 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19879 - CGI_10002923 superfamily 243072 926 1049 4.03E-26 105.543 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#19879 - CGI_10002923 superfamily 247755 76 119 0.000275271 41.0763 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#19880 - CGI_10002924 superfamily 247856 12 74 3.40E-16 69.8841 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19880 - CGI_10002924 superfamily 247856 101 154 7.04E-16 68.7285 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#19882 - CGI_10002140 superfamily 248097 74 134 0.00658114 33.4297 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19883 - CGI_10002141 superfamily 248097 186 314 4.67E-17 75.3794 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19883 - CGI_10002141 superfamily 243066 8 79 3.23E-11 58.7829 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#19884 - CGI_10002154 superfamily 241550 1 217 3.42E-79 242.875 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#19890 - CGI_10001584 superfamily 241874 14 305 4.33E-130 392.64 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#19891 - CGI_10001585 superfamily 243058 439 544 0.000631544 38.8348 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#19895 - CGI_10006442 superfamily 241547 46 285 3.65E-94 294.947 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#19895 - CGI_10006442 superfamily 243092 312 601 5.46E-66 221.825 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#19897 - CGI_10006444 superfamily 240421 3 168 3.13E-09 53.1676 cl14774 PTZ00445 superfamily - - p36-lilke protein; Provisional Q#19898 - CGI_10006445 superfamily 241574 995 1243 1.27E-87 286.404 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#19898 - CGI_10006445 superfamily 241704 547 664 2.72E-24 102.066 cl00227 PEBP superfamily - - "PhosphatidylEthanolamine-Binding Protein (PEBP) domain; PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). A number of biological roles for members of the PEBP family include serine protease inhibition, membrane biogenesis, regulation of flowering plant stem architecture, and Raf-1 kinase inhibition. Although their overall structures are similar, the members of the PEBP family bind very different substrates including phospholipids, opioids, and hydrophobic odorant molecules as well as having different oligomerization states (monomer/dimer/tetramer)." Q#19898 - CGI_10006445 superfamily 247792 827 893 6.77E-06 45.5144 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#19898 - CGI_10006445 superfamily 243141 710 817 2.05E-26 107.017 cl02687 RWD superfamily - - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#19898 - CGI_10006445 superfamily 241704 289 445 6.11E-07 49.6792 cl00227 PEBP superfamily - - "PhosphatidylEthanolamine-Binding Protein (PEBP) domain; PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). A number of biological roles for members of the PEBP family include serine protease inhibition, membrane biogenesis, regulation of flowering plant stem architecture, and Raf-1 kinase inhibition. Although their overall structures are similar, the members of the PEBP family bind very different substrates including phospholipids, opioids, and hydrophobic odorant molecules as well as having different oligomerization states (monomer/dimer/tetramer)." Q#19898 - CGI_10006445 superfamily 148004 19 78 9.42E-07 48.4749 cl05589 Ifi-6-16 superfamily C - Interferon-induced 6-16 family; Interferon-induced 6-16 family. Q#19900 - CGI_10012306 superfamily 246664 13 203 3.12E-54 182.895 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#19901 - CGI_10012307 superfamily 246664 1 294 3.52E-129 380.767 cl14561 An_peroxidase_like superfamily N - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#19902 - CGI_10012308 superfamily 246664 300 670 1.28E-148 440.473 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#19902 - CGI_10012308 superfamily 246664 153 237 7.89E-07 50.746 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#19904 - CGI_10012310 superfamily 245201 18 211 1.04E-49 173.577 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19904 - CGI_10012310 superfamily 240618 296 382 1.05E-08 53.4068 cl18927 UBL_TBK1_like superfamily - - "Ubiquitin-Like Domain Of Human Tbk1 and similar proteins; This family contains ubiquitin-like domain (UBL) found in TANK-binding kinase 1 (TBK1) and similar proteins. TBK1 regulates factors such as IRF3 and IRF7, promoting antiviral activity in the interferon signaling pathways. In addition to the central UBL, these proteins have an N-terminal kinase domain and a C-terminal elongated helical domain. The ubiquitin-like domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the IFN pathway." Q#19906 - CGI_10012312 superfamily 241563 68 109 1.06E-05 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19907 - CGI_10012313 superfamily 245201 18 211 7.32E-50 173.962 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19907 - CGI_10012313 superfamily 240618 296 382 7.94E-09 53.792 cl18927 UBL_TBK1_like superfamily - - "Ubiquitin-Like Domain Of Human Tbk1 and similar proteins; This family contains ubiquitin-like domain (UBL) found in TANK-binding kinase 1 (TBK1) and similar proteins. TBK1 regulates factors such as IRF3 and IRF7, promoting antiviral activity in the interferon signaling pathways. In addition to the central UBL, these proteins have an N-terminal kinase domain and a C-terminal elongated helical domain. The ubiquitin-like domain acts as a protein-protein interaction domain, and has been implicated in regulating kinase activity, which modulates interactions in the IFN pathway." Q#19908 - CGI_10012314 superfamily 217064 1 86 2.07E-18 77.92 cl03617 CLN3 superfamily N - CLN3 protein; This is a family of proteins from the CLN3 gene. A missense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Q#19911 - CGI_10012317 superfamily 217064 51 230 4.24E-62 203.11 cl03617 CLN3 superfamily C - CLN3 protein; This is a family of proteins from the CLN3 gene. A missense mutation of glutamic acid (E) to lysine (K) at position 295 in the human protein has been implicated in Juvenile neuronal ceroid lipofuscinosis (Batten disease). Q#19911 - CGI_10012317 superfamily 241683 234 256 3.15E-05 43.658 cl00204 PFK superfamily N - "Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to PFK family that includes ATP- and pyrophosphate (PPi)- dependent phosphofructokinases. Some members evolved by gene duplication and thus have a large C-terminal/N-terminal extension comprising a second PFK domain. Generally, ATP-PFKs are allosteric homotetramers, and PPi-PFKs are dimeric and nonallosteric except for plant PPi-PFKs which are allosteric heterotetramers." Q#19912 - CGI_10012318 superfamily 247724 1 125 4.02E-42 146.141 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19914 - CGI_10012320 superfamily 248097 7 69 4.15E-05 37.2446 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19915 - CGI_10012321 superfamily 248097 14 133 5.89E-24 90.7874 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#19916 - CGI_10003329 superfamily 247724 9 179 5.59E-131 368.53 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19917 - CGI_10003330 superfamily 247727 53 166 1.57E-09 53.2027 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#19917 - CGI_10003330 superfamily 247844 32 70 0.00562815 35.7481 cl17290 Methyltransf_4 superfamily C - Putative methyltransferase; This is a family of putative methyltransferases. The aligned region contains the GXGXG S-AdoMet binding site suggesting a putative methyltransferase activity. Q#19919 - CGI_10003332 superfamily 219817 126 274 2.19E-37 138.52 cl07129 Xpo1 superfamily - - "Exportin 1-like protein; The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus." Q#19919 - CGI_10003332 superfamily 243689 49 119 0.00803334 35.6821 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#19920 - CGI_10003333 superfamily 241599 13 70 4.28E-19 78.4392 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#19921 - CGI_10003334 superfamily 241563 64 100 0.00011897 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19923 - CGI_10000659 superfamily 247684 11 48 0.00302228 32.6082 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19925 - CGI_10004643 superfamily 241584 60 101 0.000108548 37.0907 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#19926 - CGI_10004644 superfamily 221584 1 196 1.10E-57 185.197 cl13841 Dynactin superfamily C - "Dynein associated protein; This domain family is found in eukaryotes, and is approximately 280 amino acids in length. The family is found in association with pfam01302. There is a single completely conserved residue E that may be functionally important. Dynactin has been associated with Dynein, a kinesin protein which is involved in organelle transport, mitotic spindle assembly and chromosome segregation. Dynactin anchors Dynein to specific subcellular structures." Q#19927 - CGI_10004645 superfamily 241752 70 190 7.61E-45 146.696 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#19928 - CGI_10002258 superfamily 241600 303 484 3.95E-64 209.404 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19928 - CGI_10002258 superfamily 241600 2 65 5.08E-13 67.3022 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19929 - CGI_10002259 superfamily 245226 18 116 3.59E-15 69.2516 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#19932 - CGI_10011763 superfamily 246664 333 701 6.46E-152 449.718 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#19932 - CGI_10011763 superfamily 246664 188 274 0.00509422 38.4196 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#19933 - CGI_10011764 superfamily 191507 62 116 0.000691197 34.3342 cl05726 Adipokin_hormo superfamily - - "Adipokinetic hormone; This family consists of several insect adipokinetic hormone as well as the related crustacean red pigment concentrating hormone. Flight activity of insects comprises one of the most intense biochemical processes known in nature, and therefore provides an attractive model system to study the hormonal regulation of metabolism during physical exercise. In long-distance flying insects, such as the migratory locust, both carbohydrate and lipid reserves are utilised as fuels for sustained flight activity. The mobilization of these energy stores in Locusta migratoria is mediated by three structurally related adipokinetic hormones (AKHs), which are all capable of stimulating the release of both carbohydrates and lipids from the fat body." Q#19934 - CGI_10011765 superfamily 245601 5 89 5.05E-30 110.875 cl11399 HP superfamily C - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#19934 - CGI_10011765 superfamily 245601 159 230 5.29E-18 77.7476 cl11399 HP superfamily N - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#19935 - CGI_10011766 superfamily 192445 45 174 7.61E-30 111.35 cl10818 Med4 superfamily C - "Vitamin-D-receptor interacting Mediator subunit 4; Members of this family function as part of the Mediator (Med) complex, which links DNA-bound transcriptional regulators and the general transcription machinery, particularly the RNA polymerase II enzyme. They play a role in basal transcription by mediating activation or repression according to the specific complement of transcriptional regulators bound to the promoter." Q#19936 - CGI_10011767 superfamily 242885 41 202 5.63E-80 239.036 cl02106 IF4E superfamily - - Eukaryotic initiation factor 4E; Eukaryotic initiation factor 4E. Q#19937 - CGI_10011768 superfamily 241563 60 102 0.000197505 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#19937 - CGI_10011768 superfamily 110440 481 508 0.0048006 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#19938 - CGI_10011769 superfamily 247684 59 542 0 701.754 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#19939 - CGI_10011770 superfamily 245206 103 390 0 542.228 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#19940 - CGI_10011771 superfamily 243098 839 886 2.71E-13 67.6231 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19940 - CGI_10011771 superfamily 243098 2146 2196 1.60E-12 65.3119 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19940 - CGI_10011771 superfamily 243098 1239 1286 1.29E-10 59.9191 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19940 - CGI_10011771 superfamily 243098 1908 1955 1.29E-10 59.9191 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19940 - CGI_10011771 superfamily 243098 209 254 2.23E-10 59.1487 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19940 - CGI_10011771 superfamily 243098 615 660 7.78E-10 57.6079 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19940 - CGI_10011771 superfamily 243098 428 472 1.27E-09 56.8375 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19940 - CGI_10011771 superfamily 243098 1056 1102 1.90E-09 56.4523 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19940 - CGI_10011771 superfamily 243098 1725 1771 1.90E-09 56.4523 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#19941 - CGI_10011772 superfamily 216363 2 57 1.60E-08 46.3094 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#19948 - CGI_10011779 superfamily 222150 102 127 7.79E-05 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#19948 - CGI_10011779 superfamily 246975 89 110 0.00626374 33.0893 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#19949 - CGI_10006307 superfamily 244824 1 321 1.56E-71 230.709 cl07893 AmyAc_family superfamily N - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#19950 - CGI_10006308 superfamily 244824 2 40 5.55E-05 38.5165 cl07893 AmyAc_family superfamily N - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#19951 - CGI_10006309 superfamily 244859 62 301 1.26E-13 68.7273 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#19953 - CGI_10002542 superfamily 219542 51 129 7.27E-25 97.3123 cl18517 Cu-oxidase_3 superfamily C - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#19953 - CGI_10002542 superfamily 215896 179 224 1.41E-05 43.0524 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#19955 - CGI_10003538 superfamily 247740 6 79 3.51E-24 92.5991 cl17186 TIM_phosphate_binding superfamily C - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#19956 - CGI_10003539 superfamily 247740 1 166 1.66E-91 269.02 cl17186 TIM_phosphate_binding superfamily N - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#19957 - CGI_10006974 superfamily 246748 12 355 9.42E-149 428.046 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#19958 - CGI_10006975 superfamily 216167 700 861 6.68E-47 166.993 cl02999 DNA_photolyase superfamily - - DNA photolyase; This domain binds a light harvesting cofactor. Q#19959 - CGI_10006976 superfamily 242920 52 141 1.20E-39 130.728 cl02174 TAF13 superfamily - - "The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex; The TATA Binding Protein (TBP) Associated Factor 13 (TAF13) is one of several TAFs that bind TBP and is involved in forming the Transcription Factor IID (TFIID) complex. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAFs orthologs and paralogs. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. Each TAF, with the help of a specific activator, is required only for expression of subset of genes and is not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and are found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF13 interacts with TAF11 and makes a histone-like heterodimer similar to H3/H4-like proteins. The dimer may be structurally and functionally similar to the spt3 protein within the SAGA histone acetyltransferase complex." Q#19960 - CGI_10006977 superfamily 243362 139 303 2.32E-48 161.054 cl03262 DnaJ_C superfamily - - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#19960 - CGI_10006977 superfamily 243077 4 58 1.55E-23 91.4529 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#19961 - CGI_10006978 superfamily 241559 1747 1865 0.00360584 38.7996 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#19962 - CGI_10006979 superfamily 191166 264 337 3.31E-23 91.1886 cl04891 SRP40_C superfamily - - "SRP40, C-terminal domain; This presumed domain is found at the C-terminus of the S. cerevisiae SRP40 protein and its homologues. SRP40/nopp40 is a chaperone involved in nucleocytoplasmic transport. SRP40 is also a suppressor of mutant AC40 subunit of RNA polymerase I and III." Q#19963 - CGI_10006980 superfamily 152466 2246 2507 6.36E-99 321.768 cl13467 DUF3518 superfamily - - Domain of unknown function (DUF3518); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 260 amino acids in length. This domain is found associated with pfam01388. Q#19963 - CGI_10006980 superfamily 243120 893 985 1.36E-29 116.22 cl02633 ARID superfamily - - "ARID/BRIGHT DNA binding domain; This domain is know as ARID for AT-Rich Interaction Domain, and also known as the BRIGHT domain." Q#19964 - CGI_10006981 superfamily 248458 36 201 1.39E-14 73.8873 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#19964 - CGI_10006981 superfamily 113509 278 361 0.00143401 38.1801 cl17052 InvH superfamily C - "InvH outer membrane lipoprotein; This family represents the Salmonella outer membrane lipoprotein InvH. The molecular function of this protein is unknown, but it is required for the localisation to outer membrane of InvG, which is involved in a type III secretion apparatus mediating host cell invasion." Q#19965 - CGI_10006982 superfamily 247724 106 181 0.00483524 37.141 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#19967 - CGI_10014302 superfamily 247750 22 235 4.38E-94 284.37 cl17196 E1_enzyme_family superfamily - - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#19967 - CGI_10014302 superfamily 241626 272 400 8.30E-36 128.581 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#19968 - CGI_10014303 superfamily 241592 1 143 2.61E-41 136.573 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#19969 - CGI_10014304 superfamily 243106 674 771 6.97E-34 127.133 cl02608 BAH superfamily C - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#19970 - CGI_10014305 superfamily 245213 609 643 1.55E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#19972 - CGI_10014307 superfamily 245201 219 486 2.93E-53 181.986 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19973 - CGI_10014308 superfamily 245201 519 771 9.51E-41 150.4 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#19977 - CGI_10014312 superfamily 241622 863 945 5.61E-24 99.1782 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19977 - CGI_10014312 superfamily 241622 1730 1809 2.14E-18 83.385 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19977 - CGI_10014312 superfamily 241622 1113 1199 5.14E-17 79.1478 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19977 - CGI_10014312 superfamily 241622 1877 1951 3.49E-15 73.755 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19977 - CGI_10014312 superfamily 241622 1371 1440 4.96E-06 46.7911 cl00117 PDZ superfamily C - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19977 - CGI_10014312 superfamily 247725 585 713 2.36E-35 133.583 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#19977 - CGI_10014312 superfamily 215882 499 610 5.16E-30 117.767 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#19977 - CGI_10014312 superfamily 220215 406 493 1.48E-14 71.8726 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#19978 - CGI_10014313 superfamily 241574 24 152 2.85E-47 155.822 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#19979 - CGI_10014314 superfamily 241838 140 342 3.98E-134 404.523 cl00395 FMT_core superfamily - - "Formyltransferase, catalytic core domain; Formyltransferase, catalytic core domain. The proteins of this superfamily contain a formyltransferase domain that hydrolyzes the removal of a formyl group from its substrate as part of a multistep transfer mechanism, and this alignment model represents the catalytic core of the formyltransferase domain. This family includes the following known members; Glycinamide Ribonucleotide Transformylase (GART), Formyl-FH4 Hydrolase, Methionyl-tRNA Formyltransferase, ArnA, and 10-Formyltetrahydrofolate Dehydrogenase (FDH). Glycinamide Ribonucleotide Transformylase (GART) catalyzes the third step in de novo purine biosynthesis, the transfer of a formyl group to 5'-phosphoribosylglycinamide. Formyl-FH4 Hydrolase catalyzes the hydrolysis of 10-formyltetrahydrofolate (formyl-FH4) to FH4 and formate. Methionyl-tRNA Formyltransferase transfers a formyl group onto the amino terminus of the acyl moiety of the methionyl aminoacyl-tRNA, which plays important role in translation initiation. ArnA is required for the modification of lipid A with 4-amino-4-deoxy-l-arabinose (Ara4N) that leads to resistance to cationic antimicrobial peptides (CAMPs) and clinical antimicrobials such as polymyxin. 10-formyltetrahydrofolate dehydrogenase (FDH) catalyzes the conversion of 10-formyltetrahydrofolate, a precursor for nucleotide biosynthesis, to tetrahydrofolate. Members of this family are multidomain proteins. The formyltransferase domain is located at the N-terminus of FDH, Methionyl-tRNA Formyltransferase and ArnA, and at the C-terminus of Formyl-FH4 Hydrolase. Prokaryotic Glycinamide Ribonucleotide Transformylase (GART) is a single domain protein while eukaryotic GART is a trifunctional protein that catalyzes the second, third and fifth steps in de novo purine biosynthesis." Q#19979 - CGI_10014314 superfamily 246712 345 444 1.07E-43 154.812 cl14785 FMT_C_like superfamily - - "Carboxy-terminal domain of Formyltransferase and similar domains; This family represents the C-terminal domain of formyltransferase and similar proteins. This domain is found in a variety of enzymes with formyl transferase and alkyladenine DNA glycosylase activities. The proteins with formyltransferase function include methionyl-tRNA formyltransferase, ArnA, 10-formyltetrahydrofolate dehydrogenase and HypX proteins. Although most proteins with formyl transferase activity contain this C-terminal domain, prokaryotic glycinamide ribonucleotide transformylase (GART), a single domain protein, only contains the core catalytic domain. Thus, the C-terminal domain is not required for formyl transferase catalytic activity and may be involved in substrate binding. Some members of this family have shown nucleic acid binding capacity. The C-terminal domain of methionyl-tRNA formyltransferase is involved in tRNA binding. Alkyladenine DNA glycosylase is a distant member of this family with very low sequence similarity to other members. It catalyzes the first step in base excision repair (BER) by cleaving damaged DNA bases within double-stranded DNA to produce an abasic site and shows ability to bind to DNA." Q#19979 - CGI_10014314 superfamily 245815 708 986 0 570.206 cl11961 ALDH-SF superfamily N - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#19979 - CGI_10014314 superfamily 245815 554 678 6.29E-72 247.408 cl11961 ALDH-SF superfamily C - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#19979 - CGI_10014314 superfamily 245209 466 528 0.00198975 37.5342 cl09936 PP-binding superfamily - - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#19982 - CGI_10014317 superfamily 241600 154 350 9.35E-66 209.019 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19982 - CGI_10014317 superfamily 241600 32 147 1.19E-26 105.015 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19983 - CGI_10014318 superfamily 241600 156 364 5.54E-89 269.495 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19983 - CGI_10014318 superfamily 241600 1 128 2.52E-58 190.529 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#19984 - CGI_10014319 superfamily 243034 447 544 3.53E-17 78.5759 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#19984 - CGI_10014319 superfamily 243034 678 764 3.56E-15 72.7979 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#19984 - CGI_10014319 superfamily 243034 561 662 1.89E-13 67.7904 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#19984 - CGI_10014319 superfamily 149463 252 290 0.00318525 36.8089 cl07144 DUF1736 superfamily C - Domain of unknown function (DUF1736); This domain of unknown function is found in various hypothetical metazoan proteins. Q#19986 - CGI_10014321 superfamily 217584 33 152 4.23E-18 77.7827 cl04100 MOSC_N superfamily - - "MOSC N-terminal beta barrel domain; This domain is found to the N-terminus of pfam03473. The function of this domain is unknown, however it is predicted to adopt a beta barrel fold." Q#19986 - CGI_10014321 superfamily 217583 197 289 3.63E-07 46.9724 cl04097 MOSC superfamily - - "MOSC domain; The MOSC (MOCO sulfurase C-terminal) domain is a superfamily of beta-strand-rich domains identified in the molybdenum cofactor sulfurase and several other proteins from both prokaryotes and eukaryotes. These MOSC domains contain an absolutely conserved cysteine and occur either as stand-alone forms, or fused to other domains such as NifS-like catalytic domain in Molybdenum cofactor sulfurase. The MOSC domain is predicted to be a sulfur-carrier domain that receives sulfur abstracted by the pyridoxal phosphate-dependent NifS-like enzymes, on its conserved cysteine, and delivers it for the formation of diverse sulfur-metal clusters." Q#19990 - CGI_10009817 superfamily 241589 64 198 1.04E-42 141.617 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#19995 - CGI_10009822 superfamily 248098 85 162 1.68E-13 65.0073 cl17544 U-box superfamily - - U-box domain; This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. Q#19996 - CGI_10009823 superfamily 241622 215 307 3.17E-14 66.5092 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#19996 - CGI_10009823 superfamily 241640 39 174 6.35E-27 103.421 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#19997 - CGI_10009824 superfamily 246681 159 329 1.55E-63 203.29 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#20000 - CGI_10009827 superfamily 247769 472 670 8.69E-09 54.2677 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#20000 - CGI_10009827 superfamily 243045 156 258 2.79E-07 49.1687 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#20001 - CGI_10009828 superfamily 247724 48 168 8.83E-31 118.063 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20001 - CGI_10009828 superfamily 192499 427 511 6.82E-23 93.8394 cl10941 RNA_GG_bind superfamily - - "PHAX RNA-binding domain; RNA_GG_bind is the highly conserved U3 snoRNA-binding domain of PHAX (phosphorylated adaptor for RNA export) whose function is to transport U3 snoRNA from the nucleus after transcription. It is characterized by having two pairs of adjacent glycines, as GGx12GG." Q#20002 - CGI_10003930 superfamily 247068 489 580 4.03E-19 85.8281 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 698 787 7.24E-17 79.2797 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 588 685 1.04E-14 73.1165 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1836 1929 8.39E-14 70.4201 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1406 1502 7.91E-13 67.3385 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 2047 2141 8.40E-13 67.3385 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1074 1163 2.24E-12 66.1829 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 873 957 3.43E-11 62.7161 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1518 1605 4.21E-11 62.3309 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1942 2035 1.20E-09 58.0938 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 262 366 4.63E-08 53.4714 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1295 1393 5.77E-08 53.0862 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1177 1284 6.90E-08 52.701 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 970 1062 1.64E-07 51.5454 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1623 1719 3.42E-06 47.6934 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 385 459 9.81E-05 43.071 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 1730 1811 0.00016208 42.3006 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20002 - CGI_10003930 superfamily 247068 107 167 0.000358493 41.145 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20003 - CGI_10003931 superfamily 247068 37 122 2.30E-13 62.7161 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20003 - CGI_10003931 superfamily 247068 135 207 0.00276375 34.5966 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20005 - CGI_10006389 superfamily 247692 111 653 0 542.372 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#20006 - CGI_10006390 superfamily 245814 139 217 6.04E-08 48.2543 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20006 - CGI_10006390 superfamily 245814 29 121 9.53E-07 44.8037 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20007 - CGI_10020528 superfamily 218493 5 152 2.77E-47 152.896 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#20008 - CGI_10020530 superfamily 218493 345 391 6.97E-12 62.3746 cl08434 GMC_oxred_C superfamily C - GMC oxidoreductase; This domain found associated with pfam00732. Q#20009 - CGI_10020531 superfamily 218493 1 50 4.52E-18 73.1602 cl08434 GMC_oxred_C superfamily N - GMC oxidoreductase; This domain found associated with pfam00732. Q#20010 - CGI_10020532 superfamily 218493 454 601 1.13E-45 158.674 cl08434 GMC_oxred_C superfamily - - GMC oxidoreductase; This domain found associated with pfam00732. Q#20010 - CGI_10020532 superfamily 130849 51 84 0.00525101 37.922 cl17981 lycopene_cycl superfamily C - lycopene cyclase; This model represents a family of bacterial lycopene cyclases catalyzing the transformation of lycopene to carotene. These enzymes are found in a limited spectrum of alpha and gamma proteobacteria as well as Flavobacterium. Q#20014 - CGI_10020536 superfamily 238076 22 144 8.32E-54 179.922 cl18938 PAX superfamily - - Paired Box domain Q#20016 - CGI_10020538 superfamily 243034 466 559 1.07E-06 47.76 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20017 - CGI_10020539 superfamily 247637 330 654 2.57E-177 510.446 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#20020 - CGI_10020542 superfamily 177822 23 278 1.16E-31 120.795 cl18088 PLN02164 superfamily N - sulfotransferase Q#20021 - CGI_10020543 superfamily 243034 333 430 1.49E-11 61.242 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20021 - CGI_10020543 superfamily 243034 292 360 8.23E-05 40.8264 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20022 - CGI_10020544 superfamily 241622 1041 1125 2.66E-15 74.5254 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20022 - CGI_10020544 superfamily 241581 360 456 6.15E-09 56.2406 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#20022 - CGI_10020544 superfamily 241645 18 128 3.98E-42 153.21 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#20022 - CGI_10020544 superfamily 216736 804 909 7.53E-33 126.145 cl03379 DIL superfamily - - DIL domain; The DIL domain has no known function. Q#20022 - CGI_10020544 superfamily 241645 224 326 4.85E-28 111.962 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#20024 - CGI_10020546 superfamily 243072 1 86 2.80E-16 72.4162 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20025 - CGI_10020547 superfamily 245596 202 242 2.70E-24 98.0453 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#20027 - CGI_10020549 superfamily 198867 85 185 3.65E-33 121.497 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#20027 - CGI_10020549 superfamily 243066 2 80 4.20E-16 74.2644 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20027 - CGI_10020549 superfamily 243146 366 411 9.46E-13 63.4494 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20027 - CGI_10020549 superfamily 243146 332 377 3.69E-11 58.7239 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20027 - CGI_10020549 superfamily 243146 272 317 5.91E-11 58.4418 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20027 - CGI_10020549 superfamily 243146 426 471 2.52E-10 56.4127 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20027 - CGI_10020549 superfamily 243146 473 518 8.11E-10 54.8719 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20027 - CGI_10020549 superfamily 243146 237 282 9.56E-06 43.3159 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20028 - CGI_10020550 superfamily 198867 118 218 2.45E-34 125.349 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#20028 - CGI_10020550 superfamily 243066 7 109 7.13E-27 105.007 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20028 - CGI_10020550 superfamily 243146 411 457 4.73E-17 75.6727 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20028 - CGI_10020550 superfamily 243146 317 363 1.25E-13 66.0427 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20028 - CGI_10020550 superfamily 243146 352 396 4.32E-12 61.5234 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20028 - CGI_10020550 superfamily 243146 459 504 1.21E-10 57.5683 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20028 - CGI_10020550 superfamily 243146 493 538 9.38E-10 54.975 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20028 - CGI_10020550 superfamily 243146 270 315 1.16E-05 42.9307 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20029 - CGI_10020551 superfamily 245814 155 235 6.11E-05 40.1651 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20029 - CGI_10020551 superfamily 245814 36 135 6.81E-07 45.9593 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20032 - CGI_10020554 superfamily 217473 127 289 4.75E-25 105.14 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#20034 - CGI_10020556 superfamily 218088 3 133 3.12E-27 104.353 cl04518 RINT1_TIP1 superfamily N - "RINT-1 / TIP-1 family; This family includes RINT-1, a Rad50 interacting protein which participates in radiation induced checkpoint control, as well as the TIP-1 protein from yeast that seems to be involved in a complex with Sec20p that is required for golgi transport." Q#20035 - CGI_10020557 superfamily 243540 64 256 8.59E-29 109.645 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#20037 - CGI_10020559 superfamily 245206 1 244 6.17E-96 283.593 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#20038 - CGI_10020560 superfamily 243078 1 61 5.05E-14 67.3481 cl02544 VHS_ENTH_ANTH superfamily N - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#20041 - CGI_10020563 superfamily 243034 105 216 1.02E-13 70.8719 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20041 - CGI_10020563 superfamily 216112 2537 2904 1.08E-58 209.075 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#20041 - CGI_10020563 superfamily 221913 1873 2089 9.08E-46 167.332 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#20041 - CGI_10020563 superfamily 221913 3638 3831 1.95E-31 125.731 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#20041 - CGI_10020563 superfamily 222005 3392 3461 2.62E-05 45.4208 cl18632 AAA_19 superfamily - - Part of AAA domain; Part of AAA domain. Q#20041 - CGI_10020563 superfamily 222258 1674 1738 7.00E-05 45.2516 cl18656 AAA_30 superfamily C - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#20041 - CGI_10020563 superfamily 248275 1184 1208 0.000194166 42.182 cl17721 zf-C2H2_jaz superfamily - - "Zinc-finger double-stranded RNA-binding; This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation." Q#20044 - CGI_10020567 superfamily 248097 14 117 1.20E-17 73.4534 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20045 - CGI_10008096 superfamily 215724 1 138 1.26E-51 170.492 cl14706 wnt superfamily C - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#20048 - CGI_10008099 superfamily 247743 256 399 5.23E-20 87.9719 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#20048 - CGI_10008099 superfamily 247743 545 657 7.32E-07 48.2963 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#20048 - CGI_10008099 superfamily 216993 7 86 2.84E-10 57.9553 cl15641 CDC48_N superfamily - - "Cell division protein 48 (CDC48), N-terminal domain; This domain has a double psi-beta barrel fold and includes VCP-like ATPase and N-ethylmaleimide sensitive fusion protein N-terminal domains. Both the VAT and NSF N-terminal functional domains consist of two structural domains of which this is at the N-terminus. The VAT-N domain found in AAA ATPases pfam00004 is a substrate 185-residue recognition domain." Q#20050 - CGI_10008101 superfamily 241791 1 131 3.23E-78 231.456 cl00331 Ribosomal_S13 superfamily - - Ribosomal protein S13/S18; This family includes ribosomal protein S13 from prokaryotes and S18 from eukaryotes. Q#20052 - CGI_10008103 superfamily 243091 4357 4476 3.11E-40 149.022 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#20052 - CGI_10008103 superfamily 248279 1655 1734 6.85E-26 106.297 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#20052 - CGI_10008103 superfamily 243127 4193 4273 1.02E-18 85.424 cl02651 FYRC superfamily - - F/Y rich C-terminus; This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00542. Q#20052 - CGI_10008103 superfamily 247999 994 1038 4.22E-10 59.1478 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#20052 - CGI_10008103 superfamily 243126 1780 1830 4.41E-10 59.1416 cl02650 FYRN superfamily - - F/Y-rich N-terminus; This region is normally found in the trithorax/ALL1 family proteins. It is similar to SMART:SM00541. Q#20052 - CGI_10008103 superfamily 243084 1475 1548 1.72E-09 58.9855 cl02556 Bromodomain superfamily N - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#20052 - CGI_10008103 superfamily 202085 711 738 0.000182964 42.7326 cl03401 zf-CXXC superfamily C - "CXXC zinc finger domain; This domain contains eight conserved cysteine residues that bind to two zinc ions. The CXXC domain is found in a variety of chromatin-associated proteins. This domain binds to nonmethyl-CpG dinucleotides. The domain is characterized by two CGXCXXC repeats. The RecQ helicase has a single repeat that also binds to zinc, but this has not been included in this family. The DNA binding interface has been identified by NMR." Q#20054 - CGI_10008105 superfamily 241559 57 162 3.23E-10 58.0911 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#20054 - CGI_10008105 superfamily 198670 644 723 7.79E-28 108.083 cl02426 DIX superfamily - - DIX domain; The DIX domain is present in Dishevelled and axin. This domain is involved in homo- and hetero-oligomerisation. It is involved in the homo- oligomerisation of mouse axin. The axin DIX domain also interacts with the dishevelled DIX domain. The DIX domain has also been called the DAX domain. Q#20054 - CGI_10008105 superfamily 221533 402 471 9.52E-05 41.1432 cl13726 TMF_DNA_bd superfamily - - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#20055 - CGI_10008106 superfamily 247755 1308 1528 9.05E-129 400.716 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20055 - CGI_10008106 superfamily 247755 643 844 1.37E-101 324.037 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20055 - CGI_10008106 superfamily 216049 330 599 1.07E-33 132.795 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#20055 - CGI_10008106 superfamily 216049 1003 1261 5.45E-33 130.869 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#20056 - CGI_10008107 superfamily 241599 100 158 9.17E-21 83.0616 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#20057 - CGI_10009636 superfamily 241584 176 244 0.000504073 37.4035 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20057 - CGI_10009636 superfamily 246925 54 135 0.00050561 39.6462 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#20058 - CGI_10009637 superfamily 218849 8 213 1.02E-72 237.136 cl05508 Nop52 superfamily - - "Nucleolar protein,Nop52; Nop52 believed to be involved in the generation of 28S rRNA." Q#20059 - CGI_10009638 superfamily 219903 1430 1603 8.80E-33 128.541 cl09456 Cut8 superfamily - - "Cut8; In Schizosaccharomyces pombe, Cut8 is a nuclear envelope protein that physically interacts with and tethers 26S proteasome in the nucleus resulting in the nuclear accumulation of proteasome. Cut8 is a proteasome substrate and amino terminal residues 1-72 are polyubiquitinated and function as a degron tag. Ubiquitination of the amino terminal is essential to the function of Cut8. Lysine residues in the amino terminal 72 amino acids of Cut8 are required for physical interaction with proteasome. In fission yeast the function of Cut8 has been demonstrated to be regulated by ubiquitin-conjugating Rhp6/Ubc2/Rad6 and ligating enzymes Ubr1. Cut8 homologs have been identified in Drosophila melanogaster, Anopheles gambiae and Dictyostelium discoideum." Q#20059 - CGI_10009638 superfamily 243091 450 553 8.35E-05 43.0919 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#20059 - CGI_10009638 superfamily 197676 915 937 0.00944111 36.2897 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#20061 - CGI_10009640 superfamily 243267 35 399 4.04E-131 383.888 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#20062 - CGI_10009641 superfamily 241550 458 733 3.92E-108 344.23 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#20062 - CGI_10009641 superfamily 241550 132 287 5.48E-84 278.361 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#20062 - CGI_10009641 superfamily 245839 733 870 3.27E-60 202.787 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#20062 - CGI_10009641 superfamily 222257 309 406 1.34E-11 64.0343 cl16317 tRNA-synt_1_2 superfamily C - "Leucyl-tRNA synthetase, Domain 2; This is a family of the conserved region of Leucine-tRNA ligase or Leucyl-tRNA synthetase, EC:6.1.1.4." Q#20063 - CGI_10009642 superfamily 241832 210 322 7.15E-59 189.136 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#20063 - CGI_10009642 superfamily 241832 6 101 4.82E-54 176.427 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#20063 - CGI_10009642 superfamily 241832 109 196 8.13E-21 85.8442 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#20064 - CGI_10009643 superfamily 221761 4 153 1.10E-14 69.8183 cl15080 SAGA-Tad1 superfamily N - "Transcriptional regulator of RNA polII, SAGA, subunit; The yeast SAGA complex is a multifunctional coactivator that regulates transcription by RNA polymerase II. It is formed of five major modular subunits and shows a high degree of structural conservation to human TFTC and STAGA. The complex can also be conceived of as consisting of two histone-fold-containing core subunits, and this family is one of these. As a family it is likely to carry binding regions for interactions with a number of the other components of the complex." Q#20065 - CGI_10009644 superfamily 111019 5 270 1.95E-117 340.529 cl17927 SURF4 superfamily - - SURF4 family; SURF4 family. Q#20066 - CGI_10009645 superfamily 247744 270 462 5.19E-56 186.674 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#20066 - CGI_10009645 superfamily 247744 59 249 9.35E-35 128.509 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#20066 - CGI_10009645 superfamily 213107 17 54 0.00510957 34.9384 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#20067 - CGI_10009646 superfamily 241622 384 455 3.64E-15 72.5994 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20067 - CGI_10009646 superfamily 246669 223 340 9.70E-33 124.103 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#20067 - CGI_10009646 superfamily 243096 546 727 3.36E-09 56.0698 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#20069 - CGI_10009648 superfamily 220662 110 311 2.50E-73 236.968 cl10947 DUF2217 superfamily N - Uncharacterized conserved protein (DUF2217); This is a family of conserved proteins of from 500 - 600 residues found from worms to humans. Its function is not known. Q#20069 - CGI_10009648 superfamily 220662 33 149 5.29E-09 55.9245 cl10947 DUF2217 superfamily NC - Uncharacterized conserved protein (DUF2217); This is a family of conserved proteins of from 500 - 600 residues found from worms to humans. Its function is not known. Q#20070 - CGI_10009649 superfamily 241568 644 686 4.89E-05 42.0648 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#20070 - CGI_10009649 superfamily 214531 70 121 2.96E-06 45.2853 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20070 - CGI_10009649 superfamily 214531 491 531 3.52E-06 45.2853 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20070 - CGI_10009649 superfamily 205157 602 627 0.00126246 37.5171 cl18264 EGF_3 superfamily C - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#20071 - CGI_10009650 superfamily 241607 126 153 0.000865852 36.479 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#20071 - CGI_10009650 superfamily 241607 293 311 0.004274 34.5671 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#20072 - CGI_10009651 superfamily 241774 1 128 2.98E-83 244.677 cl00313 Ribosomal_S7 superfamily N - Ribosomal protein S7p/S5e; This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes. Q#20073 - CGI_10003285 superfamily 245815 37 558 0 913.125 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#20075 - CGI_10017821 superfamily 248097 6 126 2.13E-11 56.5046 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20077 - CGI_10017823 superfamily 243175 130 243 4.84E-51 165.859 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#20077 - CGI_10017823 superfamily 241832 18 75 4.28E-12 60.4636 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#20084 - CGI_10017830 superfamily 216056 35 163 4.39E-38 140.137 cl08279 Peptidase_M16 superfamily - - Insulinase (Peptidase family M16); Insulinase (Peptidase family M16). Q#20084 - CGI_10017830 superfamily 218490 190 368 9.31E-23 97.1619 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#20084 - CGI_10017830 superfamily 218490 626 809 2.72E-10 59.7975 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#20086 - CGI_10002556 superfamily 215647 106 265 6.36E-13 66.0928 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#20088 - CGI_10002558 superfamily 247755 252 487 4.54E-154 441.593 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20088 - CGI_10002558 superfamily 216049 17 205 6.42E-39 143.195 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#20090 - CGI_10005936 superfamily 142268 1 32 0.00954306 30.2948 cl09807 Lin0512_fam superfamily C - "Conserved hypothetical protein (Lin0512_fam); This family consists of few members, broadly distributed. It occurs so far in several Firmicutes (twice in Oceanobacillus), one Cyanobacterium, one alpha Proteobacterium, and (with a long prefix) in plants. The function is unknown. The alignment includes a well conserved motif GxGxDxHG near the N-terminus." Q#20091 - CGI_10005937 superfamily 247792 582 626 5.71E-06 44.3588 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20091 - CGI_10005937 superfamily 141623 91 211 8.79E-42 147.457 cl02685 Neuralized superfamily - - Neuralized; This family contains a conserved region approximately 60 residues long within eukaryotic neuralized and neuralized-like proteins. Neuralized belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the Drosophila nervous system. Some family members contain multiple copies of this region. Q#20091 - CGI_10005937 superfamily 141623 316 439 2.83E-23 95.8405 cl02685 Neuralized superfamily - - Neuralized; This family contains a conserved region approximately 60 residues long within eukaryotic neuralized and neuralized-like proteins. Neuralized belongs to a group of ubiquitin ligases and is required in a subset of Notch pathway-mediated cell fate decisions during development of the Drosophila nervous system. Some family members contain multiple copies of this region. Q#20094 - CGI_10005940 superfamily 247986 43 83 0.000638447 40.8194 cl17432 PBPb superfamily NC - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#20094 - CGI_10005940 superfamily 197504 194 326 1.04E-07 51.1361 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#20095 - CGI_10004403 superfamily 214531 24 63 1.32E-07 43.7445 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20096 - CGI_10004404 superfamily 241568 124 178 2.24E-05 43.6056 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#20096 - CGI_10004404 superfamily 241568 930 963 0.000604454 39.3684 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#20096 - CGI_10004404 superfamily 214531 761 801 1.07E-08 52.9893 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20096 - CGI_10004404 superfamily 214531 353 394 6.52E-06 44.9001 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20096 - CGI_10004404 superfamily 214531 308 351 4.82E-05 42.2037 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20096 - CGI_10004404 superfamily 215683 735 775 0.000411875 39.4607 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#20098 - CGI_10004406 superfamily 246669 58 175 3.26E-21 84.6375 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#20099 - CGI_10004407 superfamily 221377 248 344 5.81E-07 47.4635 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#20102 - CGI_10027874 superfamily 241571 24 124 1.57E-08 50.8739 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#20102 - CGI_10027874 superfamily 241571 128 226 0.00628194 34.6955 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#20103 - CGI_10027875 superfamily 241734 69 288 1.54E-75 245.856 cl00261 PLPDE_III superfamily C - "Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes; The fold type III PLP-dependent enzyme family is predominantly composed of two-domain proteins with similarity to bacterial alanine racemases (AR) including eukaryotic ornithine decarboxylases (ODC), prokaryotic diaminopimelate decarboxylases (DapDC), biosynthetic arginine decarboxylases (ADC), carboxynorspermidine decarboxylases (CANSDC), and similar proteins. AR-like proteins contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. These proteins play important roles in the biosynthesis of amino acids and polyamine. The family also includes the single-domain YBL036c-like proteins, which contain a single PLP-binding TIM-barrel domain without any N- or C-terminal extensions. Due to the lack of a second domain, these proteins may possess only limited D- to L-alanine racemase activity or non-specific racemase activity." Q#20103 - CGI_10027875 superfamily 241734 373 522 7.07E-53 185.379 cl00261 PLPDE_III superfamily N - "Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes; The fold type III PLP-dependent enzyme family is predominantly composed of two-domain proteins with similarity to bacterial alanine racemases (AR) including eukaryotic ornithine decarboxylases (ODC), prokaryotic diaminopimelate decarboxylases (DapDC), biosynthetic arginine decarboxylases (ADC), carboxynorspermidine decarboxylases (CANSDC), and similar proteins. AR-like proteins contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. These proteins play important roles in the biosynthesis of amino acids and polyamine. The family also includes the single-domain YBL036c-like proteins, which contain a single PLP-binding TIM-barrel domain without any N- or C-terminal extensions. Due to the lack of a second domain, these proteins may possess only limited D- to L-alanine racemase activity or non-specific racemase activity." Q#20104 - CGI_10027876 superfamily 247916 141 199 2.30E-07 48.9183 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#20105 - CGI_10027877 superfamily 247916 184 247 2.52E-08 51.9999 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#20105 - CGI_10027877 superfamily 193253 601 713 0.00686469 38.0941 cl15084 MT superfamily NC - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#20106 - CGI_10027878 superfamily 247916 141 194 2.95E-08 51.6147 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#20107 - CGI_10027879 superfamily 247916 38 145 3.63E-07 50.0918 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#20107 - CGI_10027879 superfamily 247916 789 842 8.03E-06 45.4515 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#20107 - CGI_10027879 superfamily 192292 516 577 0.00705568 36.4822 cl09671 Wbp11 superfamily - - WW domain binding protein 11; The WW domain is a small protein module with a triple-stranded beta-sheet fold. This is a family of WW domain binding proteins. Q#20109 - CGI_10027881 superfamily 241640 20 260 1.62E-84 254.894 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#20110 - CGI_10027882 superfamily 241777 407 721 3.71E-68 227.966 cl00316 Cation_efflux superfamily - - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#20111 - CGI_10027883 superfamily 243689 28 95 9.40E-13 65.343 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#20112 - CGI_10027884 superfamily 243074 125 170 1.78E-06 45.1901 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#20113 - CGI_10027885 superfamily 243146 69 121 1.84E-08 50.1393 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20113 - CGI_10027885 superfamily 243146 253 295 4.68E-07 46.1154 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20113 - CGI_10027885 superfamily 243146 26 78 3.57E-05 40.7355 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20113 - CGI_10027885 superfamily 243146 200 238 0.000205878 38.4114 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20117 - CGI_10027889 superfamily 215754 128 225 2.34E-25 96.1684 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#20117 - CGI_10027889 superfamily 215754 28 123 5.28E-23 90.0052 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#20119 - CGI_10027891 superfamily 215754 211 308 2.89E-25 97.324 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#20119 - CGI_10027891 superfamily 215754 111 206 8.64E-24 93.0868 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#20119 - CGI_10027891 superfamily 215754 11 104 4.97E-21 85.3828 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#20121 - CGI_10027893 superfamily 241874 25 551 0 852.353 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#20122 - CGI_10027894 superfamily 247724 2 108 6.16E-39 129.704 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20123 - CGI_10027895 superfamily 218708 58 286 4.19E-33 122.458 cl05328 DUF829 superfamily - - Eukaryotic protein of unknown function (DUF829); This family consists of several uncharacterized eukaryotic proteins. Q#20125 - CGI_10027897 superfamily 241862 58 267 5.59E-09 55.4702 cl00437 COG0428 superfamily C - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#20126 - CGI_10027898 superfamily 245206 6 250 1.03E-61 196.753 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#20129 - CGI_10027901 superfamily 245206 6 251 1.75E-72 224.487 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#20130 - CGI_10027902 superfamily 245206 6 251 1.46E-71 221.791 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#20133 - CGI_10027905 superfamily 243110 135 356 1.62E-21 93.6481 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#20135 - CGI_10027907 superfamily 245226 20 213 7.07E-48 164.765 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#20136 - CGI_10027908 superfamily 241874 9 547 8.33E-168 492.381 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#20137 - CGI_10027909 superfamily 241992 1 304 1.61E-102 309.966 cl00628 Piwi-like superfamily C - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#20138 - CGI_10027910 superfamily 241765 1395 1480 5.92E-37 137.392 cl00301 PAZ superfamily N - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#20139 - CGI_10027911 superfamily 242122 88 205 1.89E-08 51.0937 cl00824 HEPN superfamily - - HEPN domain; HEPN domain. Q#20139 - CGI_10027911 superfamily 243077 3 34 0.000920298 36.4437 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#20141 - CGI_10027913 superfamily 243053 755 991 1.95E-63 216.735 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#20141 - CGI_10027913 superfamily 247725 430 539 1.09E-42 152.872 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#20141 - CGI_10027913 superfamily 243067 599 722 1.01E-29 116.358 cl02520 REM superfamily - - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#20141 - CGI_10027913 superfamily 243096 197 381 1.89E-17 81.9604 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#20141 - CGI_10027913 superfamily 241592 99 169 7.30E-09 55.2534 cl00074 H2A superfamily C - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#20143 - CGI_10027915 superfamily 206354 147 230 2.78E-27 103.978 cl16692 Aida_C2 superfamily C - "Cytoskeletal adhesion; This is the C-terminal domain of the axin-interacting protein family, and is a distinct version of the C2 domain. This domain is critical for interactions with cytoskeletal in the context of cellular adhesion points." Q#20143 - CGI_10027915 superfamily 204091 15 113 2.45E-24 94.5577 cl07485 Aida-C2 superfamily - - "C2 domain of Aida; This is the C2 domain in the axin interaction dorsal-associated (Aida) proteins. In all proteins the Aida-C2 domain is found in the C-terminal portion of the protein. these proteins also contain diverse domains related to cytoskeletal functions, e.g. EF hands, coiled coils, IQ calmodulin-binding motifs, ankyrin repeats and myosin head motor domain. Aida blocks Axin-mediated JNK (c-Jun N-terminal kinases) activation by disrupting Axin homodimerisation, thereby having an anti-dorsalisation action in zebrafish. Axin is a scaffold protein that controls multiple, important pathways, including the canonical Wnt pathway and the JNK signalling pathway and besides its ventralising activity mediated though facilitating beta-catenin degradation, possesses a dorsalising activity that is mediated by Axin-induced JNK activation." Q#20144 - CGI_10027917 superfamily 241832 6 155 2.22E-76 227.06 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#20145 - CGI_10027918 superfamily 241581 11 100 7.81E-19 82.4342 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#20145 - CGI_10027918 superfamily 247792 391 433 1.16E-10 57.8408 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20147 - CGI_10027920 superfamily 243035 33 156 9.76E-26 96.1497 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20148 - CGI_10027921 superfamily 243035 33 156 1.73E-26 98.0757 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20149 - CGI_10027922 superfamily 243035 36 158 7.20E-28 101.928 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20150 - CGI_10027923 superfamily 245864 30 477 1.39E-62 212.524 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#20151 - CGI_10027924 superfamily 241749 54 194 8.35E-29 105.931 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#20152 - CGI_10027925 superfamily 241749 49 154 1.76E-10 55.0845 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#20153 - CGI_10027926 superfamily 241749 37 185 4.26E-18 77.8113 cl00280 globin_like superfamily - - superfamily containing globins and truncated hemoglobins Q#20155 - CGI_10027928 superfamily 217293 4 209 1.48E-73 230.983 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#20155 - CGI_10027928 superfamily 202474 202 374 4.17E-07 49.1893 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#20156 - CGI_10027929 superfamily 247044 129 152 5.63E-07 44.1361 cl15697 ADF_gelsolin superfamily C - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#20158 - CGI_10027931 superfamily 245814 51 134 3.93E-16 72.9232 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20159 - CGI_10027932 superfamily 243075 3 33 5.97E-05 40.7643 cl02536 SAND superfamily N - "SAND domain; The DNA binding activity of two proteins has been mapped to the SAND domain. The conserved KDWK motif is necessary for DNA binding, and it appears to be important for dimerisation. This region is also found in the putative transcription factor RegA from the multicellular green alga Volvox cateri. This region of RegA is known as the VARL domain." Q#20160 - CGI_10027933 superfamily 218871 172 289 4.43E-21 92.3948 cl15997 Sec6 superfamily C - "Exocyst complex component Sec6; Sec6 is a component of the multiprotein exocyst complex. Sec6 interacts with Sec8, Sec10 and Exo70.These exocyst proteins localise to regions of active exocytosis-at the growing ends of interphase cells and in the medial region of cells undergoing cytokinesis-in an F-actin-dependent and exocytosis- independent manner." Q#20161 - CGI_10027934 superfamily 218871 1 150 1.02E-35 129.374 cl15997 Sec6 superfamily NC - "Exocyst complex component Sec6; Sec6 is a component of the multiprotein exocyst complex. Sec6 interacts with Sec8, Sec10 and Exo70.These exocyst proteins localise to regions of active exocytosis-at the growing ends of interphase cells and in the medial region of cells undergoing cytokinesis-in an F-actin-dependent and exocytosis- independent manner." Q#20162 - CGI_10027935 superfamily 243035 1 67 2.27E-16 68.0301 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20166 - CGI_10027939 superfamily 247637 1039 1359 8.91E-50 182.045 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#20167 - CGI_10027940 superfamily 243072 327 452 2.02E-32 123.263 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20167 - CGI_10027940 superfamily 243072 32 157 2.00E-30 117.485 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20167 - CGI_10027940 superfamily 243072 494 618 3.96E-30 116.714 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20167 - CGI_10027940 superfamily 243072 131 259 1.74E-24 100.536 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20167 - CGI_10027940 superfamily 243072 592 726 1.18E-23 97.8394 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20167 - CGI_10027940 superfamily 243072 469 497 0.00164269 37.1484 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20170 - CGI_10027943 superfamily 193235 220 269 6.92E-26 97.2351 cl15075 HTH_Tnp_IS1 superfamily - - InsA C-terminal domain; This short domain is found at the C-terminus of the InsA protein. This domain contains a helix-turn-helix domain. Q#20170 - CGI_10027943 superfamily 190760 182 217 2.87E-14 64.8683 cl09312 Zn_Tnp_IS1 superfamily - - InsA N-terminal domain; This appears to be a short zinc binding domain found in IS1 InsA family protein. It is found at the N-terminus of the protein and may be a DNA-binding domain. Q#20173 - CGI_10027948 superfamily 110440 381 408 0.000254445 38.5429 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20173 - CGI_10027948 superfamily 243092 144 232 0.00992645 36.1588 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20174 - CGI_10027949 superfamily 247755 182 211 1.63E-10 58.2852 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20175 - CGI_10027950 superfamily 247755 1375 1593 7.31E-101 323.688 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20175 - CGI_10027950 superfamily 247755 504 723 7.61E-100 320.991 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20175 - CGI_10027950 superfamily 247789 1062 1209 0.00275187 39.9346 cl17235 ABC2_membrane superfamily - - ABC-2 type transporter; ABC-2 type transporter. Q#20176 - CGI_10027951 superfamily 247724 45 181 6.92E-90 262.838 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20177 - CGI_10027952 superfamily 247755 582 802 1.56E-119 362.967 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20177 - CGI_10027952 superfamily 216049 373 531 6.46E-17 80.793 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#20177 - CGI_10027952 superfamily 110440 315 339 4.60E-05 41.6245 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20179 - CGI_10027954 superfamily 248469 90 187 5.03E-11 57.3799 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#20179 - CGI_10027954 superfamily 247756 12 116 0.000532612 38.1099 cl17202 HAD superfamily C - haloacid dehalogenase-like hydrolase; haloacid dehalogenase-like hydrolase. Q#20180 - CGI_10027955 superfamily 243072 36 148 2.96E-36 124.418 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20182 - CGI_10027957 superfamily 247724 1 283 0 582.248 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20183 - CGI_10027958 superfamily 241572 13 102 1.55E-18 77.2788 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#20183 - CGI_10027958 superfamily 241572 112 195 1.16E-16 72.2712 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#20184 - CGI_10027959 superfamily 241572 209 298 5.03E-20 84.2124 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#20184 - CGI_10027959 superfamily 241572 308 383 1.51E-17 77.2788 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#20185 - CGI_10027960 superfamily 245213 4 34 0.000586984 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20185 - CGI_10027960 superfamily 245213 45 73 0.000910963 36.0754 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20187 - CGI_10027962 superfamily 243035 135 250 2.29E-18 78.8157 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20187 - CGI_10027962 superfamily 241619 34 98 0.000109558 39.4877 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#20191 - CGI_10018592 superfamily 241599 153 210 1.20E-20 82.6764 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#20193 - CGI_10018594 superfamily 247724 9 80 7.61E-31 109.184 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20194 - CGI_10018595 superfamily 241718 2 218 3.78E-115 340.417 cl00241 IF6 superfamily - - "Ribosome anti-association factor IF6 binds the large ribosomal subunit and prevents the two subunits from associating during translation initiation. IF6 comprises a family of translation factors that includes both eukaryotic (eIF6) and archeal (aIF6) members. All members of this family have a conserved pentameric fold referred to as a beta/alpha propeller. The eukaryotic IF6 members have a moderately conserved C-terminal extension which is not required for ribosomal binding, and may have an alternative function." Q#20197 - CGI_10018599 superfamily 245213 1497 1526 0.00016304 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 1621 1650 0.00016304 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 1373 1406 0.000210379 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 1580 1614 0.00126151 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 1456 1491 0.00224176 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 996 1029 0.00362898 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 1704 1738 0.00931724 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 241578 1244 1283 2.85E-06 48.9204 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20197 - CGI_10018599 superfamily 241578 1160 1204 6.43E-06 47.7648 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20197 - CGI_10018599 superfamily 243124 345 404 1.47E-05 45.4957 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#20197 - CGI_10018599 superfamily 241578 828 868 3.17E-05 45.8388 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20197 - CGI_10018599 superfamily 241578 1410 1452 0.000125151 43.9128 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20197 - CGI_10018599 superfamily 241578 1070 1111 0.00120858 41.2164 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20197 - CGI_10018599 superfamily 245213 1330 1368 0.00149227 38.382 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 1207 1236 0.00287097 37.6116 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 1660 1692 0.007114 36.456 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 245213 1536 1568 0.007114 36.456 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20197 - CGI_10018599 superfamily 241578 910 950 0.00945096 38.1348 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20198 - CGI_10018600 superfamily 247905 190 296 5.12E-12 62.2552 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#20198 - CGI_10018600 superfamily 247805 43 130 7.27E-05 41.4759 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#20200 - CGI_10018602 superfamily 241596 52 111 2.06E-14 65.6983 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#20201 - CGI_10018603 superfamily 241599 156 214 2.21E-21 84.9876 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#20201 - CGI_10018603 superfamily 247725 112 168 0.00557175 35.5242 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#20204 - CGI_10019337 superfamily 245847 286 363 0.00101741 37.9214 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20205 - CGI_10019338 superfamily 241563 40 76 3.73E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20205 - CGI_10019338 superfamily 110440 460 487 0.000910138 37.3873 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20206 - CGI_10019339 superfamily 245847 161 306 2.14E-18 79.5229 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20206 - CGI_10019339 superfamily 241568 128 149 0.0064174 33.9756 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#20207 - CGI_10019340 superfamily 247097 78 112 0.001111 35.4329 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#20207 - CGI_10019340 superfamily 245226 20 62 0.0012325 37.2801 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#20208 - CGI_10019341 superfamily 245226 12 60 0.000136356 39.2061 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#20212 - CGI_10019345 superfamily 245213 197 232 5.33E-09 52.639 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20212 - CGI_10019345 superfamily 245847 238 304 3.87E-14 69.8929 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20213 - CGI_10019346 superfamily 245847 113 250 1.25E-16 73.7449 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20218 - CGI_10019351 superfamily 245847 1 142 1.65E-19 79.9081 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20219 - CGI_10019352 superfamily 245213 74 109 5.42E-11 57.6466 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20219 - CGI_10019352 superfamily 245213 112 147 5.88E-11 57.6466 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20219 - CGI_10019352 superfamily 245847 288 429 4.91E-19 82.9897 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20219 - CGI_10019352 superfamily 245847 153 219 8.36E-11 59.4925 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20220 - CGI_10019353 superfamily 242406 1 119 7.93E-35 119.233 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#20222 - CGI_10019357 superfamily 248097 4 114 3.18E-19 77.6906 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20223 - CGI_10019358 superfamily 241563 348 383 2.66E-05 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20223 - CGI_10019358 superfamily 110440 813 840 0.00289866 36.6169 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20226 - CGI_10019361 superfamily 241564 157 222 2.21E-21 86.9359 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#20226 - CGI_10019361 superfamily 241564 2 68 1.65E-13 65.3647 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#20226 - CGI_10019361 superfamily 247792 368 406 0.00152227 36.2696 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20229 - CGI_10002198 superfamily 246723 12 829 0 960.562 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#20230 - CGI_10002199 superfamily 241678 5 400 3.43E-174 495.236 cl00198 Phosphoglycerate_kinase superfamily - - "Phosphoglycerate kinase (PGK) is a monomeric enzyme which catalyzes the transfer of the high-energy phosphate group of 1,3-bisphosphoglycerate to ADP, forming ATP and 3-phosphoglycerate. This reaction represents the first of the two substrate-level phosphorylation events in the glycolytic pathway. Substrate-level phosphorylation is defined as production of ATP by a process, which is catalyzed by water-soluble enzymes in the cytosol; not involving membranes and ion gradients." Q#20231 - CGI_10002201 superfamily 248262 1 237 1.43E-72 227.112 cl17708 HMBS superfamily - - "Hydroxymethylbilane synthase (HMBS), also known as porphobilinogen deaminase (PBGD), is an intermediate enzyme in the biosynthetic pathway of tetrapyrrolic ring systems, such as heme, chlorophylls, and vitamin B12. HMBS catalyzes the conversion of porphobilinogen (PBG) into hydroxymethylbilane (HMB). HMBS consists of three domains, and is believed to bind substrate through a hinge-bending motion of domains I and II. HMBS is found in all organisms except viruses." Q#20232 - CGI_10002204 superfamily 241550 549 741 3.23E-80 267.962 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#20232 - CGI_10002204 superfamily 247896 2 264 4.18E-69 242.22 cl17342 Pyruvate_Kinase superfamily N - "Pyruvate kinase (PK): Large allosteric enzyme that regulates glycolysis through binding of the substrate, phosphoenolpyruvate, and one or more allosteric effectors. Like other allosteric enzymes, PK has a high substrate affinity R state and a low affinity T state. PK exists as several different isozymes, depending on organism and tissue type. In mammals, there are four PK isozymes: R, found in red blood cells, L, found in liver, M1, found in skeletal muscle, and M2, found in kidney, adipose tissue, and lung. PK forms a homotetramer, with each subunit containing three domains. The T state to R state transition of PK is more complex than in most allosteric enzymes, involving a concerted rotation of all 3 domains of each monomer in the homotetramer." Q#20232 - CGI_10002204 superfamily 222257 753 914 1.11E-57 198.854 cl16317 tRNA-synt_1_2 superfamily - - "Leucyl-tRNA synthetase, Domain 2; This is a family of the conserved region of Leucine-tRNA ligase or Leucyl-tRNA synthetase, EC:6.1.1.4." Q#20232 - CGI_10002204 superfamily 241550 1035 1194 8.90E-49 178.21 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#20232 - CGI_10002204 superfamily 241870 288 434 1.53E-22 97.1692 cl00451 MoCF_BD superfamily - - "MoCF_BD: molybdenum cofactor (MoCF) binding domain (BD). This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. The domain is presumed to bind molybdopterin." Q#20232 - CGI_10002204 superfamily 245839 1194 1325 2.74E-12 65.7041 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#20232 - CGI_10002204 superfamily 241550 935 962 9.90E-06 47.9971 cl00015 nt_trans superfamily NC - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#20233 - CGI_10002205 superfamily 247755 386 619 1.22E-97 301.072 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20233 - CGI_10002205 superfamily 216049 93 297 4.81E-10 59.607 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#20234 - CGI_10002209 superfamily 241808 2 138 1.25E-22 88.6854 cl00352 PTH superfamily N - "Peptidyl-tRNA hydrolase (PTH) is a monomeric protein that cleaves the ester bond linking the nascent peptide and tRNA when peptidyl-tRNA is released prematurely from the ribosome. This ensures the recycling of peptidyl-tRNAs into tRNAs produced through abortion of translation and is essential for cell viability.This group also contains chloroplast RNA splicing 2 (CRS2), which is closely related nuclear-encoded protein required for the splicing of nine group II introns in chloroplasts." Q#20235 - CGI_10002213 superfamily 247794 3 193 5.52E-81 245.825 cl17240 FDH_GDH_like superfamily N - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#20236 - CGI_10004926 superfamily 247724 35 202 4.73E-96 279.998 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20237 - CGI_10004927 superfamily 241547 38 226 2.61E-59 189.788 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#20239 - CGI_10004929 superfamily 241573 877 1200 7.94E-106 340.078 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#20239 - CGI_10004929 superfamily 246669 1368 1493 1.01E-29 116.609 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#20239 - CGI_10004929 superfamily 241653 1211 1352 8.76E-27 108.925 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#20240 - CGI_10004930 superfamily 244539 463 547 1.47E-16 78.8858 cl06868 FNR_like superfamily C - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#20240 - CGI_10004930 superfamily 244539 593 721 1.10E-15 76.1894 cl06868 FNR_like superfamily N - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#20240 - CGI_10004930 superfamily 247856 127 196 5.09E-09 53.7057 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#20240 - CGI_10004930 superfamily 247856 92 153 2.24E-06 46.0017 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#20240 - CGI_10004930 superfamily 242267 287 425 5.52E-15 72.3228 cl01043 Ferric_reduct superfamily - - "Ferric reductase like transmembrane component; This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease." Q#20246 - CGI_10006870 superfamily 247683 544 590 1.88E-29 112.207 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#20246 - CGI_10006870 superfamily 243056 162 375 7.46E-54 185.588 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#20246 - CGI_10006870 superfamily 243142 617 771 3.83E-34 127.743 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#20247 - CGI_10006871 superfamily 241546 371 488 4.84E-36 133.449 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#20247 - CGI_10006871 superfamily 243072 698 822 2.26E-31 120.181 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20248 - CGI_10006872 superfamily 245827 119 534 1.03E-141 446.281 cl11986 TOP1Ac superfamily - - "DNA Topoisomerase, subtype IA; DNA-binding, ATP-binding and catalytic domain of bacterial DNA topoisomerases I and III, and eukaryotic DNA topoisomerase III and eubacterial and archael reverse gyrases. Topoisomerases clevage single or double stranded DNA and then rejoin the broken phosphodiester backbone. Proposed catalytic mechanism of single stranded DNA cleavage is by phosphoryl transfer through a tyrosine nucleophile using acid/base catalysis. Tyr is activated by a nearby group (not yet identified) acting as a general base for nucleophilic attack on the 5' phosphate of the scissile bond. Arg and Lys stabilize the pentavalent transition state. Glu then acts as a proton donor for the leaving 3'-oxygen, upon cleavage of the scissile strand." Q#20248 - CGI_10006872 superfamily 242046 1 110 2.25E-37 140.061 cl00718 TOPRIM superfamily N - "Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function." Q#20248 - CGI_10006872 superfamily 149077 1560 1634 2.31E-27 110.022 cl06719 TMC superfamily - - "TMC domain; These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 and EVIN2 - this region is termed the TMC domain. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters." Q#20248 - CGI_10006872 superfamily 219199 866 908 7.90E-18 80.502 cl06070 zf-GRF superfamily - - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#20248 - CGI_10006872 superfamily 219199 940 983 5.15E-17 78.1908 cl06070 zf-GRF superfamily - - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#20248 - CGI_10006872 superfamily 219199 740 782 7.77E-16 74.724 cl06070 zf-GRF superfamily - - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#20248 - CGI_10006872 superfamily 219199 800 831 1.86E-11 62.0124 cl06070 zf-GRF superfamily N - GRF zinc finger; This presumed zinc binding domain is found in a variety of DNA-binding proteins. It seems likely that this domain is involved in nucleic acid binding. It is named GRF after three conserved residues in the centre of the alignment of the domain. This zinc finger may be related to pfam01396. Q#20248 - CGI_10006872 superfamily 110400 572 610 0.000144453 41.8683 cl15497 zf-C4_Topoisom superfamily - - Topoisomerase DNA binding C4 zinc finger; Topoisomerase DNA binding C4 zinc finger. Q#20249 - CGI_10006873 superfamily 143631 1 107 5.22E-32 110.892 cl12007 Cby_like superfamily - - "Chibby, a nuclear inhibitor of Wnt/beta-catenin mediated transcription, and similar proteins; Chibby(Cby) is a well-conserved nuclear protein that functions as part of the Wnt/beta-catenin signaling pathway. Specifically, Cby binds directly to beta-catenin by interacting with its central region, which harbors armadillo repeats. Cby-beta-catenin interactions may also involve 14-3-3 proteins. By competing with other binding partners of beta-catenin, the Tcf/Lef transcription factors, Cby inhibits transcriptional activation. Cby has been shown to play a role in adipocyte differentiation. The C-terminal region of Cby appears to contain an alpha-helical coiled-coil motif." Q#20252 - CGI_10003521 superfamily 241559 33 137 1.37E-26 108.167 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#20252 - CGI_10003521 superfamily 241559 159 203 2.11E-07 51.9279 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#20252 - CGI_10003521 superfamily 216033 1714 1800 1.48E-23 98.9452 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2398 2483 9.94E-23 96.634 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 735 826 6.55E-22 94.3228 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2215 2302 1.57E-21 93.1672 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1317 1404 2.72E-21 92.3968 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1223 1309 8.30E-21 90.856 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1033 1119 1.43E-20 90.4708 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2612 2695 1.56E-20 90.0856 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1614 1707 3.28E-20 89.3152 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1806 1897 6.41E-20 88.5448 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 640 725 6.88E-20 88.5448 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1514 1598 3.95E-19 86.2336 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1128 1214 5.47E-19 85.8484 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 542 635 9.60E-19 85.078 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 841 931 1.72E-17 81.226 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 429 535 2.65E-16 77.7592 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2799 2884 3.62E-16 77.374 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 337 421 4.47E-16 77.374 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2924 3011 4.83E-16 76.9888 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1904 1993 5.51E-16 76.9888 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1999 2085 1.59E-15 75.8332 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2703 2791 1.71E-15 75.448 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 1415 1498 2.24E-12 66.5884 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 939 1026 8.12E-12 64.6624 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2144 2210 3.76E-11 62.7364 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2541 2600 3.45E-10 60.04 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 216033 2310 2389 1.10E-08 55.4176 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#20252 - CGI_10003521 superfamily 241559 232 315 6.53E-08 53.4372 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#20253 - CGI_10011808 superfamily 247725 394 503 1.86E-57 190.295 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#20253 - CGI_10011808 superfamily 215882 538 592 5.36E-13 66.1502 cl09511 FERM_M superfamily N - FERM central domain; This domain is the central structural domain of the FERM domain. Q#20253 - CGI_10011808 superfamily 215882 302 337 5.87E-09 54.209 cl09511 FERM_M superfamily C - FERM central domain; This domain is the central structural domain of the FERM domain. Q#20254 - CGI_10011809 superfamily 243034 346 447 5.51E-12 62.3976 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20254 - CGI_10011809 superfamily 243034 282 377 1.03E-11 61.6272 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20254 - CGI_10011809 superfamily 243034 215 304 2.08E-06 45.834 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20254 - CGI_10011809 superfamily 243034 416 485 0.00116876 37.3596 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20255 - CGI_10011810 superfamily 243092 248 548 3.13E-33 127.836 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20255 - CGI_10011810 superfamily 128914 85 134 2.65E-06 45.2546 cl15352 CTLH superfamily - - C-terminal to LisH motif; Alpha-helical motif of unknown function. Q#20257 - CGI_10011812 superfamily 247739 118 321 4.73E-97 299.874 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#20258 - CGI_10011813 superfamily 248458 34 367 7.14E-25 103.548 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20259 - CGI_10011814 superfamily 202715 26 117 3.53E-21 82.2408 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#20260 - CGI_10011815 superfamily 241563 58 99 2.51E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20261 - CGI_10011816 superfamily 243519 12 174 7.74E-89 281.418 cl03757 phosphohexomutase superfamily C - "The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model." Q#20261 - CGI_10011816 superfamily 202715 310 407 1.79E-22 90.7152 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#20263 - CGI_10011818 superfamily 243519 12 593 0 992.496 cl03757 phosphohexomutase superfamily - - "The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model." Q#20264 - CGI_10011819 superfamily 245201 91 367 9.44E-63 206.602 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20265 - CGI_10011820 superfamily 243154 13 60 7.82E-18 75.3165 cl02715 Surp superfamily - - Surp module; This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. Q#20265 - CGI_10011820 superfamily 243078 210 273 6.51E-12 59.5358 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#20267 - CGI_10011822 superfamily 243992 5 72 3.81E-09 48.7242 cl05087 Complex1_LYR_1 superfamily - - "Complex1_LYR-like; This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria." Q#20268 - CGI_10011823 superfamily 241624 60 354 8.43E-45 159.414 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#20269 - CGI_10011824 superfamily 247986 205 298 2.28E-08 53.531 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#20269 - CGI_10011824 superfamily 197504 417 578 4.28E-27 106.99 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#20270 - CGI_10011825 superfamily 247986 206 307 1.94E-05 44.6714 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#20271 - CGI_10011826 superfamily 116798 43 166 4.63E-13 64.6202 cl17955 Lipocalin_2 superfamily - - "Lipocalin-like domain; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The structure is an eight-stranded beta barrel." Q#20271 - CGI_10011826 superfamily 245225 198 314 7.79E-13 66.9188 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#20274 - CGI_10019084 superfamily 145792 18 38 0.000938664 33.3883 cl10597 Antistasin superfamily - - Antistasin family; Members of this family are inhibitors of trypsin family proteases. This domain is highly disulphide bonded. The domain is also found in some large extracellular proteins in multiple copies. Q#20275 - CGI_10019085 superfamily 222150 301 326 9.37E-05 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#20275 - CGI_10019085 superfamily 206083 61 87 0.00035152 39.5136 cl16471 zf-C2H2_6 superfamily - - C2H2-type zinc finger; C2H2-type zinc finger. Q#20275 - CGI_10019085 superfamily 222150 631 655 0.00617349 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#20276 - CGI_10019086 superfamily 248020 20 371 5.79E-62 207.702 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#20277 - CGI_10019087 superfamily 245201 35 131 8.76E-16 70.258 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20280 - CGI_10019090 superfamily 110440 481 508 0.00129764 37.0021 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20280 - CGI_10019090 superfamily 110440 523 550 0.00933853 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20281 - CGI_10019092 superfamily 242902 23 119 2.45E-14 68.0422 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#20283 - CGI_10019094 superfamily 247675 198 456 4.36E-103 314.277 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#20283 - CGI_10019094 superfamily 247675 10 110 1.68E-44 159.783 cl17011 Arginase_HDAC superfamily N - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#20284 - CGI_10019095 superfamily 217473 146 322 8.63E-27 110.532 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#20286 - CGI_10019097 superfamily 217473 221 368 9.14E-21 92.8133 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#20287 - CGI_10019098 superfamily 241563 136 182 8.61E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20289 - CGI_10019100 superfamily 114751 156 257 8.87E-56 176.843 cl05538 SynMuv_product superfamily - - Ras-induced vulval development antagonist; This family is from synthetic multi-vulval genes which encode chromatin-associated proteins involved in transcriptional repression. This protein has a role in antagonising Ras-induced vulval development. Q#20290 - CGI_10019101 superfamily 247792 16 66 1.87E-06 46.2848 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20291 - CGI_10019102 superfamily 219619 389 438 2.17E-05 42.58 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#20296 - CGI_10019107 superfamily 243092 608 864 1.90E-16 79.6864 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20296 - CGI_10019107 superfamily 208095 74 130 4.93E-07 48.3587 cl04084 dDENN superfamily - - dDENN domain; This region is always found associated with pfam02141. It is predicted to form a globular domain. This domain is predicted to be completely alpha helical. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. Q#20300 - CGI_10015041 superfamily 248097 60 141 1.21E-05 42.6374 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20301 - CGI_10015042 superfamily 248097 21 137 9.30E-14 63.4382 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20302 - CGI_10015043 superfamily 248097 23 139 2.33E-15 68.0606 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20303 - CGI_10015044 superfamily 248097 80 196 2.41E-13 63.053 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20304 - CGI_10015045 superfamily 245210 4 391 0 537.448 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#20305 - CGI_10015046 superfamily 242039 73 143 2.52E-31 109.228 cl00706 Ribosomal_L44 superfamily - - Ribosomal protein L44; Ribosomal protein L44. Q#20308 - CGI_10015049 superfamily 217062 5 209 5.28E-27 105.044 cl12266 Branch superfamily N - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#20309 - CGI_10015050 superfamily 243051 17 42 5.90E-06 40.8242 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#20311 - CGI_10005243 superfamily 245612 39 462 7.84E-87 275.329 cl11426 Amidase superfamily - - Amidase; Amidase. Q#20313 - CGI_10005245 superfamily 241570 6 56 7.95E-09 51.5578 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#20313 - CGI_10005245 superfamily 241570 135 180 1.41E-06 45.0094 cl00047 CAP_ED superfamily N - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#20313 - CGI_10005245 superfamily 241570 200 257 0.00266803 35.3794 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#20316 - CGI_10005248 superfamily 243079 704 754 2.27E-10 57.724 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 309 351 2.58E-10 57.3199 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 155 197 2.58E-10 57.3199 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 611 661 3.24E-10 57.3388 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 222 272 3.86E-10 56.9536 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 376 426 4.14E-10 56.9536 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 452 502 4.61E-10 56.9536 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 68 118 6.60E-10 56.5684 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 781 831 6.60E-10 56.5684 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 538 580 1.15E-09 55.7792 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 867 912 3.71E-07 48.4792 cl02546 Granulin superfamily - - Granulin; Granulin. Q#20316 - CGI_10005248 superfamily 243079 18 43 0.00220789 37.2896 cl02546 Granulin superfamily N - Granulin; Granulin. Q#20317 - CGI_10005499 superfamily 243072 155 271 5.62E-26 105.158 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 860 981 1.34E-25 104.003 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 793 913 3.47E-25 102.847 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 928 1047 5.49E-25 102.462 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 60 176 2.38E-24 100.536 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 347 463 2.62E-24 100.536 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 505 623 2.81E-24 100.151 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 668 783 1.14E-22 95.5282 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 252 399 1.58E-20 89.365 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20317 - CGI_10005499 superfamily 243072 1 112 9.17E-15 72.8014 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20318 - CGI_10005500 superfamily 241574 217 264 3.90E-13 66.8405 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#20318 - CGI_10005500 superfamily 241574 281 367 2.81E-09 55.6697 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#20319 - CGI_10000762 superfamily 241832 14 68 9.19E-26 96.3692 cl00388 Thioredoxin_like superfamily NC - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#20321 - CGI_10017162 superfamily 219619 40 118 1.13E-11 58.3731 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#20323 - CGI_10017164 superfamily 198867 127 223 4.09E-33 122.652 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#20323 - CGI_10017164 superfamily 243066 15 118 1.19E-26 104.622 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20323 - CGI_10017164 superfamily 243146 481 534 1.16E-08 51.8934 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20323 - CGI_10017164 superfamily 243146 447 492 7.84E-06 43.7011 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20323 - CGI_10017164 superfamily 243146 386 432 8.63E-06 43.8042 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20324 - CGI_10017165 superfamily 227412 114 268 1.67E-07 49.7746 cl18811 YIP1 superfamily N - "Rab GTPase interacting factor, Golgi membrane protein [Intracellular trafficking and secretion]" Q#20325 - CGI_10017166 superfamily 246722 1 207 2.27E-121 362.199 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#20325 - CGI_10017166 superfamily 246724 214 247 1.40E-14 69.9117 cl14815 H3TH_StructSpec-5'-nucleases superfamily C - "H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination; The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases." Q#20326 - CGI_10017167 superfamily 247739 2 205 2.43E-34 124.658 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#20327 - CGI_10017168 superfamily 247744 358 559 1.02E-27 111.175 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#20327 - CGI_10017168 superfamily 147395 660 701 1.69E-15 71.8468 cl04973 Dpy-30 superfamily - - Dpy-30 motif; This motif is found in a wide variety of domain contexts. It is found in the Dpy-30 proteins hence the motifs name. It is about 40 residues long and is probably fomed of two alpha-helices. It may be a dimerisation motif analogous to pfam02197 (Bateman A pers obs). Q#20327 - CGI_10017168 superfamily 245206 73 287 3.69E-05 44.5889 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#20328 - CGI_10017169 superfamily 247044 17 98 9.26E-38 125.427 cl15697 ADF_gelsolin superfamily N - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#20329 - CGI_10017170 superfamily 241574 547 728 5.48E-63 216.298 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#20329 - CGI_10017170 superfamily 241574 1311 1448 2.08E-49 177.393 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#20329 - CGI_10017170 superfamily 241574 791 874 2.76E-11 64.1441 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#20329 - CGI_10017170 superfamily 238012 224 269 0.00293748 37.719 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20329 - CGI_10017170 superfamily 241574 1537 1584 0.00325106 39.4914 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#20330 - CGI_10017171 superfamily 243859 175 258 4.49E-17 74.2886 cl04722 PLAC8 superfamily - - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#20331 - CGI_10017172 superfamily 247724 137 341 1.23E-43 151.403 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20332 - CGI_10017173 superfamily 241874 45 604 0 571.43 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#20333 - CGI_10017174 superfamily 246597 83 311 3.36E-81 254.148 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#20335 - CGI_10017176 superfamily 149105 144 253 3.94E-44 153.745 cl12353 TMPIT superfamily N - "TMPIT-like protein; A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this." Q#20335 - CGI_10017176 superfamily 149105 14 133 3.79E-24 98.6612 cl12353 TMPIT superfamily C - "TMPIT-like protein; A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this." Q#20336 - CGI_10001150 superfamily 220672 3 190 3.38E-37 131.6 cl10957 Frag1 superfamily - - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#20337 - CGI_10001151 superfamily 241571 257 365 8.91E-15 69.7486 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#20337 - CGI_10001151 superfamily 243072 28 124 4.61E-13 65.0974 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20337 - CGI_10001151 superfamily 243073 205 243 1.83E-07 47.4649 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#20338 - CGI_10004033 superfamily 243066 132 215 6.15E-15 67.1929 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20339 - CGI_10004034 superfamily 247057 9 53 2.32E-05 42.2265 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#20339 - CGI_10004034 superfamily 241594 341 374 0.00805592 36.774 cl00077 HECTc superfamily C - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#20340 - CGI_10004035 superfamily 243092 1424 1492 6.03E-06 48.4852 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20340 - CGI_10004035 superfamily 243092 958 1206 6.53E-06 48.4852 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20341 - CGI_10004036 superfamily 247916 219 277 3.47E-05 42.3699 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#20342 - CGI_10004037 superfamily 241609 112 165 1.75E-18 76.2627 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#20342 - CGI_10004037 superfamily 241568 61 102 0.000232964 36.672 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#20343 - CGI_10004038 superfamily 247916 181 247 0.000140915 40.8291 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#20343 - CGI_10004038 superfamily 247916 142 202 0.0030793 36.995 cl17362 Transglut_core superfamily C - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#20344 - CGI_10006887 superfamily 245201 123 240 1.46E-07 50.2276 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20345 - CGI_10006888 superfamily 243092 176 410 9.46E-42 151.334 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20345 - CGI_10006888 superfamily 243074 68 112 3.94E-11 58.6721 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#20346 - CGI_10006889 superfamily 220964 5626 5695 3.61E-05 46.0549 cl12630 DUF2869 superfamily C - Protein of unknown function (DUF2869); This bacterial family of proteins has no known function. Q#20346 - CGI_10006889 superfamily 219525 528 563 0.00297599 39.3246 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#20349 - CGI_10006892 superfamily 245814 21 102 4.91E-11 59.8264 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20349 - CGI_10006892 superfamily 245814 115 147 5.69E-05 41.661 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20351 - CGI_10006894 superfamily 241750 13 342 9.41E-158 455.596 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#20352 - CGI_10017058 superfamily 248097 194 308 1.66E-15 73.0682 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20352 - CGI_10017058 superfamily 248097 387 516 1.22E-14 70.7941 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20353 - CGI_10017059 superfamily 245248 10 444 1.73E-59 203.228 cl10080 RPE65 superfamily - - "Retinal pigment epithelial membrane protein; This family represents a retinal pigment epithelial membrane receptor which is abundantly expressed in retinal pigment epithelium, and binds plasma retinal binding protein. The family also includes the sequence related neoxanthin cleavage enzyme in plants and lignostilbene-alpha,beta-dioxygenase in bacteria." Q#20355 - CGI_10017061 superfamily 241563 38 73 0.00408687 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20356 - CGI_10017062 superfamily 204985 1156 1252 1.60E-37 140.385 cl14987 Chorein_N superfamily - - "N-terminal region of Chorein, a TM vesicle-mediated sorter; Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport." Q#20356 - CGI_10017062 superfamily 241750 986 1156 5.51E-12 67.5986 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#20357 - CGI_10017063 superfamily 241563 17 49 0.000263826 38.9996 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20360 - CGI_10017066 superfamily 218226 43 119 2.71E-05 39.1883 cl04703 ATP-synt_G superfamily - - "Mitochondrial ATP synthase g subunit; The Fo sector of the ATP synthase is a membrane bound complex which mediates proton transport. It is composed of nine different polypeptide subunits (a, b, c, d, e, f, g F6, A6L). The function of subunit g is currently unknown. The conserved region covers all but the very N-terminus of the member sequences. No prokaryotic members have been identified thus far." Q#20361 - CGI_10017067 superfamily 242323 127 237 7.62E-19 79.4747 cl01132 FA_hydroxylase superfamily - - "Fatty acid hydroxylase superfamily; This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins." Q#20363 - CGI_10017069 superfamily 219542 45 149 6.08E-41 144.692 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#20363 - CGI_10017069 superfamily 215896 158 327 1.88E-21 91.5876 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#20363 - CGI_10017069 superfamily 219541 438 582 2.57E-19 85.2127 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#20364 - CGI_10017070 superfamily 247724 186 339 2.77E-67 215.012 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20364 - CGI_10017070 superfamily 247724 14 186 1.39E-66 213.086 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20365 - CGI_10017071 superfamily 246908 182 269 4.47E-30 112.621 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#20365 - CGI_10017071 superfamily 243034 11 99 2.30E-08 51.612 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20366 - CGI_10017072 superfamily 217380 59 372 1.88E-58 198.703 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#20367 - CGI_10017073 superfamily 217380 108 415 1.08E-53 187.532 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#20368 - CGI_10017074 superfamily 222090 284 490 7.18E-16 76.1574 cl18636 Methyltransf_22 superfamily - - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#20369 - CGI_10017075 superfamily 243072 80 165 9.83E-07 48.1486 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20369 - CGI_10017075 superfamily 149414 165 227 2.70E-20 86.5566 cl07091 TRP_2 superfamily - - Transient receptor ion channel II; This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023). Q#20372 - CGI_10017078 superfamily 215724 40 347 1.70E-157 447.066 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#20373 - CGI_10017079 superfamily 215724 6 313 1.58E-140 402.383 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#20374 - CGI_10017080 superfamily 215724 40 396 1.62E-158 451.688 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#20375 - CGI_10009916 superfamily 241584 522 607 2.60E-10 59.0471 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20375 - CGI_10009916 superfamily 241584 428 520 8.87E-10 57.5063 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20375 - CGI_10009916 superfamily 241584 619 708 3.23E-09 55.9655 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20375 - CGI_10009916 superfamily 241584 811 902 4.80E-09 55.1951 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20375 - CGI_10009916 superfamily 245814 147 208 2.04E-06 47.0987 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20375 - CGI_10009916 superfamily 241584 719 802 2.58E-06 47.1059 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20375 - CGI_10009916 superfamily 245814 39 125 5.27E-13 67.118 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20375 - CGI_10009916 superfamily 245814 251 327 4.31E-11 61.3672 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20375 - CGI_10009916 superfamily 245814 349 411 2.61E-10 58.7656 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20378 - CGI_10009919 superfamily 241629 53 107 1.54E-20 80.8225 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#20382 - CGI_10003458 superfamily 247042 166 283 1.47E-10 62.2565 cl15693 Sema superfamily N - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#20385 - CGI_10000562 superfamily 248372 7 114 8.49E-14 63.3724 cl17818 Nuc_deoxyrib_tr superfamily - - Nucleoside 2-deoxyribosyltransferase; Nucleoside 2-deoxyribosyltransferase EC:2.4.2.6 catalyzes the cleavage of the glycosidic bonds of 2`-deoxyribonucleosides. Q#20388 - CGI_10001360 superfamily 216502 5 128 9.74E-32 113.456 cl03209 Peptidase_M41 superfamily N - Peptidase family M41; Peptidase family M41. Q#20389 - CGI_10001361 superfamily 245213 363 399 5.76E-05 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20389 - CGI_10001361 superfamily 241571 218 306 9.62E-07 47.77 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#20391 - CGI_10001513 superfamily 241567 6 74 0.00485106 34.9066 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#20394 - CGI_10010272 superfamily 241563 724 760 4.37E-05 42.6595 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20394 - CGI_10010272 superfamily 243092 965 1075 0.00228072 40.0108 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20394 - CGI_10010272 superfamily 110440 387 413 0.00714069 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20394 - CGI_10010272 superfamily 149105 12 129 0.00761059 38.5701 cl12353 TMPIT superfamily C - "TMPIT-like protein; A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this." Q#20395 - CGI_10010273 superfamily 246936 23 139 5.83E-42 140.055 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#20396 - CGI_10010274 superfamily 128778 5 78 0.00113691 37.6295 cl17972 BBC superfamily N - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#20397 - CGI_10010275 superfamily 246936 267 387 1.09E-44 152.381 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#20397 - CGI_10010275 superfamily 246936 111 239 7.17E-36 128.127 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#20401 - CGI_10010279 superfamily 247986 440 546 5.46E-13 68.939 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#20401 - CGI_10010279 superfamily 197504 653 790 2.12E-14 71.9369 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#20402 - CGI_10010280 superfamily 241571 74 176 1.16E-10 57.037 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#20403 - CGI_10010281 superfamily 241583 171 262 5.70E-21 87.2414 cl00064 ZnMc superfamily NC - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#20404 - CGI_10010282 superfamily 241571 576 642 1.96E-09 55.8814 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#20404 - CGI_10010282 superfamily 241583 432 532 3.12E-28 111.894 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#20406 - CGI_10010284 superfamily 247908 121 314 4.20E-109 319.071 cl17354 NIF superfamily - - NLI interacting factor-like phosphatase; This family contains a number of NLI interacting factor isoforms and also an N-terminal regions of RNA polymerase II CTC phosphatase and FCP1 serine phosphatase. This region has been identified as the minimal phosphatase domain. Q#20406 - CGI_10010284 superfamily 241645 9 81 5.48E-17 74.104 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#20409 - CGI_10015654 superfamily 243035 81 196 2.11E-14 66.8745 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20409 - CGI_10015654 superfamily 243035 198 266 4.92E-12 60.3875 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20410 - CGI_10015655 superfamily 243035 7 113 3.42E-20 79.9713 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20413 - CGI_10015658 superfamily 219542 1 61 4.02E-20 82.6747 cl18517 Cu-oxidase_3 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#20417 - CGI_10010645 superfamily 243077 240 261 0.00154839 36.7396 cl02542 DnaJ superfamily N - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#20418 - CGI_10010646 superfamily 247684 7 382 0 774.593 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#20419 - CGI_10010647 superfamily 247684 7 382 0 774.208 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#20420 - CGI_10010648 superfamily 220831 1 96 8.63E-41 132.82 cl11248 UPF0546 superfamily - - Uncharacterized protein family UPF0546; This family of proteins has no known function. Many members are annotated as potential transmembrane proteins. Q#20421 - CGI_10010649 superfamily 241622 1431 1522 3.45E-22 94.1706 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 2229 2311 9.84E-21 89.9334 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 139 223 5.02E-18 82.2294 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 1966 2046 1.35E-17 81.0738 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 1725 1807 1.59E-17 80.6886 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 1870 1951 1.24E-16 77.9922 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 416 502 2.81E-15 74.1402 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 632 709 1.12E-13 69.5178 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 791 872 3.60E-13 67.977 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 1599 1682 7.16E-13 67.2066 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 289 344 1.37E-12 66.4362 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 241622 2121 2177 4.52E-11 61.8138 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20421 - CGI_10010649 superfamily 117611 12 64 1.14E-07 51.485 cl07602 L27_2 superfamily - - "L27_2; The L27_2 domain is a protein-protein interaction domain capable of organising scaffold proteins into supramolecular assemblies by formation of heteromeric L27_2 domain complexes. L27_2 domain-mediated protein assemblies have been shown to play essential roles in cellular processes including asymmetric cell division, establishment and maintenance of cell polarity, and clustering of receptors and ion channels. Members of this family form specific heterotetrameric complexes, in which each domain contains three alpha-helices. The two N-terminal helices of each L27_2 domain pack together to form a tight, four-helix bundle in the heterodimer, whilst the third helix of each L27_2 domain forms another four-helix bundle that assembles the two units of the heterodimer into a tetramer." Q#20421 - CGI_10010649 superfamily 241622 1265 1322 9.76E-05 43.1364 cl00117 PDZ superfamily C - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20422 - CGI_10010650 superfamily 244859 54 309 5.25E-12 64.1049 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#20425 - CGI_10010653 superfamily 241645 114 126 0.000787508 34.3808 cl00155 UBQ superfamily N - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#20427 - CGI_10010655 superfamily 245201 39 281 2.62E-73 235.209 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20428 - CGI_10010656 superfamily 247725 34 141 1.45E-47 161.223 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#20428 - CGI_10010656 superfamily 241620 207 249 7.31E-10 55.1373 cl00113 CRIB superfamily - - "PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules." Q#20429 - CGI_10010657 superfamily 243091 338 447 1.30E-26 108.191 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#20430 - CGI_10005156 superfamily 247755 1152 1372 1.40E-115 362.967 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20430 - CGI_10005156 superfamily 247755 466 683 1.79E-87 283.591 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20430 - CGI_10005156 superfamily 216049 818 1101 3.78E-30 122.395 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#20430 - CGI_10005156 superfamily 216049 155 423 7.87E-17 81.5634 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#20431 - CGI_10005157 superfamily 201778 6 65 1.29E-08 51.0554 cl18219 GFO_IDH_MocA superfamily N - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#20431 - CGI_10005157 superfamily 217272 82 189 6.30E-06 43.292 cl18400 GFO_IDH_MocA_C superfamily - - "Oxidoreductase family, C-terminal alpha/beta domain; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#20433 - CGI_10001872 superfamily 241867 58 302 3.88E-78 240.516 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#20436 - CGI_10001875 superfamily 245608 1 204 2.81E-80 240.678 cl11421 FAA_hydrolase superfamily - - "Fumarylacetoacetate (FAA) hydrolase family; This family consists of fumarylacetoacetate (FAA) hydrolase, or fumarylacetoacetate hydrolase (FAH) and it also includes HHDD isomerase/OPET decarboxylase from E. coli strain W. FAA is the last enzyme in the tyrosine catabolic pathway, it hydrolyses fumarylacetoacetate into fumarate and acetoacetate which then join the citric acid cycle. Mutations in FAA cause type I tyrosinemia in humans this is an inherited disorder mainly affecting the liver leading to liver cirrhosis, hepatocellular carcinoma, renal tubular damages and neurologic crises amongst other symptoms. The enzymatic defect causes the toxic accumulation of phenylalanine/tyrosine catabolites. The E. coli W enzyme HHDD isomerase/OPET decarboxylase contains two copies of this domain and functions in fourth and fifth steps of the homoprotocatechuate pathway; here it decarboxylates OPET to HHDD and isomerises this to OHED. The final products of this pathway are pyruvic acid and succinic semialdehyde. This family also includes various hydratases and 4-oxalocrotonate decarboxylases which are involved in the bacterial meta-cleavage pathways for degradation of aromatic compounds. 2-hydroxypentadienoic acid hydratase, encoded by mhpD in E. coli, is involved in the phenylpropionic acid pathway of E. coli and catalyzes the conversion of 2-hydroxy pentadienoate to 4-hydroxy-2-keto-pentanoate and uses a Mn2+ co-factor. OHED hydratase encoded by hpcG in E. coli is involved in the homoprotocatechuic acid (HPC) catabolism. XylI in P. putida is a 4-Oxalocrotonate decarboxylase." Q#20437 - CGI_10001449 superfamily 203841 2 118 5.59E-23 88.5488 cl17716 NAD_binding_6 superfamily N - Ferric reductase NAD binding domain; Ferric reductase NAD binding domain. Q#20438 - CGI_10001450 superfamily 241733 5 82 1.51E-30 106.855 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#20439 - CGI_10001451 superfamily 244539 16 97 8.14E-18 80.4266 cl06868 FNR_like superfamily N - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#20440 - CGI_10001452 superfamily 241733 1 36 1.83E-11 55.2386 cl00259 Sm_like superfamily N - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#20444 - CGI_10018261 superfamily 215872 51 129 1.21E-10 53.7366 cl08261 Ribosomal_L6 superfamily - - Ribosomal protein L6; Ribosomal protein L6. Q#20444 - CGI_10018261 superfamily 215872 9 39 0.00618865 32.5506 cl08261 Ribosomal_L6 superfamily N - Ribosomal protein L6; Ribosomal protein L6. Q#20445 - CGI_10018262 superfamily 227412 88 283 1.84E-26 103.703 cl18811 YIP1 superfamily N - "Rab GTPase interacting factor, Golgi membrane protein [Intracellular trafficking and secretion]" Q#20446 - CGI_10018263 superfamily 245814 248 316 7.55E-08 50.1803 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20446 - CGI_10018263 superfamily 214507 174 230 3.49E-08 50.8916 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#20447 - CGI_10018264 superfamily 247727 149 259 2.92E-06 44.3431 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#20448 - CGI_10018265 superfamily 247068 110 194 8.04E-14 63.8717 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20450 - CGI_10018267 superfamily 241563 61 97 4.27E-05 43.0447 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20452 - CGI_10018269 superfamily 248458 51 164 2.41E-14 71.5761 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20453 - CGI_10018270 superfamily 248281 3 64 4.11E-06 41.8723 cl17727 GT1 superfamily C - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#20455 - CGI_10018272 superfamily 201217 69 121 4.72E-09 54.0688 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#20455 - CGI_10018272 superfamily 201217 278 329 2.66E-07 48.676 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#20455 - CGI_10018272 superfamily 205718 165 194 2.89E-06 45.559 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#20455 - CGI_10018272 superfamily 205718 51 80 8.68E-06 44.0182 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#20455 - CGI_10018272 superfamily 201217 232 275 2.27E-05 43.2832 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#20455 - CGI_10018272 superfamily 201217 137 177 0.00146722 37.8904 cl08266 RCC1 superfamily N - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#20455 - CGI_10018272 superfamily 205718 215 240 0.00436112 36.3142 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#20456 - CGI_10018273 superfamily 245599 182 433 3.39E-167 472.325 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#20456 - CGI_10018273 superfamily 207662 83 155 4.67E-53 173.877 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#20459 - CGI_10018276 superfamily 247912 26 339 5.35E-35 133.01 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#20459 - CGI_10018276 superfamily 221337 397 480 4.49E-07 47.6967 cl13401 DUF3471 superfamily - - "Domain of unknown function (DUF3471); This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144." Q#20460 - CGI_10018277 superfamily 247912 44 358 2.94E-38 142.64 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#20460 - CGI_10018277 superfamily 221337 420 454 6.79E-06 44.2299 cl13401 DUF3471 superfamily C - "Domain of unknown function (DUF3471); This presumed domain is functionally uncharacterized. This domain is found in bacteria, archaea and eukaryotes. This domain is typically between 98 to 114 amino acids in length. This domain is found associated with pfam00144." Q#20461 - CGI_10018278 superfamily 247912 32 232 3.67E-30 115.291 cl17358 Beta-lactamase superfamily C - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#20465 - CGI_10008617 superfamily 248013 33 49 0.00027423 36.7824 cl17459 CHROMO superfamily NC - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#20466 - CGI_10008618 superfamily 207794 4 67 6.43E-21 85.4212 cl02948 GH20_hexosaminidase superfamily N - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#20468 - CGI_10008620 superfamily 216363 96 197 3.65E-20 82.133 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#20470 - CGI_10008622 superfamily 216363 146 247 5.75E-21 85.2145 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#20474 - CGI_10002035 superfamily 217900 55 120 3.43E-23 91.4895 cl04403 APG9 superfamily C - "Autophagy protein Apg9; In yeast, 15 Apg proteins coordinate the formation of autophagosomes. Autophagy is a bulk degradation process induced by starvation in eukaryotic cells. Apg9 plays a direct role in the formation of the cytoplasm to vacuole targeting and autophagic vesicles, possibly serving as a marker for a specialised compartment essential for these vesicle-mediated alternative targeting pathways." Q#20476 - CGI_10002009 superfamily 198738 214 305 1.11E-28 106.583 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#20477 - CGI_10001849 superfamily 217685 216 362 2.04E-27 107.42 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#20477 - CGI_10001849 superfamily 216290 80 201 5.65E-22 91.5809 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#20480 - CGI_10016063 superfamily 247805 24 98 1.69E-10 53.8023 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#20483 - CGI_10016066 superfamily 241619 617 681 0.000587186 39.1025 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#20486 - CGI_10016069 superfamily 245596 243 539 3.72E-149 437.021 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#20486 - CGI_10016069 superfamily 247085 555 633 9.49E-07 47.5002 cl15820 RICIN superfamily C - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#20487 - CGI_10016070 superfamily 241575 107 155 2.12E-12 61.5195 cl00054 DSRM superfamily C - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#20487 - CGI_10016070 superfamily 241575 9 51 1.83E-06 44.5707 cl00054 DSRM superfamily C - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#20487 - CGI_10016070 superfamily 241575 251 317 0.000819931 36.8667 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#20488 - CGI_10016071 superfamily 242025 57 119 5.41E-13 65.3455 cl00682 Alba superfamily - - Alba; Alba is a novel chromosomal protein that coats archaeal DNA without compacting it. Q#20488 - CGI_10016071 superfamily 216574 284 419 1.09E-10 59.914 cl14794 FAD_binding_4 superfamily - - "FAD binding domain; This family consists of various enzymes that use FAD as a co-factor, most of the enzymes are similar to oxygen oxidoreductase. One of the enzymes Vanillyl-alcohol oxidase (VAO) has a solved structure, the alignment includes the FAD binding site, called the PP-loop, between residues 99-110. The FAD molecule is covalently bound in the known structure, however the residue that links to the FAD is not in the alignment. VAO catalyzes the oxidation of a wide variety of substrates, ranging form aromatic amines to 4-alkylphenols. Other members of this family include D-lactate dehydrogenase, this enzyme catalyzes the conversion of D-lactate to pyruvate using FAD as a co-factor; mitomycin radical oxidase, this enzyme oxidises the reduced form of mitomycins and is involved in mitomycin resistance. This family includes MurB an UDP-N-acetylenolpyruvoylglucosamine reductase enzyme EC:1.1.1.158. This enzyme is involved in the biosynthesis of peptidoglycan." Q#20489 - CGI_10016072 superfamily 245201 17 297 1.34E-47 161.897 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20492 - CGI_10002637 superfamily 216939 104 171 0.00061031 37.2573 cl03492 PC4 superfamily - - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#20493 - CGI_10002638 superfamily 243091 202 245 0.000868406 38.2421 cl02566 SET superfamily C - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#20494 - CGI_10002508 superfamily 245864 33 337 1.31E-53 185.945 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#20494 - CGI_10002508 superfamily 245864 343 395 4.35E-07 50.3546 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#20495 - CGI_10002509 superfamily 245864 12 261 8.08E-84 260.674 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#20496 - CGI_10002510 superfamily 245864 33 188 2.23E-20 86.9486 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#20498 - CGI_10028131 superfamily 241584 358 421 0.000329445 39.0167 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20500 - CGI_10028133 superfamily 241754 628 944 1.31E-140 425.447 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#20501 - CGI_10028134 superfamily 219111 172 284 2.33E-16 72.899 cl05914 DUF1151 superfamily - - Protein of unknown function (DUF1151); This family consists of several hypothetical eukaryotic proteins of unknown function. Q#20502 - CGI_10028135 superfamily 247683 18 61 5.35E-07 42.3438 cl17036 SH3 superfamily N - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#20503 - CGI_10028136 superfamily 241597 99 162 1.54E-11 56.8674 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#20503 - CGI_10028136 superfamily 241597 20 86 7.52E-10 52.245 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#20504 - CGI_10028137 superfamily 241563 63 102 2.83E-05 41.8892 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20505 - CGI_10028138 superfamily 241563 13 55 5.00E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20506 - CGI_10028139 superfamily 216421 240 397 2.27E-05 44.7209 cl03153 Lamp superfamily N - Lysosome-associated membrane glycoprotein (Lamp); Lysosome-associated membrane glycoprotein (Lamp). Q#20507 - CGI_10028140 superfamily 241773 26 139 3.99E-76 241.79 cl00312 Ribosomal_S12_like superfamily - - "Ribosomal protein S12-like family; composed of prokaryotic 30S ribosomal protein S12, eukaryotic 40S ribosomal protein S23 and similar proteins. S12 and S23 are located at the interface of the large and small ribosomal subunits, adjacent to the decoding center. They play an important role in translocation during the peptide elongation step of protein synthesis. They are also involved in important RNA and protein interactions. Ribosomal protein S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as a control element for the rRNA- and tRNA-driven movements of translocation. S23 interacts with domain III of the eukaryotic elongation factor 2 (eEF2), which catalyzes translocation. Mutations in S12 and S23 have been found to affect translational accuracy. Antibiotics such as streptomycin may also bind S12/S23 and cause the ribosome to misread the genetic code." Q#20507 - CGI_10028140 superfamily 243066 168 275 1.90E-20 87.6729 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20507 - CGI_10028140 superfamily 198867 284 383 1.22E-14 70.8332 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#20507 - CGI_10028140 superfamily 243146 630 698 3.42E-09 53.7163 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20507 - CGI_10028140 superfamily 243146 568 619 0.000380378 38.9686 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20507 - CGI_10028140 superfamily 243146 420 454 0.00118139 37.5379 cl02701 Kelch_3 superfamily C - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20508 - CGI_10028141 superfamily 243072 379 539 2.19E-20 88.2094 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20508 - CGI_10028141 superfamily 243072 603 702 1.61E-19 85.8982 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20508 - CGI_10028141 superfamily 243072 96 221 2.90E-14 70.4902 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20508 - CGI_10028141 superfamily 243072 22 131 3.14E-07 49.3042 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20510 - CGI_10028143 superfamily 248264 338 497 1.09E-56 187.83 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#20510 - CGI_10028143 superfamily 222263 259 346 1.84E-07 48.8533 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#20510 - CGI_10028143 superfamily 243161 3 60 2.57E-05 42.3814 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#20511 - CGI_10028144 superfamily 241583 225 364 5.11E-31 118.245 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#20511 - CGI_10028144 superfamily 216572 33 139 1.55E-05 42.6471 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#20512 - CGI_10028145 superfamily 243072 178 335 3.12E-15 73.957 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20512 - CGI_10028145 superfamily 243072 440 571 3.34E-11 62.0158 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20512 - CGI_10028145 superfamily 243072 116 231 1.45E-07 50.845 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20512 - CGI_10028145 superfamily 243072 316 370 0.000118167 41.6003 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20512 - CGI_10028145 superfamily 241583 794 934 2.77E-30 120.942 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#20513 - CGI_10028146 superfamily 245303 2 115 5.53E-57 180.967 cl10447 GH18_chitinase-like superfamily N - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#20514 - CGI_10028147 superfamily 245815 204 597 0 537.787 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#20514 - CGI_10028147 superfamily 241871 8 181 7.23E-86 272.383 cl00452 AAK superfamily C - "Amino Acid Kinases (AAK) superfamily, catalytic domain; present in such enzymes like N-acetylglutamate kinase (NAGK), carbamate kinase (CK), aspartokinase (AK), glutamate-5-kinase (G5K) and UMP kinase (UMPK). The AAK superfamily includes kinases that phosphorylate a variety of amino acid substrates. These kinases catalyze the formation of phosphoric anhydrides, generally with a carboxylate, and use ATP as the source of the phosphoryl group; are involved in amino acid biosynthesis. Some of these kinases control the process via allosteric feed-back inhibition." Q#20515 - CGI_10028148 superfamily 246723 17 461 0 742.073 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#20517 - CGI_10028150 superfamily 241578 4 131 4.23E-11 61.813 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20518 - CGI_10028151 superfamily 248097 93 217 9.14E-22 87.3206 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20518 - CGI_10028151 superfamily 128778 4 50 0.0044113 34.9331 cl17972 BBC superfamily C - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#20521 - CGI_10028154 superfamily 247744 59 249 1.94E-35 128.894 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#20521 - CGI_10028154 superfamily 247744 270 323 1.95E-15 72.6546 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#20521 - CGI_10028154 superfamily 213107 17 54 0.00213244 35.7088 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#20522 - CGI_10028155 superfamily 241622 76 147 4.05E-15 71.4438 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#20522 - CGI_10028155 superfamily 243096 238 419 6.43E-10 57.6106 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#20522 - CGI_10028155 superfamily 246669 6 32 0.000249656 40.1296 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#20526 - CGI_10028161 superfamily 241592 9 133 6.96E-26 97.3433 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#20528 - CGI_10028163 superfamily 207794 174 392 7.64E-102 308.372 cl02948 GH20_hexosaminidase superfamily C - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#20528 - CGI_10028163 superfamily 111707 53 172 5.42E-16 73.6032 cl03741 Glyco_hydro_20b superfamily - - "Glycosyl hydrolase family 20, domain 2; This domain has a zincin-like fold." Q#20532 - CGI_10028167 superfamily 247684 2 367 2.98E-179 514.552 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#20534 - CGI_10028169 superfamily 147626 55 110 0.000228059 38.7755 cl05227 DUF1519 superfamily N - Protein of unknown function (DUF1519); This family consists of several putative homing endonuclease proteins of around 245 residues in length which appear to be found exclusively in Naegleria species. The function of this family is unclear. Q#20536 - CGI_10028171 superfamily 243050 141 196 1.95E-37 131.318 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#20536 - CGI_10028171 superfamily 241599 214 272 8.19E-20 82.6764 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#20536 - CGI_10028171 superfamily 243050 44 99 1.47E-28 106.401 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#20538 - CGI_10028173 superfamily 247684 26 399 0 586.199 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#20539 - CGI_10028174 superfamily 241995 1 249 2.28E-76 236.344 cl00635 Ntn_Asparaginase_2_like superfamily N - "Ntn-hydrolase superfamily, L-Asparaginase type 2-like enzymes. This family includes Glycosylasparaginase, Taspase 1 and L-Asparaginase type 2 enzymes. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue." Q#20540 - CGI_10028175 superfamily 247725 46 185 1.07E-78 248.122 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#20540 - CGI_10028175 superfamily 219103 204 322 6.95E-56 186.038 cl05893 Myotub-related superfamily - - "Myotubularin-related; This family represents a region within eukaryotic myotubularin-related proteins that is sometimes found with pfam02893. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease." Q#20540 - CGI_10028175 superfamily 206020 382 436 1.18E-35 128.009 cl18286 Y_phosphatase_m superfamily - - "Myotubularin Y_phosphatase-like; This short region is highly conserved and seems to be common to many myotubularin proteins with protein tyrosine pyrophosphate activity. As the family has a number of highly conserved residues such as histidine, cysteine, glutamine and aspartate, it is possible that this represents a catalytic core of the active enzymatic part of the proteins." Q#20543 - CGI_10028178 superfamily 245814 733 802 1.28E-10 59.8103 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20543 - CGI_10028178 superfamily 245213 65 101 1.00E-09 56.1058 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20543 - CGI_10028178 superfamily 245213 33 63 1.74E-05 43.7794 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20543 - CGI_10028178 superfamily 245201 836 1160 4.74E-126 395.758 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20543 - CGI_10028178 superfamily 245814 345 444 5.42E-05 43.1397 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20544 - CGI_10028179 superfamily 114091 1 79 1.11E-20 80.1902 cl05088 UMP1 superfamily N - Proteasome maturation factor UMP1; UMP1 is a short-lived chaperone present in the precursor form of the 20S proteasome and absent in the mature complex. UMP1 is required for the correct assembly and enzymatic activation of the proteasome. UMP1 seems to be degraded by the proteasome upon its formation Q#20545 - CGI_10028180 superfamily 241640 43 121 8.14E-29 107.379 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#20545 - CGI_10028180 superfamily 241640 121 169 1.64E-13 64.605 cl00149 Tryp_SPc superfamily N - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#20547 - CGI_10028182 superfamily 245033 1 115 7.99E-21 82.2899 cl09208 Tim44 superfamily N - Tim44-like domain; Tim44 is an essential component of the machinery that mediates the translocation of nuclear-encoded proteins across the mitochondrial inner membrane. Tim44 is thought to bind phospholipids of the mitochondrial inner membrane both by electrostatic interactions and by penetrating the polar head group region. This family includes the C-terminal region of Tim44 that has been shown to form a stable proteolytic fragment in yeast. This region is also found in a set of smaller bacterial proteins. The molecular function of the bacterial members of this family is unknown but transport seems likely. The crystal structure of the C terminal of Tim44 has revealed a large hydrophobic pocket which might play an important role in interacting with the acyl chains of lipid molecules in the mitochondrial membrane. Q#20548 - CGI_10028183 superfamily 241754 16 699 0 1026.37 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#20548 - CGI_10028183 superfamily 218855 880 1079 4.27E-30 119.329 cl10652 Myosin_TH1 superfamily - - Myosin tail; Myosin tail. Q#20549 - CGI_10028184 superfamily 241584 340 423 5.68E-15 70.2179 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20549 - CGI_10028184 superfamily 246664 157 263 7.52E-44 158.507 cl14561 An_peroxidase_like superfamily N - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#20549 - CGI_10028184 superfamily 246664 64 153 1.41E-15 77.3501 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#20549 - CGI_10028184 superfamily 243091 27 63 0.00019722 39.6251 cl02566 SET superfamily C - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#20549 - CGI_10028184 superfamily 241584 280 328 0.00843439 34.3943 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20552 - CGI_10028187 superfamily 248097 153 285 3.88E-19 80.7722 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20553 - CGI_10028188 superfamily 234278 96 308 6.86E-19 86.8937 cl15938 non_repeat_PQQ superfamily C - "dehydrogenase, PQQ-dependent, s-GDH family; PQQ, or pyrroloquinoline-quinone, serves as a cofactor for a number of sugar and alcohol dehydrogenases in a limited number of bacterial species. Most characterized PQQ-dependent enzymes have multiple repeats of a sequence region described by pfam01011 (PQQ enzyme repeat), but this protein family in unusual in lacking that repeat. Below the noise cutoff are related proteins mostly from species that lack PQQ biosynthesis." Q#20554 - CGI_10028189 superfamily 247986 14 103 3.18E-09 55.457 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#20554 - CGI_10028189 superfamily 197504 232 362 2.58E-12 63.8477 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#20555 - CGI_10028190 superfamily 241613 69 102 1.13E-07 44.1198 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#20555 - CGI_10028190 superfamily 241613 28 64 0.00574187 31.023 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#20556 - CGI_10028191 superfamily 221574 331 411 4.36E-25 102.471 cl13819 DUF3677 superfamily - - "Protein of unknown function (DUF3677); This domain family is found in eukaryotes, and is approximately 80 amino acids in length." Q#20558 - CGI_10028193 superfamily 217408 674 1155 1.34E-51 192.755 cl15645 Nucleoporin_C superfamily - - "Non-repetitive/WGA-negative nucleoporin C-terminal; This is the C-termainl half of a family of nucleoporin proteins. Nucleoporins are the main components of the nuclear pore complex in eukaryotic cells, and mediate bidirectional nucleocytoplasmic transport, especially of mRNA and proteins. Two nucleoporin classes are known: one is characterized by the FG repeat pfam03093; the other is represented by this family, and lacks any repeats. RNA undergoing nuclear export first encounters the basket of the nuclear pore and many nucleoporins are accessible on the basket side of the pore." Q#20558 - CGI_10028193 superfamily 217408 1154 1370 1.45E-35 143.065 cl15645 Nucleoporin_C superfamily N - "Non-repetitive/WGA-negative nucleoporin C-terminal; This is the C-termainl half of a family of nucleoporin proteins. Nucleoporins are the main components of the nuclear pore complex in eukaryotic cells, and mediate bidirectional nucleocytoplasmic transport, especially of mRNA and proteins. Two nucleoporin classes are known: one is characterized by the FG repeat pfam03093; the other is represented by this family, and lacks any repeats. RNA undergoing nuclear export first encounters the basket of the nuclear pore and many nucleoporins are accessible on the basket side of the pore." Q#20559 - CGI_10028194 superfamily 247727 195 279 4.43E-11 58.9806 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#20560 - CGI_10028195 superfamily 247057 36 110 5.31E-27 97.3738 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#20561 - CGI_10028196 superfamily 241625 20 143 3.37E-15 67.3504 cl00123 PROF superfamily - - "Profilin binds actin monomers, membrane polyphosphoinositides such as PI(4,5)P2, and poly-L-proline. Profilin can inhibit actin polymerization into F-actin by binding to monomeric actin (G-actin) and terminal F-actin subunits, but - as a regulator of the cytoskeleton - it may also promote actin polymerization. It plays a role in the assembly of branched actin filament networks, by activating WASP via binding to WASP's proline rich domain. Profilin may link the cytoskeleton with major signalling pathways by interacting with components of the phosphatidylinositol cycle and Ras pathway." Q#20562 - CGI_10028197 superfamily 241733 8 83 4.32E-51 171.99 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#20562 - CGI_10028197 superfamily 245852 194 501 9.81E-14 70.4562 cl12050 TraB superfamily - - "TraB family; pAD1 is a hemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. It encodes a mating response to a peptide sex pheromone, cAD1, secreted by recipient bacteria. Once the plasmid pAD1 is acquired, production of the pheromone ceases--a trait related in part to a determinant designated traB. However a related protein is found in C. elegans, suggesting that members of the TraB family have some more general function. This family also includes the bacterial GumN protein. The family has a conserved GXXH motif close to the N-terminus, a conserved glutamate and a conserved arginine that may be catalytic. The family also includes a second conserved GXXH motif near the C-terminus." Q#20563 - CGI_10028198 superfamily 243310 30 270 7.30E-74 229.818 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#20564 - CGI_10028199 superfamily 247856 76 135 6.90E-09 50.6241 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#20564 - CGI_10028199 superfamily 247856 146 212 0.000150855 38.2977 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#20565 - CGI_10028200 superfamily 217380 125 410 8.90E-78 256.098 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#20565 - CGI_10028200 superfamily 248034 593 653 0.00101916 41.763 cl17480 Herpes_ICP4_N superfamily NC - "Herpesvirus ICP4-like protein N-terminal region; The immediate-early protein ICP4 (infected-cell polypeptide 4) is required for efficient transcription of early and late viral genes and is thus essential for productive infection. ICP4 is a large phosphoprotein that binds DNA in a sequence specific manner as a homodimer. ICP4 represses transcription from LAT, ICP4 and ORF-P that have high-affinity a ICP4 binding site that spans the transcription initiation site. ICP4 proteins have two highly conserved regions, this family contains the N-terminal region that contains sites for DNA binding and homodimerisation." Q#20567 - CGI_10028202 superfamily 241972 21 116 5.80E-31 108.05 cl00600 Ribosomal_L7Ae superfamily - - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#20568 - CGI_10028203 superfamily 245226 279 331 1.01E-07 49.9917 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#20569 - CGI_10028205 superfamily 247794 54 96 0.000459396 39.9077 cl17240 FDH_GDH_like superfamily NC - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#20571 - CGI_10028207 superfamily 217390 156 286 0.000165357 39.8505 cl18407 TPT superfamily - - Triose-phosphate Transporter family; This family includes transporters with a specificity for triose phosphate. Q#20572 - CGI_10028208 superfamily 247744 95 294 2.77E-89 275.587 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#20572 - CGI_10028208 superfamily 241770 394 518 4.96E-06 45.0792 cl00309 PRTases_typeI superfamily - - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#20573 - CGI_10028209 superfamily 243092 17 314 3.71E-18 83.5384 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20578 - CGI_10028218 superfamily 242274 1 169 6.20E-08 49.333 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#20579 - CGI_10028219 superfamily 243035 29 143 3.04E-25 96.9201 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20579 - CGI_10028219 superfamily 243035 172 240 7.33E-14 64.9485 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20581 - CGI_10028221 superfamily 242889 260 360 3.81E-18 79.1841 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#20582 - CGI_10028222 superfamily 241640 136 388 1.89E-82 262.598 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#20582 - CGI_10028222 superfamily 241640 437 674 5.22E-65 215.988 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#20583 - CGI_10028223 superfamily 241640 44 275 3.73E-74 228.7 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#20585 - CGI_10028225 superfamily 243072 269 344 0.00624243 35.8223 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20586 - CGI_10028226 superfamily 247792 281 324 8.03E-10 55.5296 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20587 - CGI_10028227 superfamily 198842 7 49 0.00714014 31.5037 cl04338 TAP_C superfamily - - "TAP C-terminal domain; The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for nuclear export of mRNA. Tap has a modular structure, and its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate nuclear shuttling. The structure of the C-terminal domain is composed of four helices. The structure is related to the UBA domain." Q#20589 - CGI_10028229 superfamily 207654 236 300 1.47E-22 89.039 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#20590 - CGI_10003002 superfamily 243035 71 183 3.49E-10 56.8593 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20590 - CGI_10003002 superfamily 243035 194 309 2.16E-06 45.6886 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20591 - CGI_10003003 superfamily 243035 4 117 2.88E-13 63.7929 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20592 - CGI_10003004 superfamily 243035 35 160 4.90E-17 72.6525 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20593 - CGI_10003371 superfamily 245303 11 366 7.91E-125 366.758 cl10447 GH18_chitinase-like superfamily - - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#20594 - CGI_10003372 superfamily 217331 39 418 2.97E-58 198.844 cl03851 Perilipin superfamily - - Perilipin family; The perilipin family includes lipid droplet-associated protein (perilipin) and adipose differentiation-related protein (adipophilin). Q#20595 - CGI_10003373 superfamily 245201 1040 1326 5.32E-170 513.814 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20595 - CGI_10003373 superfamily 241584 584 623 8.95E-06 45.5651 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20595 - CGI_10003373 superfamily 241584 881 975 8.23E-05 42.8687 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20595 - CGI_10003373 superfamily 241585 209 249 0.000223919 40.9652 cl00066 FU superfamily C - Furin-like repeats. Cysteine rich region. Exact function of the domain is not known. Furin is a serine-kinase dependent proprotein processor. Other members of this family include endoproteases and cell surface receptors. Q#20595 - CGI_10003373 superfamily 216254 20 131 7.14E-30 116.579 cl08303 Recep_L_domain superfamily - - Receptor L domain; The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. Q#20595 - CGI_10003373 superfamily 216254 328 442 3.14E-26 106.178 cl08303 Recep_L_domain superfamily - - Receptor L domain; The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. Q#20597 - CGI_10002392 superfamily 201956 7 91 3.31E-25 93.8134 cl08333 ACOX superfamily N - Acyl-CoA oxidase; This is a family of Acyl-CoA oxidases EC:1.3.3.6. Acyl-coA oxidase converts acyl-CoA into trans-2- enoyl-CoA. Q#20600 - CGI_10017444 superfamily 241584 304 380 0.000266713 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20600 - CGI_10017444 superfamily 241584 109 172 0.000928497 38.2463 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20600 - CGI_10017444 superfamily 243124 568 635 3.73E-05 42.7993 cl02648 NIDO superfamily C - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#20602 - CGI_10017446 superfamily 241596 21 80 3.59E-08 49.5199 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#20602 - CGI_10017446 superfamily 243123 102 137 5.55E-06 42.9306 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#20604 - CGI_10017448 superfamily 241596 15 47 0.00125148 36.0379 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#20604 - CGI_10017448 superfamily 243123 61 100 7.81E-14 64.1165 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#20605 - CGI_10017449 superfamily 241596 25 45 0.000378898 37.1935 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#20605 - CGI_10017449 superfamily 243123 61 98 4.97E-11 55.6421 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#20606 - CGI_10017450 superfamily 217926 361 444 1.75E-37 134.225 cl04418 YTH superfamily C - "YT521-B-like domain; A protein of the YTH family has been shown to selectively remove transcripts of meiosis-specific genes expressed in mitotic cells. It has been speculated that in higher eukaryotic YTH-family members may be involved in similar mechanisms to suppress gene regulation during gametogenesis or general silencing. The rat protein YT521-B is a tyrosine-phosphorylated nuclear protein, that interacts with the nuclear transcriptosomal component scaffold attachment factor B, and the 68-kDa Src substrate associated during mitosis, Sam68. In vivo splicing assays demonstrated that YT521-B modulates alternative splice site selection in a concentration-dependent manner. The YTH domain has been identified as part of the PUA superfamily." Q#20607 - CGI_10017451 superfamily 216554 130 270 8.41E-27 103.713 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#20608 - CGI_10017452 superfamily 222150 1224 1249 0.003509 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#20610 - CGI_10017454 superfamily 216056 53 200 3.29E-69 219.489 cl08279 Peptidase_M16 superfamily - - Insulinase (Peptidase family M16); Insulinase (Peptidase family M16). Q#20610 - CGI_10017454 superfamily 218490 205 393 3.07E-34 126.822 cl08432 Peptidase_M16_C superfamily - - "Peptidase M16 inactive domain; Peptidase M16 consists of two structurally related domains. One is the active peptidase, whereas the other is inactive. The two domains hold the substrate like a clamp." Q#20611 - CGI_10017455 superfamily 245208 71 513 0 568.253 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#20613 - CGI_10017457 superfamily 110998 33 469 1.46E-64 218.276 cl03422 Glyco_hydro_30 superfamily - - O-Glycosyl hydrolase family 30; O-Glycosyl hydrolase family 30. Q#20614 - CGI_10017458 superfamily 243072 978 1110 8.28E-26 105.158 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20614 - CGI_10017458 superfamily 243072 857 1002 6.35E-09 55.4674 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20614 - CGI_10017458 superfamily 247743 185 285 0.00624028 36.8089 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#20617 - CGI_10017461 superfamily 201540 25 111 3.20E-05 40.2245 cl16960 Troponin superfamily C - "Troponin; Troponin (Tn) contains three subunits, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). this Pfam contains members of the TnT subunit. Troponin is a complex of three proteins, Ca2+ binding (TnC), inhibitory (TnI), and tropomyosin binding (TnT). The troponin complex regulates Ca++ induced muscle contraction. This family includes troponin T and troponin I. Troponin I binds to actin and troponin T binds to tropomyosin." Q#20618 - CGI_10017462 superfamily 243035 39 158 5.93E-27 99.2313 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20619 - CGI_10017463 superfamily 243035 157 276 3.78E-27 102.698 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20619 - CGI_10017463 superfamily 243035 91 138 1.46E-06 45.3034 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20620 - CGI_10017464 superfamily 243035 77 193 5.32E-27 100.387 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20620 - CGI_10017464 superfamily 243035 19 60 7.11E-05 39.5254 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20621 - CGI_10017465 superfamily 241644 236 375 1.06E-49 166.224 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#20621 - CGI_10017465 superfamily 243074 138 185 2.34E-07 47.1161 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#20625 - CGI_10017469 superfamily 247637 21 369 4.10E-136 395.801 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#20626 - CGI_10017470 superfamily 218652 8 410 2.56E-115 350.825 cl12311 CLPTM1 superfamily - - "Cleft lip and palate transmembrane protein 1 (CLPTM1); This family consists of several eukaryotic cleft lip and palate transmembrane protein 1 sequences. Cleft lip with or without cleft palate is a common birth defect that is genetically complex. The nonsyndromic forms have been studied genetically using linkage and candidate-gene association studies with only partial success in defining the loci responsible for orofacial clefting. CLPTM1 encodes a transmembrane protein and has strong homology to two Caenorhabditis elegans genes, suggesting that CLPTM1 may belong to a new gene family. This family also contains the human cisplatin resistance related protein CRR9p which is associated with CDDP-induced apoptosis." Q#20627 - CGI_10017471 superfamily 247792 197 238 2.83E-09 51.6776 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20628 - CGI_10017472 superfamily 241645 8 82 5.21E-05 41.9 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#20629 - CGI_10017473 superfamily 241578 1 125 9.79E-16 75.0206 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20629 - CGI_10017473 superfamily 243119 194 243 1.39E-05 43.1913 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#20629 - CGI_10017473 superfamily 241611 433 589 0.000141287 41.22 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#20630 - CGI_10017474 superfamily 245206 535 814 7.68E-60 202.853 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#20632 - CGI_10017476 superfamily 148285 102 207 1.45E-40 142.489 cl05876 DIRP superfamily - - "DIRP; DIRP (Domain in Rb-related Pathway) is postulated to be involved in the Rb-related pathway, which is encoded by multiple eukaryotic genomes and is present in proteins including lin-9 of Caenorhabditis elegans, aly of fruit fly and mustard weed. Studies of lin-9 and aly of fruit fly proteins containing DIRP suggest that this domain might be involved in development. Aly, lin-9, act in parallel to, or downstream of, activation of MAPK by the RTK-Ras signalling pathway." Q#20633 - CGI_10017477 superfamily 241626 174 291 1.07E-53 174.33 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#20634 - CGI_10017478 superfamily 245819 254 430 2.76E-69 222.069 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#20634 - CGI_10017478 superfamily 245201 3 180 2.03E-20 90.2884 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20634 - CGI_10017478 superfamily 219526 199 240 2.13E-07 50.3103 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#20637 - CGI_10002357 superfamily 243091 818 936 9.67E-14 69.058 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#20638 - CGI_10002837 superfamily 247724 50 252 7.12E-60 195.061 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20638 - CGI_10002837 superfamily 221533 295 350 0.00359185 35.3652 cl13726 TMF_DNA_bd superfamily C - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#20639 - CGI_10002838 superfamily 243035 80 128 2.26E-10 53.901 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20639 - CGI_10002838 superfamily 243035 46 98 2.16E-05 39.623 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20640 - CGI_10002839 superfamily 243035 130 248 7.86E-29 106.935 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20644 - CGI_10006510 superfamily 241559 8 151 2.57E-23 93.314 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#20645 - CGI_10006511 superfamily 241739 73 403 9.89E-95 291.043 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#20645 - CGI_10006511 superfamily 241738 417 508 1.16E-13 67.1796 cl00266 HGTP_anticodon superfamily - - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#20645 - CGI_10006511 superfamily 241805 11 35 8.58E-05 40.5307 cl00349 S15_NS1_EPRS_RNA-bind superfamily C - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#20646 - CGI_10006512 superfamily 241743 37 86 3.85E-05 41.7898 cl00274 ML superfamily NC - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#20647 - CGI_10006513 superfamily 243124 95 254 2.45E-34 124.462 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#20648 - CGI_10006515 superfamily 245814 22 97 5.95E-07 43.2467 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20649 - CGI_10011080 superfamily 245011 61 153 2.14E-41 135.327 cl09113 cpn10 superfamily - - "Chaperonin 10 Kd subunit (cpn10 or GroES); Cpn10 cooperates with chaperonin 60 (cpn60 or GroEL), an ATPase, to assist the folding and assembly of proteins and is found in eubacterial cytosol, as well as in the matrix of mitochondria and chloroplasts. It forms heptameric rings with a dome-like structure, forming a lid to the large cavity of the tetradecameric cpn60 cylinder and thereby tightly regulating release and binding of proteins to the cpn60 surface." Q#20650 - CGI_10011081 superfamily 243176 25 533 0 787.034 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#20653 - CGI_10011084 superfamily 245847 307 448 4.09E-08 52.1738 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20653 - CGI_10011084 superfamily 221377 95 183 4.94E-05 42.8411 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#20655 - CGI_10011086 superfamily 202484 20 84 2.41E-18 73.032 cl03798 zf-Tim10_DDP superfamily - - Tim10/DDP family zinc finger; Putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein TIMM8A. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localised to the mitochondrial intermembrane space. Q#20656 - CGI_10011087 superfamily 243742 15 304 6.46E-107 346.497 cl04407 Dopey_N superfamily - - "Dopey, N-terminal; DopA is the founding member of the Dopey family and is required for correct cell morphology and spatiotemporal organisation of multicellular structures in the filamentous fungus Aspergillus nidulans. DopA homologues are found in mammals. S. cerevisiae DOP1 is essential for viability and, affects cellular morphogenesis." Q#20656 - CGI_10011087 superfamily 243742 1932 2092 0.00010712 46.4604 cl04407 Dopey_N superfamily N - "Dopey, N-terminal; DopA is the founding member of the Dopey family and is required for correct cell morphology and spatiotemporal organisation of multicellular structures in the filamentous fungus Aspergillus nidulans. DopA homologues are found in mammals. S. cerevisiae DOP1 is essential for viability and, affects cellular morphogenesis." Q#20657 - CGI_10011088 superfamily 218050 300 445 3.19E-49 167.254 cl18439 ATE_C superfamily - - "Arginine-tRNA-protein transferase, C terminus; This family represents the C terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyzes the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a destabilising amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified." Q#20657 - CGI_10011088 superfamily 202988 17 93 1.17E-24 97.6027 cl04490 ATE_N superfamily - - "Arginine-tRNA-protein transferase, N terminus; This family represents the N terminal region of the enzyme arginine-tRNA-protein transferase (EC 2.3.2.8), which catalyzes the post-translational conjugation of arginine to the N terminus of a protein. In eukaryotes, this functions as part of the N-end rule pathway of protein degradation by conjugating a de-stabilising amino acid to the amino terminal aspartate or glutamate of a protein, targeting the protein for ubiquitin-dependent proteolysis. N terminal cysteine is sometimes modified. In S cerevisiae, Cys20, 23, 94 and/or 95 are thought to be important for activity. Of these, only Cys 94 appears to be completely conserved in this family." Q#20658 - CGI_10011089 superfamily 245606 56 224 4.31E-76 235.45 cl11410 TPP_enzyme_PYR superfamily - - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#20658 - CGI_10011089 superfamily 217227 266 380 1.76E-37 133.105 cl08363 Transketolase_C superfamily - - "Transketolase, C-terminal domain; The C-terminal domain of transketolase has been proposed as a regulatory molecule binding site." Q#20659 - CGI_10011090 superfamily 245864 19 428 1.70E-110 335.788 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#20660 - CGI_10011091 superfamily 245201 343 603 8.15E-82 260.547 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20661 - CGI_10007693 superfamily 243029 10 73 2.90E-09 49.6565 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#20661 - CGI_10007693 superfamily 215647 84 130 0.00739585 33.7361 cl18338 7tm_2 superfamily NC - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#20662 - CGI_10007694 superfamily 241738 108 228 1.43E-19 81.3106 cl00266 HGTP_anticodon superfamily - - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#20666 - CGI_10007698 superfamily 192535 34 151 0.000448797 40.2718 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#20667 - CGI_10007699 superfamily 202367 24 208 4.47E-74 227.808 cl18226 3HCDH_N superfamily - - "3-hydroxyacyl-CoA dehydrogenase, NAD binding domain; This family also includes lambda crystallin." Q#20667 - CGI_10007699 superfamily 216084 210 307 1.04E-45 151.976 cl08285 3HCDH superfamily - - "3-hydroxyacyl-CoA dehydrogenase, C-terminal domain; This family also includes lambda crystallin. Some proteins include two copies of this domain." Q#20669 - CGI_10007701 superfamily 247724 41 146 6.55E-39 145.945 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20669 - CGI_10007701 superfamily 244558 165 251 9.41E-38 137.687 cl06950 AARP2CN superfamily - - AARP2CN (NUC121) domain; This domain is the central domain of AARP2. It is weakly similar to the GTP-binding domain of elongation factor TU. Q#20670 - CGI_10007702 superfamily 242225 3 137 1.60E-75 232.779 cl00969 Ribosomal_S19e superfamily - - Ribosomal protein S19e; Ribosomal protein S19e. Q#20671 - CGI_10010204 superfamily 247805 133 282 1.49E-18 80.074 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#20671 - CGI_10010204 superfamily 204187 21 67 0.000270713 38.2484 cl08537 MerR-DNA-bind superfamily N - "MerR, DNA binding; Members of this family of DNA-binding domains are predominantly found in the prokaryotic transcriptional regulator MerR. They adopt a structure consisting of a core of three alpha helices, with an architecture that is similar to that of the 'winged helix' fold." Q#20672 - CGI_10010205 superfamily 241644 51 174 1.30E-28 108.058 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#20673 - CGI_10010206 superfamily 243039 450 598 3.34E-69 222.868 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#20673 - CGI_10010206 superfamily 247792 54 92 2.02E-09 53.9888 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20673 - CGI_10010206 superfamily 190233 138 192 3.30E-07 47.8342 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#20673 - CGI_10010206 superfamily 190233 196 251 5.23E-07 47.449 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#20682 - CGI_10010215 superfamily 247068 498 571 9.36E-05 41.1811 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#20684 - CGI_10004559 superfamily 144608 2 39 2.15E-05 39.0353 cl18013 Mg_chelatase superfamily C - "Magnesium chelatase, subunit ChlI; Magnesium-chelatase is a three-component enzyme that catalyzes the insertion of Mg2+ into protoporphyrin IX. This is the first unique step in the synthesis of (bacterio)chlorophyll. Due to this, it is thought that Mg-chelatase has an important role in channelling inter- mediates into the (bacterio)chlorophyll branch in response to conditions suitable for photosynthetic growth. ChlI and BchD have molecular weight between 38-42 kDa." Q#20686 - CGI_10005317 superfamily 245874 207 270 0.00214401 35.865 cl12111 TNFR superfamily C - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#20688 - CGI_10005319 superfamily 241659 274 341 1.88E-09 53.7471 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#20690 - CGI_10005321 superfamily 241659 151 218 9.43E-11 55.2879 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#20692 - CGI_10005323 superfamily 244265 149 421 7.93E-126 368.729 cl05973 FAM20_C_like superfamily - - "C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins; Drosophila Fj is a Golgi kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in the Drosophila fj gene include loss of the intermediate leg joint, and a PCP defect in the eye. Fjx1, the murine homologue of Fj, has been shown to be involved in both the Fat and Hippo signaling pathways, these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. This domain has homology to a kinase-active site, mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. This model includes the FAM20_C domain family, previously known as DUF1193; FAM20_C appears to be homologous to the catalytic domain of the phosphoinositide 3-kinase (PI3K)-like family." Q#20693 - CGI_10005324 superfamily 246925 239 347 4.57E-12 66.2249 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#20694 - CGI_10005325 superfamily 245213 740 767 0.000195449 40.3126 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20694 - CGI_10005325 superfamily 245213 696 738 0.00265685 36.8458 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20696 - CGI_10002704 superfamily 207701 81 198 1.84E-29 114.699 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#20696 - CGI_10002704 superfamily 241578 333 473 1.05E-14 73.4033 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20697 - CGI_10005672 superfamily 247684 14 183 1.35E-12 64.9175 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#20699 - CGI_10005674 superfamily 243179 128 233 1.84E-17 80.6262 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#20699 - CGI_10005674 superfamily 241613 1470 1506 2.78E-11 61.0685 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#20699 - CGI_10005674 superfamily 241613 1513 1544 2.47E-07 49.5126 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#20699 - CGI_10005674 superfamily 214531 680 722 1.69E-13 67.6268 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 723 766 2.54E-11 61.4636 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 767 811 4.71E-10 57.6116 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 457 498 1.19E-09 56.4561 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 916 957 3.92E-09 54.9153 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 958 1000 9.67E-09 53.7597 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 369 409 4.95E-08 51.8337 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 411 450 1.11E-07 50.6781 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 1318 1358 2.37E-07 49.9077 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 638 679 1.38E-05 44.5149 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 215683 1292 1333 2.45E-05 44.0831 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#20699 - CGI_10005674 superfamily 214531 1003 1041 4.11E-05 43.3593 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 215683 521 558 7.59E-05 42.5423 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#20699 - CGI_10005674 superfamily 214531 813 855 8.99E-05 42.2037 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 1044 1084 0.000109161 41.8185 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20699 - CGI_10005674 superfamily 214531 1359 1400 0.00107878 39.1221 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20701 - CGI_10005676 superfamily 218453 297 408 7.28E-50 173.184 cl04949 Pep3_Vps18 superfamily N - Pep3/Vps18/deep orange family; This region is found in a number of protein identified as involved in golgi function and vacuolar sorting. The molecular function of this region is unknown. The members of this family contain a C-terminal ring finger domain. Q#20703 - CGI_10005678 superfamily 243072 2 109 3.47E-29 107.855 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20705 - CGI_10023738 superfamily 241572 51 135 1.60E-11 60.33 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#20706 - CGI_10023739 superfamily 241572 55 140 2.01E-05 41.8405 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#20708 - CGI_10023741 superfamily 222424 39 226 2.92E-58 186.626 cl16439 BCIP superfamily - - "p21-C-terminal region-binding protein; This family of p21-binding proteins is important as a modulator of p21 activity. The domain binds the C-terminal region of p21 in a ternary complex with CDK2, which results in inhibition of the kinase activity of CDK2." Q#20712 - CGI_10023745 superfamily 243065 241 331 1.95E-12 64.7705 cl02516 VWD superfamily N - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#20712 - CGI_10023745 superfamily 244710 378 443 4.79E-12 62.0225 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#20712 - CGI_10023745 superfamily 248289 180 238 2.56E-09 53.5819 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#20712 - CGI_10023745 superfamily 248289 116 174 6.87E-05 40.8703 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#20713 - CGI_10023746 superfamily 214531 139 182 4.05E-07 48.3669 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20713 - CGI_10023746 superfamily 241578 286 329 0.000129346 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20713 - CGI_10023746 superfamily 214531 450 492 0.00133337 37.9665 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20713 - CGI_10023746 superfamily 215683 113 156 0.00137949 37.9199 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#20713 - CGI_10023746 superfamily 214531 200 226 0.00538244 36.4257 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20713 - CGI_10023746 superfamily 214531 741 776 0.0054625 36.0405 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#20716 - CGI_10023749 superfamily 216363 70 156 3.69E-19 78.6662 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#20720 - CGI_10023754 superfamily 243199 65 99 1.35E-09 52.5977 cl02808 RT_like superfamily N - "RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs." Q#20723 - CGI_10023757 superfamily 241593 34 155 2.36E-08 50.7485 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#20723 - CGI_10023757 superfamily 243078 152 187 1.73E-05 43.8339 cl02544 VHS_ENTH_ANTH superfamily N - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#20724 - CGI_10023758 superfamily 222006 554 653 4.46E-12 64.1658 cl16182 Hydrolase_like2 superfamily - - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#20724 - CGI_10023758 superfamily 215733 191 278 1.58E-11 64.5087 cl02811 E1-E2_ATPase superfamily C - E1-E2 ATPase; E1-E2 ATPase. Q#20725 - CGI_10023759 superfamily 243306 174 416 2.91E-139 415.773 cl03114 RNase_PH superfamily - - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#20725 - CGI_10023759 superfamily 243306 6 181 4.28E-88 281.724 cl03114 RNase_PH superfamily C - "RNase PH-like 3'-5' exoribonucleases; RNase PH-like 3'-5' exoribonucleases are enzymes that catalyze the 3' to 5' processing and decay of RNA substrates. Evolutionarily related members can be fond in prokaryotes, archaea, and eukaryotes. Bacterial ribonuclease PH contains a single copy of this domain, and removes nucleotide residues following the -CCA terminus of tRNA. Polyribonucleotide nucleotidyltransferase (PNPase) contains two tandem copies of the domain and is involved in mRNA degradation in a 3'-5' direction. Archaeal exosomes contain two individually encoded RNase PH-like 3'-5' exoribonucleases and are required for 3' processing of the 5.8S rRNA. The eukaryotic exosome core is composed of six individually encoded RNase PH-like subunits, but it is not a phosphorolytic enzyme per se; it directly associates with Rrp44 and Rrp6, which are hydrolytic exoribonucleases related to bacterial RNase II/R and RNase D. All members of the RNase PH-like family form ring structures by oligomerization of six domains or subunits, except for a total of 3 subunits with tandem repeats in the case of PNPase, with a central channel through which the RNA substrate must pass to gain access to the phosphorolytic active sites." Q#20725 - CGI_10023759 superfamily 242274 536 713 2.29E-10 59.7334 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#20726 - CGI_10023760 superfamily 241572 205 294 7.73E-27 102.317 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#20726 - CGI_10023760 superfamily 241572 303 389 4.39E-09 53.0113 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#20727 - CGI_10023761 superfamily 247755 470 604 2.38E-44 156.999 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20729 - CGI_10023763 superfamily 247755 4 167 1.91E-33 124.642 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#20730 - CGI_10023764 superfamily 241599 301 357 8.79E-15 68.0388 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#20730 - CGI_10023764 superfamily 198730 203 272 4.01E-37 129.866 cl02582 Pou superfamily - - Pou domain - N-terminal to homeobox domain; Pou domain - N-terminal to homeobox domain. Q#20731 - CGI_10023765 superfamily 244881 329 627 8.66E-142 424.298 cl08267 ISOPREN_C2_like superfamily - - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#20731 - CGI_10023765 superfamily 215788 110 199 4.14E-33 123.829 cl08251 A2M superfamily - - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#20731 - CGI_10023765 superfamily 203720 734 820 1.14E-23 96.8485 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#20733 - CGI_10023767 superfamily 216731 242 330 3.20E-20 87.3143 cl12258 A2M_N superfamily - - MG2 domain; This is the MG2 (macroglobulin) domain of alpha-2-macroglobulin. Q#20734 - CGI_10023768 superfamily 247743 478 644 6.64E-29 113.395 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#20734 - CGI_10023768 superfamily 247743 258 378 2.97E-12 64.8599 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#20738 - CGI_10023772 superfamily 216341 206 274 7.95E-21 84.064 cl03125 UPF0016 superfamily - - Uncharacterized protein family UPF0016; This family contains integral membrane proteins of unknown function. Most members of the family contain two copies of a region that contains an EXGD motif. Each of these regions contains three predicted transmembrane regions. Q#20738 - CGI_10023772 superfamily 216341 68 142 7.56E-19 78.6712 cl03125 UPF0016 superfamily - - Uncharacterized protein family UPF0016; This family contains integral membrane proteins of unknown function. Most members of the family contain two copies of a region that contains an EXGD motif. Each of these regions contains three predicted transmembrane regions. Q#20741 - CGI_10023775 superfamily 245211 172 736 0 729.341 cl09939 RNR_PFL superfamily - - "Ribonucleotide reductase and Pyruvate formate lyase; Ribonucleotide reductase (RNR) and pyruvate formate lyase (PFL) are believed to have diverged from a common ancestor. They have a structurally similar ten-stranded alpha-beta barrel domain that hosts the active site, and are radical enzymes. RNRs are found in all organisms and provide the only mechanism by which nucleotides are converted to deoxynucleotides. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs use a diiron-tyrosyl radical while Class II RNRs use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. PFL, an essential enzyme in anaerobic bacteria, catalyzes the conversion of pyruvate and CoA to acteylCoA and formate in a mechanism that uses a glycyl radical." Q#20741 - CGI_10023775 superfamily 217585 12 94 6.27E-13 65.7308 cl12279 ATP-cone superfamily - - ATP cone domain; ATP cone domain. Q#20742 - CGI_10023776 superfamily 246925 221 369 5.13E-17 80.8625 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#20742 - CGI_10023776 superfamily 247856 482 529 0.00123321 37.1421 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#20743 - CGI_10023777 superfamily 148145 52 87 0.0082893 33.2416 cl05715 CD34_antigen superfamily NC - CD34/Podocalyxin family; This family consists of several mammalian CD34 antigen proteins. The CD34 antigen is a human leukocyte membrane protein expressed specifically by lymphohematopoietic progenitor cells. CD34 is a phosphoprotein. Activation of protein kinase C (PKC) has been found to enhance CD34 phosphorylation. This family contains several eukaryotic podocalyxin proteins. Podocalyxin is a major membrane protein of the glomerular epithelium and is thought to be involved in maintenance of the architecture of the foot processes and filtration slits characteristic of this unique epithelium by virtue of its high negative charge. Podocalyxin functions as an anti-adhesin that maintains an open filtration pathway between neighboring foot processes in the glomerular epithelium by charge repulsion. Q#20744 - CGI_10023778 superfamily 244824 23 395 0 544.85 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#20744 - CGI_10023778 superfamily 243149 401 469 2.51E-09 54.9365 cl02706 Alpha-amylase_C superfamily - - "Alpha amylase, C-terminal all-beta domain; Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain." Q#20745 - CGI_10023779 superfamily 216686 1 175 1.85E-45 151.707 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#20747 - CGI_10023781 superfamily 244824 57 429 0 534.835 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#20747 - CGI_10023781 superfamily 243149 437 506 1.28E-11 61.8701 cl02706 Alpha-amylase_C superfamily - - "Alpha amylase, C-terminal all-beta domain; Alpha amylase is classified as family 13 of the glycosyl hydrolases. The structure is an 8 stranded alpha/beta barrel containing the active site, interrupted by a ~70 a.a. calcium-binding domain protruding between beta strand 3 and alpha helix 3, and a carboxyl-terminal Greek key beta-barrel domain." Q#20748 - CGI_10023782 superfamily 246925 138 316 3.07E-13 68.9213 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#20749 - CGI_10023783 superfamily 246918 20 73 1.61E-09 49.1223 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#20750 - CGI_10023784 superfamily 246918 20 72 0.000120332 35.6403 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#20751 - CGI_10023785 superfamily 247723 66 153 4.49E-59 192.818 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#20751 - CGI_10023785 superfamily 243034 411 518 1.60E-07 49.3008 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20751 - CGI_10023785 superfamily 243034 169 276 2.20E-06 45.834 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20751 - CGI_10023785 superfamily 243034 330 435 0.00820528 35.0484 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#20752 - CGI_10023786 superfamily 245201 1 128 2.22E-24 96.0663 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#20753 - CGI_10023787 superfamily 220394 251 357 3.64E-28 115.634 cl10752 Meckelin superfamily N - "Meckelin (Transmembrane protein 67); Members of this family are thought to be related to the ciliary basal body. Defects result in Meckel syndrome type 3, an autosomal recessive disorder characterized by a combination of renal cysts and variably associated features including developmental anomalies of the central nervous system (typically encephalocele), hepatic ductal dysplasia and cysts, and polydactyly. Joubert syndrome type 6 is also a manifestation of certain mutations; it is an autosomal recessive congenital malformation of the cerebellar vermis and brainstem with abnormalities of axonal decussation (crossing in the brain) affecting the corticospinal tract and superior cerebellar peduncles. Individuals with Joubert syndrome have motor and behavioral abnormalities, including an inability to walk due to severe clumsiness and 'mirror' movements, and cognitive and behavioural disturbances." Q#20753 - CGI_10023787 superfamily 243521 209 253 2.56E-09 53.7862 cl03759 Alpha_adaptinC2 superfamily C - "Adaptin C-terminal domain; Alpha adaptin is a heterotetramer which regulates clathrin-bud formation. The carboxyl-terminal appendage of the alpha subunit regulates translocation of endocytic accessory proteins to the bud site. This ig-fold domain is found in alpha, beta and gamma adaptins." Q#20757 - CGI_10001592 superfamily 241609 712 780 3.85E-20 86.6631 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#20757 - CGI_10001592 superfamily 243065 175 329 4.10E-24 100.209 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#20757 - CGI_10001592 superfamily 243093 43 119 6.68E-11 59.8514 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#20757 - CGI_10001592 superfamily 244710 374 457 0.000864183 38.4756 cl07383 C8 superfamily - - "C8 domain; This domain contains 8 conserved cysteine residues, but this family only contains 7 of them to overlaps with other domains. It is found in disease-related proteins including von Willebrand factor, Alpha tectorin, Zonadhesin and Mucin. It is often found on proteins containing pfam00094 and pfam01826." Q#20758 - CGI_10001593 superfamily 221370 55 118 0.00246916 34.6545 cl13441 DUF3497 superfamily C - "Domain of unknown function (DUF3497); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 213 to 257 amino acids in length. This domain is found associated with pfam02793, pfam00002, pfam01825. This domain has a single completely conserved residue W that may be functionally important." Q#20762 - CGI_10003854 superfamily 241750 158 462 2.44E-138 405.117 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#20762 - CGI_10003854 superfamily 241750 41 90 2.75E-18 84.6314 cl00281 metallo-dependent_hydrolases superfamily C - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#20762 - CGI_10003854 superfamily 219848 1 61 1.31E-13 66.9542 cl07170 A_deaminase_N superfamily N - Adenosine/AMP deaminase N-terminal; This domain is found to the N-terminus of the Adenosine/AMP deaminase domain (pfam00962) in metazoan proteins such as the Cat eye syndrome critical region protein 1 and its homologues. Q#20764 - CGI_10003856 superfamily 245205 3 130 4.03E-11 59.6291 cl09930 RPA_2b-aaRSs_OBF_like superfamily C - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#20765 - CGI_10003857 superfamily 242156 85 201 2.63E-45 148.055 cl00869 PTH2_family superfamily - - "Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes." Q#20765 - CGI_10003857 superfamily 241643 13 50 0.000106955 37.8239 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#20766 - CGI_10003858 superfamily 243039 15 158 6.43E-57 177.415 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#20767 - CGI_10003859 superfamily 247792 47 85 0.000306714 37.8104 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20767 - CGI_10003859 superfamily 190233 186 245 1.24E-13 64.0126 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#20767 - CGI_10003859 superfamily 190233 132 187 2.99E-06 43.597 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#20771 - CGI_10003546 superfamily 245814 157 218 5.43E-15 68.9727 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20771 - CGI_10003546 superfamily 245814 234 336 1.64E-09 54.553 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20772 - CGI_10003547 superfamily 245814 184 266 3.44E-08 50.5655 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20772 - CGI_10003547 superfamily 245814 378 460 6.26E-07 47.0987 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20772 - CGI_10003547 superfamily 245814 88 149 3.83E-13 64.7355 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20772 - CGI_10003547 superfamily 245814 285 343 4.74E-13 64.3503 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20774 - CGI_10003549 superfamily 219345 87 175 0.00101969 39.4331 cl06326 Phlebovirus_G1 superfamily C - Phlebovirus glycoprotein G1; This family consists of several Phlebovirus glycoprotein G1 sequences. Members of the Bunyaviridae family acquire an envelope by budding through the lipid bilayer of the Golgi complex. The budding compartment is thought to be determined by the accumulation of the two heterodimeric membrane glycoproteins G1 and G2 in the Golgi. Q#20774 - CGI_10003549 superfamily 243119 191 222 0.00871188 33.5714 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#20776 - CGI_10003197 superfamily 216554 1 67 2.25E-26 95.2389 cl15977 zf-DHHC superfamily C - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#20777 - CGI_10015954 superfamily 248458 191 306 0.00219488 39.2193 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20781 - CGI_10015959 superfamily 238012 502 541 2.01E-10 57.7494 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20781 - CGI_10015959 superfamily 238012 391 450 1.22E-09 55.4382 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20781 - CGI_10015959 superfamily 238012 451 501 2.32E-09 54.6678 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20781 - CGI_10015959 superfamily 238012 265 317 3.54E-09 54.2826 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20781 - CGI_10015959 superfamily 243198 32 264 1.36E-83 270.152 cl02806 Laminin_N superfamily - - Laminin N-terminal (Domain VI); Laminin N-terminal (Domain VI). Q#20781 - CGI_10015959 superfamily 199166 729 863 5.97E-10 58.878 cl15308 AMN1 superfamily N - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#20781 - CGI_10015959 superfamily 238012 353 381 0.00506461 36.1782 cl11390 EGF_Lam superfamily N - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20782 - CGI_10015960 superfamily 243066 120 213 6.28E-12 58.7185 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20783 - CGI_10015961 superfamily 241563 71 115 0.000147476 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20790 - CGI_10015968 superfamily 241972 21 88 2.84E-13 60.7936 cl00600 Ribosomal_L7Ae superfamily C - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#20791 - CGI_10015969 superfamily 241972 47 158 1.51E-25 95.8468 cl00600 Ribosomal_L7Ae superfamily - - "Ribosomal protein L7Ae/L30e/S12e/Gadd45 family; This family includes: Ribosomal L7A from metazoa, Ribosomal L8-A and L8-B from fungi, 30S ribosomal protein HS6 from archaebacteria, 40S ribosomal protein S12 from eukaryotes, Ribosomal protein L30 from eukaryotes and archaebacteria. Gadd45 and MyD118." Q#20792 - CGI_10015970 superfamily 247692 17 443 0 598.762 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#20793 - CGI_10015971 superfamily 241609 1 85 6.32E-30 117.094 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#20793 - CGI_10015971 superfamily 247038 341 399 1.61E-15 75.5689 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 1110 1196 1.83E-14 72.4873 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 1705 1793 7.29E-11 62.0869 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 1982 2053 1.23E-10 61.3165 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 1885 1949 3.38E-10 60.1609 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 1371 1425 1.88E-09 57.8497 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 1198 1266 1.49E-07 52.0717 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 1286 1360 1.04E-06 49.7672 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 1799 1878 2.81E-06 48.2264 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 2056 2140 7.34E-06 47.0641 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 247038 2144 2231 2.48E-05 45.5233 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#20793 - CGI_10015971 superfamily 220608 2238 2357 2.48E-34 131.274 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#20793 - CGI_10015971 superfamily 220608 3092 3213 1.68E-19 88.1318 cl10859 G8 superfamily - - G8 domain; This domain is found in disease proteins PKHD1 and KIAA1199 and is named G8 after its 8 conserved glycines. It is predicted to contain 10 beta strands and an alpha helix. Q#20794 - CGI_10015972 superfamily 241563 68 109 1.19E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20794 - CGI_10015972 superfamily 241563 28 59 0.00229057 36.3032 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20795 - CGI_10015973 superfamily 241705 104 219 5.38E-31 112.77 cl00228 HIT_like superfamily - - "HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups." Q#20795 - CGI_10015973 superfamily 245610 2 81 4.29E-48 161.826 cl11424 nitrilase superfamily N - "Nitrilase superfamily, including nitrile- or amide-hydrolyzing enzymes and amide-condensing enzymes; This superfamily (also known as the C-N hydrolase superfamily) contains hydrolases that break carbon-nitrogen bonds; it includes nitrilases, cyanide dihydratases, aliphatic amidases, N-terminal amidases, beta-ureidopropionases, biotinidases, pantotheinase, N-carbamyl-D-amino acid amidohydrolases, the glutaminase domain of glutamine-dependent NAD+ synthetase, apolipoprotein N-acyltransferases, and N-carbamoylputrescine amidohydrolases, among others. These enzymes depend on a Glu-Lys-Cys catalytic triad, and work through a thiol acylenzyme intermediate. Members of this superfamily generally form homomeric complexes, the basic building block of which is a homodimer. These oligomers include dimers, tetramers, hexamers, octamers, tetradecamers, octadecamers, as well as variable length helical arrangements and homo-oligomeric spirals. These proteins have roles in vitamin and co-enzyme metabolism, in detoxifying small molecules, in the synthesis of signaling molecules, and in the post-translational modification of proteins. They are used industrially, as biocatalysts in the fine chemical and pharmaceutical industry, in cyanide remediation, and in the treatment of toxic effluent. This superfamily has been classified previously in the literature, based on global and structure-based sequence analysis, into thirteen different enzyme classes (referred to as 1-13). This hierarchy includes those thirteen classes and a few additional subfamilies. A putative distant relative, the plasmid-borne TraB family, has not been included in the hierarchy." Q#20797 - CGI_10001778 superfamily 242274 15 191 5.32E-09 52.4146 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#20798 - CGI_10001779 superfamily 248097 19 141 6.97E-16 69.2162 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20800 - CGI_10001781 superfamily 248097 27 131 8.21E-15 66.1346 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20802 - CGI_10005407 superfamily 215754 131 213 1.07E-10 57.6484 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#20802 - CGI_10005407 superfamily 215754 63 107 0.000878028 37.2328 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#20805 - CGI_10005410 superfamily 215691 240 320 1.25E-07 49.5066 cl15766 Pyr_redox superfamily - - Pyridine nucleotide-disulphide oxidoreductase; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. Q#20807 - CGI_10005412 superfamily 247725 1 113 1.06E-41 143.999 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#20812 - CGI_10016282 superfamily 247905 303 430 1.90E-25 101.546 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#20812 - CGI_10016282 superfamily 247805 1 133 7.12E-21 88.9336 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#20814 - CGI_10016284 superfamily 216301 43 224 1.03E-26 102.342 cl03099 EMP24_GP25L superfamily - - emp24/gp25L/p24 family/GOLD; Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Q#20815 - CGI_10016285 superfamily 242156 58 171 1.41E-52 165.389 cl00869 PTH2_family superfamily - - "Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes." Q#20816 - CGI_10016286 superfamily 245847 23 166 4.46E-17 73.7449 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#20817 - CGI_10016287 superfamily 248458 37 390 2.31E-24 103.933 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20817 - CGI_10016287 superfamily 197361 554 602 0.00852511 35.0175 cl15254 UBAN superfamily N - "polyubiquitin binding domain of NEMO and related proteins; NEMO (NF-kappaB essential modulator) is a regulatory subunit of the kinase complex IKK, which is involved in the activation of NF-kappaB via phosporylation of inhibitory IkappaBs. This mechanism requires the binding of NEMO to ubiquinated substrates. Binding is achieved via the UBAN motif (ubiquitin binding in ABIN and NEMO), which is described in this model. This region of NEMO has also been named CoZi (for coiled-coil 2 and leucine zipper). ABINs (A20-binding inhibitors of NF-kappaB) are sensors for ubiquitin that are involved in regulation of apoptosis, ABIN-1 is presumed to inhibit signalling via the NF-kappaB route. The UBAN motif is also found in optineurin, the product of a gene associated with glaucoma, which has been characterized as a negative regulator of NF-kappaB as well." Q#20818 - CGI_10016288 superfamily 197361 525 553 0.0001542 40.4103 cl15254 UBAN superfamily NC - "polyubiquitin binding domain of NEMO and related proteins; NEMO (NF-kappaB essential modulator) is a regulatory subunit of the kinase complex IKK, which is involved in the activation of NF-kappaB via phosporylation of inhibitory IkappaBs. This mechanism requires the binding of NEMO to ubiquinated substrates. Binding is achieved via the UBAN motif (ubiquitin binding in ABIN and NEMO), which is described in this model. This region of NEMO has also been named CoZi (for coiled-coil 2 and leucine zipper). ABINs (A20-binding inhibitors of NF-kappaB) are sensors for ubiquitin that are involved in regulation of apoptosis, ABIN-1 is presumed to inhibit signalling via the NF-kappaB route. The UBAN motif is also found in optineurin, the product of a gene associated with glaucoma, which has been characterized as a negative regulator of NF-kappaB as well." Q#20818 - CGI_10016288 superfamily 152013 51 118 0.000713732 38.5425 cl13089 NEMO superfamily - - "NF-kappa-B essential modulator NEMO; NEMO is a regulatory protein which is part of the IKK complex along with the catalytic IKKalpha and beta kinases. The IKK complex phosphorylates IkappaB targeting it for degradation which results in the release of NF-kappaB which initiates the inflammatory response, cell proliferation or cell differentiation. NEMO activates the IKK complex's activity by associating with the unphosphorylated IKK kinase C termini.The core domain of NEMO is a dimer which binds to two fragments of IKK." Q#20819 - CGI_10016289 superfamily 128778 19 111 2.27E-05 42.6371 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#20822 - CGI_10008668 superfamily 191362 149 202 6.75E-31 109.668 cl05351 zf-nanos superfamily - - "Nanos RNA binding domain; This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localised determinant of posterior pattern. Nanos RNA is localised to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localised source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localised and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development." Q#20824 - CGI_10008670 superfamily 241750 61 221 2.76E-31 123.838 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#20825 - CGI_10008671 superfamily 247856 117 188 5.74E-13 60.6393 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#20825 - CGI_10008671 superfamily 247856 81 143 2.15E-12 59.0985 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#20827 - CGI_10008673 superfamily 204716 29 223 6.85E-07 48.3907 cl18257 Git3 superfamily - - "G protein-coupled glucose receptor regulating Gpa2; Git3 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. Git3 contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This is the conserved N-terminus of these proteins, and the C-terminal conserved region is now in family Git3_C." Q#20828 - CGI_10008674 superfamily 221441 279 777 1.05E-135 430.401 cl13566 Med12-LCEWAV superfamily - - "Eukaryotic Mediator 12 subunit domain; This domain is found in eukaryotes, and is typically between 325 and 354 amino acids in length. The function of this particular region of the Mediator subunit Med12 is not known, but there is a conserved sequence motif: LCEWAV, from which the name derives." Q#20828 - CGI_10008674 superfamily 192306 110 157 1.38E-14 71.1215 cl09730 Med12 superfamily N - "Transcription mediator complex subunit Med12; Med12 is a negative regulator of the Gli3-dependent sonic hedgehog signalling pathway via its interaction with Gli3 within the RNA polymerase II transcriptional Mediator. A complex is formed between Med12, Med13, CDK8 and CycC which is responsible for suppression of transcription. This subunit forms part of the Kinase section of Mediator." Q#20829 - CGI_10008675 superfamily 110998 10 186 1.37E-63 205.949 cl03422 Glyco_hydro_30 superfamily N - O-Glycosyl hydrolase family 30; O-Glycosyl hydrolase family 30. Q#20832 - CGI_10012973 superfamily 245864 23 457 1.94E-95 298.038 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#20835 - CGI_10012976 superfamily 110440 86 110 4.56E-05 37.7725 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20836 - CGI_10012977 superfamily 247724 51 364 5.98E-126 368.007 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#20837 - CGI_10012978 superfamily 247684 32 331 2.52E-64 213.678 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#20838 - CGI_10012979 superfamily 243072 61 191 5.00E-11 61.6306 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20838 - CGI_10012979 superfamily 149414 205 267 4.03E-25 100.809 cl07091 TRP_2 superfamily - - Transient receptor ion channel II; This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023). Q#20840 - CGI_10012981 superfamily 247792 14 64 0.00460105 34.2777 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20841 - CGI_10012982 superfamily 216411 23 168 1.30E-20 84.2506 cl15974 MARVEL superfamily - - "Membrane-associating domain; MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis." Q#20844 - CGI_10002261 superfamily 247723 170 239 2.45E-40 138.554 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#20844 - CGI_10002261 superfamily 247723 108 162 2.45E-28 105.931 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#20846 - CGI_10002263 superfamily 241645 6 70 6.06E-27 94.4653 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#20847 - CGI_10002264 superfamily 241770 42 133 2.63E-16 70.5024 cl00309 PRTases_typeI superfamily C - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#20849 - CGI_10007876 superfamily 199156 398 413 0.00108345 37.4228 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#20852 - CGI_10007879 superfamily 243175 65 133 5.43E-14 67.2659 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#20852 - CGI_10007879 superfamily 243175 251 382 1.41E-13 66.1104 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#20852 - CGI_10007879 superfamily 241832 162 234 1.48E-19 82.2866 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#20853 - CGI_10007880 superfamily 243175 136 268 1.77E-14 66.8808 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#20853 - CGI_10007880 superfamily 241832 45 119 1.43E-12 61.4858 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#20854 - CGI_10007881 superfamily 219541 53 102 2.58E-17 72.8863 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#20855 - CGI_10007882 superfamily 248097 10 131 1.42E-23 89.6318 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#20856 - CGI_10002121 superfamily 246902 197 366 3.32E-53 176.96 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#20856 - CGI_10002121 superfamily 246902 15 167 2.53E-49 166.112 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#20857 - CGI_10002122 superfamily 241888 1 110 5.98E-43 141.545 cl00473 BI-1-like superfamily N - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#20858 - CGI_10002123 superfamily 241841 13 137 4.99E-60 183.872 cl00399 MoaE superfamily - - "MoaE family. Members of this family are involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor for a diverse group of redox enzymes. Moco biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. Moco contains a tricyclic pyranopterin, termed molybdopterin (MPT), which carries the cis-dithiolene group responsible for molybdenum ligation. This dithiolene group is generated by MPT synthase in the second major step in Moco biosynthesis. MPT synthase is a heterotetramer consisting of two large (MoaE) and two small (MoaD) subunits." Q#20859 - CGI_10002124 superfamily 241719 15 154 1.14E-80 237.445 cl00242 MoaC superfamily - - "MoaC family. Members of this family are involved in molybdenum cofactor (Moco) biosynthesis, an essential cofactor of a diverse group of redox enzymes. MoaC, a small hexameric protein, converts, together with MoaA, a guanosine derivative to the precursor Z by inserting the carbon-8 of the purine between the 2' and 3' ribose carbon atoms, which is the first of three phases of Moco biosynthesis." Q#20860 - CGI_10002125 superfamily 241870 1 62 1.09E-38 126.767 cl00451 MoCF_BD superfamily N - "MoCF_BD: molybdenum cofactor (MoCF) binding domain (BD). This domain is found a variety of proteins involved in biosynthesis of molybdopterin cofactor, like MoaB, MogA, and MoeA. The domain is presumed to bind molybdopterin." Q#20861 - CGI_10002126 superfamily 241855 1 105 2.06E-24 93.8657 cl00425 CofD_YvcK superfamily N - "Family of CofD-like proteins and proteins related to YvcK; CofD is a 2-phospho-L-lactate transferase that catalyzes the last step in the biosynthesis of coenzyme F(420)-0 (F(420) without polyglutamate) by transferring the lactyl phosphate moiety of lactyl(2)diphospho-(5')guanosine (LPPG) to 7,8-didemethyl-8-hydroxy-5-deazariboflavin ribitol (F0). F420 is a hydride carrier, important for energy metabolism of methanogenic archaea, as well as for the biosynthesis of other natural products, like tetracycline in Streptomyces. F420 and some of its precursors are also utilized as cofactors for enzymes, like DNA photolyase in Mycobacterium tuberculosis. YvcK from Bacillus subtilis is a member of a family of mostly uncharacterized proteins and has been proposed to play a role in carbon metabolism, since its function is essential for growth on intermediates of the Krebs cycle and pentose phosphate pathway. Both families appear to have a conserved phosphate binding site, but have different substrate binding residues conserved within each family." Q#20862 - CGI_10002127 superfamily 247905 82 204 8.22E-29 108.479 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#20862 - CGI_10002127 superfamily 204889 203 246 5.73E-22 86.7149 cl13741 UvrB superfamily - - "Ultra-violet resistance protein B; This domain family is found in bacteria, archaea and eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00271, pfam02151, pfam04851. There are two conserved sequence motifs: YAD and RRR. This family is the C terminal region of the UvrB protein which conveys mutational resistance against UV light to various different species." Q#20862 - CGI_10002127 superfamily 145355 284 319 0.00118991 36.2256 cl12262 UVR superfamily - - UvrB/uvrC motif; UvrB/uvrC motif. Q#20863 - CGI_10002128 superfamily 247757 10 107 3.13E-31 109.418 cl17203 Fer4_NifH superfamily N - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#20864 - CGI_10002130 superfamily 241782 1 181 1.13E-92 277.424 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#20865 - CGI_10002132 superfamily 246940 50 249 2.24E-21 89.7001 cl15377 Radical_SAM superfamily - - "Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin." Q#20865 - CGI_10002132 superfamily 244315 220 312 5.09E-41 140.309 cl06149 BATS superfamily - - "Biotin and Thiamin Synthesis associated domain; Biotin synthase (BioB), EC:2.8.1.6, catalyzes the last step of the biotin biosynthetic pathway. The reaction consists in the introduction of a sulphur atom into dethiobiotin. BioB functions as a homodimer. Thiamin synthesis if a complex process involving at least six gene products (ThiFSGH, ThiI and ThiJ). Two of the proteins required for the biosynthesis of the thiazole moiety of thiamine (vitamin B(1)) are ThiG and ThiH (this family) and form a heterodimer. Both of these reactions are thought of involve the binding of co-factors, and both function as dimers. This domain therefore may be involved in co-factor binding or dimerisation (Finn, RD personal observation)." Q#20866 - CGI_10002135 superfamily 116980 52 110 4.28E-13 65.5009 cl17962 phage_tail_N superfamily N - Prophage tail fibre N-terminal; This domain is found at the N-terminus of prophage tail fibre proteins. Q#20866 - CGI_10002135 superfamily 146180 346 383 2.46E-11 58.4602 cl04049 Phage_fiber_2 superfamily - - Phage tail fibre repeat; This repeat is found in the tail fibres of phage. For example protein K. The repeats are about 40 residues long. Q#20867 - CGI_10002137 superfamily 242762 1 281 6.78E-158 447.546 cl01886 Omptin superfamily - - Omptin family; The omptin family is a family of serine proteases. Q#20868 - CGI_10002139 superfamily 242667 1 164 2.51E-93 271.385 cl01720 Phage_Nu1 superfamily - - "Phage DNA packaging protein Nu1; Terminase, the DNA packaging enzyme of bacteriophage lambda, is a heteromultimer composed of subunits Nu1 and A. The smaller Nu1 terminase subunit has a low-affinity ATPase stimulated by non-specific DNA." Q#20871 - CGI_10004752 superfamily 246597 30 307 7.51E-145 422.722 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#20871 - CGI_10004752 superfamily 217260 345 506 2.00E-42 149.327 cl03752 5_nucleotid_C superfamily - - "5'-nucleotidase, C-terminal domain; 5'-nucleotidase, C-terminal domain. " Q#20874 - CGI_10010726 superfamily 241583 99 285 1.39E-48 170.059 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#20874 - CGI_10010726 superfamily 243051 663 818 1.04E-26 107.464 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#20874 - CGI_10010726 superfamily 245213 621 656 0.00207251 37.071 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20874 - CGI_10010726 superfamily 241571 330 446 0.00301032 37.0067 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#20875 - CGI_10010727 superfamily 177822 232 505 3.00E-23 102.306 cl18088 PLN02164 superfamily - - sulfotransferase Q#20875 - CGI_10010727 superfamily 246921 683 736 1.43E-12 65.0893 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#20875 - CGI_10010727 superfamily 246921 613 663 1.87E-05 43.9033 cl15299 FG-GAP superfamily C - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#20876 - CGI_10010728 superfamily 217380 674 956 1.57E-106 338.916 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#20877 - CGI_10010729 superfamily 243092 23 308 3.88E-18 85.0792 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20878 - CGI_10010730 superfamily 241613 161 192 3.12E-07 46.431 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#20880 - CGI_10008311 superfamily 245206 37 291 1.36E-33 126.251 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#20881 - CGI_10008312 superfamily 247684 32 456 2.76E-110 339.253 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#20883 - CGI_10008314 superfamily 243066 2 106 9.18E-22 91.1397 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20883 - CGI_10008314 superfamily 198867 120 203 3.84E-19 83.3619 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#20883 - CGI_10008314 superfamily 243146 471 512 1.41E-09 54.5898 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20883 - CGI_10008314 superfamily 243146 426 468 1.65E-07 48.8118 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20883 - CGI_10008314 superfamily 243146 374 421 6.86E-07 46.8858 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20883 - CGI_10008314 superfamily 243146 519 571 0.000257276 39.1818 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20884 - CGI_10008315 superfamily 241874 47 610 0 717.421 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#20885 - CGI_10008316 superfamily 241874 1 175 7.90E-54 181.493 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#20888 - CGI_10008319 superfamily 241554 229 364 6.32E-37 136.621 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#20888 - CGI_10008319 superfamily 241554 429 571 3.91E-36 134.31 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#20888 - CGI_10008319 superfamily 241554 48 185 1.48E-28 112.739 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#20888 - CGI_10008319 superfamily 241752 670 918 4.91E-16 77.7559 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#20890 - CGI_10008321 superfamily 243035 221 347 3.41E-24 95.3793 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20890 - CGI_10008321 superfamily 243051 84 197 1.16E-14 70.0721 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#20891 - CGI_10008322 superfamily 114686 27 174 8.35E-06 44.2364 cl11626 UNC-93 superfamily - - Ion channel regulatory protein UNC-93; This family of proteins is a component of a multi-subunit protein complex which is involved in the coordination of muscle contraction. UNC-93 is most likely an ion channel regulatory protein. Q#20892 - CGI_10008323 superfamily 243035 93 219 1.41E-24 94.9941 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#20893 - CGI_10008324 superfamily 248458 299 395 0.00782805 36.9081 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20893 - CGI_10008324 superfamily 114686 27 174 0.00821344 35.762 cl11626 UNC-93 superfamily - - Ion channel regulatory protein UNC-93; This family of proteins is a component of a multi-subunit protein complex which is involved in the coordination of muscle contraction. UNC-93 is most likely an ion channel regulatory protein. Q#20895 - CGI_10008326 superfamily 241974 454 563 1.68E-12 64.185 cl00604 STAS superfamily C - "Sulphate Transporter and Anti-Sigma factor antagonist domain found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors; The STAS (Sulphate Transporter and Anti-Sigma factor antagonist) domain is found in the C-terminal region of sulphate transporters as well as in bacterial and archaeal proteins involved in the regulation of sigma factors, like anti-anti-sigma factors and "stressosome" components. The sigma factor regulators are involved in protein-protein interaction which is regulated by phosphorylation." Q#20895 - CGI_10008326 superfamily 216188 139 393 4.77E-58 197.055 cl18360 Sulfate_transp superfamily - - Sulfate transporter family; Mutations in human SLC26A2 lead to several human diseases. Q#20895 - CGI_10008326 superfamily 205965 1 56 2.77E-20 85.9271 cl18285 Sulfate_tra_GLY superfamily N - "Sulfate transporter N-terminal domain with GLY motif; This domain is found usually at the N-terminus of sulfate-transporter proteins. It carries a highly conserved GLY sequence motif, but the function of the domain is not known." Q#20900 - CGI_10017214 superfamily 215827 164 334 1.83E-25 104.858 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#20902 - CGI_10017216 superfamily 245306 33 161 7.47E-15 66.4551 cl10465 Peptidase_S24_S26 superfamily - - "The S24, S26 LexA/signal peptidase superfamily contains LexA-related and type I signal peptidase families. The S24 LexA protein domains include: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The S26 type I signal peptidase (SPase) family also includes mitochondrial inner membrane protease (IMP)-like members. SPases are essential membrane-bound proteases which function to cleave away the amino-terminal signal peptide from the translocated pre-protein, thus playing a crucial role in the transport of proteins across membranes in all living organisms. All members in this superfamily are unique serine proteases that carry out catalysis using a serine/lysine dyad instead of the prototypical serine/histidine/aspartic acid triad found in most serine proteases." Q#20903 - CGI_10017217 superfamily 241584 300 374 5.50E-06 44.4095 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#20905 - CGI_10017219 superfamily 147395 14 47 1.69E-05 37.9492 cl04973 Dpy-30 superfamily - - Dpy-30 motif; This motif is found in a wide variety of domain contexts. It is found in the Dpy-30 proteins hence the motifs name. It is about 40 residues long and is probably fomed of two alpha-helices. It may be a dimerisation motif analogous to pfam02197 (Bateman A pers obs). Q#20906 - CGI_10017220 superfamily 218015 1098 1255 9.84E-28 112.609 cl08416 FBA superfamily - - "F-box associated region; Members of this family are associated with F-box domains, hence the name FBA. This domain is probably involved in binding other proteins that will be targeted for ubiquitination. Human FBXO2 is involved in binding to N-glycosylated proteins." Q#20906 - CGI_10017220 superfamily 221187 885 1036 0.000104787 43.049 cl13212 Malectin superfamily - - "Di-glucose binding within endoplasmic reticulum; Malectin is a membrane-anchored protein of the endoplasmic reticulum that recognises and binds Glc2-N-glycan. It carries a signal peptide from residues 1-26, a C-terminal transmembrane helix from residues 255-274, and a highly conserved central part of approximately 190 residues followed by an acidic, glutamate-rich region. Carbohydrate-binding is mediated by the four aromatic residues, Y67, Y89, Y116, and F117 and the aspartate at D186. NMR-based ligand-screening studies has shown binding of the protein to maltose and related oligosaccharides, on the basis of which the protein has been designated "malectin", and its endogenous ligand is found to be Glc2-high-mannose N-glycan." Q#20907 - CGI_10017221 superfamily 218721 38 401 4.81E-68 226.228 cl05344 TROVE superfamily - - "TROVE domain; This presumed domain is found in TEP1 and Ro60 proteins, that are RNA-binding components of Telomerase, Ro and Vault RNPs. This domain has been named TROVE, (after Telomerase, Ro and Vault). This domain is probably RNA-binding." Q#20908 - CGI_10017222 superfamily 241752 725 841 4.38E-43 152.859 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#20908 - CGI_10017222 superfamily 207713 515 592 1.52E-10 58.5053 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#20910 - CGI_10017224 superfamily 241900 510 853 4.64E-177 517.241 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#20912 - CGI_10017226 superfamily 241783 38 129 1.98E-05 42.9416 cl00322 Ribosomal_L1 superfamily N - "Ribosomal protein L1. The L1 protein, located near the E-site of the ribosome, forms part of the L1 stalk along with 23S rRNA. In bacteria and archaea, L1 functions both as a ribosomal protein that binds rRNA, and as a translation repressor that binds its own mRNA. Like several other large ribosomal subunit proteins, L1 displays RNA chaperone activity. L1 is one of the largest ribosomal proteins. It is composed of two domains that cycle between open and closed conformations via a hinge motion. The RNA-binding site of L1 is highly conserved, with both mRNA and rRNA binding the same binding site." Q#20913 - CGI_10017227 superfamily 247780 246 532 3.11E-106 320.252 cl17226 NAD_bind_amino_acid_DH superfamily - - "NAD(P) binding domain of amino acid dehydrogenase-like proteins; Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts." Q#20913 - CGI_10017227 superfamily 202408 96 223 4.87E-61 199.264 cl08368 ELFV_dehydrog_N superfamily - - "Glu/Leu/Phe/Val dehydrogenase, dimerisation domain; Glu/Leu/Phe/Val dehydrogenase, dimerisation domain. " Q#20914 - CGI_10017228 superfamily 243179 98 215 4.95E-34 120.608 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#20919 - CGI_10017233 superfamily 192535 48 329 6.98E-06 45.6646 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#20925 - CGI_10001542 superfamily 245670 147 335 2.35E-38 136.171 cl11519 DENN superfamily C - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#20925 - CGI_10001542 superfamily 243635 7 97 1.92E-21 86.6196 cl04085 uDENN superfamily - - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#20926 - CGI_10000933 superfamily 243072 9 65 4.07E-09 49.6894 cl02529 ANK superfamily NC - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20928 - CGI_10001497 superfamily 241646 40 74 0.000197819 36.6598 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#20929 - CGI_10001498 superfamily 244574 11 37 0.00402202 33.132 cl06998 LEM_like superfamily - - "LEM-like domain of lamina-associated polypeptide 2 (LAP2) and similar proteins; LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and postmitotic reassembly. Some of the LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are nonmembrane nuclear polypeptides. All LAP2 isoforms contain an N-terminal lamina-associated polypeptide-Emerin-MAN1 (LEM)-domain that is connected to a highly divergent LEM-like domain by an unstructured linker. Both LEM and LEM-like domains share the same structural fold, mainly composed of two large parallel alpha helices. However, their biochemical nature of the solvent-accessible residues is completely different, which indicates the two domains may target different protein surfaces. The LEM domain is responsible for the interaction with the nonspecific DNA binding protein barrier-to-autointegration factor (BAF), and the LEM-like domain is involved in chromosome binding. The family also includes the yeast helix-extension-helix domain-containing proteins, Heh1p (formerly called Src1p) and Heh2p, and their uncharacterized homologs found mainly in fungi and several in bacteria. Heh1p and Heh2p are inner nuclear membrane proteins that might interact with nuclear pore complexes (NPCs). Heh1p is involved in mitosis. It functions at the interface between subtelomeric gene expression and transcription export (TREX)-dependent messenger RNA export through NPCs. The function of Heh2p remains ill-defined. Both Heh1p and Heh2p contain a LEM-like domain (also termed HeH domain), but lack a LEM domain." Q#20931 - CGI_10002215 superfamily 241619 381 430 0.000834707 37.5584 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#20932 - CGI_10002216 superfamily 242911 30 290 3.88E-179 498.291 cl02160 Rcd1 superfamily - - "Cell differentiation family, Rcd1-like; Two of the members in this family have been characterized as being involved in regulation of Ste11 regulated sex genes. Mammalian Rcd1 is a novel transcriptional cofactor that mediates retinoic acid-induced cell differentiation." Q#20935 - CGI_10008887 superfamily 247913 124 464 2.74E-47 170.185 cl17359 PTR2 superfamily - - POT family; The POT (proton-dependent oligopeptide transport) family all appear to be proton dependent transporters. Q#20937 - CGI_10008889 superfamily 243072 5 117 2.57E-22 92.0614 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20937 - CGI_10008889 superfamily 247057 414 477 6.59E-09 52.724 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#20938 - CGI_10008890 superfamily 241563 68 108 4.04E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20940 - CGI_10008892 superfamily 243082 208 521 9.05E-146 442.852 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#20940 - CGI_10008892 superfamily 243039 63 198 5.28E-64 213.854 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#20942 - CGI_10008894 superfamily 246597 11 272 1.18E-55 190.269 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#20942 - CGI_10008894 superfamily 217930 291 455 1.36E-53 182.389 cl04421 Mre11_DNA_bind superfamily - - "Mre11 DNA-binding presumed domain; The Mre11 complex is a multi-subunit nuclease that is composed of Mre11, Rad50 and Nbs1/Xrs2, and is involved in checkpoint signalling and DNA replication. Mre11 has an intrinsic DNA-binding activity that is stimulated by Rad50 on its own or in combination with Nbs1." Q#20943 - CGI_10008895 superfamily 243058 138 226 0.000579295 39.6052 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#20943 - CGI_10008895 superfamily 218299 350 627 1.08E-81 268.363 cl04810 Uso1_p115_head superfamily - - "Uso1 / p115 like vesicle tethering protein, head region; Also known as General vesicular transport factor, Transcytosis associated protein (TAP) and Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerisation, and a short C-terminal acidic region. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the golgi stack. This family consists of part of the head region. The head region is highly conserved, but its function is unknown. It does not seem to be essential for vesicle tethering. The N-terminal part of the head region, not within this family, contains context-detected Armadillo/beta-catenin-like repeats (pfam00514)." Q#20943 - CGI_10008895 superfamily 218301 888 926 0.00108171 38.7871 cl04811 Uso1_p115_C superfamily N - "Uso1 / p115 like vesicle tethering protein, C terminal region; Also known as General vesicular transport factor, Transcytosis associate protein (TAP) and Vesicle docking protein, this myosin-shaped molecule consists of an N-terminal globular head region, a coiled-coil tail which mediates dimerisation, and a short C-terminal acidic region. p115 tethers COP1 vesicles to the Golgi by binding the coiled coil proteins giantin (on the vesicles) and GM130 (on the Golgi), via its C-terminal acidic region. It is required for intercisternal transport in the golgi stack. This family consists of the acidic C-terminus, which binds to the golgins giantin and GM130. p115 is thought to juxtapose two membranes by binding giantin with one acidic region, and GM130 with another." Q#20944 - CGI_10008896 superfamily 243310 126 329 1.44E-45 157.4 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#20945 - CGI_10008897 superfamily 245213 552 592 0.000134058 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20945 - CGI_10008897 superfamily 243124 117 230 4.15E-08 52.0441 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#20945 - CGI_10008897 superfamily 245213 593 632 1.29E-06 46.188 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20945 - CGI_10008897 superfamily 241578 468 504 1.69E-05 45.4536 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#20946 - CGI_10008898 superfamily 245612 419 607 8.22E-68 230.662 cl11426 Amidase superfamily NC - Amidase; Amidase. Q#20946 - CGI_10008898 superfamily 245612 15 93 2.54E-29 120.495 cl11426 Amidase superfamily C - Amidase; Amidase. Q#20947 - CGI_10008899 superfamily 248264 338 497 1.12E-41 147.384 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#20947 - CGI_10008899 superfamily 243161 3 60 8.40E-05 40.8406 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#20947 - CGI_10008899 superfamily 222263 256 342 0.00161186 36.9121 cl16321 DDE_4_2 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#20949 - CGI_10007362 superfamily 246676 130 285 4.28E-30 114.364 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#20949 - CGI_10007362 superfamily 243146 93 129 0.00559941 34.1247 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20949 - CGI_10007362 superfamily 243146 1 41 0.00682918 34.1871 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#20953 - CGI_10007366 superfamily 216316 38 259 2.20E-74 239.838 cl10574 CD36 superfamily C - CD36 family; The CD36 family is thought to be a novel class of scavenger receptors. There is also evidence suggesting a possible role in signal transduction. CD36 is involved in cell adhesion. Q#20955 - CGI_10007369 superfamily 243040 42 155 2.70E-54 174.749 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#20955 - CGI_10007369 superfamily 243064 168 286 2.94E-09 53.4523 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#20956 - CGI_10007370 superfamily 247736 131 196 6.35E-08 47.7094 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#20957 - CGI_10004092 superfamily 242173 47 186 3.14E-16 71.5142 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#20958 - CGI_10004093 superfamily 110440 472 496 0.0014798 36.6169 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#20958 - CGI_10004093 superfamily 128778 88 197 0.00388672 36.4739 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#20958 - CGI_10004093 superfamily 241563 43 80 0.00661092 35.0055 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20960 - CGI_10004095 superfamily 241563 76 118 1.73E-05 42.4664 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20960 - CGI_10004095 superfamily 128778 125 233 0.000543417 38.7851 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#20962 - CGI_10013432 superfamily 238012 139 185 6.00E-07 45.0378 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20962 - CGI_10013432 superfamily 238012 53 79 0.000627739 36.5634 cl11390 EGF_Lam superfamily N - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20965 - CGI_10013435 superfamily 238012 4 46 2.06E-06 40.4154 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20965 - CGI_10013435 superfamily 238012 50 81 0.00493651 31.1706 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#20966 - CGI_10013436 superfamily 243100 164 239 1.65E-08 50.6403 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#20968 - CGI_10013438 superfamily 147395 55 96 1.98E-16 67.6096 cl04973 Dpy-30 superfamily - - Dpy-30 motif; This motif is found in a wide variety of domain contexts. It is found in the Dpy-30 proteins hence the motifs name. It is about 40 residues long and is probably fomed of two alpha-helices. It may be a dimerisation motif analogous to pfam02197 (Bateman A pers obs). Q#20969 - CGI_10013439 superfamily 202715 115 213 2.97E-30 108.82 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#20970 - CGI_10013440 superfamily 202715 93 193 3.01E-30 108.434 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#20971 - CGI_10013441 superfamily 241699 40 127 6.38E-18 79.1577 cl00221 ACBP superfamily - - Acyl CoA binding protein (ACBP) binds thiol esters of long fatty acids and coenzyme A in a one-to-one binding mode with high specificity and affinity. Acyl-CoAs are important intermediates in fatty lipid synthesis and fatty acid degradation and play a role in regulation of intermediary metabolism and gene regulation. The suggested role of ACBP is to act as a intracellular acyl-CoA transporter and pool former. ACBPs are present in a large group of eukaryotic species and several tissue-specific isoforms have been detected. Q#20972 - CGI_10013442 superfamily 247905 1114 1242 5.69E-17 79.5892 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#20972 - CGI_10013442 superfamily 247792 1042 1087 1.13E-10 58.9964 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#20972 - CGI_10013442 superfamily 247805 759 867 2.43E-07 50.7988 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#20972 - CGI_10013442 superfamily 244719 44 134 3.14E-25 102.439 cl07418 HIRAN superfamily - - "HIRAN domain; The HIRAN domain (HIP116, Rad5p N-terminal) is found in the N-terminal regions of the SWI2/SNF2 proteins typified by HIP116 and Rad5p. The HIRAN domain is found as a standalone protein in several bacteria and prophages, or fused to other catalytic domains, such as a nuclease of the restriction endonuclease fold and TDP1-like DNA phosphoesterases, in the eukaryotes. It has been predicted that this domain functions as a DNA-binding domain that probably recognises features associated with damaged DNA or stalled replication forks" Q#20973 - CGI_10013443 superfamily 248458 98 457 3.65E-08 53.8569 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20974 - CGI_10013444 superfamily 248458 83 437 1.30E-09 58.0941 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20975 - CGI_10013445 superfamily 248458 84 432 6.30E-14 71.5761 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20976 - CGI_10013446 superfamily 245596 64 287 4.24E-92 275.615 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#20977 - CGI_10013447 superfamily 243092 207 491 1.97E-70 228.759 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#20977 - CGI_10013447 superfamily 199091 43 82 7.89E-19 80.2202 cl13550 Beta-TrCP_D superfamily - - "D domain of beta-TrCP; This domain is found in eukaryotes, and is approximately 40 amino acids in length. It is found associated with pfam00646, pfam00400. The protein that contains this domain functions as a ubiquitin ligase. Ubiquitination is required to direct proteins towards the proteasome for degradation. This protein is part of the WD40 class of F box proteins. The D domain of these F box proteins is involved in mediating the dimerisation of the protein. Dimerisation is necessary to polyubiquitinate substrates so this D domain is vital in directing substrates towards the proteasome for degradation." Q#20977 - CGI_10013447 superfamily 243074 86 133 9.61E-08 49.0421 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#20978 - CGI_10013448 superfamily 241599 105 162 1.31E-18 76.5132 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#20979 - CGI_10013449 superfamily 242274 149 352 1.17E-60 201.96 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#20979 - CGI_10013449 superfamily 242274 21 133 5.19E-23 97.186 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#20981 - CGI_10013451 superfamily 248458 18 332 6.37E-10 58.4793 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20984 - CGI_10013454 superfamily 245213 485 521 8.85E-09 53.7946 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 638 675 3.30E-08 52.2538 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 1208 1243 8.10E-08 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 715 752 1.72E-07 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 1246 1282 1.94E-07 49.9426 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 1170 1206 2.58E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 1284 1320 2.60E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 523 558 2.64E-07 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 409 445 6.58E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 561 598 6.66E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 447 483 6.99E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 830 865 8.62E-07 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 868 904 2.26E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 755 791 3.22E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 372 407 8.62E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 677 713 9.08E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 600 636 2.04E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245213 793 828 4.23E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20984 - CGI_10013454 superfamily 245814 1356 1429 0.00389038 37.6136 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#20985 - CGI_10013455 superfamily 245213 671 707 2.44E-12 63.4246 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20985 - CGI_10013455 superfamily 245213 709 746 1.20E-11 61.4986 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20985 - CGI_10013455 superfamily 245213 748 783 2.27E-10 57.6466 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20985 - CGI_10013455 superfamily 245213 633 669 4.43E-10 56.8762 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20985 - CGI_10013455 superfamily 248458 863 1077 1.55E-09 59.6349 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#20985 - CGI_10013455 superfamily 245213 786 821 2.09E-08 52.2538 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20985 - CGI_10013455 superfamily 245213 596 631 5.24E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#20985 - CGI_10013455 superfamily 243060 330 419 4.15E-07 49.2996 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#20987 - CGI_10013457 superfamily 243072 6 86 2.47E-11 60.8602 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#20987 - CGI_10013457 superfamily 243066 116 212 8.31E-15 70.0272 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20987 - CGI_10013457 superfamily 243066 270 368 2.24E-14 69.1833 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#20988 - CGI_10013458 superfamily 245596 253 490 1.52E-58 195.097 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#20988 - CGI_10013458 superfamily 248046 33 234 2.72E-56 187.715 cl17492 DUF2064 superfamily - - Uncharacterized protein conserved in bacteria (DUF2064); This family has structural similarity to proteins in the nucleotide-diphospho-sugar transferases superfamily. The similarity suggests that it is an enzyme with a sugar substrate. Q#20989 - CGI_10013459 superfamily 245206 23 257 1.57E-131 375.088 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#20991 - CGI_10013461 superfamily 241563 61 95 6.80E-05 40.7336 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#20991 - CGI_10013461 superfamily 128778 105 211 0.00109254 38.0147 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#20992 - CGI_10005969 superfamily 221316 973 1076 7.64E-32 121.669 cl13375 DUF3441 superfamily - - "Domain of unknown function (DUF3441); This presumed domain is functionally uncharacterized. This domain is found in archaea and eukaryotes. This domain is typically between 104 to 119 amino acids in length. This domain is found associated with pfam05833, pfam05670. This domain has two conserved residues (P and G) that may be functionally important." Q#20992 - CGI_10005969 superfamily 218683 504 602 1.13E-27 108.834 cl05307 DUF814 superfamily - - "Domain of unknown function (DUF814); This domain occurs in proteins that have been annotated as Fibronectin/fibrinogen binding protein by similarity. This annotation comes from B. subtilis yloA, where the N-terminal region is involved in this activity. Hence the activity of this C-terminal domain is unknown. This domain contains a conserved motif D/E-X-W/Y-X-H that may be functionally important." Q#20995 - CGI_10005972 superfamily 110998 163 625 2.55E-172 504.094 cl03422 Glyco_hydro_30 superfamily - - O-Glycosyl hydrolase family 30; O-Glycosyl hydrolase family 30. Q#20996 - CGI_10005973 superfamily 241868 99 210 1.05E-39 140.065 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#20996 - CGI_10005973 superfamily 191168 21 96 7.09E-30 110.803 cl04894 DCP2 superfamily - - "Dcp2, box A domain; This domain is always found to the amino terminal side of pfam00293. This domain is specific to mRNA decapping protein 2 and this region has been termed Box A. Removal of the cap structure is catalyzed by the Dcp1-Dcp2 complex." Q#20997 - CGI_10005974 superfamily 241758 21 163 2.78E-19 79.3362 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#20998 - CGI_10005975 superfamily 241758 26 123 3.94E-16 69.7062 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#20999 - CGI_10005976 superfamily 241758 2 129 1.88E-16 74.3286 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#20999 - CGI_10005976 superfamily 241758 129 239 1.03E-07 49.2906 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#21000 - CGI_10005977 superfamily 241578 219 381 9.74E-45 154.752 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21000 - CGI_10005977 superfamily 241578 7 182 5.89E-41 144.838 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21001 - CGI_10005978 superfamily 241578 162 324 4.51E-38 137.423 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21001 - CGI_10005978 superfamily 241578 2 93 1.54E-18 83.9735 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21001 - CGI_10005978 superfamily 241578 364 504 3.87E-05 43.212 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21004 - CGI_10015365 superfamily 248020 25 352 1.76E-50 176.501 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#21007 - CGI_10015368 superfamily 248020 25 352 6.68E-50 174.96 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#21008 - CGI_10015369 superfamily 248020 25 353 2.07E-50 176.116 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#21009 - CGI_10015370 superfamily 241578 464 626 9.95E-11 60.6574 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21010 - CGI_10015371 superfamily 221476 1634 1976 3.74E-57 213.003 cl13644 Rav1p_C superfamily N - "RAVE protein 1 C terminal; This domain family is found in eukaryotes, and is typically between 621 and 644 amino acids in length. This family is the C terminal region of the protein RAVE (regulator of the ATPase of vacuolar and endosomal membranes). Rav1p is involved in regulating the glucose dependent assembly and disassembly of vacuolar ATPase V1 and V0 subunits." Q#21010 - CGI_10015371 superfamily 243092 3047 3314 5.44E-30 123.599 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21014 - CGI_10015375 superfamily 217020 27 112 0.0092667 33.337 cl03574 Seryl_tRNA_N superfamily N - Seryl-tRNA synthetase N-terminal domain; This domain is found associated with the Pfam tRNA synthetase class II domain (pfam00587) and represents the N-terminal domain of seryl-tRNA synthetase. Q#21020 - CGI_10015381 superfamily 241640 42 266 5.90E-91 271.842 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#21021 - CGI_10015382 superfamily 243100 75 125 2.71E-11 54.9273 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#21024 - CGI_10006460 superfamily 241563 107 148 2.70E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21024 - CGI_10006460 superfamily 241563 67 98 0.0023617 36.3032 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21027 - CGI_10006463 superfamily 241563 68 109 9.55E-06 43.2368 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21029 - CGI_10006465 superfamily 243540 914 1140 1.71E-29 118.119 cl03831 HlyIII superfamily - - "Haemolysin-III related; Members of this family are integral membrane proteins. This family includes a protein with hemolytic activity from Bacillus cereus. It has been proposed that YOL002c encodes a Saccharomyces cerevisiae protein that plays a key role in metabolic pathways that regulate lipid and phosphate metabolism. In eukaryotes, members are seven-transmembrane pass molecules found to encode functional receptors with a broad range of apparent ligand specificities, including progestin and adipoQ receptors, and hence have been named PAQR proteins. The mammalian members include progesterone binding proteins. Unlike the case with GPCR receptor proteins, the evolutionary ancestry of the members of this family can be traced back to the Archaea." Q#21029 - CGI_10006465 superfamily 241563 59 95 1.53E-05 43.8651 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21029 - CGI_10006465 superfamily 221533 115 208 6.64E-05 42.2988 cl13726 TMF_DNA_bd superfamily - - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#21030 - CGI_10011953 superfamily 247724 274 434 0.000186416 40.9028 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21030 - CGI_10011953 superfamily 242902 69 219 3.17E-16 75.361 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#21031 - CGI_10011954 superfamily 247724 89 249 0.000177283 40.1324 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21033 - CGI_10011956 superfamily 202224 244 344 2.01E-14 70.7875 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#21033 - CGI_10011956 superfamily 247999 9 55 6.53E-10 56.0662 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#21033 - CGI_10011956 superfamily 214721 205 272 1.66E-07 49.1728 cl18313 JmjC superfamily - - "A domain family that is part of the cupin metalloenzyme superfamily; Probable enzymes, but of unknown functions, that regulate chromatin reorganisation processes (Clissold and Ponting, in press)." Q#21036 - CGI_10011959 superfamily 246748 358 592 4.10E-104 321.464 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#21036 - CGI_10011959 superfamily 244870 132 348 1.66E-73 238.729 cl08238 PA superfamily - - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#21036 - CGI_10011959 superfamily 202944 592 683 4.28E-29 112.749 cl07854 TFR_dimer superfamily N - Transferrin receptor-like dimerisation domain; This domain is involved in dimerisation of the transferrin receptor as shown in its crystal structure. Q#21036 - CGI_10011959 superfamily 246748 72 126 6.39E-10 59.5285 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#21039 - CGI_10011962 superfamily 247856 48 101 2.49E-05 38.6829 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21040 - CGI_10011963 superfamily 247856 22 78 0.000306656 36.7569 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21040 - CGI_10011963 superfamily 247856 58 102 0.000655986 35.9865 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21040 - CGI_10011963 superfamily 150838 136 177 0.00700218 34.7474 cl10913 DUF2216 superfamily NC - "Uncharacterized conserved proteins (DUF2216); This is the conserved N-terminal half of a proteins which are found from worms to humans. some annotation suggests it might be PKR, the Hepatitis delta antigen-interacting protein A, but this could not be confirmed." Q#21042 - CGI_10000259 superfamily 243609 1 109 4.01E-32 112.313 cl04000 Cornichon superfamily N - Cornichon protein; Cornichon protein. Q#21046 - CGI_10018692 superfamily 248097 3 118 1.58E-21 83.8538 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21048 - CGI_10018694 superfamily 222150 553 578 6.67E-05 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21048 - CGI_10018694 superfamily 222150 526 549 0.000159763 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21048 - CGI_10018694 superfamily 222150 434 458 0.000231843 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21050 - CGI_10018696 superfamily 248054 187 359 1.04E-05 44.9859 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#21051 - CGI_10018697 superfamily 243034 296 394 7.91E-14 68.1756 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21051 - CGI_10018697 superfamily 243034 160 258 5.37E-12 62.7828 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21051 - CGI_10018697 superfamily 243034 92 191 3.29E-09 54.6936 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21052 - CGI_10018698 superfamily 245864 1 162 1.33E-46 158.981 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#21053 - CGI_10018699 superfamily 245864 1 219 8.23E-31 117.765 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#21055 - CGI_10018701 superfamily 241571 114 226 1.05E-30 117.128 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#21055 - CGI_10018701 superfamily 241613 378 411 1.01E-06 46.431 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21055 - CGI_10018701 superfamily 241571 250 339 2.13E-06 46.6367 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#21056 - CGI_10018702 superfamily 202894 65 130 5.98E-14 63.0086 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#21057 - CGI_10018703 superfamily 241733 572 658 2.22E-43 151.223 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#21057 - CGI_10018703 superfamily 241571 166 276 9.00E-09 53.9554 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#21057 - CGI_10018703 superfamily 241613 419 447 1.64E-05 42.9642 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21057 - CGI_10018703 superfamily 241613 476 511 7.04E-05 41.0382 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21060 - CGI_10018706 superfamily 247724 171 326 1.92E-40 143.699 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21061 - CGI_10018707 superfamily 247724 165 324 5.44E-40 142.158 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21062 - CGI_10018708 superfamily 241600 1 166 2.45E-79 237.138 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#21063 - CGI_10018709 superfamily 241600 20 230 1.35E-95 281.051 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#21066 - CGI_10018712 superfamily 245814 607 679 3.36E-27 107.328 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21066 - CGI_10018712 superfamily 245814 691 771 1.17E-13 68.686 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21066 - CGI_10018712 superfamily 245814 486 586 1.39E-06 47.858 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21066 - CGI_10018712 superfamily 214507 431 479 7.51E-06 45.1136 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#21066 - CGI_10018712 superfamily 246925 61 189 0.000104844 44.2686 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#21066 - CGI_10018712 superfamily 243030 28 54 0.00972076 35.3711 cl02423 LRRNT superfamily C - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#21067 - CGI_10018713 superfamily 245864 43 502 4.02E-84 269.918 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#21068 - CGI_10018714 superfamily 214531 194 235 2.39E-06 44.9001 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21068 - CGI_10018714 superfamily 214531 149 192 1.52E-05 42.5889 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21075 - CGI_10010794 superfamily 245847 131 279 0.000894975 37.5362 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#21076 - CGI_10010795 superfamily 243104 28 71 9.65E-06 40.6061 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#21078 - CGI_10010798 superfamily 111397 111 184 2.76E-08 51.9582 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#21078 - CGI_10010798 superfamily 241611 296 430 9.18E-06 45.072 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#21081 - CGI_10001126 superfamily 207654 119 184 4.00E-28 103.677 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#21081 - CGI_10001126 superfamily 207654 48 112 3.29E-22 87.4982 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#21081 - CGI_10001126 superfamily 207654 203 268 6.61E-21 83.6462 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#21083 - CGI_10008025 superfamily 247941 147 286 1.21E-14 69.2868 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#21084 - CGI_10008026 superfamily 241619 770 807 8.35E-05 42.0717 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#21084 - CGI_10008026 superfamily 241619 838 909 1.08E-05 44.8805 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#21085 - CGI_10008027 superfamily 247724 22 180 1.54E-84 249.031 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21089 - CGI_10008031 superfamily 243161 5 64 4.33E-11 54.7077 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#21091 - CGI_10001364 superfamily 247746 156 213 0.00471534 36.0822 cl17192 ATP-synt_B superfamily N - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#21094 - CGI_10001350 superfamily 245814 29 92 8.29E-07 43.9348 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21094 - CGI_10001350 superfamily 245814 113 190 0.00289342 34.4033 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21095 - CGI_10003473 superfamily 243072 1 83 2.22E-13 62.7862 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21095 - CGI_10003473 superfamily 243073 112 145 2.82E-07 43.6129 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#21096 - CGI_10003474 superfamily 243072 80 189 3.39E-19 79.735 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21097 - CGI_10003475 superfamily 241866 15 346 3.53E-169 489.97 cl00445 Iso_dh superfamily - - Isocitrate/isopropylmalate dehydrogenase; Isocitrate/isopropylmalate dehydrogenase. Q#21098 - CGI_10003476 superfamily 241563 71 109 3.07E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21098 - CGI_10003476 superfamily 241563 28 59 0.000803123 37.844 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21099 - CGI_10005574 superfamily 241600 2 164 2.54E-54 172.81 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#21100 - CGI_10005575 superfamily 241600 1 206 6.98E-95 278.355 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#21102 - CGI_10005577 superfamily 241600 1 205 3.90E-94 276.429 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#21103 - CGI_10005578 superfamily 243064 68 224 5.21E-05 41.6322 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#21104 - CGI_10005579 superfamily 243064 1 119 5.19E-12 59.3514 cl02512 NTR_like superfamily N - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#21105 - CGI_10005580 superfamily 245201 7 261 1.76E-106 321.37 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21106 - CGI_10001524 superfamily 247743 429 552 1.24E-07 50.2223 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21106 - CGI_10001524 superfamily 205451 136 228 0.00175584 37.1727 cl16203 DUF4062 superfamily - - "Domain of unknown function (DUF4062); This presumed domain is functionally uncharacterized. This domain family is found in bacteria, archaea and eukaryotes, and is approximately 80 amino acids in length. There is a conserved SST sequence motif." Q#21109 - CGI_10011210 superfamily 247739 88 300 1.20E-91 280.261 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#21109 - CGI_10011210 superfamily 247856 405 465 1.36E-11 60.2541 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21109 - CGI_10011210 superfamily 247856 329 389 1.97E-06 45.2313 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21110 - CGI_10011211 superfamily 245201 25 217 6.77E-33 121.96 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21111 - CGI_10011212 superfamily 241754 918 1244 1.54E-173 519.821 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#21112 - CGI_10011213 superfamily 241613 189 218 5.42E-08 49.1274 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21112 - CGI_10011213 superfamily 241571 26 136 4.00E-07 47.7923 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#21113 - CGI_10011214 superfamily 241858 152 265 2.64E-18 79.2516 cl00429 SNARE_assoc superfamily - - SNARE associated Golgi protein; This is a family of SNARE associated Golgi proteins. The yeast member of this family localises with the t-SNARE Tlg2. Q#21116 - CGI_10011217 superfamily 247725 51 65 0.00862844 32.5612 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#21120 - CGI_10011221 superfamily 243648 27 141 1.24E-11 64.2346 cl04109 Methyltransf_7 superfamily C - "SAM dependent carboxyl methyltransferase; This family of plant methyltransferases contains enzymes that act on a variety of substrates including salicylic acid, jasmonic acid and 7-Methylxanthine. Caffeine is synthesised through sequential three-step methylation of xanthine derivatives at positions 7-N, 3-N, and 1-N. The protein 7-methylxanthine methyltransferase (designated as CaMXMT) catalyzes the second step to produce theobromine." Q#21120 - CGI_10011221 superfamily 243648 292 397 3.29E-09 56.9158 cl04109 Methyltransf_7 superfamily C - "SAM dependent carboxyl methyltransferase; This family of plant methyltransferases contains enzymes that act on a variety of substrates including salicylic acid, jasmonic acid and 7-Methylxanthine. Caffeine is synthesised through sequential three-step methylation of xanthine derivatives at positions 7-N, 3-N, and 1-N. The protein 7-methylxanthine methyltransferase (designated as CaMXMT) catalyzes the second step to produce theobromine." Q#21121 - CGI_10011222 superfamily 215754 23 107 1.37E-26 103.872 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21121 - CGI_10011222 superfamily 215754 121 204 1.40E-14 69.9748 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21121 - CGI_10011222 superfamily 215754 211 256 9.11E-05 41.0848 cl02813 Mito_carr superfamily C - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21123 - CGI_10011224 superfamily 245206 2 232 6.63E-75 229.046 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#21125 - CGI_10001851 superfamily 241753 87 619 0 722.939 cl00285 Aconitase superfamily - - "Aconitase catalytic domain; Aconitase catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle; Aconitase catalytic domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. Cis-aconitate is formed as an intermediate product during the course of the reaction. In eukaryotes two isozymes of aconitase are known to exist: one found in the mitochondrial matrix and the other found in the cytoplasm. Aconitase, in its active form, contains a 4Fe-4S iron-sulfur cluster; three cysteine residues have been shown to be ligands of the 4Fe-4S cluster. This is the Aconitase core domain, including structural domains 1, 2 and 3, which binds the Fe-S cluster. The aconitase family also contains the following proteins: - Iron-responsive element binding protein (IRE-BP), a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid." Q#21125 - CGI_10001851 superfamily 241693 723 892 1.74E-103 321.533 cl00215 Aconitase_swivel superfamily - - "Aconitase swivel domain. Aconitase (aconitate hydratase) catalyzes the reversible isomerization of citrate and isocitrate as part of the TCA cycle. This is the aconitase swivel domain, which undergoes swivelling conformational change in the enzyme mechanism. The aconitase family contains the following proteins: - Iron-responsive element binding protein (IRE-BP). IRE-BP is a cytosolic protein that binds to iron-responsive elements (IREs). IREs are stem-loop structures found in the 5'UTR of ferritin, and delta aminolevulinic acid synthase mRNAs, and in the 3'UTR of transferrin receptor mRNA. IRE-BP also express aconitase activity. - 3-isopropylmalate dehydratase (isopropylmalate isomerase), the enzyme that catalyzes the second step in the biosynthesis of leucine. - Homoaconitase (homoaconitate hydratase), an enzyme that participates in the alpha-aminoadipate pathway of lysine biosynthesis and that converts cis-homoaconitate into homoisocitric acid." Q#21128 - CGI_10001394 superfamily 246680 7 103 4.19E-33 118.822 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#21129 - CGI_10001427 superfamily 245814 148 197 5.91E-08 47.4016 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21130 - CGI_10001453 superfamily 236795 117 163 0.000541097 37.7682 cl15947 PRK10920 superfamily C - putative uroporphyrinogen III C-methyltransferase; Provisional Q#21131 - CGI_10018226 superfamily 241623 3575 3866 2.49E-115 369.341 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#21131 - CGI_10018226 superfamily 219733 1687 2091 3.30E-114 372.551 cl06971 NUC194 superfamily - - NUC194 domain; This is domain B in the catalytic subunit of DNA-dependent protein kinases. Q#21132 - CGI_10018227 superfamily 241622 5 87 2.72E-16 69.5178 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#21133 - CGI_10018228 superfamily 243034 2250 2337 9.35E-06 46.6044 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21133 - CGI_10018228 superfamily 243034 2161 2212 0.000450226 41.2116 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21133 - CGI_10018228 superfamily 222150 1809 1831 0.00294307 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21133 - CGI_10018228 superfamily 222150 1836 1859 0.00909525 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21134 - CGI_10018229 superfamily 219592 650 810 3.54E-43 161.851 cl06720 WAPL superfamily C - "Wings apart-like protein regulation of heterochromatin; This family contains sequences expressed in eukaryotic organisms bearing high similarity to the WAPL conserved region of D. melanogaster wings apart-like protein. This protein is involved in the regulation of heterochromatin structure. hWAPL, the human homologue, is found to play a role in the development of cervical carcinogenesis, and is thought to have similar functions to Drosophila wapl protein. Malfunction of the hWAPL pathway is thought to activate an apoptotic pathway that consequently leads to cell death." Q#21134 - CGI_10018229 superfamily 219592 882 960 5.36E-27 113.315 cl06720 WAPL superfamily C - "Wings apart-like protein regulation of heterochromatin; This family contains sequences expressed in eukaryotic organisms bearing high similarity to the WAPL conserved region of D. melanogaster wings apart-like protein. This protein is involved in the regulation of heterochromatin structure. hWAPL, the human homologue, is found to play a role in the development of cervical carcinogenesis, and is thought to have similar functions to Drosophila wapl protein. Malfunction of the hWAPL pathway is thought to activate an apoptotic pathway that consequently leads to cell death." Q#21135 - CGI_10018230 superfamily 219592 2 265 5.56E-54 186.118 cl06720 WAPL superfamily N - "Wings apart-like protein regulation of heterochromatin; This family contains sequences expressed in eukaryotic organisms bearing high similarity to the WAPL conserved region of D. melanogaster wings apart-like protein. This protein is involved in the regulation of heterochromatin structure. hWAPL, the human homologue, is found to play a role in the development of cervical carcinogenesis, and is thought to have similar functions to Drosophila wapl protein. Malfunction of the hWAPL pathway is thought to activate an apoptotic pathway that consequently leads to cell death." Q#21136 - CGI_10018231 superfamily 243036 303 598 4.32E-89 281.161 cl02434 CNH superfamily - - "CNH domain; Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations." Q#21140 - CGI_10018235 superfamily 247856 103 131 0.000271486 35.6484 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21142 - CGI_10018237 superfamily 246975 39 60 0.00521454 32.7041 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#21143 - CGI_10018238 superfamily 244824 1 368 1.49E-107 328.935 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#21144 - CGI_10018239 superfamily 245201 21 350 8.41E-141 420.339 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21151 - CGI_10018246 superfamily 242406 2 122 2.31E-20 83.4097 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#21154 - CGI_10002800 superfamily 217410 246 422 1.02E-16 78.1647 cl18409 DDE_1 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localised to the centromere." Q#21158 - CGI_10003130 superfamily 248369 85 261 2.41E-07 48.5834 cl17815 Yip1 superfamily - - Yip1 domain; The Yip1 integral membrane domain contains four transmembrane alpha helices. The domain is characterized by the motifs DLYGP and GY. The Yip1 protein is a golgi protein involved in vesicular transport that interacts with GTPases. Q#21161 - CGI_10003133 superfamily 218482 73 163 4.13E-19 83.0801 cl04969 Kri1 superfamily - - KRI1-like family; The yeast member of this family (Kri1p) is found to be required for 40S ribosome biogenesis in the nucleolus. Q#21161 - CGI_10003133 superfamily 193409 262 330 1.99E-18 80.7077 cl15178 Kri1_C superfamily N - KRI1-like family C-terminal; The yeast member of this family (Kri1p) is found to be required for 40S ribosome biogenesis in the nucleolus. This is the C-terminal domain of the family. Q#21162 - CGI_10003134 superfamily 248012 445 581 2.41E-28 112.367 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#21162 - CGI_10003134 superfamily 245213 309 338 0.00334075 36.7261 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21165 - CGI_10002119 superfamily 242274 9 90 0.000287212 36.5384 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#21167 - CGI_10002543 superfamily 216456 442 534 7.76E-07 48.4738 cl03182 RYDR_ITPR superfamily C - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#21167 - CGI_10002543 superfamily 197746 95 137 0.00400138 35.7799 cl02624 MIR superfamily C - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#21168 - CGI_10003028 superfamily 221683 61 102 5.31E-08 47.2635 cl15002 UPF0489 superfamily C - UPF0489 domain; This family is probably an enzyme which is related to the Arginase family. Q#21169 - CGI_10003029 superfamily 245847 26 120 1.47E-16 71.0485 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#21172 - CGI_10004196 superfamily 201831 3 27 0.00481962 32.136 cl03238 Vault superfamily N - Major Vault Protein repeat; The vault is a ubiquitous and highly conserved ribonucleoprotein particle of approximately 13 mDa of unknown function. This family corresponds to a repeat found in the amino terminal half of the major vault protein. Q#21176 - CGI_10003710 superfamily 247856 10 72 1.87E-06 42.9201 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21177 - CGI_10003711 superfamily 246918 63 119 5.31E-05 39.1071 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21179 - CGI_10005631 superfamily 243051 1 100 6.33E-12 58.9013 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#21180 - CGI_10004724 superfamily 241783 33 239 3.21E-51 167.746 cl00322 Ribosomal_L1 superfamily - - "Ribosomal protein L1. The L1 protein, located near the E-site of the ribosome, forms part of the L1 stalk along with 23S rRNA. In bacteria and archaea, L1 functions both as a ribosomal protein that binds rRNA, and as a translation repressor that binds its own mRNA. Like several other large ribosomal subunit proteins, L1 displays RNA chaperone activity. L1 is one of the largest ribosomal proteins. It is composed of two domains that cycle between open and closed conformations via a hinge motion. The RNA-binding site of L1 is highly conserved, with both mRNA and rRNA binding the same binding site." Q#21181 - CGI_10004725 superfamily 243082 223 571 8.07E-113 358.108 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#21182 - CGI_10004726 superfamily 216981 35 73 4.20E-05 37.5122 cl17087 OTU superfamily C - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#21183 - CGI_10008967 superfamily 219542 29 142 1.66E-39 140.84 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#21183 - CGI_10008967 superfamily 219541 429 603 9.03E-26 103.317 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#21183 - CGI_10008967 superfamily 215896 152 342 1.99E-12 65.0088 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#21185 - CGI_10008969 superfamily 219542 99 211 5.14E-41 148.544 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#21185 - CGI_10008969 superfamily 219541 736 892 8.62E-28 111.021 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#21185 - CGI_10008969 superfamily 215896 533 632 5.89E-22 94.6692 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#21185 - CGI_10008969 superfamily 113452 284 369 0.00902412 38.5613 cl04672 BAF1_ABF1 superfamily NC - "BAF1 / ABF1 chromatin reorganising factor; ABF1 is a sequence-specific DNA binding protein involved in transcription activation, gene silencing and initiation of DNA replication. ABF1 is known to remodel chromatin, and it is proposed that it mediates its effects on transcription and gene expression by modifying local chromatin architecture. These functions require a conserved stretch of 20 amino acids in the C-terminal region of ABF1 (amino acids 639 to 662 S. cerevisiae). The N-terminal two thirds of the protein are necessary for DNA binding, and the N-terminus (amino acids 9 to 91 in S. cerevisiae) is thought to contain a novel zinc-finger motif which may stabilise the protein structure." Q#21186 - CGI_10008970 superfamily 241574 78 209 6.06E-44 147.755 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21187 - CGI_10008971 superfamily 245864 65 484 1.24E-67 226.391 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#21188 - CGI_10008972 superfamily 245864 56 490 8.96E-76 248.732 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#21189 - CGI_10008973 superfamily 217324 27 136 3.64E-09 51.2954 cl03844 Folate_rec superfamily - - Folate receptor family; This family includes the folate receptor which binds to folate and reduced folic acid derivatives and mediates delivery of 5-methyltetrahydrofolate to the interior of cells. These proteins are attached to the membrane by a GPI-anchor. The proteins contain 16 conserved cysteines that form eight disulphide bridges. Q#21190 - CGI_10008974 superfamily 244083 303 391 2.82E-34 125.492 cl05417 PLA2_like superfamily - - "PLA2_like: Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers." Q#21190 - CGI_10008974 superfamily 217740 25 242 1.73E-76 244.578 cl18427 Scramblase superfamily - - Scramblase; Scramblase is palmitoylated and contains a potential protein kinase C phosphorylation site. Scramblase exhibits Ca2+-activated phospholipid scrambling activity in vitro. There are also possible SH3 and WW binding motifs. Scramblase is involved in the redistribution of phospholipids after cell activation or injury. Q#21190 - CGI_10008974 superfamily 217252 482 577 1.17E-20 88.0055 cl08372 Pyr_redox_dim superfamily - - "Pyridine nucleotide-disulphide oxidoreductase, dimerisation domain; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases." Q#21192 - CGI_10011852 superfamily 243061 55 155 8.63E-41 134.392 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#21192 - CGI_10011852 superfamily 243061 2 52 1.21E-16 71.219 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#21193 - CGI_10011853 superfamily 242274 48 136 0.000299658 38.5474 cl01053 SGNH_hydrolase superfamily NC - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#21196 - CGI_10011856 superfamily 243035 28 85 1.98E-14 64.2395 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#21198 - CGI_10011858 superfamily 245819 110 284 2.09E-70 219.373 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#21198 - CGI_10011858 superfamily 219526 1 97 7.12E-24 96.9194 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#21202 - CGI_10003915 superfamily 215647 167 210 7.96E-11 59.9297 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#21203 - CGI_10003916 superfamily 215647 2 37 1.46E-07 45.6773 cl18338 7tm_2 superfamily N - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#21205 - CGI_10020653 superfamily 247736 175 231 0.00342663 35.3293 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#21205 - CGI_10020653 superfamily 247736 34 91 0.000684335 37.5464 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#21206 - CGI_10020655 superfamily 245604 1 64 1.83E-19 82.0671 cl11404 Biotinyl_lipoyl_domains superfamily - - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#21206 - CGI_10020655 superfamily 215782 211 414 4.36E-75 235.137 cl18344 2-oxoacid_dh superfamily - - 2-oxoacid dehydrogenases acyltransferase (catalytic domain); These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain. Q#21206 - CGI_10020655 superfamily 202412 104 142 1.86E-07 47.4169 cl03729 E3_binding superfamily - - e3 binding domain; This family represents a small domain of the E2 subunit of 2-oxo-acid dehydrogenases responsible for the binding of the E3 subunit. Q#21207 - CGI_10020656 superfamily 241614 11 138 4.86E-41 135.471 cl00105 LMWPc superfamily - - Low molecular weight phosphatase family; Q#21208 - CGI_10020657 superfamily 245849 155 235 2.63E-12 61.1042 cl12045 Ubiq_cyt_C_chap superfamily C - Ubiquinol-cytochrome C chaperone; Ubiquinol-cytochrome C chaperone. Q#21210 - CGI_10020659 superfamily 244881 134 187 4.48E-15 72.2256 cl08267 ISOPREN_C2_like superfamily NC - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#21210 - CGI_10020659 superfamily 203720 188 230 7.28E-07 45.617 cl08457 A2M_recep superfamily N - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#21211 - CGI_10020660 superfamily 243072 48 163 3.73E-24 98.995 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21211 - CGI_10020660 superfamily 245814 387 449 0.00608777 35.5589 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21212 - CGI_10020661 superfamily 243072 60 178 7.98E-26 100.151 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21212 - CGI_10020661 superfamily 243072 1 113 1.54E-24 96.6838 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21213 - CGI_10020662 superfamily 247794 16 421 0 768.159 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#21215 - CGI_10020664 superfamily 245814 105 194 1.31E-13 65.5446 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21215 - CGI_10020664 superfamily 245814 204 281 4.77E-12 61.4172 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21216 - CGI_10020665 superfamily 201590 31 187 1.61E-93 274.475 cl03090 HH_signal superfamily - - "Hedgehog amino-terminal signalling domain; For the carboxyl Hint module, see pfam01079. Hedgehog is a family of secreted signal molecules required for embryonic cell differentiation." Q#21216 - CGI_10020665 superfamily 247996 200 230 0.000338923 37.6348 cl17442 Hint superfamily C - "Hedgehog/Intein domain, found in Hedgehog proteins as well as proteins which contain inteins and undergo protein splicing (e.g. DnaB, RIR1-2, GyrA and Pol). In protein splicing an intervening polypeptide sequence - the intein - is excised from a protein, and the flanking polypeptide sequences - the exteins - are joined by a peptide bond. In addition to the autocatalytic splicing domain, many inteins contain an inserted endonuclease domain, which plays a role in spreading inteins. Hedgehog proteins are a major class of intercellular signaling molecules, which control inductive interactions during animal development. The mature signaling forms of hedgehog proteins are the N-terminal fragments, which are covalently linked to cholesterol at their C-termini. This modification is the result of an autoprocessing step catalyzed by the C-terminal fragments, which are aligned here." Q#21217 - CGI_10020666 superfamily 248097 54 169 6.07E-21 83.8538 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21218 - CGI_10020667 superfamily 247684 176 330 0.00717299 37.9536 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#21218 - CGI_10020667 superfamily 248383 668 1035 8.70E-92 300.633 cl17829 DUF917 superfamily - - Protein of unknown function (DUF917); This family consists of hypothetical bacterial and archaeal proteins of unknown function. Q#21218 - CGI_10020667 superfamily 218571 11 184 7.15E-31 121.207 cl05110 Hydant_A_N superfamily - - Hydantoinase/oxoprolinase N-terminal region; This family is found at the N-terminus of the pfam01968 family. Q#21218 - CGI_10020667 superfamily 248097 1125 1241 1.80E-20 89.6318 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21219 - CGI_10020668 superfamily 248458 31 204 3.11E-05 44.6121 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#21219 - CGI_10020668 superfamily 248458 282 408 0.000706861 40.3749 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#21220 - CGI_10020669 superfamily 243096 305 487 6.66E-20 89.2792 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#21221 - CGI_10020670 superfamily 241752 130 292 2.77E-24 98.1714 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#21222 - CGI_10020671 superfamily 247743 3 43 6.72E-07 45.2147 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21224 - CGI_10020673 superfamily 241572 196 285 1.48E-23 93.4572 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#21224 - CGI_10020673 superfamily 241572 295 379 9.06E-17 74.9676 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#21225 - CGI_10020674 superfamily 244962 60 250 2.88E-27 106.852 cl08452 SKA1_N superfamily - - "Spindle and kinetochore-associated protein 1, N-terminal domain; SKA1 is a component of the SKA complex, which is formed by the association of three subunits (SKA1, SKA2, annd SKA3). The SKA complex is essential for accurate cell division. It functions with the Ndc80 network to establish stable kinetochore-microtubule interactions, which are crucial for the highly orchestrated chromosome movements during mitosis. The biological unit is a W-shaped homodimer of the three-subunit complex. This model represents the N-terminal domain of SKA1, which is involved in interactions with SKA2 and SKA3 to form the SKA complex. The C-terminal portion of SKA1 is involved in creating a microtubule-binding surface." Q#21226 - CGI_10020675 superfamily 245841 692 1055 1.51E-124 392.655 cl12025 PolY superfamily - - "Y-family of DNA polymerases; Y-family DNA polymerases are a specialized subset of polymerases that facilitate translesion synthesis (TLS), a process that allows the bypass of a variety of DNA lesions. Unlike replicative polymerases, TLS polymerases lack proofreading activity and have low fidelity and low processivity. They use damaged DNA as templates and insert nucleotides opposite the lesions. The active sites of TLS polymerases are large and flexible to allow the accomodation of distorted bases. Most TLS polymerases are members of the Y-family, including Pol eta, Pol kappa/IV, Pol iota, Rev1, and Pol V, which is found exclusively in bacteria. In eukaryotes, the B-family polymerase Pol zeta also functions as a TLS polymerase. Expression of Y-family polymerases is often induced by DNA damage and is believed to be highly regulated. TLS is likely induced by the monoubiquitination of the replication clamp PCNA, which provides a scaffold for TLS polymerases to bind in order to access the lesion. Because of their high error rates, TLS polymerases are potential targets for cancer treatment and prevention." Q#21226 - CGI_10020675 superfamily 128973 1345 1370 0.00117248 38.3574 cl02765 ZnF_Rad18 superfamily - - Rad18-like CCHC zinc finger; Yeast Rad18p functions with Rad5p in error-free post-replicative DNA repair. This zinc finger is likely to bind nucleic-acids. Q#21227 - CGI_10020676 superfamily 247725 33 129 1.15E-57 190.157 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#21227 - CGI_10020676 superfamily 246681 367 599 9.23E-123 365.125 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#21228 - CGI_10020677 superfamily 241578 409 592 3.78E-14 71.443 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21228 - CGI_10020677 superfamily 115363 758 817 1.34E-12 64.7005 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#21228 - CGI_10020677 superfamily 115363 686 744 1.83E-12 64.3153 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#21229 - CGI_10020678 superfamily 241578 167 329 4.78E-11 61.4278 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21231 - CGI_10020680 superfamily 242165 4 260 0 515.635 cl00880 Ribosomal_S8e_like superfamily - - "Eukaryotic/archaeal ribosomal protein S8e and similar proteins; This family contains the eukaryotic/archaeal ribosomal protein S8, a component of the small ribosomal subunits, as well as the NSA2 gene product." Q#21232 - CGI_10020681 superfamily 243072 473 595 8.02E-34 127.115 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21232 - CGI_10020681 superfamily 243072 637 758 3.41E-29 114.018 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21232 - CGI_10020681 superfamily 241760 87 131 3.90E-18 79.8099 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#21232 - CGI_10020681 superfamily 115363 13 78 7.01E-15 70.8637 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#21232 - CGI_10020681 superfamily 115363 159 222 3.61E-11 60.0781 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#21234 - CGI_10020683 superfamily 222557 26 201 2.01E-75 227.864 cl16634 DUF4291 superfamily - - Domain of unknown function (DUF4291); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 190 and 214 amino acids in length. There are two conserved sequence motifs: VYQAY and RMTW. Q#21235 - CGI_10020684 superfamily 241570 65 165 8.15E-14 67.351 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#21235 - CGI_10020684 superfamily 241570 205 319 1.14E-07 49.2466 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#21237 - CGI_10020686 superfamily 245863 69 359 2.69E-110 326.608 cl12077 Methyltrn_RNA_3 superfamily - - "Putative RNA methyltransferase; This family has a TIM barrel-like fold with a deep C-terminal trefoil knot. The arrangement of its hydrophilic and hydrophobic surfaces are opposite to that of the classic TIM barrel proteins. It is likely to bind RNA, and may function as a methyltransferase." Q#21238 - CGI_10020687 superfamily 243092 183 496 1.08E-57 203.721 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21238 - CGI_10020687 superfamily 243084 1179 1294 8.15E-30 117.438 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#21238 - CGI_10020687 superfamily 243084 1359 1471 1.27E-47 168.405 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#21239 - CGI_10020688 superfamily 241707 28 251 9.88E-109 319.198 cl00230 CIS_IPPS superfamily - - "Cis (Z)-Isoprenyl Diphosphate Synthases (cis-IPPS); homodimers which catalyze the successive 1'-4 condensation of the isopentenyl diphosphate (IPP) molecule to trans,trans-farnesyl diphosphate (FPP) or to cis,trans-FPP to form long-chain polyprenyl diphosphates. A few can also catalyze the condensation of IPP to trans-geranyl diphosphate to form the short-chain cis,trans- FPP. In prokaryotes, the cis-IPPS, undecaprenyl diphosphate synthase (UPP synthase) catalyzes the formation of the carrier lipid UPP in bacterial cell wall peptidooglycan biosynthesis. Similarly, in eukaryotes, the cis-IPPS, dehydrodolichyl diphosphate (dedol-PP) synthase catalyzes the formation of the polyisoprenoid glycosyl carrier lipid dolichyl monophosphate. cis-IPPS are mechanistically and structurally distinct from trans-IPPS, lacking the DDXXD motifs, yet requiring Mg2+ for activity." Q#21242 - CGI_10020195 superfamily 217518 254 356 1.90E-36 129.277 cl08388 CBM_21 superfamily - - "Putative phosphatase regulatory subunit; This family consists of several eukaryotic proteins that are thought to be involved in the regulation of glycogen metabolism. For instance, the mouse PTG protein has been shown to interact with glycogen synthase, phosphorylase kinase, phosphorylase a: these three enzymes have key roles in the regulation of glycogen metabolism. PTG also binds the catalytic subunit of protein phosphatase 1 (PP1C) and localises it to glycogen. Subsets of similar interactions have been observed with several other members of this family, such as the yeast PIG1, PIG2, GAC1 and GIP2 proteins. While the precise function of these proteins is not known, they may serve a scaffold function, bringing together the key enzymes in glycogen metabolism. This family is a carbohydrate binding domain." Q#21243 - CGI_10020196 superfamily 247724 6 92 1.06E-09 51.3724 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21244 - CGI_10020197 superfamily 220692 2 257 3.07E-08 52.5917 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#21247 - CGI_10020200 superfamily 217293 26 217 1.53E-30 116.578 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#21249 - CGI_10020202 superfamily 243034 348 443 3.00E-10 59.316 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21249 - CGI_10020202 superfamily 243034 155 260 0.000534598 40.4412 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21249 - CGI_10020202 superfamily 202224 1034 1142 8.54E-36 133.96 cl18224 JmjC superfamily - - "JmjC domain, hydroxylase; The JmjC domain belongs to the Cupin superfamily. JmjC-domain proteins may be protein hydroxylases that catalyze a novel histone modification. This is confirmed to be a hydroxylase: the human JmjC protein named Tyw5p unexpectedly acts in the biosynthesis of a hypermodified nucleoside, hydroxy-wybutosine, in tRNA-Phe by catalyzing hydroxylation." Q#21249 - CGI_10020202 superfamily 214721 1000 1064 1.58E-06 47.632 cl18313 JmjC superfamily - - "A domain family that is part of the cupin metalloenzyme superfamily; Probable enzymes, but of unknown functions, that regulate chromatin reorganisation processes (Clissold and Ponting, in press)." Q#21249 - CGI_10020202 superfamily 242988 1447 1585 0.00249573 39.9116 cl02331 Intg_mem_TP0381 superfamily N - "Integral membrane protein (intg_mem_TP0381); This entry represents a family of hydrophobic proteins with seven predicted transmembrane alpha helices. Members are found in Bacillus subtilis (ywaF), TP0381 from Treponema pallidum (TP0381), Streptococcus pyogenes, Rhodococcus erythropolis, etc." Q#21251 - CGI_10020204 superfamily 247725 44 92 9.39E-18 74.9439 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#21253 - CGI_10020206 superfamily 242251 49 103 4.26E-13 61.5315 cl01015 FUN14 superfamily C - FUN14 family; This family of short proteins are found in eukaryotes and some archaea. Although the function of these proteins is not known they may contain transmembrane helices. Q#21255 - CGI_10020208 superfamily 243034 460 525 2.08E-08 52.3824 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21255 - CGI_10020208 superfamily 243034 349 447 3.09E-07 48.5304 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21255 - CGI_10020208 superfamily 243034 291 378 3.42E-05 42.3672 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21259 - CGI_10020212 superfamily 217888 474 782 4.19E-95 302.261 cl04396 Sec15 superfamily - - Exocyst complex subunit Sec15-like; Exocyst complex subunit Sec15-like. Q#21260 - CGI_10020213 superfamily 245206 9 237 4.40E-63 199.435 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#21261 - CGI_10020214 superfamily 247723 29 58 1.78E-05 43.366 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#21263 - CGI_10020216 superfamily 241570 27 87 0.00191635 34.609 cl00047 CAP_ED superfamily N - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#21265 - CGI_10020218 superfamily 241874 644 1162 2.42E-161 495.463 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#21265 - CGI_10020218 superfamily 241874 1 499 8.30E-156 480.825 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#21266 - CGI_10020219 superfamily 241874 1 64 4.20E-14 66.7358 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#21270 - CGI_10020223 superfamily 242406 22 105 3.20E-09 51.4381 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#21272 - CGI_10013812 superfamily 241618 1 35 2.06E-06 41.5527 cl00111 PAH superfamily - - "Pancreatic Hormone domain, a regulator of pancreatic and gastrointestinal functions; neuropeptide Y (NPY)b, peptide YY (PYY), and pancreatic polypetide (PP) are closely related; propeptide is enzymatically cleaved to yield the mature active peptide with amidated C-terminal ends; receptor binding and activation functions may reside in the N- and C-termini respectively; occurs in neurons, intestinal endocrine cells, and pancreas; exist as monomers and dimers" Q#21275 - CGI_10013816 superfamily 241563 62 102 0.000144869 39.77 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21276 - CGI_10013817 superfamily 241563 62 102 0.000106305 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21276 - CGI_10013817 superfamily 110440 493 517 0.00647422 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#21276 - CGI_10013817 superfamily 220631 576 612 0.0094989 34.7606 cl10899 MRP-S32 superfamily N - "Mitochondrial 28S ribosomal protein S32; This entry is of a family of short, approximately 100 amino acid residues, proteins which are mitochondrial 28S ribosomal proteins named as MRP-S32. Their exact function could not be confirmed." Q#21277 - CGI_10013818 superfamily 241563 132 172 0.000114162 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21277 - CGI_10013818 superfamily 242730 231 372 0.000584894 39.1667 cl01825 Phage_Mu_Gam superfamily - - Bacteriophage Mu Gam like protein; This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. Q#21278 - CGI_10013819 superfamily 191362 217 269 2.08E-26 98.8822 cl05351 zf-nanos superfamily - - "Nanos RNA binding domain; This family consists of several conserved novel zinc finger domains found in the eukaryotic proteins Nanos and Xcat-2. In Drosophila melanogaster, Nanos functions as a localised determinant of posterior pattern. Nanos RNA is localised to the posterior pole of the maturing egg cell and encodes a protein that emanates from this localised source. Nanos acts as a translational repressor and thereby establishes a gradient of the morphogen Hunchback. Xcat-2 is found in the vegetal cortical region and is inherited by the vegetal blasomeres during development, and is degraded very early in development. The localised and maternally restricted expression of Xcat-2 RNA suggests a role for its protein in setting up regional differences in gene expression that occur early in development." Q#21279 - CGI_10013820 superfamily 215859 429 631 1.06E-33 127.718 cl18347 Peptidase_S9 superfamily - - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#21280 - CGI_10013821 superfamily 241563 59 98 6.60E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21281 - CGI_10013822 superfamily 241563 67 104 0.000186504 39.3848 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21282 - CGI_10013823 superfamily 247746 41 98 0.00747019 35.3118 cl17192 ATP-synt_B superfamily N - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#21288 - CGI_10013829 superfamily 241600 411 616 7.36E-45 158.943 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#21289 - CGI_10013830 superfamily 149409 182 725 2.27E-119 372.109 cl18037 Plexin_cytopl superfamily - - "Plexin cytoplasmic RasGAP domain; This family features the C-terminal regions of various plexins. Plexins are receptors for semaphorins, and plexin signalling is important in path finding and patterning of both neurons and developing blood vessels. The cytoplasmic region, which has been called a SEX domain in some members of this family, is involved in downstream signalling pathways, by interaction with proteins such as Rac1, RhoD, Rnd1 and other plexins. This domain acts as a RasGAP domain." Q#21290 - CGI_10018247 superfamily 247684 249 452 6.47E-21 94.6515 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#21290 - CGI_10018247 superfamily 245847 11 171 0.000243186 40.6178 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#21290 - CGI_10018247 superfamily 110440 691 717 0.00636305 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#21291 - CGI_10018248 superfamily 214531 79 120 1.38E-06 43.7445 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21291 - CGI_10018248 superfamily 214531 34 77 5.89E-06 42.2037 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21291 - CGI_10018248 superfamily 215683 8 51 0.00696546 33.2975 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#21293 - CGI_10018250 superfamily 245213 198 233 0.000576304 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21293 - CGI_10018250 superfamily 245213 240 275 0.0026518 36.0754 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21293 - CGI_10018250 superfamily 222083 406 472 0.00493131 36.4334 cl16258 DUF2407_C superfamily C - DUF2407 C-terminal domain; This is a family of proteins found in fungi. The function is not known. There is a characteristic GFDRL sequence motif. Q#21294 - CGI_10018251 superfamily 218520 31 211 3.28E-69 212.909 cl05007 EBP superfamily - - "Emopamil binding protein; Emopamil binding protein (EBP) is as a gene that encodes a non-glycosylated type I integral membrane protein of endoplasmic reticulum and shows high level expression in epithelial tissues. The EBP protein has emopamil binding domains, including the sterol acceptor site and the catalytic centre, which show Delta8-Delta7 sterol isomerase activity. Human sterol isomerase, a homologue of mouse EBP, is suggested not only to play a role in cholesterol biosynthesis, but also to affect lipoprotein internalisation. In humans, mutations of EBP are known to cause the genetic disorder of X-linked dominant chondrodysplasia punctata (CDPX2). This syndrome of humans is lethal in most males, and affected females display asymmetric hyperkeratotic skin and skeletal abnormalities." Q#21296 - CGI_10018253 superfamily 243166 61 253 5.06E-16 73.4818 cl02759 TRAM_LAG1_CLN8 superfamily - - TLC domain; TLC domain. Q#21297 - CGI_10018254 superfamily 243051 80 164 0.00115298 36.587 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#21299 - CGI_10018256 superfamily 243106 1085 1204 7.42E-54 187.021 cl02608 BAH superfamily - - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#21299 - CGI_10018256 superfamily 243084 231 341 8.82E-48 169.436 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#21299 - CGI_10018256 superfamily 243084 907 1019 3.19E-46 164.848 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#21299 - CGI_10018256 superfamily 241597 1390 1452 9.30E-16 75.3479 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#21299 - CGI_10018256 superfamily 243084 386 485 5.72E-35 132.178 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#21299 - CGI_10018256 superfamily 243084 556 652 4.53E-30 117.825 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#21299 - CGI_10018256 superfamily 243084 811 893 5.83E-30 117.791 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#21299 - CGI_10018256 superfamily 243106 1247 1314 5.92E-12 65.6835 cl02608 BAH superfamily C - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#21299 - CGI_10018256 superfamily 243084 737 776 1.72E-11 64.0038 cl02556 Bromodomain superfamily N - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#21299 - CGI_10018256 superfamily 241563 1817 1852 0.00196121 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21301 - CGI_10018258 superfamily 248458 67 230 7.77E-26 108.17 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#21301 - CGI_10018258 superfamily 248458 298 364 0.000919184 40.3749 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#21301 - CGI_10018258 superfamily 248458 471 556 0.0049336 38.0637 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#21302 - CGI_10018259 superfamily 217228 113 407 4.45E-167 473.766 cl07843 G6PD_C superfamily - - "Glucose-6-phosphate dehydrogenase, C-terminal domain; Glucose-6-phosphate dehydrogenase, C-terminal domain. " Q#21302 - CGI_10018259 superfamily 215937 1 111 1.77E-54 180.776 cl02877 G6PD_N superfamily N - "Glucose-6-phosphate dehydrogenase, NAD binding domain; Glucose-6-phosphate dehydrogenase, NAD binding domain. " Q#21303 - CGI_10008625 superfamily 216686 180 364 1.14E-34 128.21 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#21304 - CGI_10008626 superfamily 243035 368 490 6.58E-18 79.5861 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#21304 - CGI_10008626 superfamily 246918 124 175 2.58E-14 67.9971 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21304 - CGI_10008626 superfamily 246918 181 220 2.94E-06 44.8851 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21305 - CGI_10008627 superfamily 246918 168 220 8.53E-12 61.0635 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21305 - CGI_10008627 superfamily 246918 288 335 4.01E-11 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21305 - CGI_10008627 superfamily 246918 54 106 8.60E-11 58.3671 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21305 - CGI_10008627 superfamily 246918 111 163 1.15E-10 57.9819 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21305 - CGI_10008627 superfamily 246918 3 49 3.17E-08 50.6631 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21305 - CGI_10008627 superfamily 246918 346 385 2.34E-07 48.3519 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 432 484 5.45E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 716 768 8.34E-10 56.0559 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 659 711 1.44E-09 55.2855 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 773 825 5.12E-09 53.7447 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 602 654 1.25E-08 52.5891 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 260 307 1.34E-08 52.5891 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 67 113 5.39E-08 51.0483 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 119 170 1.04E-07 49.8927 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 545 597 1.23E-07 49.8927 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 830 882 4.32E-07 48.3519 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 318 368 7.39E-06 44.4999 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21306 - CGI_10008628 superfamily 246918 204 254 3.13E-05 42.5739 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21308 - CGI_10008630 superfamily 217925 45 90 4.27E-11 55.5901 cl04417 Ctr superfamily N - "Ctr copper transporter family; The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport." Q#21309 - CGI_10008631 superfamily 241571 244 355 2.28E-14 69.3634 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#21309 - CGI_10008631 superfamily 241584 154 241 2.96E-08 51.3431 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#21309 - CGI_10008631 superfamily 241583 14 109 8.84E-38 136.932 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#21312 - CGI_10014772 superfamily 247085 728 817 0.000357997 40.9519 cl15820 RICIN superfamily N - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#21312 - CGI_10014772 superfamily 241645 519 615 0.000863927 39.5488 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#21312 - CGI_10014772 superfamily 241645 207 272 0.00980552 36.082 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#21313 - CGI_10014773 superfamily 243146 19 57 0.00210017 34.9447 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#21314 - CGI_10014774 superfamily 241568 880 911 0.00663557 36.2868 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21314 - CGI_10014774 superfamily 214531 695 737 2.98E-10 57.6116 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21314 - CGI_10014774 superfamily 215683 670 712 3.44E-08 51.7871 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#21314 - CGI_10014774 superfamily 214531 70 112 1.48E-07 49.9077 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21314 - CGI_10014774 superfamily 214531 26 69 2.34E-06 46.4409 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21314 - CGI_10014774 superfamily 241578 182 214 0.00151473 40.0608 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21316 - CGI_10014776 superfamily 191378 53 102 0.000437215 37.4876 cl05416 PSP94 superfamily NC - "Beta-microseminoprotein (PSP-94); This family consists of the mammalian specific protein beta-microseminoprotein. Prostatic secretory protein of 94 amino acids (PSP94), also called beta-microseminoprotein, is a small, nonglycosylated protein, rich in cysteine residues. It was first isolated as a major protein from human seminal plasma. The exact function of this protein is unknown." Q#21318 - CGI_10014778 superfamily 247042 41 154 1.36E-09 55.7998 cl15693 Sema superfamily C - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#21319 - CGI_10014779 superfamily 157570 1 31 0.00955277 32.6875 cl07019 SHR3_chaperone superfamily N - "ER membrane protein SH3; This family of proteins are membrane localised chaperones that are required for correct plasma membrane localisation of amino acid permeases (AAPs). SH3 prevents AAPs proteins from aggregating and assists in their correct folding. In the absence of SH3, AAPs are retained in the ER." Q#21321 - CGI_10014781 superfamily 217293 30 234 1.53E-65 213.263 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#21321 - CGI_10014781 superfamily 202474 241 491 2.06E-29 114.673 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#21322 - CGI_10014782 superfamily 241812 37 100 0.000616695 37.0156 cl00356 Ribosomal_L17 superfamily C - Ribosomal protein L17; Ribosomal protein L17. Q#21323 - CGI_10014783 superfamily 217293 3 182 2.57E-30 115.808 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#21323 - CGI_10014783 superfamily 202474 212 288 4.49E-09 54.9673 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#21325 - CGI_10014785 superfamily 246710 82 194 1.64E-06 45.8372 cl14783 DOMON_like superfamily N - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#21327 - CGI_10014787 superfamily 243072 33 149 2.67E-19 85.513 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21327 - CGI_10014787 superfamily 245814 178 231 0.00933124 35.1575 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21327 - CGI_10014787 superfamily 245814 406 463 0.00171365 37.6626 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21327 - CGI_10014787 superfamily 245814 542 607 0.00260013 37.0714 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21329 - CGI_10014790 superfamily 247999 98 143 0.00146205 34.1098 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#21333 - CGI_10006564 superfamily 177822 18 299 1.84E-22 94.2165 cl18088 PLN02164 superfamily - - sulfotransferase Q#21334 - CGI_10006565 superfamily 177822 82 292 4.28E-20 87.2829 cl18088 PLN02164 superfamily N - sulfotransferase Q#21335 - CGI_10006566 superfamily 177822 34 291 1.69E-21 91.1349 cl18088 PLN02164 superfamily N - sulfotransferase Q#21337 - CGI_10006568 superfamily 243066 23 124 7.39E-13 65.3313 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#21338 - CGI_10006569 superfamily 247677 940 1087 1.83E-43 155.52 cl17013 W2 superfamily - - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#21338 - CGI_10006569 superfamily 243128 271 493 5.99E-47 167.892 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#21338 - CGI_10006569 superfamily 241899 15 169 3.50E-11 62.946 cl00489 60KD_IMP superfamily C - 60Kd inner membrane protein; 60Kd inner membrane protein. Q#21338 - CGI_10006569 superfamily 243129 758 863 1.47E-05 44.543 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#21339 - CGI_10006570 superfamily 241574 325 559 4.68E-100 305.664 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21341 - CGI_10006572 superfamily 218663 4 380 1.62E-114 340.785 cl05280 PAXNEB superfamily - - "PAXNEB protein; PAXNEB or PAX6 neighbor is found in several eukaryotic organisms. PAXNED is an RNA polymerase II Elongator protein subunit. It is part of the HAP subcomplex of Elongator, which is a six-subunit component of the RNA polymerase II holoenzyme. The HAP subcomplex is required for Elongator structural integrity and histone acetyltransferase activity. This protein family has a P-loop motif. However its sequence has degraded in many members of the family." Q#21343 - CGI_10006574 superfamily 202184 10 87 5.30E-18 73.0822 cl03509 UCR_14kD superfamily N - "Ubiquinol-cytochrome C reductase complex 14kD subunit; The ubiquinol-cytochrome C reductase complex (cytochrome bc1 complex) is a respiratory multienzyme complex. This Pfam family represents the 14kD (or VI) subunit of the complex which is not directly involved in electron transfer, but has a role in assembly of the complex." Q#21344 - CGI_10006575 superfamily 192997 659 782 4.55E-51 178.544 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#21347 - CGI_10010843 superfamily 241584 278 361 0.000602883 37.8611 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#21350 - CGI_10010846 superfamily 219276 371 535 8.02E-61 200.434 cl06189 Mic1 superfamily - - Colon cancer-associated protein Mic1-like; This family represents the C-terminus (approximately 160 residues) of a number of proteins that resemble colon cancer-associated protein Mic1. Q#21352 - CGI_10010848 superfamily 218549 76 234 1.31E-07 49.7507 cl18461 Mito_fiss_reg superfamily C - "Mitochondrial fission regulator; In eukaryotes, this family of proteins induces mitochondrial fission." Q#21355 - CGI_10010851 superfamily 247856 372 434 0.00363135 35.9865 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21356 - CGI_10010852 superfamily 241563 286 326 9.42E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21357 - CGI_10010853 superfamily 241574 168 395 1.55E-92 280.626 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21358 - CGI_10010854 superfamily 215724 41 334 1.12E-111 329.965 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#21360 - CGI_10010856 superfamily 192535 43 334 0.000534109 39.8866 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#21361 - CGI_10010857 superfamily 243053 404 634 6.24E-52 180.141 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#21361 - CGI_10010857 superfamily 243067 248 373 1.87E-15 73.6007 cl02520 REM superfamily - - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#21362 - CGI_10010858 superfamily 247743 2016 2156 0.000940959 41.3627 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21362 - CGI_10010858 superfamily 193256 2342 2611 3.79E-64 223.286 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#21362 - CGI_10010858 superfamily 193257 2995 3218 2.53E-46 170.166 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#21362 - CGI_10010858 superfamily 193253 2624 2968 1.45E-38 150.958 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#21362 - CGI_10010858 superfamily 247743 1646 1789 8.97E-06 47.29 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21365 - CGI_10010861 superfamily 192535 50 148 0.000290818 39.8866 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#21367 - CGI_10010863 superfamily 241868 19 154 4.78E-44 144.254 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#21368 - CGI_10010864 superfamily 241868 1 90 7.90E-29 102.653 cl00447 Nudix_Hydrolase superfamily N - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#21369 - CGI_10010865 superfamily 247792 39 82 0.00116984 36.2696 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#21369 - CGI_10010865 superfamily 190233 102 151 0.00065749 37.0486 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#21371 - CGI_10010867 superfamily 215647 2 152 0.000606618 39.1289 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#21372 - CGI_10005565 superfamily 241752 27 146 1.17E-36 123.969 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#21373 - CGI_10005566 superfamily 247684 9 78 4.57E-13 66.5319 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#21373 - CGI_10005566 superfamily 247684 66 102 3.37E-05 43.0348 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#21374 - CGI_10005567 superfamily 243035 115 186 5.07E-16 70.7965 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#21376 - CGI_10005569 superfamily 217473 105 144 0.00388722 38.115 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#21377 - CGI_10005570 superfamily 215647 716 929 7.27E-38 142.748 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#21377 - CGI_10005570 superfamily 243086 628 674 3.01E-10 57.385 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#21378 - CGI_10005571 superfamily 220695 99 292 0.000126435 42.5659 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#21380 - CGI_10005573 superfamily 241646 8 51 6.78E-07 42.4378 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#21381 - CGI_10005892 superfamily 220782 53 253 2.27E-33 122.103 cl11134 Stk19 superfamily - - "Serine-threonine protein kinase 19; This serine-threonine protein kinase number 19 is expressed from the MHC and predominantly in the nucleus. Protein kinases are involved in signal transduction pathways and play fundamental roles in the regulation of cell functions. This is a novel Ser/Thr protein kinase, that has Mn2+-dependent protein kinase activity that phosphorylates alpha -casein at Ser/Thr residues and histone at Ser residues. It can be covalently modified by the reactive ATP analogue 5'-p-fluorosulfonylbenzoyladenosine in the absence of ATP, and this modification is prevented in the presence of 1 mM ATP, indicating that the kinase domain of is capable of binding ATP." Q#21382 - CGI_10005893 superfamily 204014 245 310 5.08E-20 83.0329 cl07321 RAI1 superfamily - - "RAI1 like PD-(D/E)XK nuclease; RAI1 is homologous to Caenorhabditis elegans DOM-3 and human DOM3Z and binds to a nuclear exoribonuclease. It is required for 5.8S rRNA processing. Profile-profile comparison tools demonstrate this to be a PD-(D/E)XK nuclease, with a full set of canonical active site signature motifs characteristic to the PD-(D/E)XK nuclease superfamily." Q#21383 - CGI_10005894 superfamily 247804 196 244 0.000810329 37.1698 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#21383 - CGI_10005894 superfamily 241760 371 419 1.87E-12 62.0655 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#21393 - CGI_10007216 superfamily 243082 317 580 2.35E-102 320.809 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#21393 - CGI_10007216 superfamily 241643 677 710 2.50E-11 60.1655 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#21393 - CGI_10007216 superfamily 241643 609 647 4.15E-08 50.5355 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#21393 - CGI_10007216 superfamily 243082 736 793 1.94E-25 107.409 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#21393 - CGI_10007216 superfamily 245220 189 262 3.81E-20 85.9194 cl09957 zf-UBP superfamily - - Zn-finger in ubiquitin-hydrolases and other protein; Zn-finger in ubiquitin-hydrolases and other protein. Q#21394 - CGI_10007217 superfamily 247725 89 234 2.30E-48 166 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#21395 - CGI_10007218 superfamily 246908 70 164 4.33E-19 79.3934 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#21398 - CGI_10007221 superfamily 241874 73 418 4.37E-123 374.125 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#21398 - CGI_10007221 superfamily 241874 20 87 1.43E-28 117.967 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#21399 - CGI_10007222 superfamily 241644 1 127 1.12E-57 177.78 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#21400 - CGI_10007223 superfamily 241607 383 421 7.94E-10 55.3538 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#21401 - CGI_10008423 superfamily 248318 262 314 9.68E-22 86.3357 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#21402 - CGI_10008424 superfamily 247750 8 147 2.75E-51 170.164 cl17196 E1_enzyme_family superfamily C - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#21402 - CGI_10008424 superfamily 247750 266 317 6.45E-15 70.7823 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#21404 - CGI_10008426 superfamily 247755 132 234 7.07E-61 193.196 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#21404 - CGI_10008426 superfamily 217962 24 96 0.006145 35.314 cl09558 TPD52 superfamily C - "Tumour protein D52 family; The hD52 gene was originally identified through its elevated expression level in human breast carcinoma. Cloning of D52 homologues from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Two human homologues of hD52, hD53 and hD54, have also been identified, demonstrating the existence of a novel gene/protein family. These proteins have an amino terminal coiled-coil that allows members to form homo- and heterodimers with each other." Q#21409 - CGI_10008431 superfamily 243310 27 265 3.23E-77 236.751 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#21410 - CGI_10008432 superfamily 245602 183 375 3.27E-37 139.684 cl11402 GH31 superfamily N - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#21410 - CGI_10008432 superfamily 245602 109 193 0.00531776 36.8099 cl11402 GH31 superfamily C - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#21413 - CGI_10008435 superfamily 243555 2 148 1.65E-09 55.0898 cl03871 Chitin_bind_3 superfamily N - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#21417 - CGI_10005546 superfamily 192604 70 136 1.01E-16 70.4058 cl11135 PACT_coil_coil superfamily C - "Pericentrin-AKAP-450 domain of centrosomal targeting protein; This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly." Q#21419 - CGI_10005548 superfamily 245201 1 172 5.68E-37 134.163 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21419 - CGI_10005548 superfamily 242406 200 292 8.55E-31 114.226 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#21422 - CGI_10018731 superfamily 220131 17 212 8.36E-46 156.282 cl11721 DUF1943 superfamily C - "Domain of unknown function (DUF1943); Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined." Q#21430 - CGI_10018739 superfamily 242113 1 186 5.57E-42 143.236 cl00814 Cyclase superfamily N - Putative cyclase; Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site. Q#21431 - CGI_10018740 superfamily 242113 5 121 3.44E-28 104.716 cl00814 Cyclase superfamily N - Putative cyclase; Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site. Q#21432 - CGI_10018741 superfamily 242113 1 186 3.31E-44 149.014 cl00814 Cyclase superfamily N - Putative cyclase; Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site. Q#21436 - CGI_10018746 superfamily 248054 75 130 3.99E-06 44.7704 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#21437 - CGI_10018747 superfamily 248054 28 80 0.00049581 38.222 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#21440 - CGI_10024508 superfamily 245226 2 132 8.21E-13 61.1625 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#21442 - CGI_10024510 superfamily 216363 28 96 3.29E-07 43.613 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#21444 - CGI_10024512 superfamily 221154 26 191 1.07E-09 57.4104 cl13144 TAN superfamily - - "Telomere-length maintenance and DNA damage repair; ATM is a large protein kinase, in humans, critical for responding to DNA double-strand breaks (DSBs). Tel1, the orthologue from budding yeast, also regulates responses to DSBs. Tel1 is important for maintaining viability and for phosphorylation of the DNA damage signal transducer kinase Rad53 (an orthologue of mammalian CHK2). In addition to functioning in the response to DSBs, numerous findings indicate that Tel1/ATM regulates telomeres. The overall domain structure of Tel1/ATM is shared by proteins of the phosphatidylinositol 3-kinase (PI3K)-related kinase (PIKK) family, but this family carries a unique and functionally important TAN sequence motif, near its N-terminal, LxxxKxxE/DRxxxL. which is conserved specifically in the Tel1/ATM subclass of the PIKKs. The TAN motif is essential for both telomere length maintenance and Tel1 action in response to DNA damage. It is classified as an EC:2.7.11.1." Q#21449 - CGI_10024517 superfamily 242485 35 99 7.66E-10 50.7399 cl01407 Rdx superfamily - - "Rdx family; This entry is an approximately 100 residue region of selenoprotein-T, conserved from plants to humans. The protein binds to UDP-glucose:glycoprotein glucosyltransferase (UGTR), the endoplasmic reticulum (ER)-resident protein, which is known to be involved in the quality control of protein folding. Selenium (Se) plays an essential role in cell survival and most of the effects of Se are probably mediated by selenoproteins, including selenoprotein T. However, despite its binding to UGTR and that its mRNA is up-regulated in extended asphyxia, the function of the protein and hence of this region of it is unknown. Selenoprotein W contains selenium as selenocysteine in the primary protein structure and levels of this selenoprotein are affected by selenium." Q#21450 - CGI_10024518 superfamily 220074 22 301 1.64E-144 411.683 cl07499 DUF1907 superfamily - - "Domain of Unknown Function (DUF1907); The structure of this domain displays an alpha-beta-beta-alpha four layer topology, with an HxHxxxxxxxxxH motif that coordinates a zinc ion, and an acetate anion at a site that likely supports the enzymatic activity of an ester hydrolase." Q#21451 - CGI_10024519 superfamily 243072 49 188 6.01E-08 50.845 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21452 - CGI_10024520 superfamily 245201 150 400 1.57E-70 227.403 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21453 - CGI_10024521 superfamily 248097 123 250 7.55E-21 85.0094 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21454 - CGI_10024522 superfamily 248097 201 327 3.47E-23 92.7134 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21455 - CGI_10024523 superfamily 245201 55 321 4.35E-66 216.334 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21456 - CGI_10024524 superfamily 241704 337 462 1.86E-23 96.2884 cl00227 PEBP superfamily - - "PhosphatidylEthanolamine-Binding Protein (PEBP) domain; PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). A number of biological roles for members of the PEBP family include serine protease inhibition, membrane biogenesis, regulation of flowering plant stem architecture, and Raf-1 kinase inhibition. Although their overall structures are similar, the members of the PEBP family bind very different substrates including phospholipids, opioids, and hydrophobic odorant molecules as well as having different oligomerization states (monomer/dimer/tetramer)." Q#21456 - CGI_10024524 superfamily 241704 118 250 2.22E-08 52.3756 cl00227 PEBP superfamily - - "PhosphatidylEthanolamine-Binding Protein (PEBP) domain; PhosphatidylEthanolamine-Binding Proteins (PEBPs) are represented in all three major phylogenetic divisions (eukaryotes, bacteria, archaea). A number of biological roles for members of the PEBP family include serine protease inhibition, membrane biogenesis, regulation of flowering plant stem architecture, and Raf-1 kinase inhibition. Although their overall structures are similar, the members of the PEBP family bind very different substrates including phospholipids, opioids, and hydrophobic odorant molecules as well as having different oligomerization states (monomer/dimer/tetramer)." Q#21458 - CGI_10024526 superfamily 247743 40 174 3.30E-22 90.6683 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21458 - CGI_10024526 superfamily 203973 245 333 2.91E-29 108.748 cl16006 Rep_fac_C superfamily - - "Replication factor C C-terminal domain; This is the C-terminal domain of RFC (replication factor-C) protein of the clamp loader complex which binds to the DNA sliding clamp (proliferating cell nuclear antigen, PCNA). The five modules of RFC assemble into a right-handed spiral, which results in only three of the five RFC subunits (RFC-A, RFC-B and RFC-C) making contact with PCNA, leaving a wedge-shaped gap between RFC-E and the PCNA clamp-loader complex. The C-terminal is vital for the correct orientation of RFC-E with respect to RFC-A." Q#21459 - CGI_10024527 superfamily 217868 156 331 9.61E-59 192.457 cl04384 DUF383 superfamily - - Domain of unknown function (DUF383); Domain of unknown function (DUF383). Q#21459 - CGI_10024527 superfamily 190851 336 393 2.75E-14 67.2738 cl04385 DUF384 superfamily - - Domain of unknown function (DUF384); Domain of unknown function (DUF384). Q#21461 - CGI_10024529 superfamily 247805 283 425 2.11E-27 110.12 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#21461 - CGI_10024529 superfamily 219532 918 1020 8.65E-22 92.7626 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#21461 - CGI_10024529 superfamily 243778 743 835 5.27E-13 66.8639 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#21461 - CGI_10024529 superfamily 247905 596 645 1.16E-06 47.9769 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#21463 - CGI_10024531 superfamily 245213 69 102 6.38E-06 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21463 - CGI_10024531 superfamily 245213 212 254 6.92E-05 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21463 - CGI_10024531 superfamily 245213 29 64 0.000537414 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21463 - CGI_10024531 superfamily 245213 169 201 0.00189042 35.305 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21463 - CGI_10024531 superfamily 201391 132 169 1.79E-06 44.2446 cl02926 TB superfamily - - TB domain; This domain is also known as the 8 cysteine domain. This family includes the hybrid domains. This cysteine rich repeat is found in TGF binding protein and fibrillin. Q#21463 - CGI_10024531 superfamily 245213 255 295 0.00263404 35.0172 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21465 - CGI_10024533 superfamily 248097 468 587 4.94E-21 89.6318 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21465 - CGI_10024533 superfamily 248097 193 310 2.04E-15 73.4534 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21465 - CGI_10024533 superfamily 248097 362 447 1.20E-06 47.2969 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21465 - CGI_10024533 superfamily 248097 92 176 5.72E-06 44.9486 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21467 - CGI_10024535 superfamily 246918 263 307 1.40E-09 54.5151 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21467 - CGI_10024535 superfamily 246918 116 164 4.58E-09 52.9743 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21467 - CGI_10024535 superfamily 246918 62 110 3.93E-06 44.4999 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21467 - CGI_10024535 superfamily 246918 310 355 6.45E-06 44.1147 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21469 - CGI_10024537 superfamily 246918 284 332 4.08E-11 57.5967 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21469 - CGI_10024537 superfamily 246918 176 224 4.66E-09 51.8187 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21469 - CGI_10024537 superfamily 246918 122 170 2.67E-07 46.8111 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21469 - CGI_10024537 superfamily 246918 230 277 0.000168085 38.7219 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21470 - CGI_10024538 superfamily 246918 21 73 2.57E-10 54.9003 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#21472 - CGI_10024540 superfamily 241913 19 112 1.71E-05 39.3706 cl00509 hot_dog superfamily C - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#21473 - CGI_10024541 superfamily 247858 673 862 5.45E-33 125.963 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#21473 - CGI_10024541 superfamily 203913 347 515 4.77E-13 67.2421 cl07084 P4Ha_N superfamily - - "Prolyl 4-Hydroxylase alpha-subunit, N-terminal region; The members of this family are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme (EC:1.14.11.2) is important in the post-translational modification of collagen, as it catalyzes the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase. The function of the N-terminal region featured in this family does not seem to be known." Q#21473 - CGI_10024541 superfamily 247799 146 196 0.000679137 38.8156 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#21474 - CGI_10024542 superfamily 245596 135 249 6.73E-47 167.485 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#21474 - CGI_10024542 superfamily 245814 415 481 5.86E-07 48.2705 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21474 - CGI_10024542 superfamily 247775 279 369 0.00105473 40.6982 cl17221 ArsB_NhaD_permease superfamily NC - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#21474 - CGI_10024542 superfamily 245814 512 566 0.00267949 37.0328 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21475 - CGI_10024543 superfamily 247858 377 568 7.10E-31 118.259 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#21475 - CGI_10024543 superfamily 203913 73 239 3.14E-12 64.1605 cl07084 P4Ha_N superfamily - - "Prolyl 4-Hydroxylase alpha-subunit, N-terminal region; The members of this family are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme (EC:1.14.11.2) is important in the post-translational modification of collagen, as it catalyzes the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase. The function of the N-terminal region featured in this family does not seem to be known." Q#21476 - CGI_10024544 superfamily 209898 138 157 4.11E-05 38.8607 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#21476 - CGI_10024544 superfamily 209898 115 135 0.00602048 32.7606 cl14787 MORN superfamily - - MORN repeat; The MORN (Membrane Occupation and Recognition Nexus) repeat is found in multiple copies in several proteins including junctophilins (See Takeshima et al. Mol. Cell 2000;6:11-22). A MORN-repeat protein has been identified in the parasite Toxoplasma gondiis a dynamic component of cell division apparatus in Toxoplasma gondii. It has been hypothesised to functions as a linker protein between certain membrane regions and the parasite's cytoskeleton. Q#21477 - CGI_10024545 superfamily 245604 49 123 2.63E-27 104.023 cl11404 Biotinyl_lipoyl_domains superfamily - - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#21477 - CGI_10024545 superfamily 215782 271 484 8.35E-94 285.983 cl18344 2-oxoacid_dh superfamily - - 2-oxoacid dehydrogenases acyltransferase (catalytic domain); These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain. Q#21477 - CGI_10024545 superfamily 202412 200 238 1.41E-07 48.1873 cl03729 E3_binding superfamily - - e3 binding domain; This family represents a small domain of the E2 subunit of 2-oxo-acid dehydrogenases responsible for the binding of the E3 subunit. Q#21478 - CGI_10024546 superfamily 241580 65 160 2.45E-17 76.8903 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#21479 - CGI_10024547 superfamily 222436 231 367 3.49E-23 96.1844 cl16454 DUF4203 superfamily N - Domain of unknown function (DUF4203); This is the N-terminal region of 7tm proteins. The function is not known. Q#21480 - CGI_10024548 superfamily 247740 262 474 4.59E-54 182.376 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#21480 - CGI_10024548 superfamily 241770 76 182 2.25E-17 78.5916 cl00309 PRTases_typeI superfamily - - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#21481 - CGI_10024549 superfamily 241997 27 125 1.71E-26 97.4227 cl00638 RNA_pol_Rpb4 superfamily - - RNA polymerase Rpb4; This family includes the Rpb4 protein. This family also includes C17 (aka CGRP-RCP) is an essential subunit of RNA polymerase III. C17 forms a subcomplex with C25 which is likely to be the counterpart of subcomplex Rpb4/7 in Pol II. Q#21483 - CGI_10024551 superfamily 246676 340 494 3.43E-24 99.7265 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#21483 - CGI_10024551 superfamily 243146 58 115 0.000453051 38.4243 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#21483 - CGI_10024551 superfamily 243146 177 228 0.00814102 34.5723 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#21486 - CGI_10024554 superfamily 241807 38 190 3.16E-77 230.742 cl00351 Ribosomal_S17 superfamily - - Ribosomal protein S17; Ribosomal protein S17. Q#21487 - CGI_10024555 superfamily 243066 129 220 1.02E-19 82.2801 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#21487 - CGI_10024555 superfamily 243179 1 69 1.11E-06 45.9867 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#21489 - CGI_10024557 superfamily 243179 1 107 1.79E-17 74.4915 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#21490 - CGI_10024558 superfamily 243066 2 95 4.81E-20 80.3541 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#21491 - CGI_10024559 superfamily 246902 283 399 0.000255032 39.2484 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#21492 - CGI_10008676 superfamily 241574 165 383 1.57E-87 267.53 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21493 - CGI_10008677 superfamily 241574 16 183 4.44E-46 154.281 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21494 - CGI_10008678 superfamily 241574 146 360 1.31E-87 282.167 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21494 - CGI_10008678 superfamily 241574 553 638 1.48E-17 83.0189 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21495 - CGI_10008679 superfamily 241574 234 446 3.94E-37 136.562 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21500 - CGI_10008684 superfamily 243092 964 1250 2.18E-23 102.028 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21502 - CGI_10003647 superfamily 245213 284 320 4.87E-07 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21502 - CGI_10003647 superfamily 245213 171 207 1.91E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21502 - CGI_10003647 superfamily 245213 210 245 2.12E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21502 - CGI_10003647 superfamily 245213 95 131 2.63E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21502 - CGI_10003647 superfamily 245213 133 169 4.12E-05 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21502 - CGI_10003647 superfamily 245213 362 395 5.83E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21502 - CGI_10003647 superfamily 245213 251 282 0.000756301 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21502 - CGI_10003647 superfamily 245213 327 358 0.00159591 36.4606 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21504 - CGI_10003649 superfamily 215866 7 164 4.81E-36 130.523 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#21504 - CGI_10003649 superfamily 243212 187 356 4.28E-19 83.1621 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#21506 - CGI_10003651 superfamily 220384 133 282 3.02E-40 142.48 cl10738 Arb2 superfamily - - "Arb2 domain; A second fission yeast Argonaute complex (Argonaute siRNA chaperone, ARC) that contains two previously uncharacterized proteins, Arb1 and Arb2, both of which are required for histone H3 Lys9 (H3-K9) methylation, heterochromatin assembly and siRNA generation. This family includes a region found in Arb2 and the Hda1 protein." Q#21507 - CGI_10003652 superfamily 245201 103 359 2.98E-77 254.769 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21507 - CGI_10003652 superfamily 247683 18 74 8.74E-28 107.93 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#21510 - CGI_10010439 superfamily 197448 2950 3103 4.39E-58 200.917 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 796 944 2.80E-53 187.049 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 1161 1302 4.90E-48 172.027 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 1865 2011 3.60E-46 166.634 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 246671 59 173 3.43E-12 67.0628 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#21510 - CGI_10010439 superfamily 197448 1668 1835 2.05E-54 191.276 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 2229 2393 2.15E-50 178.937 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 3585 3737 1.35E-49 176.91 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 1504 1659 9.91E-49 174.236 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 2401 2557 3.05E-46 167.005 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 3372 3544 7.52E-46 166.552 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 2593 2740 1.81E-45 164.623 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 3109 3281 1.39E-44 163.085 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 950 1116 2.43E-42 156.232 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 2020 2192 4.00E-38 143.664 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 1313 1468 1.73E-34 133.792 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 611 757 4.54E-33 128.824 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 197 323 2.87E-31 123.735 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 2752 2911 4.34E-25 105.915 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 3306 3366 2.22E-18 86.1272 cl15240 Reelin_subrepeat_like superfamily N - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 504 643 1.22E-12 68.7932 cl15240 Reelin_subrepeat_like superfamily - - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 197448 443 494 0.000921727 41.2598 cl15240 Reelin_subrepeat_like superfamily C - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#21510 - CGI_10010439 superfamily 219677 3547 3572 0.00387278 38.1876 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#21510 - CGI_10010439 superfamily 219677 2564 2586 0.0062057 37.8024 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#21511 - CGI_10010440 superfamily 243119 49 87 0.000409856 37.0382 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#21513 - CGI_10010442 superfamily 247743 19 165 1.63E-11 61.4468 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21514 - CGI_10010443 superfamily 241600 71 144 2.23E-26 101.933 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#21514 - CGI_10010443 superfamily 220647 145 192 8.82E-18 77.3679 cl18565 L_HGMIC_fpl superfamily N - "Lipoma HMGIC fusion partner-like protein; This is a group of proteins expressed from a series of genes referred to as Lipoma HGMIC fusion partner-like. The proteins carry four highly conserved transmembrane domains in this entry. In certain instances, eg in LHFPL5, mutations cause deafness in humans and hypospadias, and LHFPL1 is transcribed in six liver tumour cell lines." Q#21515 - CGI_10010444 superfamily 241609 31 108 6.56E-28 98.9894 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#21516 - CGI_10010445 superfamily 247041 292 556 9.78E-78 249.156 cl15692 CE4_SF superfamily - - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#21516 - CGI_10010445 superfamily 247041 7 227 1.22E-44 159.79 cl15692 CE4_SF superfamily - - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#21517 - CGI_10010446 superfamily 190614 30 113 2.32E-28 109.958 cl15647 YEATS superfamily - - "YEATS family; We have named this family the YEATS family, after `YNK7', `ENL', `AF-9', and `TFIIF small subunit'. This family also contains the GAS41 protein. All these proteins are thought to have a transcription stimulatory activity" Q#21518 - CGI_10010447 superfamily 243072 51 145 5.12E-23 89.365 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21519 - CGI_10010448 superfamily 247724 5 149 9.08E-42 138.831 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21520 - CGI_10010449 superfamily 247724 5 152 3.76E-42 139.986 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21521 - CGI_10010450 superfamily 222269 692 979 5.91E-62 211.028 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#21521 - CGI_10010450 superfamily 222269 224 504 1.27E-60 207.561 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#21522 - CGI_10010451 superfamily 222269 20 305 6.21E-67 212.569 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#21523 - CGI_10010452 superfamily 241754 6 558 0 583.073 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#21524 - CGI_10010453 superfamily 245596 635 871 3.19E-73 246.065 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#21524 - CGI_10010453 superfamily 248458 897 1038 6.39E-06 48.8493 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#21524 - CGI_10010453 superfamily 241754 1 96 2.20E-14 76.9207 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#21524 - CGI_10010453 superfamily 245596 512 598 2.73E-05 46.1468 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#21525 - CGI_10010454 superfamily 241754 6 57 9.60E-15 68.4463 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#21526 - CGI_10010455 superfamily 241754 3 220 8.85E-61 205.192 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#21527 - CGI_10010456 superfamily 241754 1 96 3.42E-15 72.2983 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#21530 - CGI_10003478 superfamily 218713 13 62 1.63E-19 78.1528 cl05332 MRG superfamily N - "MRG; This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation. It contains 2 chromo domains and a leucine zipper motif." Q#21536 - CGI_10007372 superfamily 217869 352 505 0.00978679 36.4962 cl12286 Not3 superfamily C - "Not1 N-terminal domain, CCR4-Not complex component; Not1 N-terminal domain, CCR4-Not complex component. " Q#21538 - CGI_10007374 superfamily 248097 46 170 3.69E-21 84.239 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21539 - CGI_10007376 superfamily 248097 256 380 1.23E-21 88.8614 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21540 - CGI_10007377 superfamily 220097 27 102 1.49E-07 45.0873 cl08518 Phospholip_A2_3 superfamily N - "Prokaryotic phospholipase A2; The prokaryotic phospholipase A2 domain is predominantly found in bacterial and fungal phospholipases, as well as various hypothetical and putative proteins. It enables the liberation of fatty acids and lysophospholipid by hydrolysing the 2-ester bond of 1,2-diacyl-3-sn-phosphoglycerides. The domain adopts an alpha-helical secondary structure, consisting of five alpha-helices and two helical segments." Q#21541 - CGI_10007378 superfamily 207609 15 115 2.52E-19 77.9044 cl02481 NGF superfamily - - Nerve growth factor family; Nerve growth factor family. Q#21543 - CGI_10007380 superfamily 241596 216 256 3.16E-07 46.8235 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#21543 - CGI_10007380 superfamily 245721 171 200 5.80E-09 52.4864 cl11603 Basic superfamily N - Myogenic Basic domain; This basic domain is found in the MyoD family of muscle specific proteins that control muscle development. The bHLH region of the MyoD family includes the basic domain and the Helix-loop-helix (HLH) motif. The bHLH region mediates specific DNA binding. With 12 residues of the basic domain involved in DNA binding. The basic domain forms an extended alpha helix in the structure. Q#21545 - CGI_10007382 superfamily 243034 10 121 0.000693331 37.7448 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21547 - CGI_10011192 superfamily 241754 185 546 2.04E-164 499.916 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#21547 - CGI_10011192 superfamily 241581 638 734 1.19E-11 63.5594 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#21549 - CGI_10011194 superfamily 241563 68 109 1.90E-06 45.548 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21549 - CGI_10011194 superfamily 241563 28 59 0.000784212 37.844 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21550 - CGI_10011195 superfamily 243604 12 389 1.44E-122 363.153 cl03994 Trp_dioxygenase superfamily - - "Tryptophan 2,3-dioxygenase; Tryptophan 2,3-dioxygenase. " Q#21551 - CGI_10011196 superfamily 247757 48 193 6.07E-71 219.1 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#21551 - CGI_10011196 superfamily 150200 17 60 1.61E-05 41.8909 cl09685 VMA21 superfamily C - VMA21-like domain; This presumed short domain appears to contain two potential transmembrane helices. VMA21 is localised in the ER where it is needed as an accessory factor for assembly of the V0 component of the vacuolar ATPase. Q#21551 - CGI_10011196 superfamily 247724 171 246 0.00027665 39.7152 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21553 - CGI_10011198 superfamily 241563 18 50 0.00162157 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21555 - CGI_10011200 superfamily 241750 201 581 2.30E-142 420.447 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#21555 - CGI_10011200 superfamily 203031 16 73 9.04E-08 49.634 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#21556 - CGI_10011201 superfamily 203031 6 63 3.68E-07 44.6264 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#21557 - CGI_10011202 superfamily 203136 42 170 5.91E-11 56.9692 cl04867 LRAT superfamily - - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#21560 - CGI_10011205 superfamily 241613 1223 1258 4.10E-11 60.6833 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21560 - CGI_10011205 superfamily 241613 1182 1218 2.33E-10 58.7574 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21560 - CGI_10011205 superfamily 241584 1428 1519 1.64E-09 57.5063 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#21560 - CGI_10011205 superfamily 241613 1378 1413 5.50E-09 54.5202 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21560 - CGI_10011205 superfamily 241613 1331 1368 1.12E-07 51.0534 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21560 - CGI_10011205 superfamily 241613 1064 1098 1.48E-06 47.5866 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21560 - CGI_10011205 superfamily 241613 1283 1314 0.000227532 41.0382 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#21560 - CGI_10011205 superfamily 241584 1604 1672 0.000566991 40.5575 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#21560 - CGI_10011205 superfamily 241584 1904 1986 0.00792425 37.0907 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#21560 - CGI_10011205 superfamily 214531 789 829 2.61E-09 55.6857 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21560 - CGI_10011205 superfamily 214531 876 919 6.82E-08 51.4485 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#21560 - CGI_10011205 superfamily 215683 851 893 8.51E-07 48.3203 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#21563 - CGI_10003698 superfamily 247741 47 70 0.00862582 32.0801 cl17187 Aldolase_Class_I superfamily C - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#21564 - CGI_10003699 superfamily 218312 120 217 1.86E-37 132.154 cl04824 Cwf_Cwc_15 superfamily N - "Cwf15/Cwc15 cell cycle control protein; This family represents Cwf15/Cwc15 (from Schizosaccharomyces pombe and Saccharomyces cerevisiae respectively) and their homologues. The function of these proteins is unknown, but they form part of the spliceosome and are thus thought to be involved in mRNA splicing." Q#21564 - CGI_10003699 superfamily 218312 1 51 1.20E-16 74.7594 cl04824 Cwf_Cwc_15 superfamily C - "Cwf15/Cwc15 cell cycle control protein; This family represents Cwf15/Cwc15 (from Schizosaccharomyces pombe and Saccharomyces cerevisiae respectively) and their homologues. The function of these proteins is unknown, but they form part of the spliceosome and are thus thought to be involved in mRNA splicing." Q#21565 - CGI_10003700 superfamily 245201 356 640 8.35E-144 422.208 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21565 - CGI_10003700 superfamily 243115 64 196 3.24E-40 143.253 cl02623 WIF superfamily - - "WIF domain; The WIF domain is found in the RYK tyrosine kinase receptors and WIF, the Wnt-inhibitory- factor. The domain is extracellular and contains two conserved cysteines that may form a disulphide bridge. This domain is Wnt binding in WIF, and it has been suggested that RYK may also bind to Wnt. The WIF domain is a member of the immunoglobulin superfamily, and it comprises nine beta-strands and two alpha-helices, with two of the beta-strands (6 and 9) interrupted by four and six residues of irregular secondary structure, respectively. Considering that the activity of Wnts depends on the presence of a palmitoylated cysteine residue in their amino-terminal polypeptide segment, Wnt proteins are lipid-modified and can act as stem cell growth factors, it is likely that the WIF domain recognises and binds to Wnts that have been activated by palmitoylation and that the recognition of palmitoylated Wnts by WIF-1 is effected by its WIF domain rather than by its EGF domains. A strong binding affinity for palmitoylated cysteine residues would further explain the remarkably high affinity of human WIF-1 not only for mammalian Wnts, but also for Wnts from Xenopus and Drosophila." Q#21566 - CGI_10003701 superfamily 217473 121 344 1.66E-27 112.073 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#21567 - CGI_10004874 superfamily 248232 229 443 1.48E-40 153.991 cl17678 MRS6 superfamily C - "RAB proteins geranylgeranyltransferase component A (RAB escort protein) [Posttranslational modification, protein turnover, chaperones]" Q#21567 - CGI_10004874 superfamily 248232 4 66 2.44E-12 68.2728 cl17678 MRS6 superfamily C - "RAB proteins geranylgeranyltransferase component A (RAB escort protein) [Posttranslational modification, protein turnover, chaperones]" Q#21568 - CGI_10004875 superfamily 222150 397 422 0.00506328 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21568 - CGI_10004875 superfamily 222150 425 448 0.00586633 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21568 - CGI_10004875 superfamily 218108 527 694 0.0061117 37.5629 cl04540 CITED superfamily C - "CITED; CITED, CBP/p300-interacting transactivator with ED-rich tail, are characterized by a conserved 32-amino acid sequence at the C-terminus. CITED proteins do not bind DNA directly and are thought to function as transcriptional co-activators." Q#21570 - CGI_10004877 superfamily 245847 16 81 0.000751628 34.4546 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#21571 - CGI_10004879 superfamily 245213 19 47 0.000633511 36.0754 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21571 - CGI_10004879 superfamily 245847 55 175 3.42E-10 55.2553 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#21572 - CGI_10001615 superfamily 217574 1 131 1.86E-43 148.91 cl04089 eRF1_1 superfamily - - "eRF1 domain 1; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#21572 - CGI_10001615 superfamily 217575 136 268 2.22E-37 132.783 cl04090 eRF1_2 superfamily - - "eRF1 domain 2; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#21572 - CGI_10001615 superfamily 146221 271 370 1.43E-36 129.595 cl04091 eRF1_3 superfamily - - "eRF1 domain 3; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#21573 - CGI_10001616 superfamily 243091 141 259 1.88E-10 58.4999 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#21573 - CGI_10001616 superfamily 222150 394 418 3.87E-05 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21573 - CGI_10001616 superfamily 222150 365 389 0.000145543 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21573 - CGI_10001616 superfamily 246975 352 373 0.00404293 35.4005 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#21574 - CGI_10015975 superfamily 241574 320 548 1.78E-98 302.197 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21575 - CGI_10015976 superfamily 247858 18 150 2.35E-19 80.5098 cl17304 2OG-FeII_Oxy_3 superfamily N - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#21577 - CGI_10015978 superfamily 247755 1106 1333 1.20E-145 448.142 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#21577 - CGI_10015978 superfamily 247755 469 706 2.53E-145 446.986 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#21577 - CGI_10015978 superfamily 216049 216 422 1.86E-53 190.575 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#21577 - CGI_10015978 superfamily 216049 790 1059 3.19E-46 169.389 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#21577 - CGI_10015978 superfamily 190308 1403 1531 5.43E-08 54.2471 cl18163 Fringe superfamily C - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#21580 - CGI_10015981 superfamily 247769 46 102 1.05E-09 53.4973 cl17215 HDc superfamily C - Metal dependent phosphohydrolases with conserved 'HD' motif Q#21581 - CGI_10015982 superfamily 241619 85 121 0.00300041 36.2602 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#21582 - CGI_10015983 superfamily 242406 14 88 5.99E-08 47.2009 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#21584 - CGI_10015985 superfamily 243082 647 782 1.88E-31 124.133 cl02553 Peptidase_C19 superfamily N - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#21584 - CGI_10015985 superfamily 243082 477 564 1.92E-06 48.6335 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#21584 - CGI_10015985 superfamily 243082 292 404 2.72E-06 48.8111 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#21585 - CGI_10015986 superfamily 241900 19 177 1.22E-11 63.39 cl00490 EEP superfamily C - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#21586 - CGI_10015987 superfamily 217249 192 335 3.28E-80 243.683 cl03742 Prp18 superfamily - - "Prp18 domain; The splicing factor Prp18 is required for the second step of pre-mRNA splicing. The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles." Q#21586 - CGI_10015987 superfamily 207680 85 114 1.15E-08 50.4343 cl02632 PRP4 superfamily - - pre-mRNA processing factor 4 (PRP4) like; This small domain is found on PRP4 ribonuleoproteins. PRP4 is a U4/U6 small nuclear ribonucleoprotein that is involved in pre-mRNA processing. Q#21587 - CGI_10015988 superfamily 217473 172 342 8.76E-26 104.754 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#21589 - CGI_10015990 superfamily 245852 95 289 1.20E-32 125.927 cl12050 TraB superfamily C - "TraB family; pAD1 is a hemolysin/bacteriocin plasmid originally identified in Enterococcus faecalis DS16. It encodes a mating response to a peptide sex pheromone, cAD1, secreted by recipient bacteria. Once the plasmid pAD1 is acquired, production of the pheromone ceases--a trait related in part to a determinant designated traB. However a related protein is found in C. elegans, suggesting that members of the TraB family have some more general function. This family also includes the bacterial GumN protein. The family has a conserved GXXH motif close to the N-terminus, a conserved glutamate and a conserved arginine that may be catalytic. The family also includes a second conserved GXXH motif near the C-terminus." Q#21591 - CGI_10015992 superfamily 218118 40 120 5.85E-16 73.032 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#21592 - CGI_10015993 superfamily 241758 15 167 5.76E-27 99.7518 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#21594 - CGI_10003268 superfamily 150423 11 129 1.24E-49 156.639 cl10732 Med10 superfamily - - "Transcription factor subunit Med10 of Mediator complex; Med10 is one of the protein subunits of the Mediator complex, tethered to Rgr1 protein. The Mediator complex is required for the transcription of most RNA polymerase II (Pol II)-transcribed genes. Med10 specifically mediates basal-level HIS4 transcription via Gcn4, and, additionally, there is a putative requirement for Med10 in Bas2-mediated transcription. Med10 is part of the middle region of Mediator." Q#21595 - CGI_10003269 superfamily 116878 112 222 1.25E-26 100.829 cl17958 TIM21 superfamily N - TIM21; TIM21 interacts with the outer mitochondrial TOM complex and promotes the insertion of proteins into the inner mitochondrial membrane. Q#21596 - CGI_10003270 superfamily 241606 21 128 5.40E-45 153.986 cl00096 IRF superfamily - - Interferon Regulatory Factor (IRF); also known as tryptophan pentad repeat. The family of IRF transcription factors is important in the regulation of interferons in response to infection by virus and in the regulation of interferon-inducible genes. The IRF family is characterized by a unique 'tryptophan cluster' DNA-binding region. Viral IRFs bind to cellular IRFs; block type I and II interferons and host IRF-mediated transcriptional activation. Q#21596 - CGI_10003270 superfamily 220732 254 414 4.32E-24 98.4993 cl11061 IRF-3 superfamily - - "Interferon-regulatory factor 3; This is the interferon-regulatory factor 3 chain of the hetero-dimeric structure which also contains the shorter chain CREB-binding protein. These two subunits make up the DRAF1 (double-stranded RNA-activated factor 1). Viral dsRNA produced during viral transcription or replication leads to the activation of DRAF1. The DNA-binding specificity of DRAF1 correlates with transcriptional induction of ISG (interferon-alpha,beta-stimulated gene). IRF-3 preexists in the cytoplasm of uninfected cells and translocates to the nucleus following viral infection. Translocation of IRF-3 is accompanied by an increase in serine and threonine phosphorylation, and association with the CREB coactivator occurs only after infection." Q#21597 - CGI_10003271 superfamily 245596 54 283 2.07E-44 155.717 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#21600 - CGI_10002254 superfamily 216653 142 230 5.12E-20 83.0302 cl08331 Na_Ca_ex superfamily C - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#21602 - CGI_10002256 superfamily 241754 5 156 2.51E-46 157.473 cl00286 Motor_domain superfamily NC - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#21603 - CGI_10004906 superfamily 245847 335 474 7.25E-08 51.3492 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#21604 - CGI_10004907 superfamily 248097 34 163 8.98E-13 61.127 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21605 - CGI_10004908 superfamily 248097 30 159 3.39E-13 62.2826 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21606 - CGI_10004909 superfamily 241568 40 76 3.32E-07 46.6872 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21607 - CGI_10004910 superfamily 243051 1 70 1.53E-06 45.0614 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#21609 - CGI_10004912 superfamily 247856 93 142 0.000113654 37.1421 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21610 - CGI_10011113 superfamily 215872 8 86 1.99E-11 54.8922 cl08261 Ribosomal_L6 superfamily - - Ribosomal protein L6; Ribosomal protein L6. Q#21614 - CGI_10011117 superfamily 242268 136 598 3.13E-176 510.321 cl01046 Glyco_hydro_1 superfamily - - Glycosyl hydrolase family 1; Glycosyl hydrolase family 1. Q#21614 - CGI_10011117 superfamily 242268 2 111 5.99E-34 133.211 cl01046 Glyco_hydro_1 superfamily N - Glycosyl hydrolase family 1; Glycosyl hydrolase family 1. Q#21615 - CGI_10011118 superfamily 243119 78 129 4.73E-08 45.8876 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#21615 - CGI_10011118 superfamily 243119 21 69 1.26E-05 39.3393 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#21616 - CGI_10011119 superfamily 243119 72 121 0.000214425 36.653 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#21617 - CGI_10011120 superfamily 243119 25 73 2.13E-06 40.505 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#21618 - CGI_10011121 superfamily 241703 5 308 2.98E-116 340.776 cl00226 nuc_hydro superfamily - - "nuc_hydro: Nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium, the purine-specific inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax and, pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases such as URH1 from Saccharomyces cerevisiae, RihA and RihB from Escherichia coli. Nucleoside hydrolases are of interest as a target for antiprotozoan drugs as, no nucleoside hydrolase activity or genes encoding these enzymes have been detected in humans and, parasitic protozoans lack de novo purine synthesis relying on nucleoside hydrolase to scavenge purine and/or pyrimidines from the environment." Q#21620 - CGI_10001460 superfamily 241563 75 115 2.44E-08 50.9408 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21621 - CGI_10001461 superfamily 241563 72 112 1.73E-07 48.2444 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21622 - CGI_10022265 superfamily 241563 96 136 0.000945709 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21625 - CGI_10022268 superfamily 241622 1073 1114 0.00740493 36.2028 cl00117 PDZ superfamily N - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#21626 - CGI_10022269 superfamily 247809 659 859 1.03E-29 117.762 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#21628 - CGI_10022271 superfamily 246669 772 871 2.02E-12 65.5511 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#21628 - CGI_10022271 superfamily 204692 615 724 7.83E-33 124.763 cl13124 DUF3250 superfamily - - Protein of unknown function (DUF3250); This family of proteins represents a protein with unknown function. It may be the C2 domain from KIAA1005 however this cannot be confirmed. Q#21629 - CGI_10022272 superfamily 247724 40 407 7.80E-172 486.264 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21631 - CGI_10022274 superfamily 243238 105 570 1.35E-136 414.743 cl02915 Voltage_gated_ClC superfamily - - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#21631 - CGI_10022274 superfamily 246936 687 743 1.15E-18 82.6852 cl15354 CBS_pair superfamily N - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#21632 - CGI_10022275 superfamily 242664 179 257 0.00841561 35.9932 cl01709 PBP2_NikA_DppA_OppA_like superfamily NC - "The substrate-binding domain of an ABC-type nickel/oligopeptide-like import system contains the type 2 periplasmic binding fold; This family represents the periplasmic substrate-binding domain of nickel/dipeptide/oligopeptide transport systems, which function in the import of nickel and peptides, and other closely related proteins. The oligopeptide-binding protein OppA is a periplasmic component of an ATP-binding cassette (ABC) transport system OppABCDEF consisting of five subunits: two homologous integral membrane proteins OppB and OppF that form the translocation pore; two homologous nucleotide-binding domains OppD and OppF that drive the transport process through binding and hydrolysis of ATP; and the substrate-binding protein or receptor OppA that determines the substrate specificity of the transport system. The dipeptide (DppA) and oligopeptide (OppA) binding proteins differ in several ways. The DppA binds dipeptides and some tripeptides and is involved in chemotaxis toward dipeptides, whereas the OppA binds peptides of a wide range of lengths (2-35 amino acid residues) and plays a role in recycling of cell wall peptides, which precludes any involvement in chemotaxis. Similar to the ABC-type dipeptide and oligopeptide import systems, nickel transporter is comprised of five subunits NikABCDE: the two pore-forming integral inner membrane proteins NikB and NikC; the two inner membrane-associated proteins with ATPase activity NikD and NikE; and the periplasmic nickel binding NikA, which is the initial nickel receptor that controls the chemotactic response away from nickel. Most of other periplasmic binding proteins are comprised of only two globular subdomains corresponding to domains I and III of the dipeptide/oligopeptide binding proteins. The structural topology of these domains is most similar to that of the type 2 periplasmic binding proteins (PBP2), which are responsible for the uptake of a variety of substrates such as phosphate, sulfate, polysaccharides, lysine/arginine/ornithine, and histidine. The PBP2 bind their ligand in the cleft between these domains in a manner resembling a Venus flytrap. After binding their specific ligand with high affinity, they can interact with a cognate membrane transport complex comprised of two integral membrane domains and two cytoplasmically located ATPase domains. This interaction triggers the ligand translocation across the cytoplasmic membrane energized by ATP hydrolysis. Besides transport proteins, the PBP2 superfamily includes the ligand binding domains of ionotropic glutamate receptors, LysR-type transcriptional regulators, and unorthodox sensor proteins involved in signal transduction." Q#21633 - CGI_10022276 superfamily 244888 37 338 1.40E-73 233.501 cl08282 Acyl_transf_1 superfamily - - Acyl transferase domain; Acyl transferase domain. Q#21634 - CGI_10022277 superfamily 221056 1 943 0 628.765 cl12819 Med24_N superfamily - - "Mediator complex subunit 24 N-terminal; This subunit of the Mediator complex appears to be conserved only from insects to humans. It is essential for correct retinal development in fish. Subunit composition of the mediator contributes to the control of differentiation in the vertebrate CNS as there are divergent functions of the mediator subunits Crsp34/Med27, Trap100/Med24, and Crsp150/Med14." Q#21636 - CGI_10022279 superfamily 245814 40 126 0.00393971 34.7216 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21639 - CGI_10022282 superfamily 245306 92 186 2.79E-19 79.1667 cl10465 Peptidase_S24_S26 superfamily - - "The S24, S26 LexA/signal peptidase superfamily contains LexA-related and type I signal peptidase families. The S24 LexA protein domains include: the lambda repressor CI/C2 family and related bacterial prophage repressor proteins; LexA (EC 3.4.21.88), the repressor of genes in the cellular SOS response to DNA damage; MucA and the related UmuD proteins, which are lesion-bypass DNA polymerases, induced in response to mitogenic DNA damage; RulA, a component of the rulAB locus that confers resistance to UV, and RuvA, which is a component of the RuvABC resolvasome that catalyzes the resolution of Holliday junctions that arise during genetic recombination and DNA repair. The S26 type I signal peptidase (SPase) family also includes mitochondrial inner membrane protease (IMP)-like members. SPases are essential membrane-bound proteases which function to cleave away the amino-terminal signal peptide from the translocated pre-protein, thus playing a crucial role in the transport of proteins across membranes in all living organisms. All members in this superfamily are unique serine proteases that carry out catalysis using a serine/lysine dyad instead of the prototypical serine/histidine/aspartic acid triad found in most serine proteases." Q#21640 - CGI_10022283 superfamily 247792 18 66 1.53E-08 52.0628 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#21640 - CGI_10022283 superfamily 248097 371 406 0.000171684 40.7114 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21641 - CGI_10022284 superfamily 219502 227 431 4.93E-65 212.305 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#21641 - CGI_10022284 superfamily 219507 129 208 4.18E-06 44.9227 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#21641 - CGI_10022284 superfamily 201962 61 115 0.00237932 36.1996 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#21642 - CGI_10022285 superfamily 247792 18 66 7.28E-06 42.4328 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#21643 - CGI_10022286 superfamily 247736 85 157 0.0030132 34.2274 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#21644 - CGI_10022287 superfamily 192997 25 171 4.84E-27 107.282 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#21645 - CGI_10022288 superfamily 242406 1 71 1.47E-07 45.2749 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#21647 - CGI_10022290 superfamily 247692 47 540 4.36E-45 166.256 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#21649 - CGI_10002477 superfamily 217473 4 284 8.90E-27 110.147 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#21650 - CGI_10002478 superfamily 241792 92 178 8.08E-24 91.7641 cl00332 Ribosomal_S11 superfamily - - Ribosomal protein S11; Ribosomal protein S11. Q#21652 - CGI_10001664 superfamily 241563 15 59 0.000585486 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21654 - CGI_10001868 superfamily 220635 24 504 4.88E-161 477.026 cl12380 DUF2151 superfamily N - "Cell cycle and development regulator; This is a set of proteins conserved from worms to humans. The proteins are a PAN GU kinase substrate, Mat89Bb, essential for S-M cycles of early Drosophila embryogenesis, Xenopus embryonic cell cycles and morphogenesis, and cell division in cultured mammalian cells." Q#21655 - CGI_10001869 superfamily 247724 221 381 9.93E-06 44.3696 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21655 - CGI_10001869 superfamily 242902 25 104 6.88E-15 71.509 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#21656 - CGI_10008643 superfamily 243035 13 76 6.61E-15 64.5633 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#21657 - CGI_10008644 superfamily 243035 282 390 2.04E-26 101.928 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#21657 - CGI_10008644 superfamily 241568 213 265 0.00234481 35.9016 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21657 - CGI_10008644 superfamily 216533 105 152 0.000142482 39.5344 cl09264 HTH_Tnp_Tc3_2 superfamily C - "Transposase; Transposase proteins are necessary for efficient DNA transposition. This family includes the amino-terminal region of Tc1, Tc1A, Tc1B and Tc2B transposases of C.elegans. The region encompasses the specific DNA binding and second DNA recognition domains as well as an amino-terminal region of the catalytic domain of Tc3 as described in. Tc3 is a member of the Tc1/mariner family of transposable elements." Q#21659 - CGI_10008646 superfamily 243051 307 467 6.39E-27 105.538 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#21659 - CGI_10008646 superfamily 245213 31 64 0.000358664 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21659 - CGI_10008646 superfamily 245213 267 299 0.00521983 34.9198 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21661 - CGI_10008648 superfamily 216347 206 623 3.15E-128 388.047 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#21662 - CGI_10008649 superfamily 217473 96 216 1.52E-09 57.7601 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#21663 - CGI_10007059 superfamily 246031 218 336 5.81E-41 144.605 cl12567 Beta-Casp superfamily - - Beta-Casp domain; The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. Q#21663 - CGI_10007059 superfamily 203663 349 390 2.52E-10 56.7291 cl06522 RMMBL superfamily - - RNA-metabolising metallo-beta-lactamase; The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in RNA metabolism. Q#21663 - CGI_10007059 superfamily 241867 34 157 4.16E-09 55.2522 cl00446 Lactamase_B superfamily C - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#21664 - CGI_10007060 superfamily 218375 51 355 2.45E-92 284.762 cl09349 IFRD superfamily - - "Interferon-related developmental regulator (IFRD); Interferon-related developmental regulator (IFRD1) is the human homologue of the rat early response protein PC4 and its murine homologue TIS7. The exact function of IFRD1 is unknown but it has been shown that PC4 is necessary to muscle differentiation and that it might have a role in signal transduction. This family also contains IFRD2 and its murine equivalent SKMc15 which are highly expressed soon after gastrulation and in the hepatic primordium, suggesting an involvement in early hematopoiesis." Q#21664 - CGI_10007060 superfamily 203103 400 451 2.39E-14 67.8005 cl04790 IFRD_C superfamily - - Interferon-related protein conserved region; Family of proteins thought to be involved in regulating gene activity in the proliferative and/or differentiative pathways induced by NGF. Q#21665 - CGI_10007061 superfamily 245201 688 935 7.11E-116 358.772 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21665 - CGI_10007061 superfamily 247038 257 340 1.15E-11 62.4721 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#21665 - CGI_10007061 superfamily 247038 350 426 8.70E-10 56.8489 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#21665 - CGI_10007061 superfamily 247042 77 214 0.00346089 39.6509 cl15693 Sema superfamily N - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#21666 - CGI_10007062 superfamily 245201 525 781 1.10E-118 362.561 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21666 - CGI_10007062 superfamily 247042 32 229 4.55E-16 79.6822 cl15693 Sema superfamily C - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#21666 - CGI_10007062 superfamily 243104 276 317 4.13E-09 53.7029 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#21668 - CGI_10007064 superfamily 245847 4 71 0.00011107 36.3264 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#21669 - CGI_10002466 superfamily 245213 117 151 0.00273852 34.9198 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21671 - CGI_10013211 superfamily 191128 16 117 5.90E-17 71.0308 cl04846 Ninjurin superfamily - - Ninjurin; Ninjurin (nerve injury-induced protein) is involved in nerve regeneration and in the formation and function in some tissues. Q#21673 - CGI_10013213 superfamily 241599 12 69 7.06E-17 71.8908 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#21674 - CGI_10013214 superfamily 243119 110 160 7.70E-05 39.7346 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#21674 - CGI_10013214 superfamily 243119 248 300 0.00060644 37.0382 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#21674 - CGI_10013214 superfamily 243119 186 225 0.000991093 36.2678 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#21675 - CGI_10013215 superfamily 221135 186 344 1.06E-13 68.5526 cl13079 PI31_Prot_N superfamily - - PI31 proteasome regulator N-terminal; PI31 is a regulatory subunit of the immuno-proteasome which is an inhibitor of the 20 S proteasome in vitro.PI31 is also an F-box protein Fbxo7.Skp1 binding partner which requires an N terminal FP domain in both proteins for the interaction to occur via the FP beta sheets. The structure of PI31 FP domain contains a novel alpha/beta-fold and two intermolecular contact surfaces. This is the N-terminal domain of the members. Q#21675 - CGI_10013215 superfamily 243074 360 405 1.97E-08 50.9681 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#21677 - CGI_10013217 superfamily 241648 213 263 1.48E-16 74.3314 cl00158 ZnF_GATA superfamily - - Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Q#21677 - CGI_10013217 superfamily 241648 267 300 3.17E-12 62.005 cl00158 ZnF_GATA superfamily C - Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Q#21679 - CGI_10013220 superfamily 246669 269 372 1.84E-23 94.0558 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#21679 - CGI_10013220 superfamily 246669 121 245 1.22E-24 97.7109 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#21680 - CGI_10013221 superfamily 215869 75 459 4.10E-84 265.145 cl10564 SecY superfamily - - SecY translocase; SecY translocase. Q#21680 - CGI_10013221 superfamily 204513 40 74 1.82E-13 64.7519 cl11188 Plug_translocon superfamily - - Plug domain of Sec61p; The Sec61/SecY translocon mediates translocation of proteins across the membrane and integration of membrane proteins into the lipid bilayer. The structure of the translocon revealed a plug domain blocking the pore on the lumenal side.The plug is unlikely to be important for sealing the translocation pore in yeast but it plays a role in stabilising Sec61p during translocon formation. The domain runs from residues 52-74. Q#21681 - CGI_10013222 superfamily 247743 40 123 7.47E-08 50.6075 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21683 - CGI_10013224 superfamily 245201 2 140 7.93E-18 77.5768 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21684 - CGI_10013225 superfamily 218109 257 289 6.98E-09 52.7129 cl12292 Gly_transf_sug superfamily N - "Glycosyltransferase sugar-binding region containing DXD motif; The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases." Q#21685 - CGI_10013226 superfamily 246597 3 262 1.26E-169 487.879 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#21686 - CGI_10013227 superfamily 241574 449 515 6.28E-30 116.531 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21687 - CGI_10013228 superfamily 241574 1 105 1.12E-34 129.628 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21687 - CGI_10013228 superfamily 241574 170 342 5.84E-12 64.1441 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21688 - CGI_10013229 superfamily 241574 485 713 2.15E-98 311.057 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21688 - CGI_10013229 superfamily 241574 778 976 1.27E-17 83.0189 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21695 - CGI_10023882 superfamily 245864 57 504 2.01E-126 379.7 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#21699 - CGI_10023886 superfamily 247684 83 251 2.10E-07 49.5095 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#21699 - CGI_10023886 superfamily 202746 225 460 3.57E-60 198.673 cl08402 Hexokinase_2 superfamily - - Hexokinase; Hexokinase (EC:2.7.1.1) contains two structurally similar domains represented by this family and pfam00349. Some members of the family have two copies of each of these domains. Q#21700 - CGI_10023887 superfamily 241640 28 302 2.89E-65 207.129 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#21701 - CGI_10023888 superfamily 248097 103 226 1.70E-15 69.6014 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21702 - CGI_10023889 superfamily 248097 163 277 2.12E-17 75.7646 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21703 - CGI_10023890 superfamily 247792 18 68 1.23E-06 46.2848 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#21704 - CGI_10023891 superfamily 245201 8 307 0 562.197 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21707 - CGI_10023894 superfamily 245213 292 329 1.25E-06 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21707 - CGI_10023894 superfamily 245213 139 176 1.91E-05 43.009 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21707 - CGI_10023894 superfamily 245213 178 214 5.96E-05 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21707 - CGI_10023894 superfamily 245213 570 599 0.000126817 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21707 - CGI_10023894 superfamily 245213 216 252 0.00134056 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21708 - CGI_10023895 superfamily 247723 16 88 3.15E-46 155.917 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#21708 - CGI_10023895 superfamily 217936 221 359 3.65E-07 48.826 cl18432 Arv1 superfamily N - Arv1-like family; Arv1 is a transmembrane protein with potential zinc-binding motifs. ARV1 is a novel mediator of eukaryotic sterol homeostasis. Q#21709 - CGI_10023896 superfamily 242025 69 160 6.65E-26 96.5177 cl00682 Alba superfamily - - Alba; Alba is a novel chromosomal protein that coats archaeal DNA without compacting it. Q#21710 - CGI_10023897 superfamily 245213 257 292 2.13E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 332 367 2.28E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 178 216 6.62E-06 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 512 547 7.71E-06 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 703 739 1.39E-05 43.7794 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 294 329 4.13E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 632 663 5.16E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 226 254 6.43E-05 41.8534 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 407 443 8.82E-05 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 665 700 0.000137297 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 372 405 0.000169943 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 245213 550 586 0.000829362 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21710 - CGI_10023897 superfamily 214565 751 822 7.54E-07 48.3277 cl18312 VWC_out superfamily - - von Willebrand factor (vWF) type C domain; von Willebrand factor (vWF) type C domain. Q#21711 - CGI_10023898 superfamily 247856 101 157 1.17E-16 74.8917 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21711 - CGI_10023898 superfamily 247856 193 255 1.67E-14 68.7285 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21711 - CGI_10023898 superfamily 247856 28 89 8.47E-14 66.8025 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21711 - CGI_10023898 superfamily 247856 305 366 1.70E-12 62.9505 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21711 - CGI_10023898 superfamily 247856 361 423 2.91E-11 59.4837 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21711 - CGI_10023898 superfamily 247856 435 490 1.17E-07 49.0833 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21711 - CGI_10023898 superfamily 247856 269 325 1.95E-07 48.3129 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21712 - CGI_10023899 superfamily 215754 27 119 1.92E-23 91.9312 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21712 - CGI_10023899 superfamily 215754 130 224 2.87E-23 91.546 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21712 - CGI_10023899 superfamily 215754 227 288 1.65E-12 61.8856 cl02813 Mito_carr superfamily C - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21714 - CGI_10023901 superfamily 247856 98 159 7.23E-07 43.3053 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21714 - CGI_10023901 superfamily 247856 19 82 7.65E-05 37.9125 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21715 - CGI_10023902 superfamily 243053 229 462 1.39E-71 234.454 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#21715 - CGI_10023902 superfamily 241566 577 626 4.60E-16 73.6803 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#21715 - CGI_10023902 superfamily 243067 75 195 3.44E-09 55.1112 cl02520 REM superfamily - - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#21715 - CGI_10023902 superfamily 247856 509 559 5.08E-06 44.8461 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21717 - CGI_10023904 superfamily 243095 54 249 1.03E-55 193.457 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#21718 - CGI_10023905 superfamily 242876 1 122 1.12E-57 177.546 cl02092 Clat_adaptor_s superfamily - - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#21719 - CGI_10023906 superfamily 245605 47 332 6.52E-179 505.976 cl11409 RNAP_RPB11_RPB3 superfamily - - "RPB11 and RPB3 subunits of RNA polymerase; The eukaryotic RPB11 and RPB3 subunits of RNA polymerase (RNAP), as well as their archaeal (L and D subunits) and bacterial (alpha subunit) counterparts, are involved in the assembly of RNAP, a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts." Q#21719 - CGI_10023906 superfamily 245814 379 436 6.86E-08 49.7951 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21720 - CGI_10023907 superfamily 245814 33 93 3.14E-09 55.1879 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21720 - CGI_10023907 superfamily 245814 416 481 4.56E-08 51.7211 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21720 - CGI_10023907 superfamily 245814 130 181 0.004084 36.6983 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21720 - CGI_10023907 superfamily 245201 696 972 2.85E-113 352.537 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21720 - CGI_10023907 superfamily 245814 494 574 8.74E-14 68.686 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21720 - CGI_10023907 superfamily 245814 309 395 1.86E-11 62.0273 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21720 - CGI_10023907 superfamily 245814 235 291 0.000109647 41.6117 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21721 - CGI_10023908 superfamily 247743 285 451 2.12E-20 88.3571 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21721 - CGI_10023908 superfamily 199226 26 58 0.000108301 40.1104 cl11662 LisH superfamily - - "LisH; The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex." Q#21722 - CGI_10023909 superfamily 243072 83 205 2.68E-30 109.781 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21723 - CGI_10023910 superfamily 222500 110 320 3.03E-06 46.2165 cl16546 DUF4239 superfamily - - Protein of unknown function (DUF4239); This family of proteins is functionally uncharacterized. This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 254 and 270 amino acids in length. Q#21724 - CGI_10023911 superfamily 245814 160 212 0.000209456 40.1651 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21724 - CGI_10023911 superfamily 216981 314 469 1.03E-14 71.795 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#21725 - CGI_10023912 superfamily 247723 173 265 1.28E-44 153.577 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#21725 - CGI_10023912 superfamily 207717 83 159 2.59E-43 149.355 cl02755 LAM superfamily - - "LA motif RNA-binding domain; This domain is found at the N-terminus of La RNA-binding proteins as well as in other related proteins. Typically, the domain co-occurs with an RNA-recognition motif (RRM), and together these domains function to bind primary transcripts of RNA polymerase III in the La autoantigen (Lupus La protein, LARP3, or Sjoegren syndrome type B antigen, SS-B). A variety of La-related proteins (LARPs or La ribonucleoproteins), with differing domain architecture, appear to function as RNA-binding proteins in eukaryotic cellular processes." Q#21725 - CGI_10023912 superfamily 193374 526 544 0.000470813 38.3012 cl15152 SUZ-C superfamily N - SUZ-C motif; The SUZ-C domain is a conserved motif found in one or more copies in several RNA-binding proteins. It is always found at the C-terminus of the protein and appear to be required for localization of the protein to specific subcellular structures. It was first characterized in the C.elegans protein Szy-20 which localizes to the centrosome. It is widely distributed in eukaryotes. Q#21727 - CGI_10023914 superfamily 222269 74 293 4.83E-36 133.603 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#21728 - CGI_10023915 superfamily 238191 26 498 2.32E-123 373.591 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#21732 - CGI_10023919 superfamily 241599 117 175 2.03E-18 76.128 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#21734 - CGI_10023921 superfamily 247684 164 608 0 549.201 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#21735 - CGI_10023922 superfamily 247725 355 462 2.32E-39 139.777 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#21735 - CGI_10023922 superfamily 247725 183 300 3.79E-39 139.504 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#21741 - CGI_10011506 superfamily 243161 3 60 4.23E-12 63.9525 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#21743 - CGI_10011508 superfamily 217293 22 120 5.78E-16 71.8951 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#21743 - CGI_10011508 superfamily 202474 127 190 0.000303357 38.7889 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#21744 - CGI_10011509 superfamily 207662 173 255 7.84E-36 128.439 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#21744 - CGI_10011509 superfamily 245599 326 491 5.98E-23 95.3674 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#21745 - CGI_10011510 superfamily 241574 76 139 1.38E-11 57.6792 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21746 - CGI_10011511 superfamily 241758 5 199 4.78E-82 247.075 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#21747 - CGI_10011512 superfamily 216371 21 369 2.42E-93 288.183 cl18365 ERG4_ERG24 superfamily - - Ergosterol biosynthesis ERG4/ERG24 family; Ergosterol biosynthesis ERG4/ERG24 family. Q#21748 - CGI_10011513 superfamily 241564 224 290 3.27E-26 99.6475 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#21748 - CGI_10011513 superfamily 241564 42 108 1.41E-16 73.4539 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#21748 - CGI_10011513 superfamily 247792 328 373 1.22E-11 59.3156 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#21751 - CGI_10003379 superfamily 243077 58 106 9.83E-11 57.1701 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#21751 - CGI_10003379 superfamily 244850 210 274 7.02E-17 74.9563 cl08096 DUF1992 superfamily - - Domain of unknown function (DUF1992); This family of proteins are functionally uncharacterized. Q#21752 - CGI_10003380 superfamily 245815 77 553 0 749.646 cl11961 ALDH-SF superfamily - - "NAD(P)+-dependent aldehyde dehydrogenase superfamily; The aldehyde dehydrogenase superfamily (ALDH-SF) of NAD(P)+-dependent enzymes, in general, oxidize a wide range of endogenous and exogenous aliphatic and aromatic aldehydes to their corresponding carboxylic acids and play an important role in detoxification. Besides aldehyde detoxification, many ALDH isozymes possess multiple additional catalytic and non-catalytic functions such as participating in metabolic pathways, or as binding proteins, or osmoregulants, to mention a few. The enzyme has three domains, a NAD(P)+ cofactor-binding domain, a catalytic domain, and a bridging domain; and the active enzyme is generally either homodimeric or homotetrameric. The catalytic mechanism is proposed to involve cofactor binding, resulting in a conformational change and activation of an invariant catalytic cysteine nucleophile. The cysteine and aldehyde substrate form an oxyanion thiohemiacetal intermediate resulting in hydride transfer to the cofactor and formation of a thioacylenzyme intermediate. Hydrolysis of the thioacylenzyme and release of the carboxylic acid product occurs, and in most cases, the reduced cofactor dissociates from the enzyme. The evolutionary phylogenetic tree of ALDHs appears to have an initial bifurcation between what has been characterized as the classical aldehyde dehydrogenases, the ALDH family (ALDH) and extended family members or aldehyde dehydrogenase-like (ALDH-L) proteins. The ALDH proteins are represented by enzymes which share a number of highly conserved residues necessary for catalysis and cofactor binding and they include such proteins as retinal dehydrogenase, 10-formyltetrahydrofolate dehydrogenase, non-phosphorylating glyceraldehyde 3-phosphate dehydrogenase, delta(1)-pyrroline-5-carboxylate dehydrogenases, alpha-ketoglutaric semialdehyde dehydrogenase, alpha-aminoadipic semialdehyde dehydrogenase, coniferyl aldehyde dehydrogenase and succinate-semialdehyde dehydrogenase. Included in this larger group are all human, Arabidopsis, Tortula, fungal, protozoan, and Drosophila ALDHs identified in families ALDH1 through ALDH22 with the exception of families ALDH18, ALDH19, and ALDH20 which are present in the ALDH-like group. The ALDH-like group is represented by such proteins as gamma-glutamyl phosphate reductase, LuxC-like acyl-CoA reductase, and coenzyme A acylating aldehyde dehydrogenase. All of these proteins have a conserved cysteine that aligns with the catalytic cysteine of the ALDH group." Q#21753 - CGI_10003381 superfamily 243092 11 360 6.88E-25 105.11 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21753 - CGI_10003381 superfamily 243092 502 749 3.33E-16 78.5308 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21754 - CGI_10002829 superfamily 241596 188 244 5.49E-10 53.3719 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#21757 - CGI_10018961 superfamily 241563 86 126 2.33E-06 45.1628 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21757 - CGI_10018961 superfamily 241563 32 77 0.000392874 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21757 - CGI_10018961 superfamily 110440 548 575 0.00705548 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#21758 - CGI_10018962 superfamily 243092 13 217 2.15E-32 119.747 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21759 - CGI_10018963 superfamily 245835 85 207 0.000391002 40.3143 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#21759 - CGI_10018963 superfamily 241563 18 50 0.000464865 37.7019 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21760 - CGI_10018964 superfamily 247805 499 657 2.90E-19 87.3928 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#21760 - CGI_10018964 superfamily 247805 1291 1438 2.75E-17 81.6148 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#21760 - CGI_10018964 superfamily 247905 686 872 4.03E-11 63.0256 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#21760 - CGI_10018964 superfamily 247905 1560 1658 5.40E-07 50.3141 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#21760 - CGI_10018964 superfamily 214946 1765 2077 1.69E-86 287.718 cl15345 Sec63 superfamily - - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#21760 - CGI_10018964 superfamily 214946 1064 1233 1.10E-37 146.349 cl15345 Sec63 superfamily N - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#21760 - CGI_10018964 superfamily 214946 983 1058 2.12E-17 84.3323 cl15345 Sec63 superfamily C - "Sec63 Brl domain; This domain was named after the yeast Sec63 (or NPL1) (also known as the Brl domain) protein in which it was found. This protein is required for assembly of functional endoplasmic reticulum translocons. Other yeast proteins containing this domain include pre-mRNA splicing helicase BRR2, HFM1 protein and putative helicases." Q#21765 - CGI_10018969 superfamily 247805 299 503 2.72E-93 292.466 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#21765 - CGI_10018969 superfamily 247905 514 645 1.34E-36 134.673 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#21766 - CGI_10018970 superfamily 245716 120 141 1.96E-06 44.5425 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#21766 - CGI_10018970 superfamily 245716 24 44 0.000197459 38.7645 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#21766 - CGI_10018970 superfamily 245674 212 262 0.00327344 35.3426 cl11531 DUF904 superfamily - - Protein of unknown function (DUF904); This family consists of several bacterial and archaeal hypothetical proteins of unknown function. Q#21767 - CGI_10018971 superfamily 247792 15 63 1.81E-06 45.5144 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#21767 - CGI_10018971 superfamily 243054 193 309 0.00115292 39.3512 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#21767 - CGI_10018971 superfamily 241563 164 195 0.00393077 35.7759 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21768 - CGI_10018972 superfamily 245819 94 263 4.39E-66 206.276 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#21768 - CGI_10018972 superfamily 219526 38 81 0.000892447 38.3691 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#21769 - CGI_10018973 superfamily 217473 1 130 2.31E-20 90.1169 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#21770 - CGI_10018974 superfamily 248097 86 193 1.55E-23 91.5578 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#21775 - CGI_10018979 superfamily 192535 63 342 6.99E-08 51.8278 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#21776 - CGI_10018980 superfamily 246031 1 73 3.70E-10 52.9272 cl12567 Beta-Casp superfamily C - Beta-Casp domain; The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. Q#21779 - CGI_10019489 superfamily 247684 17 201 6.23E-17 78.0143 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#21781 - CGI_10019491 superfamily 243058 463 577 3.90E-12 65.7987 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#21781 - CGI_10019491 superfamily 220736 1159 1282 2.37E-41 150.923 cl11068 PTEN_C2 superfamily - - "C2 domain of PTEN tumour-suppressor protein; This is the C2 domain-like domain, in greek key form, of the PTEN protein, phosphatidyl-inositol triphosphate phosphatase, and it is the C-terminus. This domain may well include a CBR3 loop which means it plays a central role in membrane binding. This domain associates across an extensive interface with the N-terminal phosphatase domain DSPc (pfam00782) suggesting that the C2 domain productively positions the catalytic part of the protein onto the membrane." Q#21781 - CGI_10019491 superfamily 246908 1940 2010 9.18E-28 111.365 cl15255 SH2 superfamily C - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#21781 - CGI_10019491 superfamily 243689 21 101 1.29E-07 51.0901 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#21781 - CGI_10019491 superfamily 241574 1051 1120 3.28E-06 47.7387 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21781 - CGI_10019491 superfamily 243689 132 199 6.08E-05 43.3861 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#21782 - CGI_10019492 superfamily 221788 505 668 3.77E-57 197.409 cl15106 Vps8 superfamily - - "Golgi CORVET complex core vacuolar protein 8; Vps8 is one of the Golgi complex components necessary for vacuolar sorting. Eukaryotic cells contain a highly dynamic endo-membrane system, in which individual organelles keep their identity despite continuous vesicle generation and fusion. Vesicles that bud from a donor membrane are targeted and delivered to each individual organelle, where they release their cargo after fusion with the acceptor membrane. Vps8 is the core component of the endosomal tethering complex CORVET (class C core vacuole/endosome tethering). Vps8 co-operates with Vps21-GTP to mediate endosomal clustering in a reaction that is dependent on Vps3. Vps8 is the only CORVET subunit that is enriched on late endosomes, suggesting that it is a marker for the maturation of late endosomes. Late endosomes form intralumenal vesicles, and the resulting multivesicular bodies fuse with the vacuole to release their cargoes." Q#21782 - CGI_10019492 superfamily 243092 86 166 2.30E-06 49.6408 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21783 - CGI_10019493 superfamily 207654 268 328 9.15E-20 84.0314 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#21783 - CGI_10019493 superfamily 207654 335 406 1.42E-16 75.1718 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#21783 - CGI_10019493 superfamily 207654 528 593 1.48E-14 69.3938 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#21783 - CGI_10019493 superfamily 207654 452 518 1.54E-08 52.0598 cl02574 Annexin superfamily - - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#21785 - CGI_10019495 superfamily 241584 583 677 1.24E-18 83.6999 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#21785 - CGI_10019495 superfamily 245814 496 560 1.14E-12 65.9735 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21785 - CGI_10019495 superfamily 241584 817 906 2.40E-12 65.5955 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#21785 - CGI_10019495 superfamily 245814 90 187 1.34E-30 118.8 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21785 - CGI_10019495 superfamily 245814 197 279 1.21E-20 89.3764 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21785 - CGI_10019495 superfamily 245814 300 368 3.92E-20 87.4621 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21785 - CGI_10019495 superfamily 245814 387 468 1.16E-12 66.115 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#21785 - CGI_10019495 superfamily 241584 712 789 8.64E-07 48.9595 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#21794 - CGI_10019504 superfamily 243069 3 46 0.000101796 37.1267 cl02525 Band_7 superfamily N - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#21797 - CGI_10019507 superfamily 241568 763 817 2.85E-11 62.0952 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1938 1992 2.29E-10 59.3988 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1004 1057 2.50E-10 59.3988 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1178 1231 7.90E-10 57.858 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 2347 2401 9.20E-10 57.858 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 2056 2109 2.02E-09 56.7024 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1120 1174 2.19E-09 56.7024 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1294 1347 2.29E-09 56.7024 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 2289 2343 5.22E-09 55.5468 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 2114 2168 5.40E-09 55.5468 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1582 1636 6.82E-09 55.1616 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1817 1874 1.23E-08 54.3912 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1062 1115 2.44E-08 53.6208 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 2172 2226 3.25E-08 53.2356 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1414 1464 3.81E-08 52.8504 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1532 1577 5.05E-08 52.4652 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1352 1407 6.81E-08 52.08 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1996 2051 8.02E-08 52.08 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 945 999 2.33E-07 50.5392 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1236 1289 2.51E-07 50.5392 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 821 877 3.58E-07 50.154 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 2230 2277 3.70E-07 50.154 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 245213 463 499 1.23E-06 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21797 - CGI_10019507 superfamily 241568 1879 1934 1.54E-06 48.228 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1652 1694 2.65E-06 47.4576 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 245213 388 423 3.58E-06 46.861 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21797 - CGI_10019507 superfamily 241568 1755 1812 3.82E-06 47.0724 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 245213 425 461 6.08E-06 46.0906 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21797 - CGI_10019507 superfamily 241568 882 941 8.34E-06 45.9168 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1698 1751 1.03E-05 45.9168 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 245213 313 347 0.000121686 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21797 - CGI_10019507 superfamily 245213 349 384 0.000223677 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21797 - CGI_10019507 superfamily 241568 705 758 0.000316855 41.2944 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 245213 282 309 0.000746827 39.9274 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21797 - CGI_10019507 superfamily 241568 2463 2515 0.00128467 39.3684 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 1484 1516 0.00175112 38.9832 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241568 2418 2459 0.00544905 37.4424 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21797 - CGI_10019507 superfamily 241611 508 692 5.92E-39 147.03 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#21797 - CGI_10019507 superfamily 219525 98 145 4.14E-10 58.5845 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#21797 - CGI_10019507 superfamily 219525 152 199 1.11E-07 51.2657 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#21797 - CGI_10019507 superfamily 219525 206 253 1.23E-06 48.1841 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#21798 - CGI_10019508 superfamily 241578 76 232 1.62E-36 136.267 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21798 - CGI_10019508 superfamily 241568 441 497 4.03E-12 62.8656 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21798 - CGI_10019508 superfamily 241568 381 437 1.66E-09 55.5468 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21798 - CGI_10019508 superfamily 241568 501 555 6.70E-07 47.8428 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21798 - CGI_10019508 superfamily 111397 648 725 9.53E-15 71.2182 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#21798 - CGI_10019508 superfamily 111397 565 646 9.52E-10 56.5806 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#21798 - CGI_10019508 superfamily 219525 313 362 0.000204454 40.095 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#21800 - CGI_10019510 superfamily 245201 11 278 6.39E-86 264.36 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21801 - CGI_10019511 superfamily 245201 6 262 1.41E-71 225.455 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21803 - CGI_10019513 superfamily 241777 204 349 6.91E-37 137.059 cl00316 Cation_efflux superfamily N - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#21803 - CGI_10019513 superfamily 241777 10 110 7.03E-30 117.028 cl00316 Cation_efflux superfamily C - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#21804 - CGI_10019514 superfamily 247725 62 192 1.04E-79 254.197 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#21804 - CGI_10019514 superfamily 216381 326 645 6.54E-101 318.38 cl03136 Oxysterol_BP superfamily - - Oxysterol-binding protein; Oxysterol-binding protein. Q#21806 - CGI_10019516 superfamily 241599 356 405 1.19E-07 48.3937 cl00084 homeodomain superfamily C - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#21807 - CGI_10019517 superfamily 243034 451 562 6.14E-12 62.7828 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21807 - CGI_10019517 superfamily 243034 343 442 7.68E-08 50.8416 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21807 - CGI_10019517 superfamily 243034 165 243 3.17E-07 48.9156 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21807 - CGI_10019517 superfamily 243034 531 637 0.000202052 40.056 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21807 - CGI_10019517 superfamily 243034 101 194 0.00344488 36.5892 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#21808 - CGI_10019518 superfamily 241599 297 353 1.67E-11 59.1793 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#21810 - CGI_10019520 superfamily 247068 1148 1235 2.69E-11 62.3309 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#21810 - CGI_10019520 superfamily 247068 849 924 2.11E-08 53.4714 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#21810 - CGI_10019520 superfamily 247068 1066 1139 3.98E-07 49.6194 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#21810 - CGI_10019520 superfamily 247068 229 307 3.64E-06 46.923 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#21810 - CGI_10019520 superfamily 247068 429 517 0.0004598 40.3746 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#21810 - CGI_10019520 superfamily 247068 332 417 0.00434701 37.293 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#21810 - CGI_10019520 superfamily 247068 729 823 0.0098531 36.1374 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#21812 - CGI_10019522 superfamily 247723 11 90 1.68E-52 173.104 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#21812 - CGI_10019522 superfamily 247723 96 171 2.01E-48 161.95 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#21812 - CGI_10019522 superfamily 247723 293 370 1.66E-45 154.314 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#21812 - CGI_10019522 superfamily 247723 189 269 5.31E-39 136.55 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#21813 - CGI_10019523 superfamily 247724 3 105 1.43E-25 99.9171 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21814 - CGI_10004774 superfamily 243051 90 221 6.83E-16 71.6129 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#21814 - CGI_10004774 superfamily 241583 29 81 9.52E-08 49.7778 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#21815 - CGI_10004776 superfamily 248264 112 281 8.13E-22 91.5297 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#21821 - CGI_10006785 superfamily 247856 428 492 0.00700723 34.8309 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21824 - CGI_10006246 superfamily 247038 408 494 1.77E-15 73.7977 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#21824 - CGI_10006246 superfamily 247038 311 407 8.12E-15 71.9631 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#21824 - CGI_10006246 superfamily 215686 693 793 7.83E-10 58.1976 cl18340 Lipocalin superfamily N - "Lipocalin / cytosolic fatty-acid binding protein family; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel." Q#21824 - CGI_10006246 superfamily 215686 901 1048 3.67E-08 52.8048 cl18340 Lipocalin superfamily - - "Lipocalin / cytosolic fatty-acid binding protein family; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel." Q#21824 - CGI_10006246 superfamily 215686 783 865 9.23E-07 48.5677 cl18340 Lipocalin superfamily N - "Lipocalin / cytosolic fatty-acid binding protein family; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel." Q#21824 - CGI_10006246 superfamily 215686 650 718 4.67E-06 46.2565 cl18340 Lipocalin superfamily N - "Lipocalin / cytosolic fatty-acid binding protein family; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel." Q#21824 - CGI_10006246 superfamily 247038 496 564 4.46E-05 43.1745 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#21824 - CGI_10006246 superfamily 243104 266 309 9.92E-05 41.3848 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#21824 - CGI_10006246 superfamily 243104 96 126 0.00305426 37.1393 cl02601 PSI superfamily C - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#21825 - CGI_10006247 superfamily 247856 65 123 1.82E-10 52.5501 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21825 - CGI_10006247 superfamily 247856 7 50 1.87E-07 44.0757 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21826 - CGI_10006248 superfamily 224772 99 275 5.41E-39 142.086 cl15312 KptA superfamily - - "RNA:NAD 2'-phosphotransferase [Translation, ribosomal structure and biogenesis]" Q#21827 - CGI_10006249 superfamily 242611 166 435 4.70E-147 438.116 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#21827 - CGI_10006249 superfamily 245606 552 758 4.34E-46 163.461 cl11410 TPP_enzyme_PYR superfamily - - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#21828 - CGI_10004126 superfamily 245213 63 99 5.81E-09 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21828 - CGI_10004126 superfamily 245213 32 61 5.49E-08 44.935 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#21830 - CGI_10008516 superfamily 242907 95 157 4.18E-13 61.9322 cl02154 YL1_C superfamily N - YL1 nuclear protein C-terminal domain; This domain is found in proteins of the YL1 family. These proteins have been shown to be DNA-binding and may be a transcription factor. This domain is found in proteins that are not YL1 proteins. Q#21832 - CGI_10008518 superfamily 243092 9 300 2.16E-18 82.768 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21837 - CGI_10008523 superfamily 244658 85 233 8.83E-45 150.222 cl07248 CDC37_M superfamily - - "Cdc37 Hsp90 binding domain; Cdc37 is a molecular chaperone required for the activity of numerous eukaryotic protein kinases. This domains corresponds to the Hsp90 chaperone (Heat shocked protein 90) binding domain of Cdc37. It is found between the N terminal Cdc37 domain pfam03234, which is predominantly involved in kinase binding, and the C terminal domain of Cdc37 pfam08564 whose function is unclear." Q#21839 - CGI_10003277 superfamily 241779 125 329 3.91E-64 205.495 cl00318 YjeF_N superfamily - - "YjeF-related protein N-terminus; YjeF-N domain is a novel version of the Rossmann fold with a set of catalytic residues and structural features that are different from the conventional dehydrogenases. YjeF-N domain is fused to Ribokinases in bacteria (YjeF), where they may be phosphatases, and to divergent Sm and the FDF domain in eukaryotes (Dcp3p and FLJ21128), where they may be involved in decapping and catalyze hydrolytic RNA-processing reactions." Q#21839 - CGI_10003277 superfamily 241874 87 155 4.12E-07 49.9763 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#21841 - CGI_10003279 superfamily 216554 93 216 1.29E-22 91.7721 cl15977 zf-DHHC superfamily N - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#21842 - CGI_10004825 superfamily 248264 1 138 1.12E-08 52.2394 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#21843 - CGI_10004826 superfamily 244881 605 765 2.26E-33 131.161 cl08267 ISOPREN_C2_like superfamily N - "This group contains class II terpene cyclases, protein prenyltransferases beta subunit, two broadly specific proteinase inhibitors alpha2-macroglobulin (alpha (2)-M) and pregnancy zone protein (PZP) and, the C3 C4 and C5 components of vertebrate complement. Class II terpene cyclases include squalene cyclase (SQCY) and 2,3-oxidosqualene cyclase (OSQCY), these integral membrane proteins catalyze a cationic cyclization cascade converting linear triterpenes to fused ring compounds. The protein prenyltransferases include protein farnesyltransferase (FTase) and geranylgeranyltransferase types I and II (GGTase-I and GGTase-II) which catalyze the carboxyl-terminal lipidation of Ras, Rab, and several other cellular signal transduction proteins, facilitating membrane associations and specific protein-protein interactions. Alpha (2)-M is a major carrier protein in serum and involved in the immobilization and entrapment of proteases. PZP is a pregnancy associated protein. Alpha (2)-M and PZP are known to bind to and, may modulate, the activity of placental protein-14 in T-cell growth and cytokine production thereby protecting the allogeneic fetus from attack by the maternal immune system." Q#21843 - CGI_10004826 superfamily 203720 853 935 2.59E-15 72.9662 cl08457 A2M_recep superfamily - - A-macroglobulin receptor; This family includes the receptor domain region of the alpha-2-macroglobulin family. Q#21843 - CGI_10004826 superfamily 215788 584 621 0.000106527 41.7811 cl08251 A2M superfamily N - Alpha-2-macroglobulin family; This family includes the C-terminal region of the alpha-2-macroglobulin family. Q#21843 - CGI_10004826 superfamily 219677 229 254 0.00582962 35.8764 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#21844 - CGI_10004827 superfamily 244339 2 183 2.60E-16 75.2629 cl06253 VDE superfamily NC - "Violaxanthin de-epoxidase (VDE); This family represents a conserved region approximately 150 residues long within plant violaxanthin de-epoxidase (VDE). In higher plants, violaxanthin de-epoxidase forms part of a conserved system that dissipates excess energy as heat in the light-harvesting complexes of photosystem II (PSII), thus protecting them from photo-inhibitory damage." Q#21845 - CGI_10004828 superfamily 243035 23 97 3.59E-10 56.8593 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#21847 - CGI_10012639 superfamily 242406 3 121 1.10E-15 69.5425 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#21850 - CGI_10012642 superfamily 245201 2 66 1.21E-20 83.0549 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21850 - CGI_10012642 superfamily 245201 40 108 0.000383225 37.2949 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21851 - CGI_10012643 superfamily 247792 7 79 1.51E-31 107.227 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#21853 - CGI_10012645 superfamily 217293 33 232 8.11E-44 154.328 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#21853 - CGI_10012645 superfamily 202474 240 460 2.38E-17 80.0053 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#21855 - CGI_10012647 superfamily 245835 26 267 4.74E-131 377.887 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#21855 - CGI_10012647 superfamily 247683 319 370 7.75E-27 101.21 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#21856 - CGI_10012648 superfamily 241640 36 271 2.90E-86 259.516 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#21858 - CGI_10012650 superfamily 241647 189 219 2.31E-05 40.2038 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#21860 - CGI_10012652 superfamily 243072 45 169 1.73E-26 100.921 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21862 - CGI_10010378 superfamily 241581 26 117 2.30E-13 68.567 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#21863 - CGI_10010379 superfamily 248458 46 211 6.57E-11 61.9461 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#21864 - CGI_10010380 superfamily 203016 5 135 7.25E-72 217.415 cl15994 Thg1 superfamily - - "tRNAHis guanylyltransferase; The Thg1 protein from Saccharomyces cerevisiae is responsible for adding a GMP residue to the 5' end of tRNA His. The catalytic domain Thg1 contains a RRM (ferredoxin) fold palm domain, just like the viral RNA-dependent RNA polymerases, reverse transcriptases, family A and B DNA polymerases, adenylyl cyclases, diguanylate cyclases (GGDEF domain) and the predicted polymerase of the CRISPR system. Thg1 possesses an active site with three acidic residues that chelate Mg++ cations. Thg1 catalyzes polymerization similar to the 5'-3' polymerases." Q#21864 - CGI_10010380 superfamily 222741 137 218 2.92E-41 138.196 cl16863 Thg1C superfamily C - "Thg1 C terminal domain; Thg1 polymerases contain an additional region of conservation C-terminal to the core palm domain that comprise of 5 helices and two strands. This region has several well-conserved charged residues including a basic residue found towards the end of the first helix of this unit might contribute to the Thg1-specific active site. This C-terminal module of Thg1 is predicted to form a helical bundle that functions equivalently to the fingers of the other nucleic acid polymerases, probably in interacting with the template HtRNA." Q#21865 - CGI_10010381 superfamily 243072 7 114 3.06E-26 105.543 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21865 - CGI_10010381 superfamily 246680 750 827 0.00143676 37.7002 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#21866 - CGI_10010382 superfamily 247068 64 149 6.20E-06 44.6118 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#21866 - CGI_10010382 superfamily 216152 257 610 2.65E-65 219.878 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#21870 - CGI_10010386 superfamily 243179 271 358 2.38E-05 42.9051 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#21870 - CGI_10010386 superfamily 131187 54 163 0.00809283 36.3441 cl11779 phaR_Bmeg superfamily N - "polyhydroxyalkanoic acid synthase, PhaR subunit; This model describes a protein, PhaR, localized to polyhydroxyalkanoic acid (PHA) inclusion granules in Bacillus cereus and related species. PhaR is required for PHA biosynthesis along with PhaC and may be a regulatory subunit." Q#21873 - CGI_10010389 superfamily 220415 68 146 1.56E-19 79.3631 cl10782 MRP-L27 superfamily N - Mitochondrial ribosomal protein L27; Members of this family of proteins are components of the mitochondrial ribosome large subunit. They are also involved in apoptosis and cell cycle regulation. Q#21874 - CGI_10010390 superfamily 243058 386 502 0.00148803 38.0644 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#21875 - CGI_10010391 superfamily 243179 21 102 5.47E-07 43.6755 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#21876 - CGI_10010392 superfamily 243179 41 161 1.18E-15 69.8691 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#21877 - CGI_10010393 superfamily 241629 90 224 4.85E-57 180.099 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#21878 - CGI_10010394 superfamily 131388 42 255 4.53E-05 44.1201 cl17985 hydr_PhnA superfamily C - "phosphonoacetate hydrolase; This family consists of examples of phosphonoacetate hydrolase, an enzyme specific for the cleavage of the C-P bond in phosphonoacetate. Phosphonates are organic compounds with a direct C-P bond that is far less labile that the C-O-P bonds of phosphate attachment sites. Phosphonates may be degraded for phosphorus and energy by broad spectrum C-P lyase encoded by large operon or by specific enzymes for some of the more common phosphonates in nature. This family represents an enzyme from the latter category. It may be found encoded near genes for phosphonate transport and for pther specific phosphonatases." Q#21879 - CGI_10010395 superfamily 243179 29 142 4.27E-15 67.5579 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#21880 - CGI_10012779 superfamily 243090 295 416 5.92E-63 201.313 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#21880 - CGI_10012779 superfamily 243038 31 86 7.79E-18 78.1009 cl02442 DEP superfamily N - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#21880 - CGI_10012779 superfamily 241587 240 276 9.30E-10 54.6026 cl00069 GGL superfamily N - "G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors" Q#21881 - CGI_10012780 superfamily 241563 134 166 3.96E-05 41.504 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21881 - CGI_10012780 superfamily 220764 161 272 0.00368309 37.1693 cl11102 DUF2458 superfamily C - Protein of unknown function (DUF2458); This a is family of uncharacterized proteins. Q#21883 - CGI_10012782 superfamily 177822 42 232 1.04E-19 84.9717 cl18088 PLN02164 superfamily N - sulfotransferase Q#21884 - CGI_10012783 superfamily 244837 57 83 0.00833994 33.9765 cl07971 Glyco_hydro_3 superfamily NC - Glycosyl hydrolase family 3 N terminal domain; Glycosyl hydrolase family 3 N terminal domain. Q#21885 - CGI_10012784 superfamily 245206 5 229 7.46E-47 159.362 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#21886 - CGI_10012785 superfamily 218028 660 713 0.000150028 41.144 cl04479 AAA_4 superfamily C - "Divergent AAA domain; This family is related to the pfam00004 family, and presumably has the same function (ATP-binding)." Q#21886 - CGI_10012785 superfamily 243058 365 468 0.00332276 36.9088 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#21889 - CGI_10012788 superfamily 245602 259 561 2.89E-148 452.825 cl11402 GH31 superfamily - - "The enzymes of glycosyl hydrolase family 31 (GH31) occur in prokaryotes, eukaryotes, and archaea with a wide range of hydrolytic activities, including alpha-glucosidase (glucoamylase and sucrase-isomaltase), alpha-xylosidase, 6-alpha-glucosyltransferase, 3-alpha-isomaltosyltransferase and alpha-1,4-glucan lyase. All GH31 enzymes cleave a terminal carbohydrate moiety from a substrate that varies considerably in size, depending on the enzyme, and may be either a starch or a glycoprotein. In most cases, the pyranose moiety recognized in subsite -1 of the substrate binding site is an alpha-D-glucose, though some GH31 family members show a preference for alpha-D-xylose. Several GH31 enzymes can accommodate both glucose and xylose and different levels of discrimination between the two have been observed. Most characterized GH31 enzymes are alpha-glucosidases. In mammals, GH31 members with alpha-glucosidase activity are implicated in at least three distinct biological processes. The lysosomal acid alpha-glucosidase (GAA) is essential for glycogen degradation and a deficiency or malfunction of this enzyme causes glycogen storage disease II, also known as pompe disease. In the endoplasmic reticulum, alpha-glucosidase II catalyzes the second step in the N-linked oligosaccharide processing pathway that constitutes part of the quality control system for glycoprotein folding and maturation. The intestinal enzymes sucrase-isomaltase (SI) and maltase-glucoamylase (MGAM) play key roles in the final stage of carbohydrate digestion, making alpha-glucosidase inhibitors useful in the treatment of type 2 diabetes. GH31 alpha-glycosidases are retaining enzymes that cleave their substrates via an acid/base-catalyzed, double-displacement mechanism involving a covalent glycosyl-enzyme intermediate. Two aspartic acid residues have been identified as the catalytic nucleophile and the acid/base, respectively." Q#21889 - CGI_10012788 superfamily 207662 909 991 5.88E-51 176.106 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#21889 - CGI_10012788 superfamily 245599 1098 1260 6.60E-41 150.069 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#21892 - CGI_10014107 superfamily 243050 441 508 5.03E-42 145.301 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#21892 - CGI_10014107 superfamily 243050 381 440 1.09E-36 129.972 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#21892 - CGI_10014107 superfamily 243050 321 374 3.36E-22 90.1553 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#21893 - CGI_10014108 superfamily 221564 32 105 2.27E-26 97.6675 cl13797 P5-ATPase superfamily C - "P5-type ATPase cation transporter; This domain family is found in eukaryotes, and is typically between 110 and 126 amino acids in length. The family is found in association with pfam00122, pfam00702. P-type ATPases comprise a large superfamily of proteins, present in both prokaryotes and eukaryotes, that transport inorganic cations and other substrates across cell membranes." Q#21894 - CGI_10014109 superfamily 215733 116 345 3.25E-36 138.082 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#21894 - CGI_10014109 superfamily 247756 707 757 8.57E-05 43.7626 cl17202 HAD superfamily N - haloacid dehalogenase-like hydrolase; haloacid dehalogenase-like hydrolase. Q#21894 - CGI_10014109 superfamily 222006 469 511 0.000747832 39.1278 cl16182 Hydrolase_like2 superfamily N - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#21896 - CGI_10014111 superfamily 245202 42 110 6.85E-25 90.6744 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#21897 - CGI_10014112 superfamily 245202 6 74 5.60E-21 83.7822 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#21899 - CGI_10014114 superfamily 216152 257 552 3.48E-64 215.641 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#21899 - CGI_10014114 superfamily 216152 12 254 2.08E-49 175.58 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#21900 - CGI_10014115 superfamily 216152 58 360 5.87E-65 212.944 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#21901 - CGI_10014116 superfamily 215647 619 735 4.11E-05 44.5217 cl18338 7tm_2 superfamily NC - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#21903 - CGI_10014118 superfamily 222150 93 120 0.00106894 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21903 - CGI_10014118 superfamily 246975 80 101 0.00263367 34.6301 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#21904 - CGI_10014119 superfamily 246671 26 164 8.26E-25 96.7232 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#21906 - CGI_10014121 superfamily 245201 34 289 9.87E-145 413.701 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21906 - CGI_10014121 superfamily 247683 276 345 2.91E-05 41.3697 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#21907 - CGI_10014122 superfamily 222150 1484 1508 0.00978524 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#21909 - CGI_10014124 superfamily 241578 94 140 1.64E-05 46.6092 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21909 - CGI_10014124 superfamily 241578 395 434 0.000141956 43.5276 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21909 - CGI_10014124 superfamily 241578 353 390 0.000278274 42.7572 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21909 - CGI_10014124 superfamily 241578 225 264 0.00186449 40.446 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21909 - CGI_10014124 superfamily 241578 179 224 0.00188606 40.446 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#21910 - CGI_10014125 superfamily 247743 312 471 1.19E-05 45.9851 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#21910 - CGI_10014125 superfamily 243092 1169 1351 0.00256761 40.396 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21911 - CGI_10014126 superfamily 247057 6 64 8.18E-22 91.872 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#21911 - CGI_10014126 superfamily 245201 198 445 1.18E-121 382.885 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21911 - CGI_10014126 superfamily 247683 457 509 8.09E-05 42.5042 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#21912 - CGI_10014127 superfamily 221585 82 195 4.78E-25 97.1448 cl13842 hSac2 superfamily - - "Inositol phosphatase; This domain family is found in eukaryotes, and is approximately 120 amino acids in length. The family is found in association with pfam02383. hSac2 functions as an inositol polyphosphate 5-phosphatase." Q#21913 - CGI_10014128 superfamily 247095 14 465 6.31E-152 442.095 cl15837 alkPPc superfamily - - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#21914 - CGI_10014129 superfamily 241607 156 195 1.14E-09 51.1166 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#21914 - CGI_10014129 superfamily 241607 73 107 0.0004052 36.479 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#21915 - CGI_10014130 superfamily 247095 28 477 1.21E-158 459.044 cl15837 alkPPc superfamily - - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#21916 - CGI_10014131 superfamily 247095 452 790 6.85E-124 380.463 cl15837 alkPPc superfamily N - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#21916 - CGI_10014131 superfamily 247095 1 215 6.35E-54 192.67 cl15837 alkPPc superfamily C - "Alkaline phosphatase homologues; alkaline phosphatases are non-specific phosphomonoesterases that catalyze the hydrolysis reaction via a phosphoseryl intermediate to produce inorganic phosphate and the corresponding alcohol, optimally at high pH. Alkaline phosphatase exists as a dimer, each monomer binding 2 zinc atoms and one magnesium atom, which are essential for enzymatic activity." Q#21919 - CGI_10012423 superfamily 241858 81 202 9.98E-09 51.132 cl00429 SNARE_assoc superfamily - - SNARE associated Golgi protein; This is a family of SNARE associated Golgi proteins. The yeast member of this family localises with the t-SNARE Tlg2. Q#21922 - CGI_10012426 superfamily 243035 59 163 2.10E-10 54.163 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#21925 - CGI_10012429 superfamily 243072 181 354 6.82E-24 100.536 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21925 - CGI_10012429 superfamily 243072 86 241 4.43E-23 98.2246 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#21925 - CGI_10012429 superfamily 245201 2118 2378 6.99E-137 430.101 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21927 - CGI_10012431 superfamily 245670 146 340 2.48E-49 175.462 cl11519 DENN superfamily - - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#21927 - CGI_10012431 superfamily 243635 6 96 5.42E-21 90.4716 cl04085 uDENN superfamily - - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#21927 - CGI_10012431 superfamily 208095 417 485 3.64E-09 55.3738 cl04084 dDENN superfamily - - dDENN domain; This region is always found associated with pfam02141. It is predicted to form a globular domain. This domain is predicted to be completely alpha helical. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. Q#21928 - CGI_10012432 superfamily 243040 29 168 2.48E-44 157.914 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#21928 - CGI_10012432 superfamily 215647 211 348 0.00111597 40.6697 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#21929 - CGI_10012433 superfamily 245244 102 195 1.25E-20 83.091 cl10045 tRNA_int_endo superfamily - - "tRNA intron endonuclease, catalytic C-terminal domain; Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9." Q#21929 - CGI_10012433 superfamily 217225 45 92 0.00690317 33.3655 cl03709 tRNA_int_endo_N superfamily - - "tRNA intron endonuclease, N-terminal domain; Members of this family cleave pre tRNA at the 5' and 3' splice sites to release the intron EC:3.1.27.9." Q#21931 - CGI_10012435 superfamily 243092 59 352 5.65E-37 137.466 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21931 - CGI_10012435 superfamily 217935 353 440 1.80E-30 112.754 cl04424 Sof1 superfamily - - Sof1-like domain; Sof1 is essential for cell growth and is a component of the nucleolar rRNA processing machinery. Q#21932 - CGI_10012436 superfamily 247727 8 105 1.57E-13 63.2178 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#21933 - CGI_10000090 superfamily 243092 38 280 3.10E-13 68.1304 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21934 - CGI_10006288 superfamily 246669 408 545 3.18E-83 257.682 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#21934 - CGI_10006288 superfamily 246669 252 375 6.02E-34 124.661 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#21936 - CGI_10006290 superfamily 220107 488 754 1.35E-60 206.116 cl07638 MIF4G_like_2 superfamily - - "MIF4G like; Members of this family are involved in mediating U snRNA export from the nucleus. They adopt a highly helical structure, wherein the polypeptide chain forms a right-handed solenoid. At the tertiary level, the domain is composed of a superhelical arrangement of successive antiparallel pairs of helices." Q#21936 - CGI_10006290 superfamily 149958 307 474 1.75E-58 197.624 cl07636 MIF4G_like superfamily - - "MIF4G like; Members of this family are involved in mediating U snRNA export from the nucleus. They adopt a highly helical structure, wherein the polypeptide chain forms a right-handed solenoid. At the tertiary level, the domain is composed of a superhelical arrangement of successive antiparallel pairs of helices." Q#21936 - CGI_10006290 superfamily 243128 31 243 1.37E-24 102.824 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#21937 - CGI_10006291 superfamily 219626 69 108 3.29E-14 61.8127 cl12354 DUF1674 superfamily - - Protein of unknown function (DUF1674); The members of this family are sequences derived from hypothetical eukaryotic and bacterial proteins. The region in question is approximately 60 residues long. Q#21939 - CGI_10006294 superfamily 241802 187 296 7.88E-28 109.499 cl00342 Trp-synth-beta_II superfamily N - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#21939 - CGI_10006294 superfamily 245874 21 64 7.42E-05 40.1022 cl12111 TNFR superfamily NC - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#21940 - CGI_10006295 superfamily 216939 34 86 0.00506249 33.4053 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#21942 - CGI_10006297 superfamily 248458 50 110 1.66E-05 44.6121 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#21942 - CGI_10006297 superfamily 242406 137 275 4.55E-21 87.2617 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#21945 - CGI_10001341 superfamily 243179 112 217 7.10E-19 78.7002 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#21946 - CGI_10010295 superfamily 247057 767 834 2.19E-38 137.919 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#21946 - CGI_10010295 superfamily 247744 2 45 2.33E-07 50.6893 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#21946 - CGI_10010295 superfamily 247744 42 74 0.00345046 38.3629 cl17190 NK superfamily N - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#21947 - CGI_10010296 superfamily 245342 424 526 1.35E-13 69.2998 cl10594 ERCC4 superfamily - - ERCC4 domain; This domain is a family of nucleases. The family includes EME1 which is an essential component of a Holliday junction resolvase. EME1 interacts with MUS81 to form a DNA structure-specific endonuclease. Q#21947 - CGI_10010296 superfamily 241563 239 288 0.00222887 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#21949 - CGI_10010298 superfamily 247724 3 177 3.32E-131 368.432 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#21950 - CGI_10010299 superfamily 247856 47 105 1.43E-10 54.0909 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#21951 - CGI_10010300 superfamily 245819 284 431 0.00195876 38.3291 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#21951 - CGI_10010300 superfamily 245201 1 180 1.09E-23 100.304 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21952 - CGI_10010301 superfamily 247850 11 90 1.23E-10 57.0671 cl17296 PBP_like_2 superfamily C - PBP superfamily domain; This domain belongs to the periplasmic binding protein superfamily. Q#21953 - CGI_10010302 superfamily 245819 704 851 0.0016311 39.0995 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#21953 - CGI_10010302 superfamily 245201 370 600 4.74E-35 135.357 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#21953 - CGI_10010302 superfamily 247850 7 224 3.83E-26 110.225 cl17296 PBP_like_2 superfamily N - PBP superfamily domain; This domain belongs to the periplasmic binding protein superfamily. Q#21954 - CGI_10010303 superfamily 247850 14 104 0.000952721 35.5735 cl17296 PBP_like_2 superfamily C - PBP superfamily domain; This domain belongs to the periplasmic binding protein superfamily. Q#21955 - CGI_10010304 superfamily 247850 7 45 0.00347807 34.0327 cl17296 PBP_like_2 superfamily C - PBP superfamily domain; This domain belongs to the periplasmic binding protein superfamily. Q#21956 - CGI_10010305 superfamily 241884 18 205 6.66E-111 318.4 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#21957 - CGI_10010306 superfamily 241622 15 88 6.54E-18 74.1402 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#21958 - CGI_10010307 superfamily 241628 18 300 1.24E-15 75.2026 cl00130 PseudoU_synth superfamily - - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#21960 - CGI_10010309 superfamily 192370 37 116 6.44E-17 73.2257 cl12371 WD-3 superfamily C - "WD-repeat region; This entry is of a region of approximately 100 residues containing three WD repeats and six cysteine residues possibly as three cystine-bridges. These regions are contained within the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. The WD repeats are required for interaction with other subunits of the FA complex." Q#21961 - CGI_10010310 superfamily 241628 99 317 1.21E-47 160.966 cl00130 PseudoU_synth superfamily - - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#21963 - CGI_10010312 superfamily 192370 3 146 2.31E-40 141.406 cl12371 WD-3 superfamily N - "WD-repeat region; This entry is of a region of approximately 100 residues containing three WD repeats and six cysteine residues possibly as three cystine-bridges. These regions are contained within the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2. The WD repeats are required for interaction with other subunits of the FA complex." Q#21963 - CGI_10010312 superfamily 204746 154 222 1.32E-31 111.744 cl18258 FANCL_C superfamily - - "FANCL C-terminal domain; This domain is found at the C-terminus of the Fancl protein in humans which is the putative E3 ubiquitin ligase subunit of the FA complex (Fanconi anaemia). Eight subunits of the Fanconi anaemia gene products form a multisubunit nuclear complex which is required for mono-ubiquitination of a downstream FA protein, FANCD2." Q#21964 - CGI_10010313 superfamily 243166 42 231 2.33E-16 73.867 cl02759 TRAM_LAG1_CLN8 superfamily - - TLC domain; TLC domain. Q#21965 - CGI_10010314 superfamily 243166 56 248 6.29E-19 81.5709 cl02759 TRAM_LAG1_CLN8 superfamily - - TLC domain; TLC domain. Q#21966 - CGI_10010315 superfamily 243166 57 243 2.22E-18 80.0301 cl02759 TRAM_LAG1_CLN8 superfamily - - TLC domain; TLC domain. Q#21967 - CGI_10010316 superfamily 246675 9 308 1.31E-135 390.037 cl14615 PI-PLCc_GDPD_SF superfamily - - "Catalytic domain of phosphoinositide-specific phospholipase C-like phosphodiesterases superfamily; The PI-PLC-like phosphodiesterases superfamily represents the catalytic domains of bacterial phosphatidylinositol-specific phospholipase C (PI-PLC, EC 4.6.1.13), eukaryotic phosphoinositide-specific phospholipase C (PI-PLC, EC 3.1.4.11), glycerophosphodiester phosphodiesterases (GP-GDE, EC 3.1.4.46), sphingomyelinases D (SMases D) (sphingomyelin phosphodiesterase D, EC 3.1.4.41) from spider venom, SMases D-like proteins, and phospholipase D (PLD) from several pathogenic bacteria, as well as their uncharacterized homologs found in organisms ranging from bacteria and archaea to metazoans, plants, and fungi. PI-PLCs are ubiquitous enzymes hydrolyzing the membrane lipid phosphoinositides to yield two important second messengers, inositol phosphates and diacylglycerol (DAG). GP-GDEs play essential roles in glycerol metabolism and catalyze the hydrolysis of glycerophosphodiesters to sn-glycerol-3-phosphate (G3P) and the corresponding alcohols that are major sources of carbon and phosphate. Both, PI-PLCs and GP-GDEs, can hydrolyze the 3'-5' phosphodiester bonds in different substrates, and utilize a similar mechanism of general base and acid catalysis with conserved histidine residues, which consists of two steps, a phosphotransfer and a phosphodiesterase reaction. This superfamily also includes Neurospora crassa ankyrin repeat protein NUC-2 and its Saccharomyces cerevisiae counterpart, Phosphate system positive regulatory protein PHO81, glycerophosphodiester phosphodiesterase (GP-GDE)-like protein SHV3 and SHV3-like proteins (SVLs). The residues essential for enzyme activities and metal binding are not conserved in these sequence homologs, which might suggest that the function of catalytic domains in these proteins might be distinct from those in typical PLC-like phosphodiesterases." Q#21968 - CGI_10010317 superfamily 243092 14 308 7.50E-50 169.053 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#21970 - CGI_10004502 superfamily 241574 340 509 2.88E-69 224.002 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#21971 - CGI_10006767 superfamily 241736 88 251 8.00E-78 235.739 cl00263 TFold superfamily - - "Tunnelling fold (T-fold). The five known T-folds are found in five different enzymes with different functions: dihydroneopterin-triphosphate epimerase (DHNTPE), dihydroneopterin aldolase (DHNA) , GTP cyclohydrolase I (GTPCH-1), 6-pyrovoyl tetrahydropterin synthetase (PTPS), and uricase (UO,uroate/urate oxidase). They bind to substrates belonging to the purine or pterin families, and share a fold-related binding site with a glutamate or glutamine residue anchoring the substrate and a lot of conserved interactions. They also share a similar oligomerization mode: several T-folds join together to form a beta(2n)alpha(n) barrel, then two barrels join together in a head-to-head fashion to made up the native enzymes. The functional enzyme is a tetramer for UO, a hexamer for PTPS, an octamer for DHNA/DHNTPE and a decamer for GTPCH-1. The substrate is located in a deep and narrow pocket at the interface between monomers. In PTPS, the active site is located at the interface of three monomers, two from one trimer and one from the other trimer. In GTPCH-1, it is also located at the interface of three subunits, two from one pentamer and one from the other pentamer. There are four equivalent active sites in UO, six in PTPS, eight in DHNA/DHNTPE and ten in GTPCH-1. Each globular multimeric enzyme encloses a tunnel which is lined with charged residues for DHNA and UO, and with basic residues in PTPS. The N and C-terminal ends are located on one side of the T-fold while the residues involved in the catalytic activity are located at the opposite side. In PTPS, UO and DHNA/DHNTPE, the N and C-terminal extremities of the enzyme are located on the exterior side of the functional multimeric enzyme. In GTPCH-1, the extra C-terminal helix places the extremity inside the tunnel." Q#21972 - CGI_10006768 superfamily 247057 362 424 1.01E-34 126.673 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#21972 - CGI_10006768 superfamily 204178 425 535 9.86E-05 41.3642 cl07759 PHAT superfamily - - "PHAT; The PHAT (pseudo-HEAT analogous topology) domain assumes a structure consisting of a layer of three parallel helices packed against a layer of two antiparallel helices, into a cylindrical shaped five-helix bundle. It is found in the RNA-binding protein Smaug, where it is essential for high-affinity RNA binding." Q#21974 - CGI_10006770 superfamily 215754 23 107 4.80E-26 99.6352 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21974 - CGI_10006770 superfamily 215754 211 306 6.30E-21 84.9976 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21974 - CGI_10006770 superfamily 215754 121 204 2.94E-14 66.8932 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#21975 - CGI_10006771 superfamily 241773 40 148 1.95E-42 138.42 cl00312 Ribosomal_S12_like superfamily - - "Ribosomal protein S12-like family; composed of prokaryotic 30S ribosomal protein S12, eukaryotic 40S ribosomal protein S23 and similar proteins. S12 and S23 are located at the interface of the large and small ribosomal subunits, adjacent to the decoding center. They play an important role in translocation during the peptide elongation step of protein synthesis. They are also involved in important RNA and protein interactions. Ribosomal protein S12 is essential for maintenance of a pretranslocation state and, together with S13, functions as a control element for the rRNA- and tRNA-driven movements of translocation. S23 interacts with domain III of the eukaryotic elongation factor 2 (eEF2), which catalyzes translocation. Mutations in S12 and S23 have been found to affect translational accuracy. Antibiotics such as streptomycin may also bind S12/S23 and cause the ribosome to misread the genetic code." Q#21976 - CGI_10006772 superfamily 247723 74 156 8.10E-40 141.978 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#21976 - CGI_10006772 superfamily 243087 688 758 5.35E-22 91.2527 cl02562 PWI superfamily - - PWI domain; PWI domain. Q#21985 - CGI_10001104 superfamily 241619 25 94 3.63E-06 42.4569 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#21985 - CGI_10001104 superfamily 241568 132 184 0.00902469 32.5472 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#21989 - CGI_10025458 superfamily 216981 57 209 7.41E-11 59.4686 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#21995 - CGI_10025464 superfamily 246902 23 89 2.50E-24 92.2968 cl15239 PLDc_SF superfamily N - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#21999 - CGI_10025468 superfamily 245816 4 163 1.44E-44 147.03 cl11964 CYTH-like_Pase superfamily - - "CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases; CYTH-like superfamily enzymes hydrolyze triphosphate-containing substrates and require metal cations as cofactors. They have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB), and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions." Q#22000 - CGI_10025469 superfamily 242274 1 130 7.14E-05 40.0882 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#22001 - CGI_10025470 superfamily 193607 573 704 1.50E-63 208.965 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#22001 - CGI_10025470 superfamily 247792 526 567 9.64E-05 40.892 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22001 - CGI_10025470 superfamily 241554 277 458 2.34E-47 165.136 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#22002 - CGI_10025471 superfamily 193607 7 138 4.84E-61 187.009 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#22003 - CGI_10025472 superfamily 241554 233 408 7.87E-42 148.958 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#22003 - CGI_10025472 superfamily 241554 468 606 6.62E-15 73.0735 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#22004 - CGI_10025473 superfamily 241600 4 139 1.88E-52 167.032 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#22005 - CGI_10025474 superfamily 241694 6 279 2.83E-96 297.568 cl00216 L-asparaginase_like superfamily - - "Bacterial L-asparaginases and related enzymes; Asparaginases (amidohydrolases, E.C. 3.5.1.1) are dimeric or tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localized to the periplasm (type II L-asparaginase), and a second (asparaginase- glutaminase) present in the cytosol (type I L-asparaginase) that hydrolyzes both asparagine and glutamine with similar specificities and has a lower affinity for its substrate. Bacterial L-asparaginases (type II) are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL). A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. This wider family also includes a subunit of an archaeal Glu-tRNA amidotransferase." Q#22005 - CGI_10025474 superfamily 243072 356 512 6.60E-28 108.24 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22006 - CGI_10025475 superfamily 241694 29 372 1.50E-100 311.05 cl00216 L-asparaginase_like superfamily - - "Bacterial L-asparaginases and related enzymes; Asparaginases (amidohydrolases, E.C. 3.5.1.1) are dimeric or tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localized to the periplasm (type II L-asparaginase), and a second (asparaginase- glutaminase) present in the cytosol (type I L-asparaginase) that hydrolyzes both asparagine and glutamine with similar specificities and has a lower affinity for its substrate. Bacterial L-asparaginases (type II) are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL). A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. This wider family also includes a subunit of an archaeal Glu-tRNA amidotransferase." Q#22006 - CGI_10025475 superfamily 243072 449 605 9.73E-28 108.625 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22007 - CGI_10025477 superfamily 241694 94 282 4.77E-59 194.034 cl00216 L-asparaginase_like superfamily N - "Bacterial L-asparaginases and related enzymes; Asparaginases (amidohydrolases, E.C. 3.5.1.1) are dimeric or tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localized to the periplasm (type II L-asparaginase), and a second (asparaginase- glutaminase) present in the cytosol (type I L-asparaginase) that hydrolyzes both asparagine and glutamine with similar specificities and has a lower affinity for its substrate. Bacterial L-asparaginases (type II) are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL). A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. This wider family also includes a subunit of an archaeal Glu-tRNA amidotransferase." Q#22007 - CGI_10025477 superfamily 241694 36 79 0.00118758 38.6772 cl00216 L-asparaginase_like superfamily C - "Bacterial L-asparaginases and related enzymes; Asparaginases (amidohydrolases, E.C. 3.5.1.1) are dimeric or tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localized to the periplasm (type II L-asparaginase), and a second (asparaginase- glutaminase) present in the cytosol (type I L-asparaginase) that hydrolyzes both asparagine and glutamine with similar specificities and has a lower affinity for its substrate. Bacterial L-asparaginases (type II) are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL). A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. This wider family also includes a subunit of an archaeal Glu-tRNA amidotransferase." Q#22008 - CGI_10025478 superfamily 243072 247 371 3.19E-27 104.773 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22008 - CGI_10025478 superfamily 243072 346 403 1.52E-07 49.3042 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22008 - CGI_10025478 superfamily 241694 41 181 9.04E-50 173.234 cl00216 L-asparaginase_like superfamily N - "Bacterial L-asparaginases and related enzymes; Asparaginases (amidohydrolases, E.C. 3.5.1.1) are dimeric or tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localized to the periplasm (type II L-asparaginase), and a second (asparaginase- glutaminase) present in the cytosol (type I L-asparaginase) that hydrolyzes both asparagine and glutamine with similar specificities and has a lower affinity for its substrate. Bacterial L-asparaginases (type II) are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL). A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. This wider family also includes a subunit of an archaeal Glu-tRNA amidotransferase." Q#22009 - CGI_10025479 superfamily 217046 180 348 6.68E-46 156.199 cl03599 Reticulon superfamily - - "Reticulon; Reticulon, also know as neuroendocrine-specific protein (NSP), is a protein of unknown function which associates with the endoplasmic reticulum. This family represents the C-terminal domain of the three reticulon isoforms and their homologues." Q#22010 - CGI_10025480 superfamily 241782 159 523 2.57E-154 448.933 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#22011 - CGI_10025481 superfamily 241782 54 377 1.12E-136 402.218 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#22014 - CGI_10025484 superfamily 241862 428 585 1.79E-21 93.96 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#22016 - CGI_10025486 superfamily 241622 6 84 3.37E-19 76.0662 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#22017 - CGI_10025487 superfamily 217247 373 488 9.76E-07 50.4694 cl18397 Glyco_hydro_2_C superfamily C - "Glycosyl hydrolases family 2, TIM barrel domain; This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities." Q#22017 - CGI_10025487 superfamily 217248 67 131 0.00521736 37.2503 cl18398 Glyco_hydro_2_N superfamily NC - "Glycosyl hydrolases family 2, sugar binding domain; This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities and has a jelly-roll fold." Q#22018 - CGI_10025488 superfamily 217247 371 485 1.15E-06 49.3138 cl18397 Glyco_hydro_2_C superfamily C - "Glycosyl hydrolases family 2, TIM barrel domain; This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities." Q#22018 - CGI_10025488 superfamily 217248 79 143 0.0097869 35.7095 cl18398 Glyco_hydro_2_N superfamily NC - "Glycosyl hydrolases family 2, sugar binding domain; This family contains beta-galactosidase, beta-mannosidase and beta-glucuronidase activities and has a jelly-roll fold." Q#22021 - CGI_10025491 superfamily 243098 14 63 0.000117468 39.8888 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#22025 - CGI_10025495 superfamily 248097 115 229 9.09E-10 53.8453 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22026 - CGI_10025496 superfamily 201217 13 61 2.03E-07 47.1352 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22026 - CGI_10025496 superfamily 201217 65 121 3.51E-05 40.5868 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22027 - CGI_10025497 superfamily 248312 20 173 4.46E-05 40.7997 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#22029 - CGI_10025499 superfamily 247736 174 219 9.83E-07 44.6278 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#22031 - CGI_10025501 superfamily 191430 14 66 0.00746089 32.833 cl05523 Gly_acyl_tr_N superfamily C - Aralkyl acyl-CoA:amino acid N-acyltransferase; This family consists of several mammalian specific aralkyl acyl-CoA:amino acid N-acyltransferase (glycine N-acyltransferase) proteins EC:2.3.1.13. Q#22032 - CGI_10025502 superfamily 247736 163 232 7.69E-10 53.4874 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#22035 - CGI_10025505 superfamily 247736 88 152 5.09E-11 55.4134 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#22036 - CGI_10025506 superfamily 247736 128 195 1.38E-13 63.1174 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#22039 - CGI_10025509 superfamily 241580 137 214 6.37E-47 156.945 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#22039 - CGI_10025509 superfamily 220199 334 391 5.13E-05 40.9148 cl09612 HNF_C superfamily - - "HNF3 C-terminal domain; This presumed domain is found in the C-terminal region of Hepatocyte Nuclear Factor 3 alpha and beta chains. Its specific function is uncertain. The N-terminal region of this presumed domain contains an EH1 (engrailed homology 1) motif, that is characterized by the FxIxxIL sequence." Q#22045 - CGI_10025515 superfamily 145533 23 120 1.07E-40 142.888 cl03592 Ski_Sno superfamily - - "SKI/SNO/DAC family; This family contains a presumed domain that is about 100 amino acids long. All members of this family contain a conserved CLPQ motif. The c-ski proto-oncogene has been shown to influence proliferation, morphological transformation and myogenic differentiation. Sno, a Ski proto-oncogene homologue, is expressed in two isoforms and plays a role in the response to proliferation stimuli. Dachshund also contains this domain. It is involved in various aspects of development." Q#22045 - CGI_10025515 superfamily 198898 131 222 5.68E-40 140.966 cl07406 c-SKI_SMAD_bind superfamily - - c-SKI Smad4 binding domain; c-SKI is an oncoprotein that inhibits TGF-beta signaling through interaction with Smad proteins. This domain binds to Smad4 Q#22046 - CGI_10007338 superfamily 241874 8 485 2.60E-176 508.983 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#22047 - CGI_10007339 superfamily 247724 147 178 1.51E-05 42.162 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22048 - CGI_10007340 superfamily 247724 44 236 6.37E-46 154.099 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22050 - CGI_10007342 superfamily 247792 16 58 1.55E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22051 - CGI_10007343 superfamily 247792 16 55 3.47E-08 51.6776 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22052 - CGI_10010533 superfamily 243066 46 133 4.33E-27 104.557 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22052 - CGI_10010533 superfamily 219619 382 451 2.97E-11 59.5287 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#22054 - CGI_10010536 superfamily 241750 6 121 1.31E-16 72.221 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#22057 - CGI_10010540 superfamily 245531 81 152 8.22E-06 41.1954 cl11158 BEN superfamily - - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#22060 - CGI_10017629 superfamily 246908 548 696 1.97E-12 64.913 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#22060 - CGI_10017629 superfamily 145817 454 526 0.00288539 38.6149 cl03748 STAT_bind superfamily NC - "STAT protein, DNA binding domain; STAT proteins (Signal Transducers and Activators of Transcription) are a family of transcription factors that are specifically activated to regulate gene transcription when cells encounter cytokines and growth factors. This family represents the DNA binding domain of STAT, which has an ig-like fold. STAT proteins also include an SH2 domain pfam00017." Q#22061 - CGI_10017630 superfamily 246908 19 101 6.79E-14 67.9946 cl15255 SH2 superfamily C - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#22062 - CGI_10017631 superfamily 247856 1007 1065 4.72E-07 49.4685 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22062 - CGI_10017631 superfamily 247856 552 610 1.67E-06 47.5425 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22062 - CGI_10017631 superfamily 247856 428 483 0.000101497 42.1497 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22062 - CGI_10017631 superfamily 247856 1718 1778 0.000454124 40.2237 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22062 - CGI_10017631 superfamily 247856 897 957 0.000528594 40.2237 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22062 - CGI_10017631 superfamily 247856 318 376 0.000732798 39.8385 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22062 - CGI_10017631 superfamily 117542 1486 1595 2.50E-07 50.9703 cl07545 DUF1880 superfamily - - Domain of unknown function (DUF1880); This domain is found predominantly in DJ binding protein. It has no known function. Q#22062 - CGI_10017631 superfamily 241707 151 218 5.27E-05 45.6377 cl00230 CIS_IPPS superfamily N - "Cis (Z)-Isoprenyl Diphosphate Synthases (cis-IPPS); homodimers which catalyze the successive 1'-4 condensation of the isopentenyl diphosphate (IPP) molecule to trans,trans-farnesyl diphosphate (FPP) or to cis,trans-FPP to form long-chain polyprenyl diphosphates. A few can also catalyze the condensation of IPP to trans-geranyl diphosphate to form the short-chain cis,trans- FPP. In prokaryotes, the cis-IPPS, undecaprenyl diphosphate synthase (UPP synthase) catalyzes the formation of the carrier lipid UPP in bacterial cell wall peptidooglycan biosynthesis. Similarly, in eukaryotes, the cis-IPPS, dehydrodolichyl diphosphate (dedol-PP) synthase catalyzes the formation of the polyisoprenoid glycosyl carrier lipid dolichyl monophosphate. cis-IPPS are mechanistically and structurally distinct from trans-IPPS, lacking the DDXXD motifs, yet requiring Mg2+ for activity." Q#22062 - CGI_10017631 superfamily 247856 662 722 0.00193988 38.2977 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22062 - CGI_10017631 superfamily 247856 1596 1657 0.00242447 38.2977 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22062 - CGI_10017631 superfamily 247856 1113 1173 0.00969477 36.3717 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22063 - CGI_10017632 superfamily 247744 74 212 7.38E-19 87.2922 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#22063 - CGI_10017632 superfamily 247744 1471 1632 5.35E-17 81.8994 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#22063 - CGI_10017632 superfamily 247744 1136 1216 1.64E-10 61.4838 cl17190 NK superfamily NC - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#22063 - CGI_10017632 superfamily 247744 1028 1073 1.72E-07 52.239 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#22063 - CGI_10017632 superfamily 247807 431 465 6.39E-06 46.5193 cl17253 AAA_17 superfamily C - AAA domain; AAA domain. Q#22063 - CGI_10017632 superfamily 221381 961 1017 2.17E-05 47.0008 cl13455 DUF3508 superfamily N - Domain of unknown function (DUF3508); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 280 amino acids in length. This domain has two conserved sequence motifs: GFC and GLL. This family is also known as UPF0704. Q#22064 - CGI_10017633 superfamily 247848 52 139 0.00478903 37.45 cl17294 PhosphMutase superfamily N - "2,3-bisphosphoglycerate-independent phosphoglycerate mutase; Members of this family are found in various bacterial 2,3-bisphosphoglycerate-independent phosphoglycerate mutase enzymes, which catalyze the interconversion of 2-phosphoglycerate and 3-phosphoglycerate in the reaction: [2-phospho-D-glycerate + 2,3-diphosphoglycerate = 3-phospho-D-glycerate + 2,3-diphosphoglycerate]." Q#22065 - CGI_10017634 superfamily 247805 142 230 6.37E-05 41.7014 cl17251 DEXDc superfamily NC - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#22066 - CGI_10017635 superfamily 242203 21 297 5.81E-71 225.256 cl00935 Brix superfamily - - Brix domain; Brix domain. Q#22068 - CGI_10017637 superfamily 248022 11 368 7.69E-47 167.455 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#22069 - CGI_10017638 superfamily 241607 157 185 7.51E-05 39.9458 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#22069 - CGI_10017638 superfamily 241607 23 48 0.00351083 34.9382 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#22070 - CGI_10017639 superfamily 241607 72 91 2.97E-06 39.9458 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#22071 - CGI_10017640 superfamily 241832 3 97 1.08E-29 111.594 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#22072 - CGI_10017641 superfamily 219275 317 640 6.96E-99 315.829 cl06188 ORC3_N superfamily - - Origin recognition complex (ORC) subunit 3 N-terminus; This family represents the N-terminus (approximately 300 residues) of subunit 3 of the eukaryotic origin recognition complex (ORC). Origin recognition complex (ORC) is composed of six subunits that are essential for cell viability. They collectively bind to the autonomously replicating sequence (ARS) in a sequence-specific manner and lead to the chromatin loading of other replication factors that are essential for initiation of DNA replication. Q#22073 - CGI_10017642 superfamily 218079 21 155 3.15E-22 88.4936 cl04507 CHD5 superfamily - - CHD5-like protein; Members of this family are probably coiled-coil proteins that are similar to the CHD5 (Congenital heart disease 5) protein. In Saccharomyces cerevisiae this protein localises to the ER and is thought to play a homeostatic role. Q#22074 - CGI_10017643 superfamily 222150 847 872 0.000160532 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22074 - CGI_10017643 superfamily 222150 875 900 0.00284073 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22075 - CGI_10017644 superfamily 241596 90 126 3.71E-05 39.8899 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#22078 - CGI_10017647 superfamily 241596 99 142 6.54E-10 52.2163 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#22079 - CGI_10017648 superfamily 241596 17 58 2.81E-13 61.0759 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#22080 - CGI_10017650 superfamily 241596 2 43 3.33E-15 65.3131 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#22083 - CGI_10017653 superfamily 246664 2 78 3.30E-36 126.535 cl14561 An_peroxidase_like superfamily N - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#22084 - CGI_10017654 superfamily 243072 118 250 3.33E-30 112.477 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22084 - CGI_10017654 superfamily 243072 189 318 1.46E-28 107.855 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22088 - CGI_10017658 superfamily 245201 21 216 2.93E-39 143.146 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22089 - CGI_10017659 superfamily 241659 275 357 4.05E-19 82.5954 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#22090 - CGI_10017660 superfamily 241754 14 350 2.28E-161 486.214 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#22090 - CGI_10017660 superfamily 242185 1145 1179 5.87E-05 42.9054 cl00910 Multi_Drug_Res superfamily N - "Small Multidrug Resistance protein; This family is the Small Multidrug Resistance (SMR) family. Several members have been shown to export a range of toxins, including ethidium bromide and quaternary ammonium compounds, through coupling with proton influx." Q#22091 - CGI_10017661 superfamily 246598 6 189 1.40E-63 197.132 cl13996 MPN superfamily - - "Mpr1p, Pad1p N-terminal (MPN) domains; MPN (also known as Mov34, PAD-1, JAMM, JAB, MPN+) domains are found in the N-terminal termini of proteins with a variety of functions; they are components of the proteasome regulatory subunits, the signalosome (CSN), eukaryotic translation initiation factor 3 (eIF3) complexes, and regulators of transcription factors. These domains are isopeptidases that release ubiquitin from ubiquitinated proteins (thus having deubiquitinating (DUB) activity) that are tagged for degradation. Catalytically active MPN domains contain a metalloprotease signature known as the JAB1/MPN/Mov34 metalloenzyme (JAMM) motif. For example, Rpn11 (also known as POH1 or PSMD14), a subunit of the 19S proteasome lid is involved in the ATP-dependent degradation of ubiquitinated proteins, contains the conserved JAMM motif involved in zinc ion coordination. Poh1 is a regulator of c-Jun, an important regulator of cell proliferation, differentiation, survival and death. JAB1 is a component of the COP9 signalosome (CSN), a regulatory particle of the ubiquitin (Ub)/26S proteasome system occurring in all eukaryotic cells; it cleaves the ubiquitin-like protein NEDD8 from the cullin subunit of the SCF (Skp1, Cullins, F-box proteins) family of E3 ubiquitin ligases. AMSH (associated molecule with the SH3 domain of STAM, also known as STAMBP), a member of JAMM/MPN+ deubiquitinases (DUBs), specifically cleaves Lys 63-linked polyubiquitin (poly-Ub) chains, thus facilitating the recycling and subsequent trafficking of receptors to the cell surface. Similarly, BRCC36, part of the nuclear complex that includes BRCA1 protein and is targeted to DNA damage foci after irradiation, specifically disassembles K63-linked polyUb. BRCC36 is aberrantly expressed in sporadic breast tumors, indicative of a potential role in the pathogenesis of the disease. Some variants of the JAB1/MPN domains lack key residues in their JAMM motif and are unable to coordinate a metal ion. Comparisons of key catalytic and metal binding residues explain why the MPN-containing proteins Mov34/PSMD7, Rpn8, CSN6, Prp8p, and the translation initiation factor 3 subunits f (p47) and h (p40) do not show catalytic isopeptidase activity. It has been proposed that the MPN domain in these proteins has a primarily structural function." Q#22092 - CGI_10017662 superfamily 245206 3 271 4.61E-139 397.202 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#22093 - CGI_10017663 superfamily 221913 164 375 2.58E-65 209.319 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#22093 - CGI_10017663 superfamily 222258 116 153 3.52E-05 42.9404 cl18656 AAA_30 superfamily NC - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#22094 - CGI_10017664 superfamily 216112 102 153 1.93E-08 55.3803 cl02964 RNB superfamily N - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#22096 - CGI_10017666 superfamily 221913 485 696 1.72E-60 203.156 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#22096 - CGI_10017666 superfamily 222005 254 314 2.41E-05 43.1096 cl18632 AAA_19 superfamily C - Part of AAA domain; Part of AAA domain. Q#22096 - CGI_10017666 superfamily 222258 397 474 3.32E-05 44.096 cl18656 AAA_30 superfamily C - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#22097 - CGI_10017667 superfamily 216112 919 1021 8.89E-15 75.4107 cl02964 RNB superfamily C - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#22098 - CGI_10017668 superfamily 241782 23 390 1.07E-48 169.829 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#22099 - CGI_10007421 superfamily 241564 23 88 8.21E-30 108.507 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#22099 - CGI_10007421 superfamily 247792 236 272 0.000242751 37.8104 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22100 - CGI_10007422 superfamily 246669 231 364 1.57E-77 238.562 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22100 - CGI_10007422 superfamily 246669 98 222 2.93E-60 193.25 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22101 - CGI_10007423 superfamily 243092 399 694 5.26E-65 220.669 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22101 - CGI_10007423 superfamily 243092 36 220 1.70E-40 151.719 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22101 - CGI_10007423 superfamily 217837 793 899 1.36E-19 86.1073 cl04367 Utp12 superfamily - - Dip2/Utp12 Family; This domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2. Q#22102 - CGI_10007424 superfamily 241563 63 100 2.02E-05 42.4664 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22103 - CGI_10007425 superfamily 241563 92 135 1.60E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22103 - CGI_10007425 superfamily 128778 133 249 0.00573388 35.3183 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#22104 - CGI_10007426 superfamily 241563 57 100 2.24E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22108 - CGI_10015109 superfamily 243161 11 96 3.12E-16 71.6565 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#22109 - CGI_10015110 superfamily 247856 223 281 2.84E-06 44.0757 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22109 - CGI_10015110 superfamily 246925 16 161 2.15E-17 80.0921 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#22110 - CGI_10015111 superfamily 243066 198 289 5.82E-43 149.626 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22110 - CGI_10015111 superfamily 219619 501 562 7.51E-13 64.9215 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#22111 - CGI_10015112 superfamily 247723 101 183 1.71E-50 165.882 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22111 - CGI_10015112 superfamily 221466 1 92 3.22E-11 59.2023 cl13630 U1snRNP70_N superfamily - - "U1 small nuclear ribonucleoprotein of 70kDa MW N terminal; This domain is found in eukaryotes. This domain is about 90 amino acids in length. This domain is found associated with pfam00076. This domain is part of U1 snRNP, which is the pre-mRNA binding protein of the penta-snRNP spliceosome complex. It extends over a distance of 180 A from its RNA binding domain, wraps around the core domain of U1 snRNP consisting of the seven Sm proteins and finally contacts U1-C, which is crucial for 5'-splice-site recognition." Q#22115 - CGI_10015116 superfamily 243034 749 848 2.89E-10 59.316 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22115 - CGI_10015116 superfamily 243034 1259 1359 1.84E-07 50.8416 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22115 - CGI_10015116 superfamily 243034 518 617 1.24E-05 45.0636 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22115 - CGI_10015116 superfamily 243034 993 1080 3.87E-05 43.5228 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22115 - CGI_10015116 superfamily 243034 331 413 0.00265705 37.7448 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22115 - CGI_10015116 superfamily 243034 1336 1403 0.00877533 36.204 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22117 - CGI_10015118 superfamily 243176 8 533 0 1033.02 cl02777 chaperonin_like superfamily - - "chaperonin_like superfamily. Chaperonins are involved in productive folding of proteins. They share a common general morphology, a double toroid of 2 stacked rings, each composed of 7-9 subunits. There are 2 main chaperonin groups. The symmetry of type I is seven-fold and they are found in eubacteria (GroEL) and in organelles of eubacterial descent (hsp60 and RBP). The symmetry of type II is eight- or nine-fold and they are found in archea (thermosome), thermophilic bacteria (TF55) and in the eukaryotic cytosol (CTT). Their common function is to sequester nonnative proteins inside their central cavity and promote folding by using energy derived from ATP hydrolysis. This superfamily also contains related domains from Fab1-like phosphatidylinositol 3-phosphate (PtdIns3P) 5-kinases that only contain the intermediate and apical domains." Q#22119 - CGI_10015120 superfamily 144129 243 270 0.00969671 33.9095 cl02863 Tubulin-binding superfamily - - "Tau and MAP protein, tubulin-binding repeat; This family includes the vertebrate proteins MAP2, MAP4 and Tau, as well as other animal homologs. MAP4 is present in many tissues but is usually absent from neurons; MAP2 and Tau are mainly neuronal. Members of this family have the ability to bind to and stabilise microtubules. As a result, they are involved in neuronal migration, supporting dendrite elongation, and regulating microtubules during mitotic metaphase. Note that Tau is involved in neurofibrillary tangle formation in Alzheimer's disease and some other dementias. This family features a C-terminal microtubule binding repeat that contains a conserved KXGS motif." Q#22120 - CGI_10015121 superfamily 144129 324 353 0.000101044 40.0727 cl02863 Tubulin-binding superfamily - - "Tau and MAP protein, tubulin-binding repeat; This family includes the vertebrate proteins MAP2, MAP4 and Tau, as well as other animal homologs. MAP4 is present in many tissues but is usually absent from neurons; MAP2 and Tau are mainly neuronal. Members of this family have the ability to bind to and stabilise microtubules. As a result, they are involved in neuronal migration, supporting dendrite elongation, and regulating microtubules during mitotic metaphase. Note that Tau is involved in neurofibrillary tangle formation in Alzheimer's disease and some other dementias. This family features a C-terminal microtubule binding repeat that contains a conserved KXGS motif." Q#22120 - CGI_10015121 superfamily 144129 385 414 0.00225066 35.8355 cl02863 Tubulin-binding superfamily - - "Tau and MAP protein, tubulin-binding repeat; This family includes the vertebrate proteins MAP2, MAP4 and Tau, as well as other animal homologs. MAP4 is present in many tissues but is usually absent from neurons; MAP2 and Tau are mainly neuronal. Members of this family have the ability to bind to and stabilise microtubules. As a result, they are involved in neuronal migration, supporting dendrite elongation, and regulating microtubules during mitotic metaphase. Note that Tau is involved in neurofibrillary tangle formation in Alzheimer's disease and some other dementias. This family features a C-terminal microtubule binding repeat that contains a conserved KXGS motif." Q#22120 - CGI_10015121 superfamily 144129 356 384 0.00287428 35.8355 cl02863 Tubulin-binding superfamily - - "Tau and MAP protein, tubulin-binding repeat; This family includes the vertebrate proteins MAP2, MAP4 and Tau, as well as other animal homologs. MAP4 is present in many tissues but is usually absent from neurons; MAP2 and Tau are mainly neuronal. Members of this family have the ability to bind to and stabilise microtubules. As a result, they are involved in neuronal migration, supporting dendrite elongation, and regulating microtubules during mitotic metaphase. Note that Tau is involved in neurofibrillary tangle formation in Alzheimer's disease and some other dementias. This family features a C-terminal microtubule binding repeat that contains a conserved KXGS motif." Q#22122 - CGI_10015123 superfamily 242372 33 199 1.67E-28 107.457 cl01221 DTW superfamily - - DTW domain; This presumed domain is found in bacterial and eukaryotic proteins. Its function is unknown. The domain contains multiple conserved motifs including a DTXW motif that this domain has been named after. Q#22123 - CGI_10015124 superfamily 247723 145 218 1.01E-33 125.957 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22123 - CGI_10015124 superfamily 241546 454 520 0.00148402 38.4109 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#22125 - CGI_10015126 superfamily 220115 398 542 3.94E-42 147.889 cl07655 N-glycanase_C superfamily - - "Peptide-N-glycosidase F, C terminal; Members of this family adopt an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. They are similar in topology to many viral capsid proteins, as well as lectins and several glucanases. The domain allows the protein to bind sugars and catalyzes the complete removal of N-linked oligosaccharide chains from glycoproteins." Q#22125 - CGI_10015126 superfamily 220114 254 376 9.24E-24 97.4725 cl07654 N-glycanase_N superfamily - - "Peptide-N-glycosidase F, N terminal; Members of this family adopt an eight-stranded antiparallel beta jelly roll configuration, with the beta strands arranged into two sheets. They are similar in topology to many viral capsid proteins, as well as lectins and several glucanases. The domain allows the protein to bind sugars and catalyzes the complete removal of N-linked oligosaccharide chains from glycoproteins." Q#22125 - CGI_10015126 superfamily 244870 103 183 9.70E-14 68.6047 cl08238 PA superfamily N - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#22128 - CGI_10015129 superfamily 220545 266 315 2.17E-24 95.0002 cl15332 DUF2296 superfamily - - "Predicted integral membrane metal-binding protein (DUF2296); This domain, found in various hypothetical bacterial and eukaryotic metal-binding proteins, has no known function." Q#22128 - CGI_10015129 superfamily 221116 46 117 0.00689106 35.3634 cl12981 DUF3021 superfamily N - Protein of unknown function (DUF3021); This is a bacterial family of uncharacterized proteins. Q#22129 - CGI_10015130 superfamily 247999 277 328 4.93E-05 40.273 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#22129 - CGI_10015130 superfamily 245225 30 83 0.00131634 38.9819 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#22131 - CGI_10015132 superfamily 241563 62 101 0.00220982 33.6068 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22137 - CGI_10017116 superfamily 241645 12 76 1.54E-05 37.9274 cl00155 UBQ superfamily N - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#22139 - CGI_10017118 superfamily 222150 306 329 0.00527523 35.0601 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22143 - CGI_10017122 superfamily 246616 1 319 1.12E-31 121.645 cl14105 MetH superfamily - - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#22144 - CGI_10017123 superfamily 246918 353 389 1.05E-05 42.9591 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22145 - CGI_10017124 superfamily 246616 3 261 3.72E-21 90.0589 cl14105 MetH superfamily - - "Methionine synthase I (cobalamin-dependent), methyltransferase domain [Amino acid transport and metabolism]" Q#22146 - CGI_10017125 superfamily 246669 1923 2039 3.16E-05 45.1355 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22146 - CGI_10017125 superfamily 246669 1480 1631 2.74E-53 186.477 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22148 - CGI_10017127 superfamily 242913 121 167 2.42E-22 89.2171 cl02162 Fip1 superfamily - - Fip1 motif; This short motif is about 40 amino acids in length. In the Fip1 protein that is a component of a yeast pre-mRNA polyadenylation factor that directly interacts with poly(A) polymerase. This region of Fip1 is needed for the interaction with the Th1 subunit of the complex and for specific polyadenylation of the cleaved mRNA precursor. Q#22152 - CGI_10017131 superfamily 248097 89 214 1.03E-22 89.6318 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22154 - CGI_10017133 superfamily 128469 350 446 5.64E-25 100.222 cl17971 VPS9 superfamily - - Domain present in VPS9; Domain present in yeast vacuolar sorting protein 9 and other proteins. Q#22155 - CGI_10014336 superfamily 245213 821 853 1.59E-05 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22155 - CGI_10014336 superfamily 245213 1287 1321 2.40E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22155 - CGI_10014336 superfamily 245213 1203 1231 0.00228272 38.0014 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22155 - CGI_10014336 superfamily 243124 84 237 1.02E-18 85.9416 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#22155 - CGI_10014336 superfamily 243065 364 544 1.59E-08 55.1405 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#22155 - CGI_10014336 superfamily 241578 1369 1412 9.34E-06 47.7648 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22155 - CGI_10014336 superfamily 243060 1468 1555 1.37E-05 45.4476 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#22155 - CGI_10014336 superfamily 241578 862 902 1.47E-05 46.9944 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22155 - CGI_10014336 superfamily 245213 1116 1155 0.000155635 41.5656 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22155 - CGI_10014336 superfamily 241578 1323 1371 0.000342606 42.7572 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22155 - CGI_10014336 superfamily 241578 942 986 0.000822324 41.6016 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22155 - CGI_10014336 superfamily 221695 927 950 0.000910639 39.3606 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#22155 - CGI_10014336 superfamily 245213 1074 1107 0.00475642 37.2264 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22156 - CGI_10014337 superfamily 238159 26 456 0 575.89 cl15670 VATPase_H superfamily - - "VATPase_H, regulatory vacuolar ATP synthase subunit H (Vma13p); activation component of the peripheral V1 complex of V-ATPase, a heteromultimeric enzyme which uses ATP to actively transport protons into organelles and extracellular compartments. The topology is that of a superhelical spiral, in part the geometry is similar to superhelices composed of armadillo repeat motifs, as found in importins for example." Q#22159 - CGI_10014340 superfamily 241629 108 245 1.37E-45 156.52 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#22160 - CGI_10014341 superfamily 245213 257 286 0.000186571 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22160 - CGI_10014341 superfamily 241629 41 139 4.38E-32 118.212 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#22161 - CGI_10014342 superfamily 247912 44 405 9.84E-45 159.589 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#22164 - CGI_10014345 superfamily 241584 645 733 0.000492401 39.4019 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22164 - CGI_10014345 superfamily 241571 736 823 0.000191267 40.8587 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#22167 - CGI_10014348 superfamily 247725 19 147 4.89E-50 169.479 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22167 - CGI_10014348 superfamily 245201 187 436 4.21E-65 214.306 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22168 - CGI_10000558 superfamily 246936 415 435 0.00964864 34.7816 cl15354 CBS_pair superfamily C - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#22170 - CGI_10000886 superfamily 247755 252 489 1.58E-147 425.415 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22170 - CGI_10000886 superfamily 216049 1 207 7.96E-44 157.062 cl18356 ABC_membrane superfamily N - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#22173 - CGI_10019912 superfamily 216939 70 127 1.88E-07 44.5761 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#22173 - CGI_10019912 superfamily 216939 5 55 0.000520262 35.3313 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#22175 - CGI_10019914 superfamily 247743 161 279 4.05E-11 59.8523 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#22177 - CGI_10019916 superfamily 241845 95 259 4.28E-40 140.934 cl00407 tRNA_m1G_MT superfamily - - "tRNA (Guanine-1)-methyltransferase; This is a family of tRNA (Guanine-1)-methyltransferases EC:2.1.1.31. In E.coli K12 this enzyme catalyzes the conversion of a guanosine residue to N1-methylguanine in position 37, next to the anticodon, in tRNA." Q#22179 - CGI_10019918 superfamily 222324 148 214 1.25E-15 69.343 cl16352 zf-3CxxC superfamily - - Zinc-binding domain; This is a family with several pairs of CxxC motifs possibly representing a multiple zinc-binding region. Only one pair of cysteines is associated with a highly conserved histidine residue. Q#22179 - CGI_10019918 superfamily 205872 109 138 2.54E-11 55.9637 cl16353 zf-CCHC_2 superfamily - - Zinc knuckle; This is a zinc-binding domain of the form CxxCxxxGHxxxxC from a variety of different species. Q#22180 - CGI_10019919 superfamily 247069 187 358 1.36E-34 127.887 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#22180 - CGI_10019919 superfamily 247643 124 169 3.35E-11 59.1003 cl16919 CRAL_TRIO_N superfamily - - "CRAL/TRIO, N-terminal domain; This all-alpha domain is found to the N-terminus of pfam00650." Q#22180 - CGI_10019919 superfamily 218219 2 31 8.51E-06 44.6179 cl04693 PRELI superfamily N - "PRELI-like family; This family includes a conserved region found in the PRELI protein and yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. This region is also found in a number of other eukaryotic proteins." Q#22182 - CGI_10019921 superfamily 246669 444 593 1.07E-84 276.051 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22182 - CGI_10019921 superfamily 246669 1861 1993 2.53E-70 234.48 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22182 - CGI_10019921 superfamily 246669 1622 1745 1.76E-64 217.032 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22182 - CGI_10019921 superfamily 246669 983 1119 1.37E-63 215.1 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22182 - CGI_10019921 superfamily 246669 279 389 1.54E-56 193.947 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22182 - CGI_10019921 superfamily 149289 366 436 9.82E-30 115.461 cl06959 FerI superfamily - - FerI (NUC094) domain; This domain is present in proteins of the Ferlin family. It is often located between two C2 domains. Q#22182 - CGI_10019921 superfamily 116739 864 941 3.00E-29 114.517 cl06958 FerB superfamily - - FerB (NUC096) domain; This is central domain B in proteins of the Ferlin family. Q#22182 - CGI_10019921 superfamily 246669 14 117 1.29E-28 114.273 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22183 - CGI_10019922 superfamily 233514 387 658 1.27E-127 381.122 cl11769 EYA-cons_domain superfamily - - eyes absent protein conserved domain; This domain is common to all eyes absent (EYA) homologs. Metazoan EYA's also contain a variable N-terminal domain consisting largely of low-complexity sequences. Q#22185 - CGI_10019924 superfamily 243090 550 679 8.62E-63 210.031 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#22185 - CGI_10019924 superfamily 243077 23 74 1.37E-13 67.5705 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#22185 - CGI_10019924 superfamily 243090 723 850 5.95E-50 173.732 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#22185 - CGI_10019924 superfamily 243090 371 486 7.25E-39 141.802 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#22187 - CGI_10019926 superfamily 197729 128 146 0.00871968 31.8937 cl11732 LRRcap superfamily - - occurring C-terminal to leucine-rich repeats; A motif occurring C-terminal to leucine-rich repeats in "sds22-like" and "typical" LRR-containing proteins. Q#22188 - CGI_10019927 superfamily 243199 5 87 1.68E-09 54.9898 cl02808 RT_like superfamily N - "RT_like: Reverse transcriptase (RT, RNA-dependent DNA polymerase)_like family. An RT gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. RTs occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. These elements can be divided into two major groups. One group contains retroviruses and DNA viruses whose propagation involves an RNA intermediate. They are grouped together with transposable elements containing long terminal repeats (LTRs). The other group, also called poly(A)-type retrotransposons, contain fungal mitochondrial introns and transposable elements that lack LTRs." Q#22189 - CGI_10019928 superfamily 243109 20 203 1.49E-90 267.546 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#22189 - CGI_10019928 superfamily 243073 193 227 6.87E-07 44.5098 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#22190 - CGI_10019929 superfamily 247725 34 169 3.43E-30 114.399 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22191 - CGI_10019930 superfamily 247725 6 125 4.82E-21 84.3982 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22192 - CGI_10019931 superfamily 241637 1 70 7.77E-12 60.3998 cl00146 TFIIS_I superfamily - - N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme Q#22192 - CGI_10019931 superfamily 243122 129 246 2.15E-35 126.601 cl02637 TFIIS_M superfamily - - "Transcription factor S-II (TFIIS), central domain; Transcription elongation by RNA polymerase II is regulated by the general elongation factor TFIIS. This factor stimulates RNA polymerase II to transcribe through regions of DNA that promote the formation of stalled ternary complexes. TFIIS is composed of three structural domains, termed I, II, and III. The two C-terminal domains (II and III), this domain and pfam01096 are required for transcription activity." Q#22192 - CGI_10019931 superfamily 207668 306 344 1.46E-17 75.3058 cl02609 TFIIS_C superfamily - - Transcription factor S-II (TFIIS); Transcription factor S-II (TFIIS). Q#22193 - CGI_10019932 superfamily 217725 302 583 4.12E-25 105.658 cl18425 FGE-sulfatase superfamily - - "Formylglycine-generating sulfatase enzyme; This domain is found in eukaryotic proteins required for post-translational sulphatase modification (SUMF1). These proteins are associated with the rare disorder multiple sulphatase deficiency (MSD). The protein product of the SUMF1 gene is FGE, formylglycine (FGly),-generating enzyme, which is a sulfatase. Sulfatases are enzymes essential for degradation and remodelling of sulfate esters, and formylglycine (FGly), the key catalytic in the active site, is unique to sulfatases. FGE is localised to the endoplasmic reticulum (ER) and interacts with and modifies the unfolded form of newly synthesised sulfatases. FGE is a single-domain monomer with a surprising paucity of secondary structure that adopts a unique fold which is stabilised by two Ca2+ ions. The effect of all mutations found in MSD patients is explained by the FGE structure, providing a molecular basis for MSD. A redox-active disulfide bond is present in the active site of FGE. An oxidized cysteine residue, possibly cysteine sulfenic acid, has been detected that may allow formulation of a structure-based mechanism for FGly formation from cysteine residues in all sulfatases." Q#22193 - CGI_10019932 superfamily 247727 644 800 1.31E-14 73.404 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#22194 - CGI_10019933 superfamily 247907 926 1077 2.42E-33 127.92 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#22194 - CGI_10019933 superfamily 247907 1139 1298 2.55E-30 119.06 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#22194 - CGI_10019933 superfamily 247907 677 828 3.43E-29 115.978 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#22194 - CGI_10019933 superfamily 246676 341 561 4.58E-22 96.2598 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#22194 - CGI_10019933 superfamily 246671 25 176 3.91E-09 56.2772 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#22194 - CGI_10019933 superfamily 245213 1104 1132 8.20E-06 44.935 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22194 - CGI_10019933 superfamily 245213 874 916 0.0013592 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22194 - CGI_10019933 superfamily 246710 199 373 6.68E-17 80.5496 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#22195 - CGI_10019934 superfamily 245201 554 824 2.68E-166 498.485 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22195 - CGI_10019934 superfamily 247725 391 497 2.01E-45 160.845 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22195 - CGI_10019934 superfamily 146323 1139 1271 6.28E-52 180.32 cl04185 Focal_AT superfamily - - "Focal adhesion targeting region; Focal adhesion kinase (FAK) is a tyrosine kinase found in focal adhesions, intracellular signaling complexes that are formed following engagement of the extracellular matrix by integrins. The C-terminal 'focal adhesion targeting' (FAT) region is necessary and sufficient for localising FAK to focal adhesions. The crystal structure of FAT shows it forms a four-helix bundle that resembles those found in two other proteins involved in cell adhesion, alpha-catenin and vinculin. The binding of FAT to the focal adhesion protein, paxillin, requires the integrity of the helical bundle, whereas binding to another focal adhesion protein, talin, does not." Q#22195 - CGI_10019934 superfamily 215882 275 395 4.98E-13 67.691 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#22197 - CGI_10019936 superfamily 241551 54 153 2.73E-30 107.385 cl00016 Cyt_c_Oxidase_Vb superfamily - - "Cytochrome c oxidase subunit Vb. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit Vb is one of three mammalian subunits that lacks a transmembrane region. Subunit Vb is located on the matrix side of the membrane and binds the regulatory subunit of protein kinase A. The abnormally extended conformation is stable only in the CcO assembly." Q#22199 - CGI_10019938 superfamily 220723 172 298 7.53E-23 91.5986 cl11043 ATG11 superfamily - - "Autophagy-related protein 11; The function of this family is conflicting. In the fission yeast, Schizosaccharomyces pombe, this protein has been shown to interact with the telomere cap complex. However, in budding yeast, Saccharomyces cerevisiae, this protein is called ATG11 and is shown to be involved in autophagy." Q#22201 - CGI_10019940 superfamily 112433 40 180 3.25E-83 258.184 cl04181 GCM superfamily - - GCM motif protein; GCM motif protein. Q#22209 - CGI_10009003 superfamily 241563 8 53 0.00106703 37.3167 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22209 - CGI_10009003 superfamily 241563 62 103 0.00463115 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22210 - CGI_10009004 superfamily 241563 62 98 0.00162332 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22210 - CGI_10009004 superfamily 241563 8 53 0.00606967 35.1476 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22212 - CGI_10009007 superfamily 110440 176 203 0.000470377 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#22214 - CGI_10009009 superfamily 222150 100 125 0.00142549 34.2897 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22214 - CGI_10009009 superfamily 222150 45 67 0.00249516 33.9045 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22214 - CGI_10009009 superfamily 222150 128 153 0.00462655 33.1342 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22215 - CGI_10006420 superfamily 241874 10 225 7.06E-106 317.886 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#22216 - CGI_10006421 superfamily 241874 7 306 2.40E-95 297.47 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#22217 - CGI_10006422 superfamily 241832 196 277 1.75E-36 127.25 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#22218 - CGI_10006423 superfamily 243072 97 179 6.47E-24 95.5282 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22220 - CGI_10006425 superfamily 241575 477 541 2.00E-09 54.9711 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#22220 - CGI_10006425 superfamily 241647 321 350 0.000844798 37.9655 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#22221 - CGI_10006426 superfamily 221494 48 334 5.05E-49 177.983 cl13664 DUF3608 superfamily - - "Protein of unknown function (DUF3608); This domain family is found in eukaryotes, and is approximately 280 amino acids in length. The family is found in association with pfam00610." Q#22221 - CGI_10006426 superfamily 243038 1070 1152 1.54E-14 71.5381 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#22222 - CGI_10006427 superfamily 241809 179 288 7.33E-18 79.8431 cl00353 Ribosomal_L16_L10e superfamily - - "Ribosomal_L16_L10e: L16 is an essential protein in the large ribosomal subunit of bacteria, mitochondria, and chloroplasts. Large subunits that lack L16 are defective in peptidyl transferase activity, peptidyl-tRNA hydrolysis activity, association with the 30S subunit, binding of aminoacyl-tRNA and interaction with antibiotics. L16 is required for the function of elongation factor P (EF-P), a protein involved in peptide bond synthesis through the stimulation of peptidyl transferase activity by the ribosome. Mutations in L16 and the adjoining bases of 23S rRNA confer antibiotic resistance in bacteria, suggesting a role for L16 in the formation of the antibiotic binding site. The GTPase RbgA (YlqF) is essential for the assembly of the large subunit, and it is believed to regulate the incorporation of L16. L10e is the archaeal and eukaryotic cytosolic homolog of bacterial L16. L16 and L10e exhibit structural differences at the N-terminus." Q#22222 - CGI_10006427 superfamily 241809 392 501 7.33E-18 79.8431 cl00353 Ribosomal_L16_L10e superfamily - - "Ribosomal_L16_L10e: L16 is an essential protein in the large ribosomal subunit of bacteria, mitochondria, and chloroplasts. Large subunits that lack L16 are defective in peptidyl transferase activity, peptidyl-tRNA hydrolysis activity, association with the 30S subunit, binding of aminoacyl-tRNA and interaction with antibiotics. L16 is required for the function of elongation factor P (EF-P), a protein involved in peptide bond synthesis through the stimulation of peptidyl transferase activity by the ribosome. Mutations in L16 and the adjoining bases of 23S rRNA confer antibiotic resistance in bacteria, suggesting a role for L16 in the formation of the antibiotic binding site. The GTPase RbgA (YlqF) is essential for the assembly of the large subunit, and it is believed to regulate the incorporation of L16. L10e is the archaeal and eukaryotic cytosolic homolog of bacterial L16. L16 and L10e exhibit structural differences at the N-terminus." Q#22225 - CGI_10001381 superfamily 246680 40 105 8.20E-08 47.1964 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#22225 - CGI_10001381 superfamily 246680 127 210 5.39E-07 45.0586 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#22228 - CGI_10001462 superfamily 241574 112 201 1.27E-19 86.8709 cl00053 PTPc superfamily NC - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#22228 - CGI_10001462 superfamily 241574 276 455 2.32E-14 71.0777 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#22231 - CGI_10001010 superfamily 242878 59 348 2.01E-99 297.663 cl02095 CDC50 superfamily - - "LEM3 (ligand-effect modulator 3) family / CDC50 family; Members of this family have been predicted to contain transmembrane helices. The family member LEM3 is a ligand-effect modulator, mutation of which increases glucocorticoid receptor activity in response to dexamethasone and also confers increased activity on other intracellular receptors including the progesterone, oestrogen and mineralocorticoid receptors. LEM3 is thought to affect a downstream step in the glucocorticoid receptor pathway. Factors that modulate ligand responsiveness are likely to contribute to the context-specific actions of the glucocorticoid receptor in mammalian cells. The products of genes YNR048w, YNL323w and YCR094w (CDC50) show redundancy of function and are involved in regulation of transcription via CDC39. CDC39 (also known as NOT1) is normally a negative regulator of transcription either by affecting the general RNA polymerase II machinery or by altering chromatin structure. One function of CDC39 is to block activation of the mating response pathway in the absence of pheromone, and mutation causes arrest in G1 by activation of the pathway. It may be that the cold-sensitive arrest in G1 noticed in CDC50 mutants may be due to inactivation of CDC39. The effects of LEM3 on glucocorticoid receptor activity may also be due to effects on transcription via CDC39." Q#22233 - CGI_10002196 superfamily 246683 79 388 2.29E-108 326.77 cl14648 Aldose_epim superfamily - - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#22234 - CGI_10005487 superfamily 203136 17 138 9.23E-30 107.815 cl04867 LRAT superfamily - - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#22235 - CGI_10005488 superfamily 203136 18 138 3.30E-31 110.897 cl04867 LRAT superfamily - - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#22236 - CGI_10005490 superfamily 203136 22 146 7.88E-32 112.823 cl04867 LRAT superfamily - - "Lecithin retinol acyltransferase; The full-length members of this family are representatives of a novel class II tumour-suppressor family, designated as H-REV107-like. This domain is the catalytic N-terminal proline-rich region of the protein. The downstream region is a putative C-terminal transmembrane domain which is found to be crucial for cellular localisation, but not necessary for the enzyme activity. H-REV107-like proteins are homologous to lecithin retinol acyltransferase (LRAT), an enzyme that catalyzes the transfer of the sn-1 acyl group of phosphatidylcholine to all-trans-retinol and forming a retinyl ester." Q#22237 - CGI_10005491 superfamily 245596 86 383 1.57E-112 336.099 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#22239 - CGI_10005493 superfamily 241644 59 197 6.44E-64 196.654 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#22240 - CGI_10005494 superfamily 220393 169 408 2.01E-57 192.59 cl10751 Tmem26 superfamily - - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#22243 - CGI_10002360 superfamily 241575 280 335 0.00534282 36.13 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#22244 - CGI_10002361 superfamily 245029 15 140 2.51E-20 81.1547 cl09190 MAPEG superfamily - - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#22246 - CGI_10002363 superfamily 220646 29 285 1.26E-54 179.505 cl10925 DUF2464 superfamily - - "Protein of unknown function (DUF2464); This is a family of proteins conserved from worms to humans. Members have been annotated as FAM125A proteins, but their function is unknown." Q#22247 - CGI_10013291 superfamily 243033 51 147 3.16E-07 46.1573 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#22248 - CGI_10013292 superfamily 243033 1 107 1.31E-08 48.0833 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#22249 - CGI_10013293 superfamily 243033 126 244 5.72E-23 91.2257 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#22250 - CGI_10013294 superfamily 192535 28 216 8.61E-07 48.361 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#22251 - CGI_10013295 superfamily 248067 635 751 4.91E-41 147.74 cl17513 ABC1 superfamily - - "ABC1 family; This family includes ABC1 from yeast and AarF from E. coli. These proteins have a nuclear or mitochondrial subcellular location in eukaryotes. The exact molecular functions of these proteins is not clear, however yeast ABC1 suppresses a cytochrome b mRNA translation defect and is essential for the electron transfer in the bc 1 complex and E. coli AarF is required for ubiquinone production. It has been suggested that members of the ABC1 family are novel chaperonins. These proteins are unrelated to the ABC transporter proteins." Q#22252 - CGI_10013296 superfamily 247799 121 186 2.62E-19 82.5243 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#22252 - CGI_10013296 superfamily 247799 480 543 1.09E-11 61.0367 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#22253 - CGI_10013297 superfamily 248054 11 77 2.61E-14 69.8084 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#22253 - CGI_10013297 superfamily 241563 535 560 0.00151999 37.844 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22253 - CGI_10013297 superfamily 245835 559 669 0.00155538 40.0595 cl12013 BAR superfamily NC - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#22255 - CGI_10013299 superfamily 243109 325 461 1.41E-16 77.7021 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#22255 - CGI_10013299 superfamily 243109 697 812 4.05E-09 55.3605 cl02614 SPRY superfamily - - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#22259 - CGI_10013303 superfamily 217062 174 414 1.06E-39 143.949 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#22261 - CGI_10013306 superfamily 203495 356 406 1.78E-05 42.6102 cl10663 Cep57_MT_bd superfamily N - "Centrosome microtubule-binding domain of Cep57; This C-terminal region of Cep57 binds, nucleates and bundles microtubules. The N-terminal part, family Cep57_CLD, pfam14073, is the centrosome localisation domain Cep57." Q#22265 - CGI_10013310 superfamily 247637 55 404 3.26E-158 453.218 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#22267 - CGI_10021955 superfamily 245847 62 203 3.01E-34 130.165 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#22267 - CGI_10021955 superfamily 247907 210 366 4.59E-27 109.815 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#22267 - CGI_10021955 superfamily 247907 825 970 1.78E-21 93.6368 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#22267 - CGI_10021955 superfamily 247907 421 542 9.72E-16 76.688 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#22267 - CGI_10021955 superfamily 247907 1045 1185 4.08E-15 74.762 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#22267 - CGI_10021955 superfamily 245213 570 604 4.64E-05 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22268 - CGI_10021956 superfamily 222150 355 378 5.22E-05 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22268 - CGI_10021956 superfamily 222150 232 256 0.00108577 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22269 - CGI_10021957 superfamily 241739 203 527 2.87E-171 489.77 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#22269 - CGI_10021957 superfamily 245205 92 191 2.33E-45 155.799 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#22270 - CGI_10021958 superfamily 206130 100 191 0.000343476 39.1091 cl16501 DUF4218 superfamily C - Domain of unknown function (DUF4218); Domain of unknown function (DUF4218). Q#22271 - CGI_10021959 superfamily 247941 42 176 2.69E-08 50.4121 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#22272 - CGI_10021960 superfamily 247724 21 185 1.11E-118 337.655 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22274 - CGI_10021962 superfamily 243058 399 510 2.23E-07 50.0055 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#22274 - CGI_10021962 superfamily 243058 130 246 2.44E-07 49.6203 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#22274 - CGI_10021962 superfamily 243689 36 103 2.86E-10 57.639 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#22275 - CGI_10021963 superfamily 192997 289 443 1.96E-29 115.757 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#22276 - CGI_10021964 superfamily 243072 511 638 4.35E-31 120.181 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22276 - CGI_10021964 superfamily 243072 580 700 6.64E-25 102.462 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22276 - CGI_10021964 superfamily 243072 340 468 8.54E-25 102.077 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22276 - CGI_10021964 superfamily 243072 268 396 3.86E-24 100.151 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22276 - CGI_10021964 superfamily 243072 164 290 2.03E-22 95.143 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22278 - CGI_10021966 superfamily 247986 428 506 0.000169123 43.1306 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#22279 - CGI_10021967 superfamily 247986 444 585 4.65E-10 59.309 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#22279 - CGI_10021967 superfamily 247986 262 342 3.43E-08 53.531 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#22279 - CGI_10021967 superfamily 245225 6 186 1.02E-10 63.0873 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#22279 - CGI_10021967 superfamily 241888 347 456 0.000265945 41.7853 cl00473 BI-1-like superfamily C - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#22280 - CGI_10021968 superfamily 247692 2 639 0 1053.68 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#22281 - CGI_10021969 superfamily 218423 143 501 7.12E-168 493.307 cl09358 NAGLU superfamily - - "Alpha-N-acetylglucosaminidase (NAGLU) tim-barrel domain; Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This central domain has a tim barrel fold." Q#22281 - CGI_10021969 superfamily 221877 506 801 4.90E-75 247.236 cl15206 NAGLU_C superfamily - - "Alpha-N-acetylglucosaminidase (NAGLU) C-terminal domain; Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This C-terminal domain has an all alpha helical fold." Q#22281 - CGI_10021969 superfamily 193444 43 129 5.09E-21 89.2288 cl15205 NAGLU_N superfamily - - "Alpha-N-acetylglucosaminidase (NAGLU) N-terminal domain; Alpha-N-acetylglucosaminidase, a lysosomal enzyme required for the stepwise degradation of heparan sulfate. Mutations on the alpha-N-acetylglucosaminidase (NAGLU) gene can lead to Mucopolysaccharidosis type IIIB (MPS IIIB; or Sanfilippo syndrome type B) characterized by neurological dysfunction but relatively mild somatic manifestations. The structure shows that the enzyme is composed of three domains. This N-terminal domain has an alpha-beta fold." Q#22282 - CGI_10021970 superfamily 247769 187 304 0.000198835 40.7857 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#22283 - CGI_10021971 superfamily 247856 33 75 9.73E-07 42.9201 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22283 - CGI_10021971 superfamily 247856 103 163 1.11E-06 42.9201 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22284 - CGI_10021972 superfamily 221138 10 303 1.33E-64 228.592 cl18592 Med23 superfamily C - "Mediator complex subunit 23; Med23 is one of the subunits of the Tail portion of the Mediator complex that regulates RNA polymerase II activity. Med23 is required for heat-shock-specific gene expression, and has been shown to mediate transcriptional activation of E1A in mice." Q#22285 - CGI_10021973 superfamily 221138 2 991 0 1085.66 cl18592 Med23 superfamily N - "Mediator complex subunit 23; Med23 is one of the subunits of the Tail portion of the Mediator complex that regulates RNA polymerase II activity. Med23 is required for heat-shock-specific gene expression, and has been shown to mediate transcriptional activation of E1A in mice." Q#22287 - CGI_10021975 superfamily 217473 106 276 1.42E-14 70.8569 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#22288 - CGI_10021976 superfamily 246918 249 300 0.000167775 39.1071 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22288 - CGI_10021976 superfamily 246918 14 70 0.000376313 37.9515 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22288 - CGI_10021976 superfamily 246918 190 217 0.0076423 33.9182 cl15278 TSP_1 superfamily C - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22289 - CGI_10021977 superfamily 245201 3 258 2.47E-157 460.478 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22289 - CGI_10021977 superfamily 201217 577 625 5.20E-10 56.38 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22289 - CGI_10021977 superfamily 201217 410 458 2.24E-09 54.454 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22289 - CGI_10021977 superfamily 205718 561 590 0.000280812 39.3958 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22289 - CGI_10021977 superfamily 201217 628 672 0.000340644 39.4312 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22290 - CGI_10021978 superfamily 245604 6 78 1.11E-19 82.8375 cl11404 Biotinyl_lipoyl_domains superfamily - - "Biotinyl_lipoyl_domains are present in biotin-dependent carboxylases/decarboxylases, the dihydrolipoyl acyltransferase component (E2) of 2-oxo acid dehydrogenases, and the H-protein of the glycine cleavage system (GCS). These domains transport CO2, acyl, or methylamine, respectively, between components of the complex/protein via a biotinyl or lipoyl group, which is covalently attached to a highly conserved lysine residue." Q#22290 - CGI_10021978 superfamily 215782 218 430 1.30E-87 268.264 cl18344 2-oxoacid_dh superfamily - - 2-oxoacid dehydrogenases acyltransferase (catalytic domain); These proteins contain one to three copies of a lipoyl binding domain followed by the catalytic domain. Q#22290 - CGI_10021978 superfamily 202412 115 151 3.42E-13 63.9805 cl03729 E3_binding superfamily - - e3 binding domain; This family represents a small domain of the E2 subunit of 2-oxo-acid dehydrogenases responsible for the binding of the E3 subunit. Q#22291 - CGI_10021979 superfamily 246597 27 324 4.41E-86 269.556 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#22292 - CGI_10021980 superfamily 241766 1 258 5.05E-107 324.675 cl00303 PNP_UDP_1 superfamily - - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#22292 - CGI_10021980 superfamily 241766 307 546 5.78E-95 293.474 cl00303 PNP_UDP_1 superfamily - - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#22294 - CGI_10021983 superfamily 217956 227 374 7.93E-38 134.845 cl04443 PDCD2_C superfamily N - "Programmed cell death protein 2, C-terminal putative domain; Programmed cell death protein 2, C-terminal putative domain. " Q#22295 - CGI_10021984 superfamily 218936 32 176 8.35E-55 173.93 cl05619 PITH superfamily - - "PITH domain; This family was formerly known as DUF1000. The full-length, Txnl1, protein which is a probable component of the 26S proteasome, uses its C-terminal, PITH, domain to associate specifically with the 26S proteasome. PITH derives from proteasome-interacting thioredoxin domain." Q#22296 - CGI_10021985 superfamily 220226 9 279 4.09E-107 314.551 cl09658 XendoU superfamily - - Endoribonuclease XendoU; This is a family of endoribonucleases involved in RNA biosynthesis which has been named XendoU in Xenopus laevis. XendoU is a U-specific metal dependent enzyme that produces products with a 2'-3' cyclic phosphate termini. Q#22297 - CGI_10021986 superfamily 241872 4 73 1.17E-16 73.2694 cl00453 CDP-OH_P_transf superfamily C - CDP-alcohol phosphatidyltransferase; All of these members have the ability to catalyze the displacement of CMP from a CDP-alcohol by a second alcohol with formation of a phosphodiester bond and concomitant breaking of a phosphoride anhydride bond. Q#22298 - CGI_10021987 superfamily 241886 1 226 1.84E-61 197.782 cl00470 Aldo_ket_red superfamily C - "Aldo-keto reductases (AKRs) are a superfamily of soluble NAD(P)(H) oxidoreductases whose chief purpose is to reduce aldehydes and ketones to primary and secondary alcohols. AKRs are present in all phyla and are of importance to both health and industrial applications. Members have very distinct functions and include the prokaryotic 2,5-diketo-D-gluconic acid reductases and beta-keto ester reductases, the eukaryotic aldose reductases, aldehyde reductases, hydroxysteroid dehydrogenases, steroid 5beta-reductases, potassium channel beta-subunits and aflatoxin aldehyde reductases, among others." Q#22300 - CGI_10021989 superfamily 241884 2 138 2.12E-84 249.844 cl00467 Ntn_hydrolase superfamily N - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#22302 - CGI_10021991 superfamily 247916 152 222 2.82E-13 65.8671 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#22303 - CGI_10021992 superfamily 248312 25 182 5.66E-05 40.4232 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#22304 - CGI_10019751 superfamily 220087 63 137 9.02E-12 58.8132 cl18544 zf-DNA_Pol superfamily C - "DNA Polymerase alpha zinc finger; The DNA Polymerase alpha zinc finger domain adopts an alpha-helix-like structure, followed by three turns, all of which involve proline. The resulting motif is a helix-turn-helix motif, in contrast to other zinc finger domains, which show anti-parallel sheet and helix conformation. Zinc binding occurs due to the presence of four cysteine residues positioned to bind the metal centre in a tetrahedral coordination geometry. Function of this domain is uncertain: it has been proposed that the zinc finger motif may be an essential part of the DNA binding domain." Q#22305 - CGI_10019752 superfamily 245230 3 409 0 906.651 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#22308 - CGI_10019756 superfamily 241594 1910 2378 5.96E-104 339.926 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#22308 - CGI_10019756 superfamily 243072 364 476 3.54E-21 92.8318 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22308 - CGI_10019756 superfamily 115363 1286 1347 7.24E-31 118.628 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#22308 - CGI_10019756 superfamily 203750 1117 1250 1.94E-16 79.2761 cl18248 Sad1_UNC superfamily - - "Sad1 / UNC-like C-terminal; The C. elegans UNC-84 protein is a nuclear envelope protein that is involved in nuclear anchoring and migration during development. The S. pombe Sad1 protein localises at the spindle pole body. UNC-84 and and Sad1 share a common C-terminal region, that is often termed the SUN (Sad1 and UNC) domain. In mammals, the SUN domain is present in two proteins, Sun1 and Sun2. The SUN domain of Sun2 has been demonstrated to be in the periplasm." Q#22309 - CGI_10019757 superfamily 247755 392 603 5.85E-69 222.799 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22309 - CGI_10019757 superfamily 241940 14 293 3.86E-57 195.526 cl00549 ABC_membrane_2 superfamily - - ABC transporter transmembrane region 2; This domain covers the transmembrane of a small family of ABC transporters and shares sequence similarity with pfam00664. Mutations in this domain in human ABCD3 (PMP70) are believed responsible for Zellweger Syndrome-2; mutations in human ABCD1 (ALD) are responsible for recessive X-linked adrenoleukodystrophy. A Saccharomyces cerevisiae homolog is involved in the import of long-chain fatty acids. Q#22311 - CGI_10019759 superfamily 243072 124 189 5.42E-17 75.1126 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22312 - CGI_10019760 superfamily 242205 34 148 5.98E-39 130.887 cl00937 Ribosomal_L21e superfamily N - Ribosomal protein L21e; Ribosomal protein L21e. Q#22315 - CGI_10019763 superfamily 241874 36 429 1.45E-88 280.56 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#22317 - CGI_10019765 superfamily 241984 148 398 2.97E-56 186.693 cl00615 Membrane-FADS-like superfamily - - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#22317 - CGI_10019765 superfamily 242849 11 84 4.06E-23 92.2668 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#22320 - CGI_10019769 superfamily 247723 596 689 5.79E-33 123.482 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22320 - CGI_10019769 superfamily 247723 243 318 1.99E-28 110.038 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22320 - CGI_10019769 superfamily 247723 446 526 7.95E-27 105.371 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22320 - CGI_10019769 superfamily 247723 16 89 1.94E-26 104.236 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22320 - CGI_10019769 superfamily 247723 148 217 0.000593476 38.7289 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22326 - CGI_10019775 superfamily 245226 29 116 0.00149432 37.3356 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#22328 - CGI_10019777 superfamily 216939 60 124 3.17E-06 41.4945 cl03492 PC4 superfamily - - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#22331 - CGI_10019780 superfamily 241568 455 493 0.000124918 40.9092 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#22331 - CGI_10019780 superfamily 241568 391 435 0.00109908 38.2128 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#22331 - CGI_10019780 superfamily 214531 731 774 4.01E-07 47.9817 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#22331 - CGI_10019780 superfamily 214531 774 814 1.21E-05 43.7445 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#22331 - CGI_10019780 superfamily 214531 238 278 1.91E-05 42.9741 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#22332 - CGI_10019781 superfamily 214531 13 56 3.28E-08 46.0557 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#22332 - CGI_10019781 superfamily 214531 58 99 1.47E-07 44.1297 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#22333 - CGI_10010471 superfamily 242203 1 252 2.94E-62 200.989 cl00935 Brix superfamily - - Brix domain; Brix domain. Q#22336 - CGI_10010474 superfamily 245202 8 53 2.61E-23 84.6091 cl09927 S1_like superfamily C - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#22340 - CGI_10010478 superfamily 245856 2 184 9.84E-71 217.958 cl12060 AP2Ec superfamily N - "AP endonuclease family 2; These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites; the alignment also contains hexulose-6-phosphate isomerases, enzymes that catalyze the epimerization of D-arabino-6-hexulose 3-phosphate to D-fructose 6-phosphate, via cleaving the phosphoesterbond with the sugar." Q#22341 - CGI_10001323 superfamily 241578 55 200 1.65E-31 120.474 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22341 - CGI_10001323 superfamily 241578 458 609 8.17E-23 95.8214 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22341 - CGI_10001323 superfamily 245213 417 450 5.54E-12 61.4986 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22341 - CGI_10001323 superfamily 241578 235 392 2.23E-24 100.38 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22342 - CGI_10001324 superfamily 247947 51 85 0.0001726 34.6677 cl17393 HTH_Hin_like superfamily - - "Helix-turn-helix domain of Hin and related proteins, a family of DNA-binding domains unique to bacteria and represented by the Hin protein of Salmonella. The basic HTH domain is a simple fold comprised of three core helices that form a right-handed helical bundle. The principal DNA-protein interface is formed by the third helix, the recognition helix, inserting itself into the major groove of the DNA. A diverse array of HTH domains participate in a variety of functions that depend on their DNA-binding properties. HTH_Hin represents one of the simplest versions of the HTH domains; the characterization of homologous relationships between various sequence-diverse HTH domain families remains difficult. The Hin recombinase induces the site-specific inversion of a chromosomal DNA segment containing a promoter, which controls the alternate expression of two genes by reversibly switching orientation. The Hin recombinase consists of a single polypeptide chain containing a DNA-binding domain (HTH_Hin) and a catalytic domain." Q#22343 - CGI_10001516 superfamily 207654 228 273 8.37E-15 67.0826 cl02574 Annexin superfamily C - Annexin; This family of annexins also includes giardin that has been shown to function as an annexin. Q#22345 - CGI_10016952 superfamily 243119 289 333 0.00498261 34.3317 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#22347 - CGI_10016954 superfamily 245201 2 136 4.65E-31 120.804 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22348 - CGI_10016955 superfamily 245201 43 240 9.83E-46 161.636 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22349 - CGI_10016956 superfamily 241748 213 521 7.10E-177 502.936 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#22350 - CGI_10016957 superfamily 246748 52 383 1.55E-126 374.586 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#22351 - CGI_10016958 superfamily 246748 1 56 2.63E-25 96.0863 cl14876 Zinc_peptidase_like superfamily N - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#22352 - CGI_10016959 superfamily 241743 22 140 3.92E-35 121.647 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#22353 - CGI_10016960 superfamily 219898 1 54 7.96E-09 51.0712 cl18533 DUF1751 superfamily N - "Eukaryotic integral membrane protein (DUF1751); This domain is found in eukaryotic integral membrane proteins. YOL107W, a Saccharomyces cerervisiae protein, has been shown to localise COP II vesicles." Q#22355 - CGI_10016962 superfamily 241832 3 90 3.87E-49 158.286 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#22355 - CGI_10016962 superfamily 243175 104 227 1.33E-45 150.549 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#22357 - CGI_10016964 superfamily 241584 209 299 2.09E-18 81.7739 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22357 - CGI_10016964 superfamily 241584 304 388 2.09E-08 52.4987 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22357 - CGI_10016964 superfamily 241584 105 204 1.66E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22357 - CGI_10016964 superfamily 241584 403 494 2.56E-06 46.3355 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22358 - CGI_10016965 superfamily 241584 247 331 3.17E-10 57.5063 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22358 - CGI_10016965 superfamily 241584 172 242 1.26E-08 52.8839 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22358 - CGI_10016965 superfamily 241584 73 166 1.38E-07 49.8023 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22358 - CGI_10016965 superfamily 241584 346 437 3.68E-06 45.5651 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22360 - CGI_10016967 superfamily 241607 503 538 1.50E-15 72.3369 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#22364 - CGI_10016971 superfamily 243092 19 320 3.95E-57 191.394 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22365 - CGI_10016972 superfamily 245201 202 406 3.85E-42 150.465 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22365 - CGI_10016972 superfamily 216276 7 86 4.02E-18 79.5143 cl15639 Activin_recp superfamily - - "Activin types I and II receptor domain; This Pfam entry consists of both TGF-beta receptor types. This is an alignment of the hydrophilic cysteine-rich ligand-binding domains, Both receptor types, (type I and II) posses a 9 amino acid cysteine box, with the the consensus CCX{4-5}CN. The type I receptors also possess 7 extracellular residues preceding the cysteine box." Q#22365 - CGI_10016972 superfamily 243113 168 194 2.67E-09 53.2682 cl02621 TGF_beta_GS superfamily - - Transforming growth factor beta type I GS-motif; This motif is found in the transforming growth factor beta (TGF-beta) type I which regulates cell growth and differentiation. The name of the GS motif comes from its highly conserved GSGSGLP signature in the cytoplasmic juxtamembrane region immediately preceding the protein's kinase domain. Point mutations in the GS motif modify the signaling ability of the type I receptor. Q#22370 - CGI_10009208 superfamily 241734 37 449 0 539.003 cl00261 PLPDE_III superfamily - - "Type III Pyridoxal 5-phosphate (PLP)-Dependent Enzymes; The fold type III PLP-dependent enzyme family is predominantly composed of two-domain proteins with similarity to bacterial alanine racemases (AR) including eukaryotic ornithine decarboxylases (ODC), prokaryotic diaminopimelate decarboxylases (DapDC), biosynthetic arginine decarboxylases (ADC), carboxynorspermidine decarboxylases (CANSDC), and similar proteins. AR-like proteins contain an N-terminal PLP-binding TIM-barrel domain and a C-terminal beta-sandwich domain. They exist as homodimers with active sites that lie at the interface between the TIM barrel domain of one subunit and the beta-sandwich domain of the other subunit. These proteins play important roles in the biosynthesis of amino acids and polyamine. The family also includes the single-domain YBL036c-like proteins, which contain a single PLP-binding TIM-barrel domain without any N- or C-terminal extensions. Due to the lack of a second domain, these proteins may possess only limited D- to L-alanine racemase activity or non-specific racemase activity." Q#22373 - CGI_10009211 superfamily 245456 126 395 0 570.657 cl10970 AP_MHD_Cterm superfamily - - "C-terminal domain of adaptor protein (AP) complexes medium mu subunits and its homologs (MHD); This family corresponds to the C-terminal domain of heterotetrameric AP complexes medium mu subunits and its homologs existing in monomeric stonins, delta-subunit of the heteroheptameric coat protein I (delta-COPI), a protein encoded by a pro-death gene referred as MuD (also known as MUDENG, mu-2 related death-inducing gene), an endocytic adaptor syp1, the mammalian FCH domain only proteins (FCHo1/2), SH3-containing GRB2-like protein 3-interacting protein 1 (SGIP1), and related proteins. AP complexes participate in the formation of intracellular coated transport vesicles and select cargo molecules for incorporation into the coated vesicles in the late secretory and endocytic pathways. Stonins have been characterized as clathrin-dependent AP-2 mu chain related factors and may act as cargo-specific sorting adaptors in endocytosis. Coat protein complex I (COPI)-coated vesicles function in the early secretory pathway. They mediate the retrograde transport from the Golgi to the ER, and intra-Golgi transport. MuD is distantly related to the C-terminal domain of mu2 subunit of AP-2. It is able to induce cell death by itself and plays an important role in cell death in various tissues. Syp1 represents a novel type of endocytic adaptor protein that participates in endocytosis, promotes vesicle tabulation, and contributes to cell polarity and stress responses. It shares the same domain architecture with its two ubiquitously expressed mammalian counterparts, FCHo1/2, which represent key initial proteins ultimately controlling cellular nutrient uptake, receptor regulation, and synaptic vesicle retrieval. They bind specifically to the plasma membrane and recruit the scaffold proteins eps15 and intersectin, which subsequently engage the adaptor complex AP2 and clathrin, leading to coated vesicle formation. Another mammalian neuronal-specific protein SGIP1 does have a C-terminal MHD and has been classified into this family as well. It is an endophilin-interacting protein that plays an obligatory role in the regulation of energy homeostasis. It is also involved in clathrin-mediated endocytosis by interacting with phospholipids and eps15." Q#22373 - CGI_10009211 superfamily 242876 4 111 6.54E-06 44.2673 cl02092 Clat_adaptor_s superfamily N - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#22374 - CGI_10009212 superfamily 241754 392 695 2.64E-44 165.101 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#22374 - CGI_10009212 superfamily 203134 363 394 0.00167551 38.0465 cl04866 CHORD superfamily N - "CHORD; CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development." Q#22375 - CGI_10009214 superfamily 241883 130 191 1.03E-06 45.518 cl00466 ATP-synt_C superfamily - - ATP synthase subunit C; ATP synthase subunit C. Q#22375 - CGI_10009214 superfamily 241883 264 325 1.03E-06 45.518 cl00466 ATP-synt_C superfamily - - ATP synthase subunit C; ATP synthase subunit C. Q#22375 - CGI_10009214 superfamily 241883 43 92 0.00415562 34.7324 cl00466 ATP-synt_C superfamily C - ATP synthase subunit C; ATP synthase subunit C. Q#22376 - CGI_10009215 superfamily 243082 1403 1773 4.27E-125 399.324 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#22376 - CGI_10009215 superfamily 241643 5 41 2.21E-07 50.1503 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#22377 - CGI_10009216 superfamily 245201 26 318 0 550.997 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22380 - CGI_10003919 superfamily 243034 29 93 0.00248949 36.5892 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22381 - CGI_10003920 superfamily 241587 3 57 1.10E-10 51.521 cl00069 GGL superfamily - - "G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors" Q#22382 - CGI_10003921 superfamily 241583 208 408 4.35E-56 193.994 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#22382 - CGI_10003921 superfamily 216572 76 177 2.80E-12 65.3738 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#22382 - CGI_10003921 superfamily 246918 978 1030 3.66E-07 48.7371 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22382 - CGI_10003921 superfamily 246918 517 554 0.000340432 39.8775 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22383 - CGI_10003922 superfamily 248312 127 284 1.92E-09 54.6669 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#22384 - CGI_10003923 superfamily 207922 2 69 8.02E-19 73.4345 cl03352 Ribosomal_L38e superfamily - - Ribosomal L38e protein family; Ribosomal L38e protein family. Q#22385 - CGI_10002695 superfamily 241764 109 187 1.35E-26 101.585 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#22385 - CGI_10002695 superfamily 247743 269 330 3.06E-10 57.5411 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#22385 - CGI_10002695 superfamily 247743 322 357 2.66E-05 42.2017 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#22386 - CGI_10002696 superfamily 243066 22 123 1.76E-20 85.7469 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22386 - CGI_10002696 superfamily 198867 133 240 6.60E-12 61.5884 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#22387 - CGI_10002697 superfamily 243092 2 277 5.01E-20 86.2348 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22388 - CGI_10002698 superfamily 241550 82 366 2.62E-95 289.869 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#22389 - CGI_10002699 superfamily 246925 15 166 1.25E-09 57.3654 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#22390 - CGI_10002700 superfamily 220534 1 222 3.46E-78 237.699 cl10820 RLL superfamily - - Putative carnitine deficiency-associated protein; This family of proteins conserved from nematodes to humans is of approximately 250 amino acids. It is purported to be carnitine deficiency-associated protein but this could not be confirmed. It carries a characteristic RLL sequence-motif. The function is unknown. Q#22391 - CGI_10002701 superfamily 246669 45 109 2.16E-21 87.5968 cl14603 C2 superfamily N - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22391 - CGI_10002701 superfamily 246708 1 19 2.00E-05 42.1178 cl14781 PI-PLC-Y superfamily N - "Phosphatidylinositol-specific phospholipase C, Y domain; This associates with pfam00388 to form a single structural unit." Q#22392 - CGI_10016747 superfamily 247683 242 298 8.32E-21 86.971 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#22392 - CGI_10016747 superfamily 245601 366 593 7.92E-14 69.2732 cl11399 HP superfamily - - "Histidine phosphatase domain found in a functionally diverse set of proteins, mostly phosphatases; contains a His residue which is phosphorylated during the reaction; Catalytic domain of a functionally diverse set of proteins, most of which are phosphatases. The conserved catalytic core of this domain contains a His residue which is phosphorylated in the reaction. This set of proteins includes cofactor-dependent and cofactor-independent phosphoglycerate mutases (dPGM, and BPGM respectively), fructose-2,6-bisphosphatase (F26BP)ase, Sts-1, SixA, histidine acid phosphatases, phytases, and related proteins. Functions include roles in metabolism, signaling, or regulation, for example F26BPase affects glycolysis and gluconeogenesis through controlling the concentration of F26BP; BPGM controls the concentration of 2,3-BPG (the main allosteric effector of hemoglobin in human blood cells); human Sts-1 is a T-cell regulator; Escherichia coli Six A participates in the ArcB-dependent His-to-Asp phosphorelay signaling system; phytases scavenge phosphate from extracellular sources. Deficiency and mutation in many of the human members result in disease, for example erythrocyte BPGM deficiency is a disease associated with a decrease in the concentration of 2,3-BPG. Clinical applications include the use of prostatic acid phosphatase (PAP) as a serum marker for prostate cancer. Agricultural applications include the addition of phytases to animal feed." Q#22392 - CGI_10016747 superfamily 241643 22 59 1.18E-05 43.2167 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#22393 - CGI_10016748 superfamily 209411 250 369 1.14E-10 60.0397 cl12008 FANCE_c-term superfamily N - "Fanconi anemia complementation group E protein, C-terminal domain; Fanconi Anemia (FA) is an autosomal recessive disorder associated with increased susceptibility to various cancers, bone marrow failure, cardiac, renal, and limb malformations, and other characteristics. Cells are highly sensitive to DNA damaging agents. A multi-subunit protein complex, the FA core complex, is responsible for ubiquitination of the protein FANCD2 in response to DNA damage. This monoubiquitination results in a downstream effect on homology-directed DNA repair. FANCE is part of the FA core complex and its C-terminal domain, which is modeled here, has been shown to directly interact with FANCD2. The domain contains a five-fold repeat of a structural unit similar to ARM and HEAT repeats. FANCE appears conserved in metazoa and in plants." Q#22395 - CGI_10016750 superfamily 241624 118 303 2.19E-14 70.0778 cl00120 PP2Cc superfamily N - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#22396 - CGI_10016751 superfamily 245087 9 272 3.69E-33 125.434 cl09515 PCNA superfamily - - "Proliferating Cell Nuclear Antigen (PCNA) domain found in eukaryotes and archaea. These polymerase processivity factors play a role in DNA replication and repair. PCNA encircles duplex DNA in its central cavity, providing a DNA-bound platform for the attachment of the polymerase. The trimeric PCNA ring is structurally similar to the dimeric ring formed by the DNA polymerase processivity factors in bacteria (beta subunit DNA polymerase III holoenzyme) and in bacteriophages (catalytic subunits in T4 and RB69). This structural correspondence further substantiates the mechanistic connection between eukaryotic and prokaryotic DNA replication that has been suggested on biochemical grounds. PCNA is also involved with proteins involved in cell cycle processes such as DNA repair and apoptosis. Many of these proteins contain a highly conserved motif known as the PIP-box (PCNA interacting protein box) which contains the sequence Qxx[LIM]xxF[FY]." Q#22397 - CGI_10016752 superfamily 246597 9 186 2.51E-125 353.437 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#22399 - CGI_10016754 superfamily 243091 66 191 9.27E-27 100.101 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#22400 - CGI_10016755 superfamily 247805 509 662 5.38E-05 43.48 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#22400 - CGI_10016755 superfamily 222427 1055 1330 3.33E-119 376.14 cl18674 Helicase_C_4 superfamily - - "Helicase_C-like; Strawberry notch proteins carry DExD/H-box groups and Helicase_C domains. These proteins promote the expression of diverse targets, potentially through interactions with transcriptional activator or repressor complexes." Q#22401 - CGI_10016756 superfamily 247057 122 183 1.04E-21 88.5051 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#22401 - CGI_10016756 superfamily 248012 224 325 1.92E-17 77.618 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#22401 - CGI_10016756 superfamily 247057 53 121 9.93E-17 74.646 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#22402 - CGI_10016757 superfamily 243058 332 436 1.84E-06 46.1535 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#22405 - CGI_10016760 superfamily 247755 1148 1368 1.03E-121 379.53 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22405 - CGI_10016760 superfamily 247755 525 726 5.27E-105 332.512 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22405 - CGI_10016760 superfamily 216049 824 1101 4.52E-15 76.1706 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#22407 - CGI_10016762 superfamily 247755 1130 1350 2.58E-122 381.071 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22407 - CGI_10016762 superfamily 247755 507 708 1.22E-105 334.438 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22407 - CGI_10016762 superfamily 216049 212 463 9.32E-17 81.1782 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#22407 - CGI_10016762 superfamily 216049 806 1083 2.72E-15 76.5558 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#22408 - CGI_10016763 superfamily 243058 25 137 0.000200212 39.9904 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#22408 - CGI_10016763 superfamily 247804 536 569 0.00361142 36.0059 cl17250 SANT superfamily C - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#22409 - CGI_10016764 superfamily 245210 30 408 8.04E-115 371.891 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#22409 - CGI_10016764 superfamily 247637 1139 1248 0.000119572 44.8674 cl16912 MDR superfamily NC - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#22410 - CGI_10016765 superfamily 243092 107 371 3.32E-34 128.607 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22411 - CGI_10016766 superfamily 203903 70 115 2.24E-13 65.2613 cl07067 cwf21 superfamily - - cwf21 domain; The cwf21 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe. The function of the cwf21 domain is to bind directly to the spliceosomal protein Prp8. Mutations in the cwf21 domain prevent Prp8 from binding. The structure of this domain has recently been solved which shows this domain to be composed of two alpha helices. Q#22413 - CGI_10016768 superfamily 246679 4 129 8.47E-79 237.623 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#22413 - CGI_10016768 superfamily 246679 139 251 3.99E-09 52.8054 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#22414 - CGI_10016769 superfamily 150420 13 123 7.98E-12 62.0602 cl18042 Jnk-SapK_ap_N superfamily C - JNK_SAPK-associated protein-1; This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end. Q#22414 - CGI_10016769 superfamily 221118 326 388 9.40E-06 42.7096 cl12985 RILP superfamily - - Rab interacting lysosomal protein; RILP contains a domain which contains two coiled-coil regions and is found mainly in the cytosol. RILP is recruited onto late endosomal and lysosomal membranes by Rab7 and acts as a downstream effector of Rab7. This recruitment process is important for phagosome maturation and fusion with late endosomes and lysosomes. Q#22415 - CGI_10016770 superfamily 216301 1 155 7.79E-50 160.122 cl03099 EMP24_GP25L superfamily - - emp24/gp25L/p24 family/GOLD; Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Q#22416 - CGI_10016771 superfamily 247861 185 283 2.64E-06 45.5804 cl17307 SpoU_methylase superfamily C - SpoU rRNA Methylase family; This family of proteins probably use S-AdoMet. Q#22417 - CGI_10016772 superfamily 243066 2 78 1.64E-20 86.4529 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22417 - CGI_10016772 superfamily 219619 343 393 1.15E-11 60.6843 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#22422 - CGI_10016777 superfamily 152008 330 375 0.00730259 34.6649 cl13085 DUF3234 superfamily C - Protein of unknown function (DUF3234); This bacterial family of proteins has no known function. Some members in this family of proteins are annotated as TTHA0547 however this cannot be confirmed. Q#22424 - CGI_10016779 superfamily 247792 367 411 1.10E-12 63.6188 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22425 - CGI_10016780 superfamily 214020 5 97 1.37E-13 64.7762 cl17165 SKA2 superfamily - - "Spindle and kinetochore-associated protein 2; SKA2, also called FAM33A, is a component of the SKA complex, which is formed by the association of three subunits (SKA1, SKA2, annd SKA3). The SKA complex is essential for accurate cell division. It functions with the Ndc80 network to establish stable kinetochore-microtubule interactions, which are crucial for the highly orchestrated chromosome movements during mitosis. The biological unit is a W-shaped homodimer of the three-subunit complex. SKA2 has also been identified as a glucocorticoid receptor-interacting protein and may be involved in regulating cancer cell proliferation." Q#22425 - CGI_10016780 superfamily 151803 176 247 3.49E-06 43.6869 cl12898 DUF3161 superfamily - - Protein of unknown function (DUF3161); This eukaryotic family of proteins has no known function. Q#22426 - CGI_10016781 superfamily 243074 6 51 5.90E-08 49.0421 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#22428 - CGI_10016783 superfamily 221180 57 130 2.17E-18 76.9843 cl13206 Vma12 superfamily C - "Endoplasmic reticulum-based factor for assembly of V-ATPase; The yeast vacuolar proton-translocating ATPase (V-ATPase) is the best characterized member of the V-ATPase family. A total of thirteen genes are required for encoding the subunits of the enzyme complex itself and an additional three for providing factors necessary for the assembly of the whole. Vma12 is one of these latter, all three of which are localised to the endoplasmic reticulum." Q#22429 - CGI_10016784 superfamily 220379 22 137 6.01E-23 95.5747 cl10734 DRY_EERY superfamily - - "Alternative splicing regulator; This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing. Most family members are associated with two Surp domains pfam01805 and an Arginine- serine-rich binding region towards the C-terminus." Q#22429 - CGI_10016784 superfamily 243154 181 231 2.79E-15 71.8497 cl02715 Surp superfamily - - Surp module; This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. Q#22429 - CGI_10016784 superfamily 243154 418 450 0.000168382 40.2633 cl02715 Surp superfamily C - Surp module; This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. Q#22430 - CGI_10016785 superfamily 241623 158 439 6.50E-179 507.924 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#22430 - CGI_10016785 superfamily 202180 505 535 1.25E-11 59.7848 cl03505 FATC superfamily - - "FATC domain; The FATC domain is named after FRAP, ATM, TRRAP C-terminal. The solution structure of the FATC domain suggests it plays a role in redox-dependent structural and cellular stability." Q#22438 - CGI_10023210 superfamily 150820 18 304 5.42E-63 206.994 cl10894 DuoxA superfamily - - "Dual oxidase maturation factor; DuoxA (Dual oxidase maturation factor) is the essential protein necessary for the final release of DUOX2 (an NADPH:O2 oxidoreductase flavoprotein) from the endoplasmic reticulum. Dual oxidases (DUOX1 and DUOX2) constitute the catalytic core of the hydrogen peroxide generator, which generates H2O2 at the apical membrane of thyroid follicular cells, essential for iodination of thyroglobulin by thyroid peroxidases. DuoxA carries five membrane-integral regions including a reverse signal-anchor with external N-terminus (type III) and two N-glycosylation sites. It is conserved from nematodes to humans." Q#22439 - CGI_10023211 superfamily 150820 18 303 7.18E-68 219.32 cl10894 DuoxA superfamily - - "Dual oxidase maturation factor; DuoxA (Dual oxidase maturation factor) is the essential protein necessary for the final release of DUOX2 (an NADPH:O2 oxidoreductase flavoprotein) from the endoplasmic reticulum. Dual oxidases (DUOX1 and DUOX2) constitute the catalytic core of the hydrogen peroxide generator, which generates H2O2 at the apical membrane of thyroid follicular cells, essential for iodination of thyroglobulin by thyroid peroxidases. DuoxA carries five membrane-integral regions including a reverse signal-anchor with external N-terminus (type III) and two N-glycosylation sites. It is conserved from nematodes to humans." Q#22440 - CGI_10023212 superfamily 243066 3 107 4.65E-11 59.9385 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22440 - CGI_10023212 superfamily 217312 256 380 7.43E-07 49.5213 cl03827 NPH3 superfamily N - "NPH3 family; Phototropism of Arabidopsis thaliana seedlings in response to a blue light source is initiated by nonphototropic hypocotyl 1 (NPH1), a light-activated serine-threonine protein kinase. Mutations in NPH3 disrupt early signaling occurring downstream of the NPH1 photoreceptor. The NPH3 gene encodes a NPH1-interacting protein. NPH3 is a member of a large protein family, apparently specific to higher plants, and may function as an adapter or scaffold protein to bring together the enzymatic components of a NPH1-activated phosphorelay." Q#22441 - CGI_10023213 superfamily 247723 548 613 3.40E-20 85.8016 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22441 - CGI_10023213 superfamily 241762 7 47 3.28E-14 68.6346 cl00297 R3H superfamily N - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#22442 - CGI_10023214 superfamily 248097 37 167 8.47E-22 85.7798 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22443 - CGI_10023215 superfamily 248097 21 151 7.33E-22 85.7798 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22444 - CGI_10023216 superfamily 248097 4 134 1.70E-18 76.1498 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22445 - CGI_10023217 superfamily 248097 4 135 7.89E-19 76.9202 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22447 - CGI_10023219 superfamily 217293 1 196 4.17E-46 158.565 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#22447 - CGI_10023219 superfamily 202474 203 286 6.11E-16 74.9977 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#22448 - CGI_10023220 superfamily 217293 1 196 4.32E-39 139.305 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#22448 - CGI_10023220 superfamily 202474 203 286 5.15E-14 69.2197 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#22452 - CGI_10023224 superfamily 247683 72 125 0.00013035 37.8523 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#22452 - CGI_10023224 superfamily 205121 171 195 0.00124956 34.7872 cl18263 zf-met superfamily - - "Zinc-finger of C2H2 type; This is a zinc-finger domain with the CxxCx(12)Hx(6)H motif, found in multiple copies in a wide range of proteins from plants to metazoans. Some member proteins, particularly those from plants, are annotated as being RNA-binding." Q#22453 - CGI_10023225 superfamily 217545 46 352 7.02E-107 319.265 cl04056 Peptidase_C54 superfamily - - Peptidase family C54; Peptidase family C54. Q#22458 - CGI_10023230 superfamily 215754 69 151 4.23E-18 76.9084 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#22458 - CGI_10023230 superfamily 215754 178 252 1.98E-14 66.508 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#22458 - CGI_10023230 superfamily 215754 1 64 5.63E-11 57.2632 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#22459 - CGI_10023231 superfamily 241578 30 196 4.00E-39 141.211 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22459 - CGI_10023231 superfamily 241578 258 405 2.05E-29 114.311 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22459 - CGI_10023231 superfamily 241578 446 601 2.09E-05 43.7559 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22461 - CGI_10023233 superfamily 241570 139 250 1.86E-22 93.1593 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#22462 - CGI_10023234 superfamily 242183 26 69 0.00972993 32.2178 cl00907 Glutaminase superfamily N - Glutaminase; This family of enzymes deaminates glutamine to glutamate EC:3.5.1.2. Q#22464 - CGI_10023236 superfamily 217203 31 565 0 583.482 cl03678 CDC45 superfamily - - "CDC45-like protein; CDC45 is an essential gene required for initiation of DNA replication in S. cerevisiae, forming a complex with MCM5/CDC46. Homologues of CDC45 have been identified in human, mouse and smut fungus among others." Q#22465 - CGI_10023237 superfamily 248458 191 357 0.000667393 41.1453 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22465 - CGI_10023237 superfamily 227469 392 632 6.02E-70 233.295 cl12196 UFD1 superfamily C - "Ubiquitin fusion-degradation protein [Posttranslational modification, protein turnover, chaperones]" Q#22465 - CGI_10023237 superfamily 199528 12 130 2.48E-10 61.8764 cl15392 PRK10429 superfamily C - melibiose:sodium symporter; Provisional Q#22466 - CGI_10023238 superfamily 241581 45 142 3.48E-12 62.789 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#22466 - CGI_10023238 superfamily 245201 169 440 4.00E-79 250.926 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22467 - CGI_10023239 superfamily 241567 147 379 3.80E-71 232.491 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#22467 - CGI_10023239 superfamily 246680 11 97 6.18E-08 50.6632 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#22467 - CGI_10023239 superfamily 246680 394 478 1.00E-07 49.8928 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#22467 - CGI_10023239 superfamily 241567 529 660 4.32E-47 166.645 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#22468 - CGI_10023240 superfamily 241567 14 129 3.61E-23 91.1227 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#22469 - CGI_10023241 superfamily 242978 943 1352 1.59E-45 175.476 cl02310 Glyco_hydro_81 superfamily N - "Glycosyl hydrolase family 81; Family of eukaryotic beta-1,3-glucanases. Within the Aspergillus fumigatus protein two perfectly conserved Glu residues (E550 or E554) have been proposed as putative nucleophiles of the active site of the Engl1 endoglucanase, while the proton donor would be D475. The endo-beta-1,3-glucanase activity is essential for efficient spore release." Q#22471 - CGI_10023243 superfamily 245596 100 358 2.87E-20 88.3003 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#22472 - CGI_10023244 superfamily 204041 29 178 3.10E-30 110.364 cl07367 GLTP superfamily - - Glycolipid transfer protein (GLTP); GLTP is a cytosolic protein that catalyzes the intermembrane transfer of glycolipids. Q#22473 - CGI_10023245 superfamily 204041 30 172 1.42E-26 100.349 cl07367 GLTP superfamily - - Glycolipid transfer protein (GLTP); GLTP is a cytosolic protein that catalyzes the intermembrane transfer of glycolipids. Q#22474 - CGI_10023246 superfamily 241600 137 346 2.83E-81 249.08 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#22475 - CGI_10023247 superfamily 243264 12 399 5.80E-165 471.775 cl02993 P2X_receptor superfamily - - ATP P2X receptor; ATP P2X receptor. Q#22476 - CGI_10023248 superfamily 243066 7 90 1.09E-16 75.2821 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22476 - CGI_10023248 superfamily 219619 339 402 1.95E-10 56.8323 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#22478 - CGI_10023250 superfamily 241578 119 199 3.58E-07 48.331 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22479 - CGI_10023251 superfamily 247727 56 148 3.88E-06 42.8023 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#22480 - CGI_10023252 superfamily 247725 215 317 1.30E-28 111.224 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22482 - CGI_10023254 superfamily 241571 316 426 4.02E-26 103.646 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#22482 - CGI_10023254 superfamily 243051 504 618 6.59E-17 78.5737 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#22482 - CGI_10023254 superfamily 241568 434 494 0.000331012 38.9832 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#22482 - CGI_10023254 superfamily 246918 271 310 7.85E-10 55.6707 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22482 - CGI_10023254 superfamily 246918 113 156 0.000302985 39.1071 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22482 - CGI_10023254 superfamily 243093 160 252 0.00374932 36.2941 cl02568 WSC superfamily - - WSC domain; This domain may be involved in carbohydrate binding. Q#22483 - CGI_10023255 superfamily 247069 94 260 3.19E-32 119.797 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#22483 - CGI_10023255 superfamily 247643 35 75 6.95E-07 46.0405 cl16919 CRAL_TRIO_N superfamily - - "CRAL/TRIO, N-terminal domain; This all-alpha domain is found to the N-terminus of pfam00650." Q#22484 - CGI_10023256 superfamily 247069 75 243 1.25E-29 112.479 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#22484 - CGI_10023256 superfamily 247643 16 57 1.31E-10 56.7891 cl16919 CRAL_TRIO_N superfamily - - "CRAL/TRIO, N-terminal domain; This all-alpha domain is found to the N-terminus of pfam00650." Q#22485 - CGI_10023257 superfamily 243066 23 113 3.38E-24 95.3124 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22485 - CGI_10023257 superfamily 219619 284 360 5.37E-10 55.6767 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#22486 - CGI_10023258 superfamily 246597 131 215 6.37E-15 68.4086 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#22486 - CGI_10023258 superfamily 246597 10 67 3.27E-12 60.7047 cl13995 MPP_superfamily superfamily C - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#22491 - CGI_10023263 superfamily 247938 56 127 2.45E-36 121.936 cl17384 TAF12 superfamily - - "TATA Binding Protein (TBP) Associated Factor 12 (TAF12) is one of several TAFs that bind TBP and is involved in forming Transcription Factor IID (TFIID) complex; The TATA Binding Protein (TBP) Associated Factor 12 (TAF12) is one of several TAFs that bind TBP and are involved in forming the TFIID complex. TFIID is one of the seven General Transcription Factors (GTFs) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) that are involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the pre-initiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. TAFs are named after their electrophoretic mobility in polyacrylamide gels in different species. A new, unified nomenclature has been suggested for the pol II TAFs to show the relationship between TAF orthologs and paralogs. Several hypotheses are proposed for TAFs function such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. These TAFs, with the help of specific activators, are required only for expression of a subset of genes and are not universally involved for transcription as are GTFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. Several TAFs interact via histone-fold (HFD) motifs; the HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamers. The minimal HFD contains three alpha-helices linked by two loops and is found in core histones, TAFs and many other transcription factors. TFIID has a histone octamer-like substructure. TAF12 domain interacts with TAF4 and makes a novel histone-like heterodimer that binds DNA and has a core promoter function of a subset of genes." Q#22492 - CGI_10023264 superfamily 205480 212 297 7.29E-12 62.2933 cl16219 DUF4078 superfamily - - "Domain of unknown function (DUF4078); This family is found from fungi to humans, but its exact function is not known." Q#22492 - CGI_10023264 superfamily 221501 518 612 0.000549782 39.0006 cl13679 RCR superfamily N - "Chitin synthesis regulation, resistance to Congo red; RCR proteins are ER membrane proteins that regulate chitin deposition in fungal cell walls. Although chitin, a linear polymer of beta-1,4-linked N-acetylglucosamine, constitutes only 2% of the cell wall it plays a vital role in the overall protection of the cell wall against stress, noxious chemicals and osmotic pressure changes. Congo red is a cell wall-disrupting benzidine-type dye extensively used in many cell wall mutant studies that specifically targets chitin in yeast cells and inhibits growth. RCR proteins render the yeasts resistant to Congo red by diminishing the content of chitin in the cell wall. RCR proteins are probably regulating chitin synthase III interact directly with ubiquitin ligase Rsp5, and the VPEY motif is necessary for this, via interaction with the WW domains of Rsp5." Q#22493 - CGI_10023265 superfamily 202715 15 113 1.38E-47 150.036 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#22495 - CGI_10003607 superfamily 241839 26 178 1.95E-39 145.02 cl00396 PHO4 superfamily C - Phosphate transporter family; This family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter. This family also contains the leukaemia virus receptor. Q#22495 - CGI_10003607 superfamily 241839 427 570 2.56E-36 136.16 cl00396 PHO4 superfamily N - Phosphate transporter family; This family includes PHO-4 from Neurospora crassa which is a is a Na(+)-phosphate symporter. This family also contains the leukaemia virus receptor. Q#22496 - CGI_10003608 superfamily 218674 414 590 5.63E-13 67.9406 cl05292 Miff superfamily - - Mitochondrial and peroxisomal fission factor Mff; This protein has a role in mitochondrial and peroxisomal fission. Q#22500 - CGI_10003612 superfamily 243034 484 588 2.74E-05 43.1376 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22500 - CGI_10003612 superfamily 243034 23 141 0.000230964 40.056 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22501 - CGI_10003613 superfamily 243034 15 114 4.64E-12 62.7828 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22501 - CGI_10003613 superfamily 243034 365 429 7.19E-05 41.2116 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22502 - CGI_10003339 superfamily 248458 322 604 6.19E-11 63.1017 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22502 - CGI_10003339 superfamily 248458 84 212 6.27E-07 50.7753 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22503 - CGI_10003593 superfamily 241573 60 346 3.29E-81 255.334 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#22503 - CGI_10003593 superfamily 241653 369 429 0.00323096 36.5075 cl00165 Calpain_III superfamily C - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#22504 - CGI_10003594 superfamily 241573 82 370 4.41E-81 261.112 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#22504 - CGI_10003594 superfamily 241653 386 495 9.63E-09 53.8636 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#22506 - CGI_10007980 superfamily 241580 68 148 8.45E-35 126.9 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#22509 - CGI_10007983 superfamily 242274 56 134 7.19E-08 48.8247 cl01053 SGNH_hydrolase superfamily NC - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#22510 - CGI_10007984 superfamily 205872 225 250 0.00158673 35.1629 cl16353 zf-CCHC_2 superfamily - - Zinc knuckle; This is a zinc-binding domain of the form CxxCxxxGHxxxxC from a variety of different species. Q#22512 - CGI_10007986 superfamily 242826 30 119 1.11E-39 131.267 cl01993 Ribosomal_S26e superfamily - - Ribosomal protein S26e; Ribosomal protein S26e. Q#22516 - CGI_10018818 superfamily 246918 453 505 6.30E-14 67.6119 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22516 - CGI_10018818 superfamily 246918 396 448 2.98E-12 62.6043 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22516 - CGI_10018818 superfamily 246918 550 602 8.19E-12 61.4487 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22516 - CGI_10018818 superfamily 152683 57 157 1.09E-09 56.5273 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#22520 - CGI_10018822 superfamily 243035 40 168 4.11E-14 65.3337 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22523 - CGI_10018825 superfamily 118685 215 388 2.19E-27 105.655 cl10854 DUF2365 superfamily - - Uncharacterized conserved protein (DUF2365); This is a family of conserved proteins found from nematodes to humans. The function is unknown. Q#22524 - CGI_10018826 superfamily 242203 134 302 7.82E-61 194.875 cl00935 Brix superfamily - - Brix domain; Brix domain. Q#22525 - CGI_10018827 superfamily 218954 2 185 1.37E-77 238.759 cl05646 Isy1 superfamily C - Isy1-like splicing family; Isy1 protein is important in the optimisation of splicing. Q#22525 - CGI_10018827 superfamily 218954 241 269 2.21E-05 43.4631 cl05646 Isy1 superfamily N - Isy1-like splicing family; Isy1 protein is important in the optimisation of splicing. Q#22526 - CGI_10018828 superfamily 205524 16 118 2.68E-20 84.7852 cl17723 Hydrolase_6 superfamily - - Haloacid dehalogenase-like hydrolase; This family is part of the HAD superfamily. Q#22527 - CGI_10018829 superfamily 241568 369 427 2.70E-08 50.9244 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#22529 - CGI_10018831 superfamily 243035 80 114 0.00248968 35.3582 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22530 - CGI_10018832 superfamily 248264 85 202 7.98E-08 50.3134 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#22531 - CGI_10018833 superfamily 242173 47 186 3.08E-16 71.5142 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#22532 - CGI_10018834 superfamily 242173 281 421 1.10E-24 98.093 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#22541 - CGI_10005080 superfamily 248312 33 183 9.44E-10 54.2904 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#22542 - CGI_10005081 superfamily 247692 10 52 0.00610398 32.5841 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#22544 - CGI_10009755 superfamily 245814 213 276 1.50E-07 47.8059 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#22544 - CGI_10009755 superfamily 245814 3 34 0.00017239 38.6405 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#22544 - CGI_10009755 superfamily 241584 38 136 0.00294579 35.1647 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#22545 - CGI_10009756 superfamily 243157 24 104 8.38E-47 158.445 cl02720 PB1 superfamily - - "The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants." Q#22546 - CGI_10009757 superfamily 248458 33 419 1.25E-39 150.542 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22547 - CGI_10009758 superfamily 243098 1059 1102 4.99E-12 63.3859 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#22547 - CGI_10009758 superfamily 243098 1285 1331 7.49E-12 63.0007 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#22547 - CGI_10009758 superfamily 243098 547 599 7.63E-12 63.0007 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#22547 - CGI_10009758 superfamily 243098 772 817 1.84E-11 61.8451 cl02573 TUDOR superfamily - - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#22547 - CGI_10009758 superfamily 247792 45 78 0.00100413 38.966 cl17238 RING superfamily N - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22547 - CGI_10009758 superfamily 241563 193 230 7.27E-07 48.2444 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22547 - CGI_10009758 superfamily 128778 274 360 0.00111396 39.5555 cl17972 BBC superfamily N - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#22547 - CGI_10009758 superfamily 243098 1491 1522 0.00893674 36.0368 cl02573 TUDOR superfamily C - "Tudor domains are found in many eukaryotic organisms and have been implicated in protein-protein interactions in which methylated protein substrates bind to these domains. For example, the Tudor domain of Survival of Motor Neuron (SMN) binds to symmetrically dimethylated arginines of arginine-glycine (RG) rich sequences found in the C-terminal tails of Sm proteins. The SMN protein is linked to spinal muscular atrophy. Another example is the tandem tudor domains of 53BP1, which bind to histone H4 specifically dimethylated at Lys20 (H4-K20me2). 53BP1 is a key transducer of the DNA damage checkpoint signal." Q#22548 - CGI_10009759 superfamily 241564 106 172 3.69E-25 96.5659 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#22548 - CGI_10009759 superfamily 241564 25 93 9.36E-20 81.5431 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#22548 - CGI_10009759 superfamily 247792 300 338 0.000641928 37.04 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22549 - CGI_10009760 superfamily 241563 13 51 1.34E-05 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22550 - CGI_10009761 superfamily 148298 10 59 0.00747764 31.661 cl05897 Prokineticin superfamily N - "Prokineticin; This family consists of several prokineticin proteins and related BM8 sequences. The suprachiasmatic nucleus (SCN) controls the circadian rhythm of physiological and behavioural processes in mammals. It has been shown that prokineticin 2 (PK2), a cysteine-rich secreted protein, functions as an output molecule from the SCN circadian clock. PK2 messenger RNA is rhythmically expressed in the SCN, and the phase of PK2 rhythm is responsive to light entrainment. Molecular and genetic studies have revealed that PK2 is a gene that is controlled by a circadian clock." Q#22551 - CGI_10009762 superfamily 247792 175 220 1.78E-12 59.3156 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22552 - CGI_10009763 superfamily 241567 226 436 1.02E-60 199.749 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#22552 - CGI_10009763 superfamily 246680 9 86 1.77E-07 48.4688 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#22553 - CGI_10009764 superfamily 243066 33 139 1.66E-17 78.0429 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22553 - CGI_10009764 superfamily 198867 153 248 8.03E-16 72.9615 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#22556 - CGI_10009767 superfamily 111397 432 509 1.38E-08 52.7286 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#22561 - CGI_10012203 superfamily 241599 214 272 6.34E-23 89.9952 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22563 - CGI_10012205 superfamily 243077 16 72 2.00E-14 65.2593 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#22564 - CGI_10012206 superfamily 247918 105 260 0.00190818 38.3266 cl17364 PMT_2 superfamily - - Dolichyl-phosphate-mannose-protein mannosyltransferase; This family contains members that are not captured by pfam02366. Q#22566 - CGI_10012208 superfamily 241599 142 199 4.23E-14 64.1868 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22566 - CGI_10012208 superfamily 119045 199 229 1.02E-11 57.3243 cl11160 Engrail_1_C_sig superfamily - - Engrailed homeobox C-terminal signature domain; Engrailed homeobox proteins are characterized by the presence of a conserved region of some 20 amino-acid residues located at the C-terminal of the 'homeobox' domain. This domain of approximately 20 residues forms a kind of a signature pattern for this subfamily of proteins. Q#22567 - CGI_10012209 superfamily 241599 127 185 9.44E-21 82.2912 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22567 - CGI_10012209 superfamily 119045 184 212 4.63E-11 55.0131 cl11160 Engrail_1_C_sig superfamily - - Engrailed homeobox C-terminal signature domain; Engrailed homeobox proteins are characterized by the presence of a conserved region of some 20 amino-acid residues located at the C-terminal of the 'homeobox' domain. This domain of approximately 20 residues forms a kind of a signature pattern for this subfamily of proteins. Q#22569 - CGI_10012211 superfamily 243035 18 110 1.80E-12 58.7853 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22570 - CGI_10012212 superfamily 247068 1143 1238 1.23E-10 60.4049 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22570 - CGI_10012212 superfamily 247068 1251 1337 1.32E-06 48.4638 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22570 - CGI_10012212 superfamily 247068 532 621 2.66E-06 47.3082 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22570 - CGI_10012212 superfamily 247068 733 825 5.78E-06 46.1526 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22570 - CGI_10012212 superfamily 247068 630 725 2.15E-05 44.6118 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22570 - CGI_10012212 superfamily 247068 954 1012 2.64E-05 44.2266 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22570 - CGI_10012212 superfamily 247068 1041 1135 0.00012682 42.3006 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22570 - CGI_10012212 superfamily 247068 429 524 0.000159861 41.9154 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22571 - CGI_10012213 superfamily 247736 85 163 3.38E-05 39.9446 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#22572 - CGI_10012214 superfamily 220695 35 163 8.87E-05 42.1807 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#22573 - CGI_10012215 superfamily 247736 128 206 1.94E-05 41.1002 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#22574 - CGI_10012216 superfamily 247736 87 164 1.86E-06 43.4114 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#22577 - CGI_10012219 superfamily 245835 446 644 3.25E-88 275.681 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#22577 - CGI_10012219 superfamily 243088 303 426 5.99E-62 203.317 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#22577 - CGI_10012219 superfamily 247683 61 115 8.64E-16 72.7447 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#22582 - CGI_10009571 superfamily 247725 85 187 6.18E-18 82.1342 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22583 - CGI_10009572 superfamily 247755 425 663 4.16E-105 332.547 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22583 - CGI_10009572 superfamily 247755 1202 1241 4.04E-11 62.9076 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22584 - CGI_10009573 superfamily 247755 1 132 1.59E-68 213.906 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22585 - CGI_10009574 superfamily 241832 37 105 2.81E-25 94.1528 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#22585 - CGI_10009574 superfamily 243175 115 184 1.32E-07 46.4618 cl02776 GST_C_family superfamily C - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#22585 - CGI_10009574 superfamily 247755 1 30 3.28E-06 44.418 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22586 - CGI_10009575 superfamily 241594 4009 4364 8.66E-143 453.175 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#22586 - CGI_10009575 superfamily 241643 1346 1381 0.000177401 42.4463 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#22586 - CGI_10009575 superfamily 218859 431 833 2.27E-74 256.141 cl15653 DUF913 superfamily - - Domain of Unknown Function (DUF913); Members of this family are found in various ubiquitin protein ligases. Q#22586 - CGI_10009575 superfamily 222719 3015 3125 5.48E-36 135.932 cl16839 DUF4414 superfamily - - Domain of unknown function (DUF4414); This family is frequently found on DNA binding proteins of the URE-B1 type and on ligases. Q#22586 - CGI_10009575 superfamily 207713 1649 1710 6.62E-13 68.1149 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#22586 - CGI_10009575 superfamily 244870 2693 2743 1.36E-09 59.2964 cl08238 PA superfamily C - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#22587 - CGI_10009576 superfamily 241628 64 327 8.94E-60 198.99 cl00130 PseudoU_synth superfamily N - "Pseudouridine synthases catalyze the isomerization of specific uridines in an RNA molecule to pseudouridines (5-ribosyluracil, psi); Pseudouridine synthases contains the RsuA/RluD, TruA, TruB and TruD families. This group consists of eukaryotic, bacterial and archeal pseudouridine synthases. Some psi sites such as psi55,13,38 and 39 in tRNA are highly conserved, being in the same position in eubacteria, archeabacteria and eukaryotes. Other psi sites occur in a more restricted fashion, for example psi2604in 23S RNA made by E.coli RluF has only been detected in E.coli. Human dyskerin with the help of guide RNAs makes the hundreds of psueudouridnes present in rRNA and small nuclear RNAs (snRNAs). Mutations in human dyskerin cause X-linked dyskeratosis congenitas. Missense mutation in human PUS1 causes mitochondrial myopathy and sideroblastic anemia (MLASA)." Q#22588 - CGI_10009577 superfamily 217473 281 439 6.52E-24 102.058 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#22591 - CGI_10003833 superfamily 243051 11 97 2.50E-18 75.0796 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#22593 - CGI_10003835 superfamily 243051 67 154 6.06E-17 73.1809 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#22595 - CGI_10003837 superfamily 245201 18 211 1.79E-51 172.806 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22597 - CGI_10008458 superfamily 245226 6 45 5.91E-07 45.3687 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#22602 - CGI_10008463 superfamily 245226 432 505 9.21E-06 44.5989 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#22612 - CGI_10004578 superfamily 243051 463 617 5.84E-24 98.9893 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#22612 - CGI_10004578 superfamily 241571 337 409 4.35E-08 51.6443 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#22612 - CGI_10004578 superfamily 241583 115 287 4.66E-39 142.325 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#22612 - CGI_10004578 superfamily 241609 623 661 1.30E-06 46.9998 cl00100 KR superfamily C - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#22616 - CGI_10004583 superfamily 192997 336 484 1.06E-12 67.2215 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#22617 - CGI_10007004 superfamily 241546 603 763 1.36E-38 140.768 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#22617 - CGI_10007004 superfamily 241546 785 914 1.56E-37 137.686 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#22617 - CGI_10007004 superfamily 241546 280 398 7.74E-35 129.982 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#22617 - CGI_10007004 superfamily 241546 411 498 2.66E-30 116.886 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#22617 - CGI_10007004 superfamily 241546 159 263 3.82E-23 96.47 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#22617 - CGI_10007004 superfamily 241546 560 583 0.000140152 41.3864 cl00011 PLAT superfamily N - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#22618 - CGI_10007005 superfamily 245599 217 440 9.51E-93 281.946 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#22618 - CGI_10007005 superfamily 207662 100 192 6.82E-63 200.471 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#22620 - CGI_10007007 superfamily 247866 7 233 6.66E-20 85.582 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#22623 - CGI_10025039 superfamily 243065 31 191 0.00247569 35.8805 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#22627 - CGI_10025043 superfamily 247744 89 124 0.00233722 37.6332 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#22628 - CGI_10025044 superfamily 216981 59 153 2.55E-13 62.9354 cl17087 OTU superfamily C - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#22630 - CGI_10025046 superfamily 248097 3 117 8.73E-18 74.2238 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22631 - CGI_10025047 superfamily 110440 396 416 0.000256929 38.5429 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#22632 - CGI_10025048 superfamily 110440 294 321 0.00309907 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#22635 - CGI_10025051 superfamily 141815 6 323 5.07E-128 371.312 cl04275 Mtc superfamily - - Tricarboxylate carrier; Tricarboxylate carrier. Q#22636 - CGI_10025052 superfamily 241599 165 223 1.83E-21 84.9876 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22637 - CGI_10025053 superfamily 241599 198 256 3.21E-21 84.9876 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22638 - CGI_10025054 superfamily 241599 182 240 2.26E-21 85.3728 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22642 - CGI_10025058 superfamily 247911 40 391 5.65E-132 385.153 cl17357 Fumble superfamily - - "Fumble; Fumble is required for cell division in Drosophila. Mutants lacking fumble exhibit abnormalities in bipolar spindle organisation, chromosome segregation, and contractile ring formation. Analyses have demonstrated that encodes three protein isoforms, all of which contain a domain with high similarity to the pantothenate kinases of A. nidulans and mouse. A role of fumble in membrane synthesis has been proposed." Q#22643 - CGI_10025059 superfamily 245201 131 219 4.46E-08 50.3129 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22644 - CGI_10025060 superfamily 241624 413 588 1.09E-60 204.483 cl00120 PP2Cc superfamily N - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#22644 - CGI_10025060 superfamily 241624 14 105 1.72E-21 93.9788 cl00120 PP2Cc superfamily C - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#22648 - CGI_10025064 superfamily 248054 97 147 2.07E-09 54.4004 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#22649 - CGI_10025065 superfamily 243074 169 211 9.08E-13 63.6797 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#22649 - CGI_10025065 superfamily 243092 453 563 1.08E-08 55.4188 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22652 - CGI_10025068 superfamily 243034 748 858 1.68E-13 69.7163 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 1150 1261 2.09E-11 63.5532 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 628 738 7.67E-11 62.0124 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 1031 1138 1.71E-10 60.8568 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 167 223 3.78E-10 60.0864 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 868 974 6.08E-10 59.316 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 511 614 8.68E-10 58.9308 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 951 1058 5.54E-09 56.6196 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 428 538 6.46E-07 50.0712 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 243034 354 458 0.00055894 41.2116 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22652 - CGI_10025068 superfamily 248422 1553 1864 1.38E-52 189.454 cl17868 CHAT superfamily - - CHAT domain; These proteins appear to be related to peptidases in peptidase clan CD that includes the caspases. This domain has been termed the CHAT domain for Caspase HetF Associated with Tprs. This family has been identified as a sister group to the separins. Q#22653 - CGI_10025069 superfamily 243034 40 110 2.23E-05 43.5228 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22653 - CGI_10025069 superfamily 248012 601 706 1.78E-11 62.2909 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#22654 - CGI_10025070 superfamily 243034 158 264 0.000622728 38.5152 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22654 - CGI_10025070 superfamily 248012 427 532 7.65E-12 62.6761 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#22656 - CGI_10025072 superfamily 247986 37 132 1.04E-11 62.3906 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#22656 - CGI_10025072 superfamily 197504 243 358 1.82E-07 48.8249 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#22657 - CGI_10025073 superfamily 247724 568 770 7.77E-75 244.367 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22657 - CGI_10025073 superfamily 243072 409 447 2.59E-07 49.6894 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22660 - CGI_10025076 superfamily 246671 379 507 2.52E-29 112.516 cl14606 Reeler_cohesin_like superfamily - - "Domains similar to the eukaryotic reeler domain and bacterial cohesins; This diverse family summarizes a set of distantly related domains, as revealed by structural similarity." Q#22661 - CGI_10025077 superfamily 242173 257 394 1.84E-19 86.537 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#22661 - CGI_10025077 superfamily 242173 412 558 3.17E-16 77.2922 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#22661 - CGI_10025077 superfamily 242173 728 873 6.25E-15 73.4402 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#22661 - CGI_10025077 superfamily 242173 637 714 7.52E-06 45.7059 cl00891 Cu-Zn_Superoxide_Dismutase superfamily N - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#22662 - CGI_10025078 superfamily 247723 124 199 2.79E-48 168.377 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22662 - CGI_10025078 superfamily 247723 221 298 2.81E-42 151.266 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22663 - CGI_10025079 superfamily 245206 4 225 6.06E-96 301.936 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#22663 - CGI_10025079 superfamily 245226 291 460 1.28E-40 147.725 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#22664 - CGI_10025080 superfamily 241760 44 92 2.74E-23 93.5655 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#22665 - CGI_10025081 superfamily 244870 21 137 7.96E-44 142.513 cl08238 PA superfamily - - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#22666 - CGI_10000247 superfamily 243072 80 195 5.82E-33 120.566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22666 - CGI_10000247 superfamily 243072 141 262 1.80E-29 110.551 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22666 - CGI_10000247 superfamily 243072 14 130 1.08E-17 78.1942 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22666 - CGI_10000247 superfamily 243072 236 300 3.18E-05 41.9855 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22668 - CGI_10006912 superfamily 247724 34 111 1.77E-10 54.8487 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22672 - CGI_10006916 superfamily 243092 138 469 7.19E-42 151.334 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22673 - CGI_10006917 superfamily 243035 134 254 2.31E-17 75.3489 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22673 - CGI_10006917 superfamily 243035 42 94 1.65E-09 53.7578 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22674 - CGI_10006918 superfamily 243035 53 173 5.22E-18 76.1193 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22675 - CGI_10006919 superfamily 243035 53 173 1.29E-18 77.6601 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22676 - CGI_10006920 superfamily 243035 52 172 1.50E-17 74.5785 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22677 - CGI_10006921 superfamily 243035 53 173 3.23E-19 79.2009 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22678 - CGI_10006922 superfamily 243035 52 172 2.11E-17 74.1933 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22679 - CGI_10008112 superfamily 248097 228 356 3.09E-14 68.0606 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22680 - CGI_10008113 superfamily 248097 235 363 2.03E-16 74.2238 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22681 - CGI_10008114 superfamily 248097 151 279 4.29E-15 71.9126 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22681 - CGI_10008114 superfamily 248097 395 523 2.54E-14 69.6014 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22682 - CGI_10008115 superfamily 248097 109 220 7.83E-14 64.979 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22684 - CGI_10008117 superfamily 243072 1180 1295 2.27E-21 92.4466 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22684 - CGI_10008117 superfamily 243037 715 871 7.37E-65 217.973 cl02440 DAGK_acc superfamily - - Diacylglycerol kinase accessory domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown. Q#22684 - CGI_10008117 superfamily 248019 566 686 2.16E-49 172.866 cl17465 DAGK_cat superfamily - - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#22684 - CGI_10008117 superfamily 241566 445 503 6.85E-07 48.2195 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#22688 - CGI_10008121 superfamily 248097 9 122 4.51E-26 95.795 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22689 - CGI_10007757 superfamily 243258 2 193 1.19E-97 284.444 cl02977 Ribosomal_L15e superfamily - - Ribosomal L15; Ribosomal L15. Q#22690 - CGI_10007758 superfamily 213107 12 51 0.00279755 35.3236 cl02594 DD_R_PKA superfamily - - "Dimerization/Docking domain of the Regulatory subunit of cAMP-dependent protein kinase and similar domains; cAMP-dependent protein kinase (PKA) is a serine/threonine kinase (STK), catalyzing the transfer of the gamma-phosphoryl group from ATP to serine/threonine residues on protein substrates. The inactive PKA holoenzyme is a heterotetramer composed of two phosphorylated and active catalytic subunits with a dimer of regulatory (R) subunits. Activation is achieved through the binding of the important second messenger cAMP to the R subunits, which leads to the dissociation of PKA into the R dimer and two active subunits. There are two classes of R subunits, RI and RII; each exists as two isoforms (alpha and beta) from distinct genes. These functionally non-redundant R isoforms allow for specificity in PKA signaling. The R subunit contains an N-terminal dimerization/docking (D/D) domain, a linker with an inhibitory sequence (IS), and two c-AMP binding domains. RI and RII subunits are distinguished by their IS; RII subunits contain a phosphorylation site and are both substrates and inhibitors while RI subunits are pseudo-substrates. RI subunits require ATP and Mg ions to form a stable holoenzyme while RII subunits do not. The D/D domain dimerizes to form a four-helix bundle that serves as a docking site for A-kinase-anchoring proteins (AKAPs), which facilitates the localization of PKA to specific sites in the cell. PKA is present ubiquitously in cells and interacts with many different downstream targets. It plays a role in the regulation of diverse processes such as growth, development, memory, metabolism, gene expression, immunity, and lipolysis." Q#22691 - CGI_10007759 superfamily 218811 4 490 9.03E-111 342.802 cl09392 API5 superfamily - - "Apoptosis inhibitory protein 5 (API5); This family consists of apoptosis inhibitory protein 5 (API5) sequences from several organisms. Apoptosis or programmed cell death is a physiological form of cell death that occurs in embryonic development and organ formation. It is characterized by biochemical and morphological changes such as DNA fragmentation and cell volume shrinkage. API5 is an anti apoptosis gene located in human chromosome 11, whose expression prevents the programmed cell death that occurs upon the deprivation of growth factors." Q#22692 - CGI_10007760 superfamily 216167 5 150 1.35E-51 174.312 cl02999 DNA_photolyase superfamily - - DNA photolyase; This domain binds a light harvesting cofactor. Q#22693 - CGI_10007761 superfamily 243096 41 213 3.06E-43 152.452 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#22693 - CGI_10007761 superfamily 247725 219 349 1.26E-19 84.9986 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22694 - CGI_10007762 superfamily 243050 102 156 5.32E-29 103.784 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#22694 - CGI_10007762 superfamily 243050 24 72 7.86E-23 87.6064 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#22696 - CGI_10007764 superfamily 243091 655 767 2.11E-12 66.2039 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#22698 - CGI_10007766 superfamily 245816 4 163 1.44E-44 147.03 cl11964 CYTH-like_Pase superfamily - - "CYTH-like (also known as triphosphate tunnel metalloenzyme (TTM)-like) Phosphatases; CYTH-like superfamily enzymes hydrolyze triphosphate-containing substrates and require metal cations as cofactors. They have a unique active site located at the center of an eight-stranded antiparallel beta barrel tunnel (the triphosphate tunnel). The name CYTH originated from the gene designation for bacterial class IV adenylyl cyclases (CyaB), and from thiamine triphosphatase. Class IV adenylate cyclases catalyze the conversion of ATP to 3',5'-cyclic AMP (cAMP) and PPi. Thiamine triphosphatase is a soluble cytosolic enzyme which converts thiamine triphosphate to thiamine diphosphate. This domain superfamily also contains RNA triphosphatases, membrane-associated polyphosphate polymerases, tripolyphosphatases, nucleoside triphosphatases, nucleoside tetraphosphatases and other proteins with unknown functions." Q#22699 - CGI_10007767 superfamily 243015 17 135 1.18E-17 75.3762 cl02381 Tim17 superfamily - - "Tim17/Tim22/Tim23/Pmp24 family; The pre-protein translocase of the mitochondrial outer membrane (Tom) allows the import of pre-proteins from the cytoplasm. Tom forms a complex with a number of proteins, including Tim17. Tim17 and Tim23 are thought to form the translocation channel of the inner membrane. This family includes Tim17, Tim22 and Tim23. This family also includes Pmp24 a peroxisomal protein. The involvement of this domain in the targeting of PMP24 remains to be proved. PMP24 was known as Pmp27 in." Q#22700 - CGI_10007768 superfamily 245819 483 645 1.36E-59 200.498 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#22700 - CGI_10007768 superfamily 219526 223 468 2.29E-70 231.354 cl06648 HNOBA superfamily - - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#22700 - CGI_10007768 superfamily 203730 25 189 3.94E-63 210.203 cl18246 HNOB superfamily - - "Heme NO binding; The HNOB (Heme NO Binding) domain, is a predominantly alpha-helical domain and binds heme via a covalent linkage to histidine. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#22701 - CGI_10007769 superfamily 248097 130 263 1.76E-16 74.9942 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22701 - CGI_10007769 superfamily 248097 271 404 1.59E-14 69.6014 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22701 - CGI_10007769 superfamily 218915 27 61 0.00222185 36.1219 cl15704 DUF972 superfamily NC - Protein of unknown function (DUF972); This family consists of several hypothetical bacterial sequences. The function of this family is unknown. Q#22703 - CGI_10021566 superfamily 245205 246 309 0.000228456 39.1046 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#22704 - CGI_10021567 superfamily 208843 84 254 1.51E-107 324.056 cl08275 RHD-n superfamily - - "N-terminal sub-domain of the Rel homology domain (RHD); Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal sub-domain, which may be distantly related to the DNA-binding domain found in P53. The C-terminal sub-domain has an immunoglobulin-like fold and serves as a dimerization module that also binds DNA (see cd00102). The RHD is found in NF-kappa B, nuclear factor of activated T-cells (NFAT), the tonicity-responsive enhancer binding protein (TonEBP), and the arthropod proteins Dorsal and Relish (Rel)." Q#22704 - CGI_10021567 superfamily 247038 259 361 9.71E-57 187.911 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#22705 - CGI_10021568 superfamily 245201 75 407 0 682.662 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22705 - CGI_10021568 superfamily 247725 1095 1228 6.81E-71 235.656 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22705 - CGI_10021568 superfamily 241566 1036 1085 3.07E-14 69.8283 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#22705 - CGI_10021568 superfamily 243036 1257 1525 1.86E-54 193.223 cl02434 CNH superfamily - - "CNH domain; Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations." Q#22705 - CGI_10021568 superfamily 241620 1591 1616 1.36E-05 44.5052 cl00113 CRIB superfamily C - "PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules." Q#22705 - CGI_10021568 superfamily 243054 484 652 0.0023925 40.1216 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#22706 - CGI_10021569 superfamily 243077 10 39 0.00389708 37.51 cl02542 DnaJ superfamily C - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#22707 - CGI_10021570 superfamily 247903 2 99 9.03E-29 104.686 cl17349 Peptidase_M54 superfamily N - "Peptidase family M54, also called archaemetzincins or archaelysins; Peptidase M54 (archaemetzincin or archaelysin) is a zinc-dependent aminopeptidase that contains the consensus zinc-binding sequence HEXXHXXGXXH/D and a conserved Met residue at the active site, and is thus classified as a metzincin. Archaemetzincins, first identified in archaea, are also found in bacteria and eukaryotes, including two human members, archaemetzincin-1 and -2 (AMZ1 and AMZ2). AMZ1 is mainly found in the liver and heart while AMZ2 is primarily expressed in testis and heart; both have been reported to degrade synthetic substrates and peptides. The Peptidase M54 family contains an extended metzincin concensus sequence of HEXXHXXGX3CX4CXMX17CXXC such that a second zinc ion is bound to four cysteines, thus resembling a zinc finger. Phylogenetic analysis of this family reveals a complex evolutionary process involving a series of lateral gene transfer, gene loss and genetic duplication events." Q#22711 - CGI_10021574 superfamily 244547 602 676 2.08E-08 52.5841 cl06893 UME superfamily - - "UME (NUC010) domain; This domain is characteristic of UVSB PI-3 kinase, MEI-41 and ESR1." Q#22712 - CGI_10021575 superfamily 217473 96 326 1.95E-29 115.925 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#22713 - CGI_10021576 superfamily 241599 514 564 1.05E-08 52.2457 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22713 - CGI_10021576 superfamily 241599 425 476 1.11E-07 49.5493 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22715 - CGI_10021578 superfamily 217473 76 133 4.63E-07 48.9005 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#22717 - CGI_10021580 superfamily 246597 179 482 0 625.042 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#22717 - CGI_10021580 superfamily 243034 41 131 5.45E-21 87.8207 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22721 - CGI_10021584 superfamily 218087 55 91 0.000935533 36.2935 cl18440 SWIM superfamily - - "SWIM zinc finger; This domain is found in bacterial, archaeal and eukaryotic proteins. It is predicted to be organised into two N-terminal beta-strands and a C-terminal alpha helix, thus possibly adopting a fold similar to that of the C2H2 zinc finger (pfam00096). SWIM is thought to be a versatile domain that can interact with DNA or proteins in different contexts." Q#22723 - CGI_10021586 superfamily 247856 137 192 6.25E-18 74.8917 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22723 - CGI_10021586 superfamily 247856 62 124 2.86E-07 45.6165 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22727 - CGI_10021590 superfamily 220533 217 684 3.57E-115 362.042 cl12375 Dpy19 superfamily N - "Q-cell neuroblast polarisation; Dyp-19, formerly known as DUF2211, is a transmembrane domain family that is required to orient the neuroblast cells, QR and QL accurately on the anterior-posterior axis: QL and QR are born in the same anterior-posterior position, but polarise and migrate left-right asymmetrically, QL migrating towards the posterior and QR migrating towards the anterior. It is also required, with unc-40, to express mab-5 correctly in the Q cell descendants. The Dpy-19 protein derives from the C. elegans DUMPY mutant." Q#22730 - CGI_10013987 superfamily 246664 543 944 9.35E-160 477.837 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#22730 - CGI_10013987 superfamily 246664 4 311 3.80E-138 421.598 cl14561 An_peroxidase_like superfamily N - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#22730 - CGI_10013987 superfamily 246664 438 480 1.25E-05 47.305 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#22732 - CGI_10013989 superfamily 198845 210 295 1.53E-19 81.5556 cl04394 BRICHOS superfamily - - "BRICHOS domain; The BRICHOS domain is about 100 amino acids long. It is found in a variety of proteins implicated in dementia, respiratory distress and cancer. Its exact function is unknown; roles that have been proposed for it include (a) in targeting of the protein to the secretory pathway, (b) intramolecular chaperone-like function, and (c) assisting the specialised intracellular protease processing system. This C-terminal domain is embedded in the endoplasmic reticulum lumen, and binds to the N-terminal, transmembrane, SP_C, pfam08999, provided that it is in non-helical conformation. Thus the Brichos domain of proSP-C is a chaperone that induces alpha-helix formation of an aggregation-prone TM region." Q#22734 - CGI_10013991 superfamily 245847 1476 1625 4.59E-27 110.52 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#22734 - CGI_10013991 superfamily 245213 1845 1882 0.000372607 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22734 - CGI_10013991 superfamily 147730 2247 2437 5.25E-46 167.589 cl05347 TSP_C superfamily - - Thrombospondin C-terminal region; This region is found at the C-terminus of thrombospondin and related proteins. Q#22734 - CGI_10013991 superfamily 241611 2439 2590 6.79E-15 75.1176 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#22734 - CGI_10013991 superfamily 219525 1737 1780 2.37E-08 53.5769 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#22734 - CGI_10013991 superfamily 202235 1998 2032 5.33E-08 52.0023 cl15981 TSP_3 superfamily - - Thrombospondin type 3 repeat; The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure. Q#22734 - CGI_10013991 superfamily 205157 1926 1962 2.81E-07 50.2287 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#22734 - CGI_10013991 superfamily 244965 1051 1173 5.71E-07 50.0109 cl08459 PA14 superfamily - - "PA14 domain; This domain forms an insert in bacterial beta-glucosidases and is found in other glycosidases, glycosyltransferases, proteases, amidases, yeast adhesins, and bacterial toxins, including anthrax protective antigen (PA). The domain also occurs in a Dictyostelium prespore-cell-inducing factor Psi and in fibrocystin, the mammalian protein whose mutation leads to polycystic kidney and hepatic disease. The crystal structure of PA shows that this domain (named PA14 after its location in the PA20 pro-peptide) has a beta-barrel structure. The PA14 domain sequence suggests a binding function, rather than a catalytic role. The PA14 domain distribution is compatible with carbohydrate binding." Q#22734 - CGI_10013991 superfamily 202235 2191 2223 5.05E-06 46.2243 cl15981 TSP_3 superfamily - - Thrombospondin type 3 repeat; The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure. Q#22734 - CGI_10013991 superfamily 202235 2118 2151 1.34E-05 45.0687 cl15981 TSP_3 superfamily - - Thrombospondin type 3 repeat; The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure. Q#22734 - CGI_10013991 superfamily 246925 550 824 3.54E-05 46.965 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#22734 - CGI_10013991 superfamily 202235 2162 2189 6.42E-05 43.1427 cl15981 TSP_3 superfamily - - Thrombospondin type 3 repeat; The thrombospondin repeat is a short aspartate rich repeat which binds to calcium ions. The repeat was initially identified in thrombospondin proteins that contained 7 of these repeats. The repeat lacks defined secondary structure. Q#22734 - CGI_10013991 superfamily 247068 1366 1436 0.00128044 39.6403 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22738 - CGI_10013995 superfamily 245596 22 166 1.51E-65 202.042 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#22740 - CGI_10013997 superfamily 247097 18 51 0.00966308 29.2697 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#22741 - CGI_10013998 superfamily 247097 30 63 0.00168174 34.6625 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#22741 - CGI_10013998 superfamily 220611 98 146 0.0052066 34.4427 cl10864 Laps superfamily NC - Learning-associated protein; This is a family of 121-amino acid secretory proteins. Laps functions in the regulation of neuronal cell adhesion and/or movement and synapse attachment. Laps binds to the ApC/EBP (Aplysia CCAAT/enhancer binding protein) promoter and activates the transcription of ApC/EBP mRNA. Q#22742 - CGI_10013999 superfamily 247097 56 92 0.00594732 33.9638 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#22743 - CGI_10014000 superfamily 245319 5 48 4.70E-12 57.9961 cl10505 CBF superfamily N - CBF/Mak21 family; CBF/Mak21 family. Q#22744 - CGI_10014001 superfamily 245319 95 148 6.15E-17 74.5597 cl10505 CBF superfamily NC - CBF/Mak21 family; CBF/Mak21 family. Q#22745 - CGI_10013345 superfamily 243066 4 102 2.17E-20 84.2061 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22745 - CGI_10013345 superfamily 198867 112 220 1.31E-09 54.6548 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#22747 - CGI_10013347 superfamily 247787 19 295 0 529.108 cl17233 RecA-like_NTPases superfamily - - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#22747 - CGI_10013347 superfamily 215848 304 406 1.46E-32 119.307 cl08258 ATP-synt_ab_C superfamily - - "ATP synthase alpha/beta chain, C terminal domain; ATP synthase alpha/beta chain, C terminal domain. " Q#22748 - CGI_10013348 superfamily 243072 400 525 1.55E-32 123.263 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22748 - CGI_10013348 superfamily 243072 467 591 7.41E-32 121.337 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22748 - CGI_10013348 superfamily 243072 532 655 1.21E-23 97.8394 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22749 - CGI_10013349 superfamily 216101 88 601 1.41E-120 372.396 cl08288 Carn_acyltransf superfamily - - Choline/Carnitine o-acyltransferase; Choline/Carnitine o-acyltransferase. Q#22750 - CGI_10013350 superfamily 247804 506 551 0.000101813 40.6366 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#22758 - CGI_10005012 superfamily 115579 78 124 0.000586607 38.6003 cl06133 SSP160 superfamily N - Special lobe-specific silk protein SSP160; This family consists of several special lobe-specific silk protein SSP160 sequences which appear to be specific to Chironomus (Midge) species. Q#22760 - CGI_10000904 superfamily 247038 30 74 4.93E-11 56.5551 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#22761 - CGI_10004935 superfamily 243072 222 319 5.71E-22 88.9798 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22761 - CGI_10004935 superfamily 207800 13 142 7.48E-29 108.656 cl02972 Ephrin_RBD superfamily - - "Receptor Binding Domain of Ephrins; Ephrins and their receptors EphR play an important role in cell communication in normal physiology, as well as in disease pathogenesis. Binding of the ephrin ligand to EphR requires cell-cell contact, since both molecules are anchored to the plasma membrane. The resulting downstream signals occur bidirectionally in both EphR-expressing cells (forward signaling, depending on Eph kinase activity) and ephrin-expressing cells (reverse signaling). Eph signaling controls cell morphology, adhesion, migration and invasion. Ephrins can be subdivided into 2 groups, A and B, depending on their respective receptors EphA or EphB. The nine human EphA receptors bind to five GPI-linked ephrin-A ligands and the five EphB receptors bind to three transmembrane ephrin-B ligands. Interactions are promiscuous within each class, and some Eph receptors can also bind to ephrins of the other class. All Ephrins contain a highly conserved extracellular receptor binding domain, which is characterized by this domain hierarchy." Q#22762 - CGI_10004936 superfamily 246953 40 406 3.27E-179 507.566 cl15414 V-ATPase_C superfamily - - V-ATPase subunit C; V-ATPase subunit C. Q#22763 - CGI_10004937 superfamily 246925 44 179 0.00125821 39.6462 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#22765 - CGI_10004939 superfamily 247068 316 391 3.69E-07 48.4638 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22765 - CGI_10004939 superfamily 247068 399 491 3.93E-06 45.3822 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22765 - CGI_10004939 superfamily 247068 33 127 0.000532502 38.8338 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22765 - CGI_10004939 superfamily 247068 197 286 0.000688666 38.4486 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22765 - CGI_10004939 superfamily 247068 125 181 0.00961028 34.9818 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22767 - CGI_10008416 superfamily 248312 22 221 3.08E-06 45.0369 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#22769 - CGI_10008418 superfamily 248312 20 210 1.41E-11 59.6745 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#22770 - CGI_10008419 superfamily 243073 377 411 4.69E-07 46.3093 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#22771 - CGI_10008420 superfamily 241737 27 194 1.07E-69 212.002 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#22772 - CGI_10008421 superfamily 241550 8 155 1.26E-95 282.919 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#22772 - CGI_10008421 superfamily 241550 204 313 6.55E-57 183.614 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#22773 - CGI_10008422 superfamily 248011 501 581 0.00304856 39.017 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#22773 - CGI_10008422 superfamily 241546 2277 2396 9.18E-44 158.593 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#22773 - CGI_10008422 superfamily 243061 299 394 1.62E-24 102.035 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#22774 - CGI_10000281 superfamily 147296 3 123 8.47E-35 121.168 cl04901 Cytochrom_B558a superfamily C - Cytochrome Cytochrome b558 alpha-subunit; Cytochrome b-245 light chain (p22-phox) is one of the key electron transfer elements of the NADPH oxidase in phagocytes. Q#22775 - CGI_10011351 superfamily 241563 25 58 5.68E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22776 - CGI_10011352 superfamily 241563 62 104 7.16E-07 46.7036 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22779 - CGI_10011355 superfamily 247098 385 510 4.45E-62 201.6 cl15841 COG0229 superfamily - - "Conserved domain frequently associated with peptide methionine sulfoxide reductase [Posttranslational modification, protein turnover, chaperones]" Q#22779 - CGI_10011355 superfamily 222150 258 283 0.000460298 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22779 - CGI_10011355 superfamily 222150 202 227 0.00362949 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22779 - CGI_10011355 superfamily 222150 286 309 0.00760343 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22780 - CGI_10011356 superfamily 243035 2 59 1.12E-10 52.237 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22783 - CGI_10011359 superfamily 245716 1309 1329 0.00349481 37.2237 cl11592 zf-CCCH superfamily N - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#22783 - CGI_10011359 superfamily 245716 1284 1304 0.00935467 35.6829 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#22784 - CGI_10011360 superfamily 241822 127 192 1.22E-12 60.3497 cl00373 Ribosomal_S18 superfamily - - Ribosomal protein S18; Ribosomal protein S18. Q#22785 - CGI_10001566 superfamily 247683 25 88 1.58E-10 61.7276 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#22786 - CGI_10001567 superfamily 151110 1 230 2.68E-66 207.051 cl11201 UPF0552 superfamily - - Uncharacterized protein family UPF0552; This family of proteins has no known function. Q#22788 - CGI_10004756 superfamily 243555 19 204 4.81E-06 47.771 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#22789 - CGI_10004757 superfamily 245814 173 196 2.63E-05 40.0825 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#22791 - CGI_10000895 superfamily 241750 69 176 4.07E-25 98.805 cl00281 metallo-dependent_hydrolases superfamily C - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#22791 - CGI_10000895 superfamily 245874 2 49 0.00171724 35.0946 cl12111 TNFR superfamily C - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#22793 - CGI_10002237 superfamily 241563 68 109 7.22E-07 46.7036 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22794 - CGI_10002238 superfamily 241563 68 109 5.07E-07 47.0888 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22795 - CGI_10002239 superfamily 241563 68 109 4.41E-07 47.0888 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22795 - CGI_10002239 superfamily 241563 28 59 0.00178565 36.6884 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22796 - CGI_10013494 superfamily 247724 426 587 1.08E-73 235.043 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22796 - CGI_10013494 superfamily 247856 23 79 4.38E-05 41.7645 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22798 - CGI_10013496 superfamily 245013 89 139 0.0067352 33.0025 cl09115 Ribosomal_L32p superfamily C - Ribosomal L32p protein family; Ribosomal L32p protein family. Q#22799 - CGI_10013497 superfamily 241884 5 232 2.18E-153 428.664 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#22801 - CGI_10013499 superfamily 241867 8 329 1.24E-96 293.353 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#22803 - CGI_10013501 superfamily 241568 109 162 0.00117666 35.9016 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#22804 - CGI_10013502 superfamily 241600 341 558 9.93E-62 204.397 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#22804 - CGI_10013502 superfamily 243035 166 238 9.37E-05 41.0662 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22804 - CGI_10013502 superfamily 243035 114 162 0.00264732 36.829 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22805 - CGI_10013503 superfamily 241874 47 618 0 726.666 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#22810 - CGI_10011516 superfamily 241574 12 162 2.05E-49 167.763 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#22810 - CGI_10011516 superfamily 241574 247 366 2.26E-08 52.9733 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#22811 - CGI_10011517 superfamily 243034 48 127 1.79E-07 48.1452 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22814 - CGI_10011520 superfamily 218219 15 164 9.91E-41 137.836 cl04693 PRELI superfamily - - "PRELI-like family; This family includes a conserved region found in the PRELI protein and yeast YLR168C gene MSF1 product. The function of this protein is unknown, though it is thought to be involved in intra-mitochondrial protein sorting. This region is also found in a number of other eukaryotic proteins." Q#22815 - CGI_10011521 superfamily 241677 464 612 5.27E-100 303.614 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#22815 - CGI_10011521 superfamily 243092 57 219 4.53E-09 56.5744 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22817 - CGI_10011523 superfamily 245201 223 528 0 597.151 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22817 - CGI_10011523 superfamily 241622 779 867 1.27E-10 59.5756 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#22817 - CGI_10011523 superfamily 204097 30 190 5.24E-61 210.882 cl07500 DUF1908 superfamily N - Domain of unknown function (DUF1908); This domain is found in a set of hypothetical/structural eukaryotic proteins. Q#22818 - CGI_10011524 superfamily 247068 220 317 2.41E-24 97.7693 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22818 - CGI_10011524 superfamily 247068 326 406 5.40E-10 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22818 - CGI_10011524 superfamily 247068 104 210 4.54E-06 44.997 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22818 - CGI_10011524 superfamily 216897 454 532 2.21E-23 94.2852 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#22818 - CGI_10011524 superfamily 247068 25 49 0.000937609 37.7143 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22820 - CGI_10011526 superfamily 247068 175 275 7.49E-20 86.5985 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22820 - CGI_10011526 superfamily 247068 395 484 1.24E-16 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22820 - CGI_10011526 superfamily 247068 74 167 1.38E-16 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22820 - CGI_10011526 superfamily 247068 507 612 3.39E-13 66.9533 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22820 - CGI_10011526 superfamily 247068 286 380 1.74E-12 65.0273 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 517 615 3.67E-23 96.9989 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 428 509 3.65E-17 79.6649 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 741 837 1.72E-13 68.8793 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 1182 1263 1.45E-12 66.1829 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 1277 1353 2.07E-12 65.7977 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 324 399 6.31E-12 64.2569 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 899 967 1.30E-11 63.4865 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 626 732 6.12E-07 49.6194 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 269 313 3.15E-06 47.3082 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 815 888 0.000501615 40.4107 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22821 - CGI_10011528 superfamily 247068 59 131 0.00452775 37.3291 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22823 - CGI_10006408 superfamily 197732 819 852 5.01E-07 47.6323 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#22824 - CGI_10006409 superfamily 151853 319 401 1.18E-21 88.991 cl12943 Suppressor_APC superfamily - - "Adenomatous polyposis coli tumour suppressor protein; The tumour suppressor protein, APC, has a nuclear export activity as well as many different intracellular functions. The structure consists of three alpha-helices forming two separate antiparallel coiled coils." Q#22825 - CGI_10006410 superfamily 222150 35 60 0.000574582 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22825 - CGI_10006410 superfamily 222150 150 173 0.00186678 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#22826 - CGI_10006411 superfamily 112485 19 65 2.55E-07 43.016 cl04205 UPF0184 superfamily N - Uncharacterized protein family (UPF0184); Uncharacterised protein family (UPF0184). Q#22827 - CGI_10006412 superfamily 247805 39 241 9.24E-72 226.597 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#22827 - CGI_10006412 superfamily 247905 256 389 1.39E-30 114.257 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#22828 - CGI_10006413 superfamily 247805 46 248 5.76E-84 260.11 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#22828 - CGI_10006413 superfamily 247905 263 389 2.32E-35 128.51 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#22830 - CGI_10006415 superfamily 241703 156 431 6.26E-94 288.003 cl00226 nuc_hydro superfamily - - "nuc_hydro: Nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium, the purine-specific inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax and, pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases such as URH1 from Saccharomyces cerevisiae, RihA and RihB from Escherichia coli. Nucleoside hydrolases are of interest as a target for antiprotozoan drugs as, no nucleoside hydrolase activity or genes encoding these enzymes have been detected in humans and, parasitic protozoans lack de novo purine synthesis relying on nucleoside hydrolase to scavenge purine and/or pyrimidines from the environment." Q#22830 - CGI_10006415 superfamily 241703 5 166 7.45E-57 191.318 cl00226 nuc_hydro superfamily C - "nuc_hydro: Nucleoside hydrolases. Nucleoside hydrolases cleave the N-glycosidic bond in nucleosides generating ribose and the respective base. These enzymes vary in their substrate specificity. This group contains eukaryotic, bacterial and archeal proteins similar to the inosine-uridine preferring nucleoside hydrolase from Crithidia fasciculata, the xanthosine-inosine-uridine-adenosine-preferring nucleoside hydrolase RihC from Salmonella enterica serovar Typhimurium, the purine-specific inosine-adenosine-guanosine-preferring nucleoside hydrolase from Trypanosoma vivax and, pyrimidine-specific uridine-cytidine preferring nucleoside hydrolases such as URH1 from Saccharomyces cerevisiae, RihA and RihB from Escherichia coli. Nucleoside hydrolases are of interest as a target for antiprotozoan drugs as, no nucleoside hydrolase activity or genes encoding these enzymes have been detected in humans and, parasitic protozoans lack de novo purine synthesis relying on nucleoside hydrolase to scavenge purine and/or pyrimidines from the environment." Q#22831 - CGI_10006416 superfamily 247692 98 604 0 744.631 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#22833 - CGI_10006418 superfamily 247792 494 531 1.28E-08 51.6776 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22834 - CGI_10006419 superfamily 214781 244 357 2.58E-14 68.5228 cl02747 NRF superfamily - - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#22836 - CGI_10007801 superfamily 245596 990 1208 8.21E-86 278.311 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#22836 - CGI_10007801 superfamily 243072 413 538 1.50E-40 147.53 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22836 - CGI_10007801 superfamily 243072 76 203 1.70E-36 135.974 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22836 - CGI_10007801 superfamily 243072 211 340 1.62E-31 121.337 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22837 - CGI_10007802 superfamily 245603 74 441 6.02E-167 480.382 cl11403 pepsin_retropepsin_like superfamily - - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#22840 - CGI_10007805 superfamily 246680 42 115 1.71E-06 46.426 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#22840 - CGI_10007805 superfamily 246680 387 446 2.95E-05 42.574 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#22841 - CGI_10007806 superfamily 241832 72 153 1.16E-38 133.049 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#22842 - CGI_10007807 superfamily 247725 136 258 3.11E-74 237.205 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22842 - CGI_10007807 superfamily 243056 510 691 6.38E-51 176.396 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#22842 - CGI_10007807 superfamily 247725 293 431 7.17E-44 153.519 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22843 - CGI_10007808 superfamily 248097 124 255 0.000251268 38.4002 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22848 - CGI_10015835 superfamily 243066 30 128 8.86E-16 71.1828 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22849 - CGI_10015836 superfamily 245814 478 541 3.32E-06 45.5579 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#22849 - CGI_10015836 superfamily 245814 338 391 0.000443859 39.0095 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#22853 - CGI_10015840 superfamily 243139 218 272 8.99E-08 49.9912 cl02676 HSA superfamily C - HSA; This domain is predicted to bind DNA and is often found associated with helicases. Q#22855 - CGI_10015842 superfamily 248458 175 295 0.00504403 38.0637 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22855 - CGI_10015842 superfamily 245596 1 161 3.43E-70 228.346 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#22858 - CGI_10015845 superfamily 220080 709 845 6.50E-55 186.986 cl07526 DUF1900 superfamily - - "Domain of unknown function (DUF1900); This domain is predominantly found in the structural protein coronin, and is duplicated in some sequences. It has no known function." Q#22858 - CGI_10015845 superfamily 220080 230 346 1.11E-37 138.45 cl07526 DUF1900 superfamily - - "Domain of unknown function (DUF1900); This domain is predominantly found in the structural protein coronin, and is duplicated in some sequences. It has no known function." Q#22858 - CGI_10015845 superfamily 243092 506 710 1.41E-21 95.8648 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22858 - CGI_10015845 superfamily 243092 78 259 1.04E-14 74.2936 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#22858 - CGI_10015845 superfamily 149883 8 69 0.000776504 38.7593 cl07525 DUF1899 superfamily - - Domain of unknown function (DUF1899); This set of domains is found in various eukaryotic proteins. Function is unknown. Q#22859 - CGI_10015846 superfamily 243362 302 416 3.99E-44 153.735 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#22859 - CGI_10015846 superfamily 199908 236 296 2.00E-21 88.0814 cl16908 DnaJ_zf superfamily - - "Zinc finger domain of DnaJ and HSP40; Central/middle or CxxCxGxG-motif containing domain of DnaJ/Hsp40 (heat shock protein 40). DnaJ proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonin family. Hsp40 proteins are characterized by the presence of an N-terminal J domain, which mediates the interaction with Hsp70. This central domain contains four repeats of a CxxCxGxG motif and binds to two Zinc ions. It has been implicated in substrate binding." Q#22859 - CGI_10015846 superfamily 243077 78 132 5.85E-17 75.2745 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#22860 - CGI_10015847 superfamily 248194 35 210 0.00126358 39.9262 cl17640 PRMT5 superfamily N - "PRMT5 arginine-N-methyltransferase; The human homologue of yeast Skb1 (Shk1 kinase-binding protein 1) is PRMT5, an arginine-N-methyltransferase. These proteins appear to be key mitotic regulators. They play a role in Jak signalling in higher eukaryotes." Q#22862 - CGI_10015849 superfamily 247041 50 92 0.00218029 34.2485 cl15692 CE4_SF superfamily NC - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#22864 - CGI_10015851 superfamily 241594 4931 5300 1.89E-105 346.089 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#22864 - CGI_10015851 superfamily 242903 3268 3417 1.32E-69 234.961 cl02148 APC10-like superfamily - - "APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination; This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here." Q#22864 - CGI_10015851 superfamily 151952 2887 2963 7.68E-42 152.205 cl13031 Cul7 superfamily - - "Mouse development and cellular proliferation protein Cullin-7; The Cullin Ring Ligase family member, Cul7, is required for normal mouse development and cellular proliferation. Cul7 has a CPH domain which is a p53 interaction domain. The CPH domain interaction surface of P53 is present in the tetramerisation domain." Q#22864 - CGI_10015851 superfamily 115363 2152 2210 4.70E-29 114.776 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#22864 - CGI_10015851 superfamily 242849 1440 1512 1.46E-23 99.2004 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#22864 - CGI_10015851 superfamily 201217 4724 4773 4.05E-15 74.0992 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 241760 3041 3085 5.10E-15 73.7724 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#22864 - CGI_10015851 superfamily 241760 3213 3257 5.10E-15 73.7724 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#22864 - CGI_10015851 superfamily 201217 689 740 1.24E-14 72.5584 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 3570 3619 2.13E-14 72.1732 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 242903 3096 3152 2.63E-14 74.3323 cl02148 APC10-like superfamily C - "APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination; This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here." Q#22864 - CGI_10015851 superfamily 201217 3622 3673 4.02E-14 71.0176 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 4618 4669 5.88E-14 70.6324 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 4566 4615 1.85E-13 69.0916 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 3728 3777 2.38E-13 69.0916 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 795 844 2.85E-13 68.7064 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 4777 4825 4.59E-12 65.2396 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 847 896 7.21E-12 64.4692 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 3780 3828 8.07E-11 61.7728 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 3515 3567 3.58E-09 56.7652 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 205718 673 702 1.27E-05 45.9442 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 637 686 1.48E-05 45.9796 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 3676 3725 3.32E-05 45.2092 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 201217 743 792 0.000282698 42.5128 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22864 - CGI_10015851 superfamily 205718 4708 4735 0.00253677 39.3958 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#22865 - CGI_10015852 superfamily 248097 132 255 1.06E-20 85.0094 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22866 - CGI_10015853 superfamily 248097 6 49 2.00E-08 46.1042 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22868 - CGI_10015855 superfamily 245814 11 94 5.13E-12 62.908 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#22869 - CGI_10015856 superfamily 219571 291 789 0 661.822 cl06695 Cas1_AcylT superfamily - - "10 TM Acyl Transferase domain found in Cas1p; Cas1p protein of Cryptococcus neoformans is required for the synthesis of O-acetylated glucuronoxylomannans, a consitutent of the capsule, and is critical for its virulence. The multi TM domain of the Cas1p was unified with the 10 TM Sugar Acyltransferase superfamily. This superfamily is comprised of members from the OatA, MdoC, OpgC, NolL and GumG families in addition to the Cas1p family. The Cas1p protein has a N terminal PC-Esterase domain with the opposing Acyl esterase activity." Q#22869 - CGI_10015856 superfamily 242274 80 262 1.28E-07 52.4098 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#22870 - CGI_10015857 superfamily 241596 30 88 6.33E-05 37.1935 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#22871 - CGI_10005177 superfamily 243035 162 235 1.16E-07 49.1554 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22871 - CGI_10005177 superfamily 243035 265 381 3.03E-07 47.9998 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22871 - CGI_10005177 superfamily 243035 37 107 1.13E-05 43.3774 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22873 - CGI_10005180 superfamily 243519 12 136 5.52E-79 245.209 cl03757 phosphohexomutase superfamily C - "The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model." Q#22874 - CGI_10009715 superfamily 206084 81 105 1.30E-07 49.1364 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 523 547 1.33E-06 46.0548 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 695 719 1.44E-06 46.0548 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 390 413 4.84E-06 44.514 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 254 277 9.57E-06 43.7436 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 319 342 0.000221277 39.5064 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 186 207 0.000251343 39.5064 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 456 478 0.000685112 38.3508 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 622 645 0.00155421 37.1952 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 9 31 0.00199067 36.81 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22874 - CGI_10009715 superfamily 206084 581 603 0.00583525 35.2692 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#22875 - CGI_10009716 superfamily 245201 31 217 3.61E-64 209.4 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22875 - CGI_10009716 superfamily 241764 297 361 2.45E-21 88.3361 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#22875 - CGI_10009716 superfamily 241764 400 471 7.63E-08 49.956 cl00299 MIT superfamily - - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#22875 - CGI_10009716 superfamily 245201 186 289 1.39E-05 46.4088 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22876 - CGI_10009717 superfamily 241613 93 127 3.84E-08 47.9718 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#22876 - CGI_10009717 superfamily 241613 54 88 1.20E-07 46.431 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#22877 - CGI_10009718 superfamily 243035 1045 1180 1.41E-15 76.5045 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22877 - CGI_10009718 superfamily 241571 2614 2719 2.35E-14 72.8302 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#22877 - CGI_10009718 superfamily 241571 915 1020 3.19E-14 72.445 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#22877 - CGI_10009718 superfamily 241613 1190 1224 5.13E-10 57.987 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#22877 - CGI_10009718 superfamily 243035 2744 2803 1.11E-07 52.6222 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#22877 - CGI_10009718 superfamily 241613 1514 1548 7.51E-06 46.0458 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#22877 - CGI_10009718 superfamily 245847 538 691 8.20E-17 80.8593 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#22877 - CGI_10009718 superfamily 241609 2271 2355 1.42E-06 48.9258 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#22877 - CGI_10009718 superfamily 241609 278 361 1.53E-06 48.9258 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#22877 - CGI_10009718 superfamily 241571 2459 2514 1.52E-05 45.8663 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#22877 - CGI_10009718 superfamily 192535 1711 1888 0.000256021 44.1238 cl18179 7TM_GPCR_Srsx superfamily N - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#22877 - CGI_10009718 superfamily 241571 801 851 0.000661459 40.8587 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#22877 - CGI_10009718 superfamily 241619 708 773 0.00159432 39.4844 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#22877 - CGI_10009718 superfamily 217525 1651 1758 0.00411758 38.8845 cl04035 Serpentine_r_xa superfamily C - "Caenorhabditis serpentine receptor-like protein, class xa; This family contains various Caenorhabditis proteins, some of which are annotated as being serpentine receptors, mainly of the xa class." Q#22878 - CGI_10009719 superfamily 247724 62 255 1.07E-115 341.1 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#22878 - CGI_10009719 superfamily 243185 263 349 1.33E-36 130.32 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#22878 - CGI_10009719 superfamily 243184 354 447 3.98E-30 112.437 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#22879 - CGI_10009720 superfamily 241599 387 417 5.08E-05 41.0749 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#22880 - CGI_10009721 superfamily 245201 528 663 3.95E-24 102.7 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22880 - CGI_10009721 superfamily 245201 963 1129 8.41E-19 87.9714 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22881 - CGI_10009722 superfamily 221913 683 904 2.15E-57 196.607 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#22881 - CGI_10009722 superfamily 222005 504 550 1.37E-06 47.3468 cl18632 AAA_19 superfamily C - Part of AAA domain; Part of AAA domain. Q#22882 - CGI_10009724 superfamily 247916 881 937 3.98E-07 48.9183 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#22883 - CGI_10009725 superfamily 243072 162 294 2.27E-27 104.388 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22884 - CGI_10000167 superfamily 241583 50 106 1.14E-11 57.1958 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#22886 - CGI_10000483 superfamily 243119 18 69 0.00062541 33.9465 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#22887 - CGI_10000542 superfamily 241563 56 92 0.00490572 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22889 - CGI_10003625 superfamily 241752 1823 2186 2.18E-72 248.724 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#22889 - CGI_10003625 superfamily 242589 1710 1806 2.92E-34 129.732 cl01581 WGR superfamily - - "WGR domain; The WGR domain is found in a variety of eukaryotic poly(ADP-ribose) polymerases (PARPs) as well as the putative Escherichia coli molybdate metabolism regulator and related bacterial proteins, a small family of bacterial DNA ligases, and various other bacterial proteins of unknown function. It has been called WGR after the most conserved central motif of the domain. The domain occurs in single-domain proteins and in a variety of domain architectures, and is between 70 and 80 residues in length. It has been proposed to function as a nucleic acid binding domain." Q#22889 - CGI_10003625 superfamily 243072 455 577 4.31E-23 98.2246 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 335 510 2.52E-20 90.1354 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 949 1095 3.33E-17 80.8906 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 248 393 1.03E-13 70.8754 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 839 966 2.29E-12 66.6382 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 559 690 7.05E-12 65.0974 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 1240 1403 6.46E-10 59.3194 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 1558 1627 4.46E-06 47.3782 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 734 877 6.78E-05 43.5263 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 243072 1191 1285 0.00315963 38.5187 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22889 - CGI_10003625 superfamily 242219 23 77 9.55E-10 58.4156 cl00955 Ribosomal_L34e superfamily C - Ribosomal protein L34e; Ribosomal protein L34e. Q#22890 - CGI_10003626 superfamily 203149 12 90 1.40E-17 74.1713 cl09350 V-SNARE superfamily - - Vesicle transport v-SNARE protein N-terminus; V-SNARE proteins are required for protein traffic between eukaryotic organelles. The v-SNAREs on transport vesicles interact with t-SNAREs on target membranes in order to facilitate this. This domain is the N-terminal half of the V-Snare proteins. Q#22890 - CGI_10003626 superfamily 152787 122 187 2.56E-07 45.6665 cl18053 V-SNARE_C superfamily - - Snare region anchored in the vesicle membrane C-terminus; Within the SNARE proteins interactions in the C-terminal half of the SNARE helix are critical to the driving of membrane fusion; whereas interactions in the N-terminal half of the SNARE domain are important for promoting priming or docking of the vesicle pfam05008. Q#22891 - CGI_10003627 superfamily 247786 9 98 9.76E-21 86.8785 cl17232 F420_oxidored superfamily - - NADP oxidoreductase coenzyme F420-dependent; NADP oxidoreductase coenzyme F420-dependent. Q#22891 - CGI_10003627 superfamily 242267 272 373 3.42E-05 42.2772 cl01043 Ferric_reduct superfamily N - "Ferric reductase like transmembrane component; This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease." Q#22892 - CGI_10000759 superfamily 247058 1 168 3.61E-49 161.574 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#22897 - CGI_10001047 superfamily 241563 62 96 3.19E-05 41.8892 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22897 - CGI_10001047 superfamily 110440 483 509 0.0028551 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#22899 - CGI_10000701 superfamily 245856 572 804 1.53E-33 133.303 cl12060 AP2Ec superfamily C - "AP endonuclease family 2; These endonucleases play a role in DNA repair. Cleave phosphodiester bonds at apurinic or apyrimidinic sites; the alignment also contains hexulose-6-phosphate isomerases, enzymes that catalyze the epimerization of D-arabino-6-hexulose 3-phosphate to D-fructose 6-phosphate, via cleaving the phosphoesterbond with the sugar." Q#22903 - CGI_10001705 superfamily 247794 1 145 4.38E-93 278.272 cl17240 FDH_GDH_like superfamily N - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#22904 - CGI_10001557 superfamily 248097 18 137 2.36E-28 102.343 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#22905 - CGI_10010911 superfamily 241750 1430 1787 1.52E-172 532.797 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#22905 - CGI_10010911 superfamily 241555 182 358 7.01E-95 306.345 cl00020 GAT_1 superfamily - - "Type 1 glutamine amidotransferase (GATase1)-like domain; Type 1 glutamine amidotransferase (GATase1)-like domain. This group contains proteins similar to Class I glutamine amidotransferases, the intracellular PH1704 from Pyrococcus horikoshii, the C-terminal of the large catalase: Escherichia coli HP-II, Sinorhizobium meliloti Rm1021 ThuA, the A4 beta-galactosidase middle domain and peptidase E. The majority of proteins in this group have a reactive Cys found in the sharp turn between a beta strand and an alpha helix termed the nucleophile elbow. For Class I glutamine amidotransferases proteins which transfer ammonia from the amide side chain of glutamine to an acceptor substrate, this Cys forms a Cys-His-Glu catalytic triad in the active site. Glutamine amidotransferases activity can be found in a range of biosynthetic enzymes included in this cd: glutamine amidotransferase, formylglycinamide ribonucleotide, GMP synthetase, anthranilate synthase component II, glutamine-dependent carbamoyl phosphate synthase (CPSase), cytidine triphosphate synthetase, gamma-glutamyl hydrolase, imidazole glycerol phosphate synthase and, cobyric acid synthase. For Pyrococcus horikoshii PH1704, the Cys of the nucleophile elbow together with a different His and, a Glu from an adjacent monomer form a catalytic triad different from the typical GATase1 triad. Peptidase E is believed to be a serine peptidase having a Ser-His-Glu catalytic triad which differs from the Cys-His-Glu catalytic triad of typical GATase1 domains, by having a Ser in place of the reactive Cys at the nucleophile elbow. The E. coli HP-II C-terminal domain, S. meliloti Rm1021 ThuA and the A4 beta-galactosidase middle domain lack the catalytic triad typical GATaseI domains. GATase1-like domains can occur either as single polypeptides, as in Class I glutamine amidotransferases, or as domains in a much larger multifunctional synthase protein, such as CPSase. Peptidase E has a circular permutation in the common core of a typical GTAse1 domain." Q#22905 - CGI_10010911 superfamily 241720 1281 1407 8.02E-35 131.653 cl00245 MGS-like superfamily - - "MGS-like domain. This domain composes the whole protein of methylglyoxal synthetase, which catalyzes the enolization of dihydroxyacetone phosphate (DHAP) to produce methylglyoxal. The family also includes the C-terminal domain in carbamoyl phosphate synthetase (CPS) where it catalyzes the last phosphorylation of a coaboxyphosphate intermediate to form the product carbamoyl phosphate and may also play a regulatory role. This family also includes inosine monophosphate cyclohydrolase. The known structures in this family show a common phosphate binding site." Q#22905 - CGI_10010911 superfamily 247809 516 720 1.38E-76 255.689 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#22905 - CGI_10010911 superfamily 198801 4 142 5.04E-66 221.864 cl03056 CPSase_sm_chain superfamily - - "Carbamoyl-phosphate synthase small chain, CPSase domain; The carbamoyl-phosphate synthase domain is in the amino terminus of protein. Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00289. The small chain has a GATase domain in the carboxyl terminus. See pfam00117." Q#22905 - CGI_10010911 superfamily 217204 1894 2036 4.77E-51 179.272 cl03681 OTCace_N superfamily - - "Aspartate/ornithine carbamoyltransferase, carbamoyl-P binding domain; Aspartate/ornithine carbamoyltransferase, carbamoyl-P binding domain. " Q#22905 - CGI_10010911 superfamily 217231 808 930 5.87E-46 163.902 cl15983 CPSase_L_D3 superfamily - - "Carbamoyl-phosphate synthetase large chain, oligomerisation domain; Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain." Q#22905 - CGI_10010911 superfamily 215776 2041 2191 1.96E-45 163.919 cl18343 OTCace superfamily - - "Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain; Aspartate/ornithine carbamoyltransferase, Asp/Orn binding domain. " Q#22905 - CGI_10010911 superfamily 247809 1033 1213 4.45E-22 97.7576 cl17255 ATP-grasp_4 superfamily - - ATP-grasp domain; This family includes a diverse set of enzymes that possess ATP-dependent carboxylate-amine ligase activity. Q#22905 - CGI_10010911 superfamily 201133 944 1038 7.24E-19 85.2265 cl02837 CPSase_L_chain superfamily - - "Carbamoyl-phosphate synthase L chain, N-terminal domain; Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117." Q#22905 - CGI_10010911 superfamily 201133 411 510 2.83E-16 77.9077 cl02837 CPSase_L_chain superfamily - - "Carbamoyl-phosphate synthase L chain, N-terminal domain; Carbamoyl-phosphate synthase catalyzes the ATP-dependent synthesis of carbamyl-phosphate from glutamine or ammonia and bicarbonate. This important enzyme initiates both the urea cycle and the biosynthesis of arginine and/or pyrimidines. The carbamoyl-phosphate synthase (CPS) enzyme in prokaryotes is a heterodimer of a small and large chain. The small chain promotes the hydrolysis of glutamine to ammonia, which is used by the large chain to synthesise carbamoyl phosphate. See pfam00988. The small chain has a GATase domain in the carboxyl terminus. See pfam00117." Q#22906 - CGI_10010912 superfamily 248458 283 665 3.06E-09 58.0941 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22907 - CGI_10010913 superfamily 248458 86 181 9.45E-09 56.5533 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22907 - CGI_10010913 superfamily 248458 369 722 2.69E-08 55.0125 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22908 - CGI_10010914 superfamily 245201 289 567 0 566.871 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22908 - CGI_10010914 superfamily 243090 160 279 1.23E-50 174.47 cl02565 RGS superfamily N - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#22908 - CGI_10010914 superfamily 243090 30 91 1.03E-35 133.253 cl02565 RGS superfamily C - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#22908 - CGI_10010914 superfamily 247725 649 770 6.12E-35 129.736 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#22908 - CGI_10010914 superfamily 245597 548 614 6.24E-06 44.6588 cl11395 Pkinase_C superfamily - - Protein kinase C terminal domain; Protein kinase C terminal domain. Q#22910 - CGI_10010916 superfamily 243056 924 1093 7.80E-42 154.002 cl02495 RabGAP-TBC superfamily N - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#22910 - CGI_10010916 superfamily 243142 91 231 5.53E-28 111.179 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#22911 - CGI_10010917 superfamily 246954 36 467 2.18E-85 282.207 cl15415 Sec1 superfamily C - Sec1 family; Sec1 family. Q#22911 - CGI_10010917 superfamily 246954 475 717 3.80E-38 148.543 cl15415 Sec1 superfamily N - Sec1 family; Sec1 family. Q#22912 - CGI_10010918 superfamily 246669 193 300 6.82E-06 45.9059 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22912 - CGI_10010918 superfamily 246669 1033 1152 1.25E-30 119.265 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#22912 - CGI_10010918 superfamily 220800 919 1017 0.000437078 40.3663 cl11172 Membr_traf_MHD superfamily - - "Munc13 (mammalian uncoordinated) homology domain; Munc13 proteins constitute a family of three highly homologous molecules (Munc13-1, Munc13-2 and Munc13-3) with homology to Caenorhabditis elegans unc-13p. Munc13 proteins contain a phorbol ester-binding C1 domain and two C2 domains, which are Ca2+/phospholipid binding domains. Sequence analyses have uncovered two regions called Munc13 homology domains 1 (MHD1) and 2 (MHD2) that are arranged between two flanking C2 domains. MHD1 and MHD2 domains are present in a wide variety of proteins from Arabidopsis thaliana, C. elegans, Drosophila melanogaster, mouse, rat and human, some of which may function in a Munc13-like manner to regulate membrane trafficking. The MHD1 and MHD2 domains are predicted to be alpha-helical." Q#22913 - CGI_10010919 superfamily 247642 87 209 1.36E-44 148.533 cl16917 Complex1_30kDa superfamily - - "Respiratory-chain NADH dehydrogenase, 30 Kd subunit; Respiratory-chain NADH dehydrogenase, 30 Kd subunit. " Q#22914 - CGI_10010920 superfamily 241642 35 94 2.24E-05 40.5566 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#22914 - CGI_10010920 superfamily 241642 180 238 0.0089198 32.8845 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#22917 - CGI_10010923 superfamily 247856 22 79 6.32E-10 51.0093 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22920 - CGI_10010926 superfamily 245201 29 337 0 613.605 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22922 - CGI_10001740 superfamily 241563 96 121 0.00201011 36.3032 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22924 - CGI_10002037 superfamily 242004 114 348 1.10E-70 222.145 cl00650 Cu-oxidase_4 superfamily - - Multi-copper polyphenol oxidoreductase laccase; Laccases are multi-copper oxidoreductases able to oxidise a wide variety of phenolic and non-phenolic compounds and are widely distributed among both prokaryotes and eukaryotes. There are two main active catalytic sites with conserved histidines that are capable of binding four copper atoms. Q#22925 - CGI_10002038 superfamily 241567 408 617 3.65E-26 107.786 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#22925 - CGI_10002038 superfamily 245814 258 335 5.68E-10 57.553 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#22925 - CGI_10002038 superfamily 245814 123 195 2.64E-07 49.0409 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#22925 - CGI_10002038 superfamily 246680 16 102 0.000116362 41.2442 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#22926 - CGI_10002039 superfamily 247986 126 222 7.30E-05 42.7454 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#22926 - CGI_10002039 superfamily 197504 328 460 1.02E-10 59.2253 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#22929 - CGI_10001790 superfamily 246748 355 597 5.33E-111 341.109 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#22929 - CGI_10001790 superfamily 244870 139 343 1.27E-75 245.662 cl08238 PA superfamily - - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#22929 - CGI_10001790 superfamily 202944 625 748 6.09E-32 121.224 cl07854 TFR_dimer superfamily - - Transferrin receptor-like dimerisation domain; This domain is involved in dimerisation of the transferrin receptor as shown in its crystal structure. Q#22929 - CGI_10001790 superfamily 246748 61 115 3.25E-09 57.6025 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#22930 - CGI_10001791 superfamily 243072 41 174 1.22E-27 105.158 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22931 - CGI_10001792 superfamily 248012 2 62 4.50E-06 40.6389 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#22932 - CGI_10001793 superfamily 247953 11 159 4.01E-75 229.246 cl17399 3-HAO superfamily - - 3-hydroxyanthranilic acid dioxygenase; In eukaryotes 3-hydroxyanthranilic acid dioxygenase (EC:1.13.11.6) is part of the kynurenine pathway for the degradation of tryptophan and the biosynthesis of nicotinic acid.The prokaryotic homolog is involved in the 2-nitrobenzoate degradation pathway. Q#22933 - CGI_10002105 superfamily 217473 146 369 1.43E-24 103.984 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#22934 - CGI_10010666 superfamily 222351 394 478 5.67E-19 83.3003 cl15817 Glyco_transf_7N superfamily C - "N-terminal region of glycosyl transferase group 7; This is the N-terminal half of a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activities, all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalyzed reaction." Q#22934 - CGI_10010666 superfamily 246031 249 330 2.50E-12 63.7128 cl12567 Beta-Casp superfamily C - Beta-Casp domain; The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. Q#22934 - CGI_10010666 superfamily 241867 25 180 2.55E-09 55.0393 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#22935 - CGI_10010667 superfamily 246681 358 569 4.04E-37 136.966 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#22936 - CGI_10010668 superfamily 247755 819 1066 1.32E-97 311.04 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#22936 - CGI_10010668 superfamily 248376 512 811 9.47E-46 167.585 cl17822 MutS_III superfamily - - "MutS domain III; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam01624 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with domain III, which is central to the structure of Thermus aquaticus MutS as characterized in." Q#22936 - CGI_10010668 superfamily 216613 214 323 8.09E-27 107.274 cl03286 MutS_I superfamily - - "MutS domain I; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam05188, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. The aligned region corresponds with globular domain I, which is involved in DNA binding, in Thermus aquaticus MutS as characterized in." Q#22936 - CGI_10010668 superfamily 218486 345 486 2.36E-09 56.6005 cl04975 MutS_II superfamily - - "MutS domain II; This domain is found in proteins of the MutS family (DNA mismatch repair proteins) and is found associated with pfam00488, pfam01624, pfam05192 and pfam05190. The MutS family of proteins is named after the Salmonella typhimurium MutS protein involved in mismatch repair; other members of the family included the eukaryotic MSH 1,2,3, 4,5 and 6 proteins. These have various roles in DNA repair and recombination. Human MSH has been implicated in non-polyposis colorectal carcinoma (HNPCC) and is a mismatch binding protein. This domain corresponds to domain II in Thermus aquaticus MutS as characterized in, and has similarity resembles RNAse-H-like domains (see pfam00075)." Q#22937 - CGI_10010669 superfamily 247941 70 198 5.70E-07 46.9453 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#22940 - CGI_10010672 superfamily 217952 19 278 1.63E-73 235.266 cl04439 Gcd10p superfamily - - Gcd10p family; eIF-3 is a multi-subunit complex that stimulates translation initiation in vitro at several different steps. This family corresponds to the gamma subunit if eIF3. The Yeast protein Gcd10p has also been shown to be part of a complex with the methyltransferase Gcd14p that is involved in modifying tRNA. Q#22942 - CGI_10010674 superfamily 247743 462 625 7.90E-27 107.617 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#22942 - CGI_10010674 superfamily 247743 208 347 2.76E-12 65.2451 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#22943 - CGI_10010675 superfamily 242274 29 75 0.00110179 34.1334 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#22944 - CGI_10010676 superfamily 242910 49 153 4.11E-51 179.791 cl02159 Peptidase_C13 superfamily C - "Peptidase C13 family; Members of this family are asparaginyl peptidases. The blood fluke parasite Schistosoma mansoni has at least five Clan CA cysteine peptidases in its digestive tract including cathepsins B (2 isoforms), C, F and L. All have been recombinantly expressed as active enzymes, albeit in various stages of activation. In addition, a Clan CD peptidase, termed asparaginyl endopeptidase or 'legumain' has been identified. This has formerly been characterized as a 'haemoglobinase', but this term is probably incorrect. Two cDNAs have been described for Schistosoma mansoni legumain; one encodes an active enzyme whereas the active site cysteine residue encoded by the second cDNA is substituted by an asparagine residue. Both forms have been recombinantly expressed." Q#22944 - CGI_10010676 superfamily 242910 150 195 4.56E-10 59.6083 cl02159 Peptidase_C13 superfamily N - "Peptidase C13 family; Members of this family are asparaginyl peptidases. The blood fluke parasite Schistosoma mansoni has at least five Clan CA cysteine peptidases in its digestive tract including cathepsins B (2 isoforms), C, F and L. All have been recombinantly expressed as active enzymes, albeit in various stages of activation. In addition, a Clan CD peptidase, termed asparaginyl endopeptidase or 'legumain' has been identified. This has formerly been characterized as a 'haemoglobinase', but this term is probably incorrect. Two cDNAs have been described for Schistosoma mansoni legumain; one encodes an active enzyme whereas the active site cysteine residue encoded by the second cDNA is substituted by an asparagine residue. Both forms have been recombinantly expressed." Q#22946 - CGI_10010678 superfamily 241565 146 221 0.00959049 34.6586 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#22948 - CGI_10010680 superfamily 248469 7 93 8.15E-05 39.2755 cl17915 HAD_like superfamily C - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#22948 - CGI_10010680 superfamily 240421 86 141 0.000435797 38.1448 cl14774 PTZ00445 superfamily N - p36-lilke protein; Provisional Q#22949 - CGI_10010681 superfamily 241607 428 476 2.54E-17 76.9593 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#22949 - CGI_10010681 superfamily 248458 32 220 6.54E-08 53.4717 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22949 - CGI_10010681 superfamily 248458 312 397 0.000379282 41.9157 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#22950 - CGI_10010682 superfamily 241823 9 182 6.10E-78 237.857 cl00376 Ribosomal_L10_P0 superfamily - - "Ribosomal protein L10 family; composed of the large subunit ribosomal protein called L10 in bacteria, P0 in eukaryotes, and L10e in archaea, as well as uncharacterized P0-like eukaryotic proteins. In all three kingdoms, L10 forms a tight complex with multiple copies of the small acidic protein L12(e). This complex forms a stalk structure on the large subunit of the ribosome. The N-terminal domain (NTD) of L10 interacts with L11 protein and forms the base of the L7/L12 stalk, while the extended C-terminal helix binds to two or three dimers of the NTD of L7/L12 (L7 and L12 are identical except for an acetylated N-terminus). The L7/L12 stalk is known to contain the binding site for elongation factors G and Tu (EF-G and EF-Tu, respectively); however, there is disagreement as to whether or not L10 is involved in forming the binding site. The stalk is believed to be associated with GTPase activities in protein synthesis. In a neuroblastoma cell line, L10 has been shown to interact with the SH3 domain of Src and to activate the binding of the Nck1 adaptor protein with skeletal proteins such as the Wiskott-Aldrich Syndrome Protein (WASP) and the WASP-interacting protein (WIP). Some eukaryotic P0 sequences have an additional C-terminal domain homologous with acidic proteins P1 and P2." Q#22950 - CGI_10010682 superfamily 215914 231 313 8.54E-08 48.7728 cl18353 Ribosomal_60s superfamily - - "60s Acidic ribosomal protein; This family includes archaebacterial L12, eukaryotic P0, P1 and P2." Q#22954 - CGI_10010686 superfamily 217598 394 498 3.72E-49 171.113 cl04130 KCNQ_channel superfamily N - KCNQ voltage-gated potassium channel; This family matches to the C-terminal tail of KCNQ type potassium channels. Q#22954 - CGI_10010686 superfamily 219619 177 227 3.67E-15 71.4699 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#22955 - CGI_10020035 superfamily 247802 80 277 1.32E-119 352.6 cl17248 RIO superfamily - - "RIO kinase family, catalytic domain. The RIO kinase catalytic domain family is part of a larger superfamily, that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and phosphoinositide 3-kinase (PI3K). RIO kinases are atypical protein serine kinases present in archaea, bacteria and eukaryotes. Serine kinases catalyze the transfer of the gamma-phosphoryl group from ATP to serine residues in protein substrates. RIO kinases contain a kinase catalytic signature, but otherwise show very little sequence similarity to typical PKs. The RIO catalytic domain is truncated compared to the catalytic domains of typical PKs, with deletions of the loops responsible for substrate binding. Most organisms contain at least two RIO kinases, RIO1 and RIO2. A third protein, RIO3, is present in multicellular eukaryotes. In yeast, RIO1 and RIO2 are essential for survival. They function as non-ribosomal factors necessary for late 18S rRNA processing. RIO1 is also required for proper cell cycle progression and chromosome maintenance. The biological substrates for RIO kinases are still unknown." Q#22955 - CGI_10020035 superfamily 220140 9 92 1.52E-35 127.277 cl07723 Rio2_N superfamily - - "Rio2, N-terminal; Members of this family are found in Rio2, and are structurally homologous to the winged helix (wHTH) domain. They adopt a structure consisting of four alpha helices followed by two beta strands and a fifth alpha helix. The domain confers DNA binding properties to the protein, as per other winged helix domains." Q#22958 - CGI_10020038 superfamily 243066 10 100 1.88E-21 88.3789 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#22958 - CGI_10020038 superfamily 219619 336 400 3.81E-09 53.3655 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#22959 - CGI_10020039 superfamily 247068 721 822 6.47E-22 93.5321 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22959 - CGI_10020039 superfamily 247068 854 927 8.89E-11 60.7901 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22959 - CGI_10020039 superfamily 247068 1170 1256 2.55E-09 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22959 - CGI_10020039 superfamily 247068 1074 1161 6.84E-09 55.3974 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22959 - CGI_10020039 superfamily 247068 1281 1366 1.56E-08 54.2418 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22959 - CGI_10020039 superfamily 247068 615 706 3.98E-07 50.0046 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22959 - CGI_10020039 superfamily 247068 299 383 5.26E-07 49.6194 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22959 - CGI_10020039 superfamily 245213 1388 1425 2.03E-06 46.701 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22959 - CGI_10020039 superfamily 247068 509 603 0.000103554 42.6858 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22959 - CGI_10020039 superfamily 216265 1470 1618 6.21E-20 88.8988 cl03079 Cadherin_C superfamily - - Cadherin cytoplasmic region; Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn. Q#22959 - CGI_10020039 superfamily 247068 214 287 0.00114601 39.2551 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#22961 - CGI_10020041 superfamily 245010 106 203 0.00161731 37.3568 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#22961 - CGI_10020041 superfamily 241563 50 84 0.00649119 35.1476 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#22962 - CGI_10020042 superfamily 246918 298 349 1.92E-14 69.1527 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22962 - CGI_10020042 superfamily 241583 161 241 3.78E-09 56.0927 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#22963 - CGI_10020043 superfamily 241583 277 375 2.65E-14 73.0415 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#22963 - CGI_10020043 superfamily 246918 486 536 4.42E-14 69.1527 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#22963 - CGI_10020043 superfamily 243051 774 903 0.00342769 38.1005 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#22965 - CGI_10020045 superfamily 216239 39 355 8.85E-117 353.534 cl18361 IRK superfamily - - Inward rectifier potassium channel; Inward rectifier potassium channel. Q#22966 - CGI_10020046 superfamily 216239 50 369 2.17E-99 303.458 cl18361 IRK superfamily - - Inward rectifier potassium channel; Inward rectifier potassium channel. Q#22967 - CGI_10020047 superfamily 243034 321 436 8.90E-05 41.2116 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#22967 - CGI_10020047 superfamily 215821 202 289 1.63E-22 92.6886 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#22968 - CGI_10020048 superfamily 241763 8 451 0 616.526 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#22969 - CGI_10020049 superfamily 220948 24 112 1.28E-37 124.456 cl12595 DUF2615 superfamily - - "Protein of unknown function (DUF2615); This small. approximately 100 residue, family is conserved from worms to humans. It is cysteine-rich with a characteristic FDxCEC sequence motif. The function is not known." Q#22970 - CGI_10020050 superfamily 241578 209 359 7.86E-29 115.467 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22970 - CGI_10020050 superfamily 241578 1022 1172 2.49E-26 108.148 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22970 - CGI_10020050 superfamily 241578 395 544 6.92E-15 74.6354 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22970 - CGI_10020050 superfamily 245213 712 748 8.95E-10 56.491 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 902 938 8.95E-10 56.491 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 864 900 1.70E-09 55.7206 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 941 976 1.81E-09 55.7206 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 751 786 2.68E-09 55.3354 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 788 824 3.91E-09 54.565 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 676 710 4.68E-09 54.565 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 128 163 1.83E-08 52.639 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 828 862 6.39E-08 51.0982 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 165 201 1.39E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 245213 978 1014 1.39E-06 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22970 - CGI_10020050 superfamily 248133 1308 1567 7.44E-60 209.379 cl17579 Fructosamin_kin superfamily - - Fructosamine kinase; This family includes eukaryotic fructosamine-3-kinase enzymes. The family also includes bacterial members that have not been characterized but probably have a similar or identical function. Q#22970 - CGI_10020050 superfamily 241578 1208 1289 1.74E-05 45.3602 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22971 - CGI_10020051 superfamily 241578 147 297 4.38E-29 110.074 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#22971 - CGI_10020051 superfamily 245213 103 139 4.69E-08 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#22972 - CGI_10020052 superfamily 242575 81 203 1.93E-05 42.2238 cl01548 YccV-like superfamily - - Hemimethylated DNA-binding protein YccV like; YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. Q#22974 - CGI_10020054 superfamily 243072 170 269 6.14E-07 48.5338 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#22979 - CGI_10020059 superfamily 247792 140 194 3.15E-05 40.892 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#22982 - CGI_10020062 superfamily 245201 173 486 6.01E-61 204.64 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#22983 - CGI_10020063 superfamily 247064 8 167 1.17E-47 164.773 cl15774 Hemerythrin-like superfamily - - "Hemerythrin family; Hemerythrin (Hr) and related proteins are found in bacteria, archaea and eukaryotes. They are non-heme diiron oxygen transport proteins. In addition to oxygen transport, members are involved in cadmium fixation and host anti-bacterial defense. They have the same "four alpha helix bundle" motif and similar active site structures. Some members, like Hr, form oligomers, the octameric form being most prevalent, while others are monomeric." Q#22983 - CGI_10020063 superfamily 243074 203 244 9.70E-14 66.7613 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#22983 - CGI_10020063 superfamily 199166 519 583 2.71E-07 50.4036 cl15308 AMN1 superfamily C - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#22983 - CGI_10020063 superfamily 199166 353 403 0.00119763 39.618 cl15308 AMN1 superfamily NC - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#22986 - CGI_10020066 superfamily 219542 135 250 9.50E-42 148.159 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#22986 - CGI_10020066 superfamily 215896 309 430 4.98E-28 110.848 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#22986 - CGI_10020066 superfamily 219541 538 687 6.02E-27 107.169 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#22987 - CGI_10020067 superfamily 245208 18 650 0 566.958 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#22989 - CGI_10020069 superfamily 247856 65 126 5.84E-13 60.6393 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22989 - CGI_10020069 superfamily 247856 100 174 3.59E-09 49.8537 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#22989 - CGI_10020069 superfamily 244899 36 86 0.00257561 34.3878 cl08302 S-100 superfamily N - "S-100: S-100 domain, which represents the largest family within the superfamily of proteins carrying the Ca-binding EF-hand motif. Note that this S-100 hierarchy contains only S-100 EF-hand domains, other EF-hands have been modeled separately. S100 proteins are expressed exclusively in vertebrates, and are implicated in intracellular and extracellular regulatory activities. Intracellularly, S100 proteins act as Ca-signaling or Ca-buffering proteins. The most unusual characteristic of certain S100 proteins is their occurrence in extracellular space, where they act in a cytokine-like manner through RAGE, the receptor for advanced glycation products. Structural data suggest that many S100 members exist within cells as homo- or heterodimers and even oligomers; oligomerization contributes to their functional diversification. Upon binding calcium, most S100 proteins change conformation to a more open structure exposing a hydrophobic cleft. This hydrophobic surface represents the interaction site of S100 proteins with their target proteins. There is experimental evidence showing that many S100 proteins have multiple binding partners with diverse mode of interaction with different targets. In addition to S100 proteins (such as S100A1,-3,-4,-6,-7,-10,-11,and -13), this group includes the ''fused'' gene family, a group of calcium binding S100-related proteins. The ''fused'' gene family includes multifunctional epidermal differentiation proteins - profilaggrin, trichohyalin, repetin, hornerin, and cornulin; functionally these proteins are associated with keratin intermediate filaments and partially crosslinked to the cell envelope. These ''fused'' gene proteins contain N-terminal sequence with two Ca-binding EF-hands motif, which may be associated with calcium signaling in epidermal cells and autoprocessing in a calcium-dependent manner. In contrast to S100 proteins, "fused" gene family proteins contain an extraordinary high number of almost perfect peptide repeats with regular array of polar and charged residues similar to many known cell envelope proteins." Q#22992 - CGI_10020072 superfamily 217390 150 292 6.14E-22 89.5412 cl18407 TPT superfamily - - Triose-phosphate Transporter family; This family includes transporters with a specificity for triose phosphate. Q#22993 - CGI_10020073 superfamily 247723 9 85 1.05E-41 138.217 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#22993 - CGI_10020073 superfamily 199156 105 121 0.00495866 33.5708 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#22994 - CGI_10020074 superfamily 221741 314 409 2.36E-10 58.1111 cl15057 Cadherin-like superfamily - - "Cadherin-like beta sandwich domain; This domain is found in several bacterial, metazoan and chlorophyte algal proteins. A profile-profile comparison recovered the cadherin domain and a comparison of the predicted structure of this domain with the crystal structure of the cadherin showed a congruent seven stranded secondary structure. The domain is widespread in bacteria and seen in the firmicutes, actinobacteria, certain proteobacteria, bacteroides and chlamydiae with an expansion in Clostridium. In contrast, it is limited in its distribution in eukaryotes suggesting that it was derived through lateral transfer from bacteria. In prokaryotes, this domain is widely fused to other domains such as FNIII (Fibronectin Type III), TIG, SLH (S-layer homology), discoidin, cell-wall-binding repeat domain and alpha-amylase-like glycohydrolases. These associations are suggestive of a carbohydrate-binding function for this cadherin-like domain. In animal proteins it is associated with an ATP-grasp domain." Q#22994 - CGI_10020074 superfamily 242274 507 609 0.000259362 42.0094 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#23001 - CGI_10002852 superfamily 243267 7 284 4.94E-39 140.827 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#23009 - CGI_10020009 superfamily 241563 300 338 4.00E-07 46.8967 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#23009 - CGI_10020009 superfamily 188588 25 81 8.55E-05 40.6499 cl14866 exo_TIGR04073 superfamily - - "putative exosortase-associated protein, TIGR04073 family; Members of this protein family are found in beta, gamma, and delta proteobacteria, and in the verrucomicrobia. Twenty-two of twenty-four species encountered contain the PEP-CTERM/exosortase system for modulating extracellular polysaccharide biosynthesis production, suggesting a role in protein sorting. The N-terminal signal sequence is divergent and not included in the model. PSI-BLAST and HMM searches suggest a distant sequence relationship between a region of this protein of about 100 amino acids and a corresponding region of the very large eukaryotic protein vps13, associated with vacuolar protein sorting in yeast." Q#23011 - CGI_10020011 superfamily 241563 38 76 3.48E-06 44.5855 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#23011 - CGI_10020011 superfamily 110440 497 524 0.00449441 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#23012 - CGI_10020012 superfamily 222150 874 895 0.00820373 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23013 - CGI_10020013 superfamily 246597 77 280 9.63E-31 116.013 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#23014 - CGI_10020014 superfamily 217452 44 151 3.20E-28 106.347 cl12276 Tropomodulin superfamily N - Tropomodulin; Tropomodulin is a novel tropomyosin regulatory protein that binds to the end of erythrocyte tropomyosin and blocks head-to-tail association of tropomyosin along actin filaments. Limited proteolysis shows this protein is composed of two domains. The amino terminal domain contains the tropomyosin binding function. Q#23015 - CGI_10020015 superfamily 193256 2393 2634 3.11E-70 241.005 cl18189 AAA_8 superfamily - - "P-loop containing dynein motor region D4; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D4 ATP-binding region of the motor." Q#23015 - CGI_10020015 superfamily 193257 2997 3227 2.13E-53 190.967 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#23015 - CGI_10020015 superfamily 193251 2033 2304 5.04E-50 182.443 cl18188 AAA_7 superfamily - - "P-loop containing dynein motor region D3; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D3 and is an ATP binding site." Q#23015 - CGI_10020015 superfamily 193253 2646 2975 7.73E-46 172.529 cl15084 MT superfamily - - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#23015 - CGI_10020015 superfamily 247743 1724 1869 1.98E-07 52.6828 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23017 - CGI_10020017 superfamily 241600 18 235 1.40E-68 212.486 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23018 - CGI_10020018 superfamily 241600 9 180 6.65E-52 167.417 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23020 - CGI_10020020 superfamily 247085 900 1007 3.97E-28 111.443 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#23020 - CGI_10020020 superfamily 245596 792 878 1.61E-33 131.943 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#23020 - CGI_10020020 superfamily 241815 603 737 2.01E-16 79.4103 cl00361 Transcrip_reg superfamily C - "Transcriptional regulator; This is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region." Q#23023 - CGI_10020023 superfamily 247085 78 194 2.75E-29 106.821 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#23023 - CGI_10020023 superfamily 245596 1 37 3.57E-18 79.1705 cl11394 Glyco_tranf_GTA_type superfamily NC - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#23023 - CGI_10020023 superfamily 245596 23 65 1.51E-08 52.2066 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#23025 - CGI_10020025 superfamily 247085 687 803 2.34E-28 111.443 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#23025 - CGI_10020025 superfamily 245596 606 645 1.55E-18 86.1041 cl11394 Glyco_tranf_GTA_type superfamily NC - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#23025 - CGI_10020025 superfamily 245596 647 674 1.19E-08 56.0586 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#23028 - CGI_10020028 superfamily 243034 48 121 5.51E-08 46.6044 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23029 - CGI_10020029 superfamily 247746 55 122 0.00373586 36.0822 cl17192 ATP-synt_B superfamily N - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#23030 - CGI_10020030 superfamily 243034 28 112 4.69E-09 49.3008 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23031 - CGI_10020031 superfamily 110440 489 513 0.00706678 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#23032 - CGI_10020032 superfamily 247068 46 139 6.55E-25 100.899 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23032 - CGI_10020032 superfamily 147567 575 838 1.45E-31 125.495 cl12309 DAG1 superfamily - - "Dystroglycan (Dystrophin-associated glycoprotein 1); Dystroglycan is one of the dystrophin-associated glycoproteins, which is encoded by a 5.5 kb transcript in human. The protein product is cleaved into two non-covalently associated subunits, [alpha] (N-terminal) and [beta] (C-terminal). In skeletal muscle the dystroglycan complex works as a transmembrane linkage between the extracellular matrix and the cytoskeleton. [alpha]-dystroglycan is extracellular and binds to merosin ([alpha]-2 laminin) in the basement membrane, while [beta]-dystroglycan is a transmembrane protein and binds to dystrophin, which is a large rod-like cytoskeletal protein, absent in Duchenne muscular dystrophy patients. Dystrophin binds to intracellular actin cables. In this way, the dystroglycan complex, which links the extracellular matrix to the intracellular actin cables, is thought to provide structural integrity in muscle tissues. The dystroglycan complex is also known to serve as an agrin receptor in muscle, where it may regulate agrin-induced acetylcholine receptor clustering at the neuromuscular junction. There is also evidence which suggests the function of dystroglycan as a part of the signal transduction pathway because it is shown that Grb2, a mediator of the Ras-related signal pathway, can interact with the cytoplasmic domain of dystroglycan. In general, aberrant expression of dystrophin-associated protein complex underlies the pathogenesis of Duchenne muscular dystrophy, Becker muscular dystrophy and severe childhood autosomal recessive muscular dystrophy. Interestingly, no genetic disease has been described for either [alpha]- or [beta]-dystroglycan. Dystroglycan is widely distributed in non-muscle tissues as well as in muscle tissues. During epithelial morphogenesis of kidney, the dystroglycan complex is shown to act as a receptor for the basement membrane. Dystroglycan expression in mouse brain and neural retina has also been reported. However, the physiological role of dystroglycan in non-muscle tissues has remained unclear." Q#23032 - CGI_10020032 superfamily 206765 157 278 2.46E-28 111.267 cl16905 alpha_DG_C superfamily - - "C-terminal domain of alpha dystroglycan; Dystroglycan is a glycoprotein widely distributed in skeletal muscle and other tissues; the pre-protein is cleaved into two subunits (alpha and beta) that form a complex which links the extracellular matrix to the cytoskeleton. This C-terminal domain of the alpha-subunit appears to contact neighboring cadherin-like repeats of alpha dystroglycan, and may also be involved in interactions with other components of the dystrophin-dystroglycan-complex (DGC). DGC has been shown to interact with extracellular matrix components such as laminin, perlecan and m-agrin, suggesting that the complex may play various different roles depending on the extracellular ligand." Q#23036 - CGI_10012986 superfamily 219542 11 115 5.45E-37 133.136 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#23036 - CGI_10012986 superfamily 219541 397 569 2.09E-29 112.947 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#23036 - CGI_10012986 superfamily 215896 126 311 2.33E-16 76.5648 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#23037 - CGI_10012987 superfamily 219542 5 115 2.26E-35 129.669 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#23037 - CGI_10012987 superfamily 219541 403 577 8.72E-27 106.399 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#23037 - CGI_10012987 superfamily 215896 126 316 1.72E-16 76.95 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#23038 - CGI_10012988 superfamily 219542 11 115 2.48E-37 134.291 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#23038 - CGI_10012988 superfamily 219541 400 574 1.25E-27 108.325 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#23038 - CGI_10012988 superfamily 215896 126 312 2.25E-18 82.3428 cl18351 Cu-oxidase superfamily - - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#23039 - CGI_10012989 superfamily 247805 33 184 3.90E-12 61.9696 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#23040 - CGI_10012991 superfamily 241546 3 117 9.40E-18 80.0756 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#23040 - CGI_10012991 superfamily 215847 225 632 2.87E-80 269.316 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#23041 - CGI_10012992 superfamily 215847 2 40 5.43E-06 43.2039 cl09510 Lipoxygenase superfamily NC - Lipoxygenase; Lipoxygenase. Q#23042 - CGI_10012993 superfamily 247805 30 142 6.62E-12 58.888 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#23043 - CGI_10012994 superfamily 243072 52 121 1.03E-09 51.6154 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23044 - CGI_10012995 superfamily 243072 66 223 1.94E-10 58.9342 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23044 - CGI_10012995 superfamily 217473 327 557 2.40E-08 54.6785 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#23045 - CGI_10003400 superfamily 187403 2 360 1.26E-165 520.063 cl14649 BRO1_Alix_like superfamily - - "Protein-interacting Bro1-like domain of mammalian Alix and related domains; This superfamily includes the Bro1-like domains of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), RhoA-binding proteins Rhophilin-1 and Rhophilin-2, Brox, Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Brox, Bro1 and Rim20 interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. Bro1-like domains are boomerang-shaped, and part of the domain is a tetratricopeptide repeat (TPR)-like structure. Bro1-like domains bind components of the ESCRT-III complex: CHMP4 (in the case of Alix, HD-PTP, and Brox) and Snf7 (in the case of yeast Bro1, and Rim20). The single domain protein human Brox, and the isolated Bro1-like domains of Alix, HD-PTP and Rhophilin can bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Alix, HD-PTP, Bro1, and Rim20 also have a V-shaped (V) domain, which in the case of Alix, has been shown to be a dimerization domain and to contain a binding site for the retroviral late assembly (L) domain YPXnL motif, which is partially conserved in this superfamily. Alix, HD-PTP and Bro1 also have a proline-rich region (PRR); the Alix PRR binds multiple partners. Rhophilin-1, and -2, in addition to this Bro1-like domain, have an N-terminal Rho-binding domain and a C-terminal PDZ (PS.D.-95, Disc-large, ZO-1) domain. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate frequently absent in human kidney, breast, lung, and cervical tumors. This protein has a C-terminal, catalytically inactive tyrosine phosphatase domain." Q#23045 - CGI_10003400 superfamily 187408 365 523 2.08E-48 180.179 cl14654 V_Alix_like superfamily C - "Protein-interacting V-domain of mammalian Alix and related domains; This superfamily contains the V-shaped (V) domain of mammalian Alix (apoptosis-linked gene-2 interacting protein X), His-Domain type N23 protein tyrosine phosphatase (HD-PTP, also known as PTPN23), Bro1 and Rim20 (also known as PalA) from Saccharomyces cerevisiae, and related domains. Alix, HD-PTP, Bro1, and Rim20 all interact with the ESCRT (Endosomal Sorting Complexes Required for Transport) system. Alix, also known as apoptosis-linked gene-2 interacting protein 1 (AIP1), participates in membrane remodeling processes during the budding of enveloped viruses, vesicle budding inside late endosomal multivesicular bodies (MVBs), and the abscission reactions of mammalian cell division. It also functions in apoptosis. HD-PTP functions in cell migration and endosomal trafficking, Bro1 in endosomal trafficking, and Rim20 in the response to the external pH via the Rim101 pathway. The Alix V-domain contains a binding site, partially conserved in this superfamily, for the retroviral late assembly (L) domain YPXnL motif. The Alix V-domain is also a dimerization domain. Members of this superfamily have an N-terminal Bro1-like domain, which binds components of the ESCRT-III complex. The Bro1-like domains of Alix and HD-PTP can also bind human immunodeficiency virus type 1 (HIV-1) nucleocapsid. Many members, including Alix, HD-PTP, and Bro1, also have a proline-rich region (PRR), which binds multiple partners in Alix, including Tsg101 (tumor susceptibility gene 101, a component of ESCRT-1) and the apoptotic protein ALG-2. The C-terminal portion (V-domain and PRR) of Bro1 interacts with Doa4, a ubiquitin thiolesterase needed to remove ubiquitin from MVB cargoes; it interacts with a YPxL motif in Doa4s catalytic domain to stimulate its deubiquitination activity. Rim20 may bind the ESCRT-III subunit Snf7, bringing the protease Rim13 (a YPxL-containing transcription factor) into proximity with Rim101, and promoting the proteolytic activation of Rim101. HD-PTP is encoded by the PTPN23 gene, a tumor suppressor gene candidate often absent in human kidney, breast, lung, and cervical tumors. HD-PTP has a C-terminal catalytically inactive tyrosine phosphatase domain." Q#23047 - CGI_10016207 superfamily 246669 19 49 1.42E-14 64.4267 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#23048 - CGI_10016208 superfamily 247723 386 439 5.96E-14 67.3273 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23049 - CGI_10016209 superfamily 241752 1806 1925 2.92E-44 159.022 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#23049 - CGI_10016209 superfamily 247723 192 263 1.16E-13 68.8332 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23049 - CGI_10016209 superfamily 241554 858 1029 1.87E-51 181.7 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#23049 - CGI_10016209 superfamily 241554 1077 1266 5.02E-22 95.4051 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#23049 - CGI_10016209 superfamily 247723 62 123 2.17E-09 56.5417 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23049 - CGI_10016209 superfamily 247723 275 343 2.70E-08 53.4601 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23049 - CGI_10016209 superfamily 241554 1344 1502 1.91E-07 51.9627 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#23049 - CGI_10016209 superfamily 207713 1660 1714 0.00100365 39.6101 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#23053 - CGI_10016213 superfamily 247724 141 273 1.39E-08 53.5656 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#23054 - CGI_10016214 superfamily 247723 10 116 3.37E-29 105.533 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23055 - CGI_10016215 superfamily 245814 288 358 4.70E-08 50.5655 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#23055 - CGI_10016215 superfamily 214507 216 271 0.000905022 37.4096 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#23055 - CGI_10016215 superfamily 243030 32 64 0.00245547 36.1415 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#23057 - CGI_10016217 superfamily 248458 385 562 1.37E-18 86.2137 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23057 - CGI_10016217 superfamily 248458 27 207 3.97E-06 47.6937 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23057 - CGI_10016217 superfamily 248188 339 404 0.00245589 38.861 cl17634 Herpes_BBRF1 superfamily NC - BRRF1-like protein; Family of herpesvirus proteins including Epstein-barr virus protein BBRF1. Q#23059 - CGI_10016219 superfamily 241550 393 542 4.61E-82 258.266 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#23059 - CGI_10016219 superfamily 247725 30 118 2.55E-29 112.773 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23061 - CGI_10016221 superfamily 241878 33 81 7.51E-09 48.518 cl00460 CMD superfamily - - "Carboxymuconolactone decarboxylase family; Carboxymuconolactone decarboxylase (CMD) EC:4.1.1.44 is involved in protocatechuate catabolism. In some bacteria a gene fusion event leads to expression of CMD with a hydrolase involved in the same pathway. In these bifunctional proteins CMD represents the C-terminal domain, pfam00561 represents the N-terminal domain." Q#23062 - CGI_10016222 superfamily 247684 166 482 5.31E-46 167.839 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#23062 - CGI_10016222 superfamily 247684 41 144 6.44E-07 50.7388 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#23065 - CGI_10016225 superfamily 241638 148 271 1.21E-07 48.5185 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#23065 - CGI_10016225 superfamily 241691 90 172 0.00432852 36.0911 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#23068 - CGI_10016228 superfamily 241638 123 245 3.34E-10 55.0669 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#23071 - CGI_10016231 superfamily 205988 80 197 5.01E-38 138.975 cl16417 Dzip-like_N superfamily - - Iguana/Dzip1-like DAZ-interacting protein N-terminal; The DAZ gene-product - Deleted in Azoospermia - and a closely related sequence are required early in germ-cell development in order to maintain germ-cell populations. This family is the N-terminal region that is the only part of the protein in some fungi and lower metazoa. Q#23072 - CGI_10016232 superfamily 243034 106 184 4.06E-13 65.094 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23072 - CGI_10016232 superfamily 243034 203 297 4.01E-09 53.538 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23072 - CGI_10016232 superfamily 243034 315 401 2.54E-06 45.0636 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23073 - CGI_10002224 superfamily 247824 34 332 4.91E-94 285.734 cl17270 APH_ChoK_like superfamily - - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#23074 - CGI_10002225 superfamily 241754 6 337 0 577.417 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#23075 - CGI_10002226 superfamily 216739 461 495 0.000235864 39.7234 cl03383 PC_rep superfamily - - Proteasome/cyclosome repeat; Proteasome/cyclosome repeat. Q#23075 - CGI_10002226 superfamily 216739 708 739 0.00428656 36.2566 cl03383 PC_rep superfamily - - Proteasome/cyclosome repeat; Proteasome/cyclosome repeat. Q#23077 - CGI_10002389 superfamily 248097 144 266 1.26E-20 84.6242 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23078 - CGI_10002437 superfamily 247743 5 142 6.12E-19 80.6531 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23078 - CGI_10002437 superfamily 204202 212 271 2.60E-08 49.1761 cl07827 Vps4_C superfamily - - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#23079 - CGI_10002438 superfamily 191619 1 149 6.27E-48 156.046 cl06071 DUF1241 superfamily - - Protein of unknown function (DUF1241); This family consists of several programmed cell death 10 protein (PDCD10 or TFAR15) sequences. The function of this family is unknown. Q#23080 - CGI_10002439 superfamily 241578 1927 2116 1.33E-76 254.761 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#23080 - CGI_10002439 superfamily 247743 97 253 1.36E-37 140.508 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23080 - CGI_10002439 superfamily 247743 990 1137 7.34E-24 100.833 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23080 - CGI_10002439 superfamily 247743 426 569 1.07E-20 91.588 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23080 - CGI_10002439 superfamily 247743 709 798 2.52E-07 51.5272 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23081 - CGI_10002440 superfamily 248458 246 418 1.02E-12 67.7241 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23082 - CGI_10002441 superfamily 248458 375 553 1.28E-08 55.7829 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23082 - CGI_10002441 superfamily 248458 67 173 0.00165568 39.6045 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23084 - CGI_10010019 superfamily 241583 49 194 8.23E-40 135.024 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23085 - CGI_10010020 superfamily 241804 206 490 2.00E-38 142.554 cl00348 COG0182 superfamily - - "Predicted translation initiation factor 2B subunit, eIF-2B alpha/beta/delta family [Translation, ribosomal structure and biogenesis]" Q#23086 - CGI_10010021 superfamily 247744 9 211 1.50E-92 272.573 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#23089 - CGI_10010024 superfamily 241571 40 136 1.09E-18 82.4602 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23089 - CGI_10010024 superfamily 241571 146 245 5.08E-16 74.7562 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23089 - CGI_10010024 superfamily 241609 253 321 1.50E-17 78.1165 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#23094 - CGI_10010029 superfamily 246954 33 626 1.64E-145 436.287 cl15415 Sec1 superfamily - - Sec1 family; Sec1 family. Q#23095 - CGI_10010030 superfamily 241575 123 194 0.00209304 35.7111 cl00054 DSRM superfamily - - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#23096 - CGI_10010031 superfamily 243263 45 335 6.41E-44 158.725 cl02990 ASC superfamily C - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#23096 - CGI_10010031 superfamily 243263 340 358 6.25E-05 43.5506 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#23098 - CGI_10010033 superfamily 243035 5 109 5.67E-16 68.8005 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23099 - CGI_10010034 superfamily 248458 59 432 2.36E-34 132.052 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23100 - CGI_10010035 superfamily 243035 2 69 1.94E-08 46.613 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23102 - CGI_10004532 superfamily 243092 5 65 9.81E-10 53.4928 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23103 - CGI_10004533 superfamily 215754 186 267 2.19E-14 66.508 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#23103 - CGI_10004533 superfamily 215754 81 156 1.57E-09 53.4112 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#23103 - CGI_10004533 superfamily 215754 11 59 3.83E-06 43.396 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#23104 - CGI_10004534 superfamily 247724 8 202 1.68E-78 237.105 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#23108 - CGI_10003069 superfamily 248275 70 94 6.37E-07 46.034 cl17721 zf-C2H2_jaz superfamily - - "Zinc-finger double-stranded RNA-binding; This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation." Q#23110 - CGI_10003071 superfamily 248275 70 94 7.79E-07 45.2636 cl17721 zf-C2H2_jaz superfamily - - "Zinc-finger double-stranded RNA-binding; This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation." Q#23111 - CGI_10003072 superfamily 242161 13 80 2.64E-33 112.493 cl00875 PTZ00255 superfamily C - 60S ribosomal protein L37a; Provisional Q#23115 - CGI_10012160 superfamily 245213 41 77 5.29E-05 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23115 - CGI_10012160 superfamily 245213 266 295 0.000450334 38.3866 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23115 - CGI_10012160 superfamily 245213 298 332 0.00110847 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23115 - CGI_10012160 superfamily 245213 371 406 0.00402572 35.305 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23115 - CGI_10012160 superfamily 245213 340 368 0.00861947 34.5346 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23116 - CGI_10012161 superfamily 112833 35 162 1.99E-69 211.056 cl04372 DUF367 superfamily - - Domain of unknown function (DUF367); Domain of unknown function (DUF367). Q#23116 - CGI_10012161 superfamily 217870 1 31 1.53E-10 54.0401 cl04386 RLI superfamily - - "Possible Fer4-like domain in RNase L inhibitor, RLI; Possible metal-binding domain in endoribonuclease RNase L inhibitor. Found at the N-terminal end of RNase L inhibitor proteins, adjacent to the 4Fe-4S binding domain, fer4, pfam00037. Also often found adjacent to the DUF367 domain pfam04034 in uncharacterized proteins. The RNase L system plays a major role in the anti-viral and anti-proliferative activities of interferons, and could possibly play a more general role in the regulation of RNA stability in mammalian cells. Inhibitory activity requires concentration-dependent association of RLI with RNase L." Q#23120 - CGI_10005266 superfamily 245335 28 277 1.06E-125 365.801 cl10571 GT_MraY-like superfamily - - "Glycosyltransferase 4 (GT4) includes both eukaryotic and prokaryotic UDP-D-N-acetylhexosamine:polyprenol phosphate D-N-acetylhexosamine-1-phosphate transferases. They catalyze the transfer of a D-N-acetylhexosamine 1-phosphate to a membrane-bound polyprenol phosphate, which is the initiation step of protein N-glycosylation in eukaryotes and peptidoglycan biosynthesis in bacteria. One member, D-N-acetylhexosamine 1-phosphate transferase (GPT) is a eukaryotic enzyme, which is specific for UDP-GlcNAc as donor substrate and dolichol-phosphate as the membrane bound acceptor. The bacterial members MraY, WecA, and WbpL/WbcO utilize undecaprenol phosphate as the acceptor substrate, but use different UDP-sugar donor substrates. MraY-type transferases are highly specific for UDP-N-acetylmuramate-pentapeptide, whereas WecA proteins are selective for UDP-N-acetylglucosamine (UDP-GlcNAc). The WbcO/WbpL substrate specificity has not yet been determined, but the structure of their biosynthetic endproducts implies that UDP-N-acetyl-D-fucosamine (UDP-FucNAc) and/or UDPN-acetyl-D-quinosamine (UDP-QuiNAc) are used. The eukaryotic reaction is the first step in the assembly of dolichol-linked oligosaccharide intermediates and is essential for N-glycosylation. The prokaryotic reactions lead to the formation of polyprenol-linked oligosaccharides involved in bacterial cell wall and peptidoglycan assembly. Archaeal and eukaryotic enzymes may use the same substrates and are evolutionarily closer than the bacterial enzyme. Archaea possess the same N-glycosylation pathway as eukaryotes. A glycosyl transferase gene Mv1751 in M. voltae encodes for the enzyme that carries out the first step in the pathway, the attachment of GlcNAc to a dolichol lipid carrier in the membrane. A lethal mutation in the alg7 (GPT) gene in Saccharomyces cerevisiae was successfully complemented with Mv1751, the archaea gene." Q#23121 - CGI_10005267 superfamily 241559 26 186 3.90E-12 66.9507 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#23121 - CGI_10005267 superfamily 109460 258 282 0.000348983 42.0254 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#23121 - CGI_10005267 superfamily 109460 339 362 0.000543267 41.255 cl02859 Calponin superfamily - - Calponin family repeat; Calponin family repeat. Q#23122 - CGI_10005268 superfamily 202000 1 129 1.92E-17 80.5957 cl03375 XRCC1_N superfamily - - XRCC1 N terminal domain; XRCC1 N terminal domain. Q#23123 - CGI_10005269 superfamily 247856 101 154 3.78E-08 46.7721 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23125 - CGI_10005272 superfamily 245598 53 90 3.22E-09 54.978 cl11396 Patatin_and_cPLA2 superfamily N - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#23125 - CGI_10005272 superfamily 247856 187 257 0.000470259 37.9125 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23126 - CGI_10005273 superfamily 215847 188 629 4.01E-88 290.117 cl09510 Lipoxygenase superfamily N - Lipoxygenase; Lipoxygenase. Q#23126 - CGI_10005273 superfamily 241546 3 87 6.88E-12 63.1269 cl00011 PLAT superfamily N - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#23128 - CGI_10011003 superfamily 245227 1 341 0 601.507 cl10013 Glycosyltransferase_GTB_type superfamily - - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#23129 - CGI_10011004 superfamily 241580 126 203 4.26E-45 152.323 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#23132 - CGI_10011007 superfamily 201217 333 381 9.99E-10 54.8392 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#23132 - CGI_10011007 superfamily 201217 151 200 1.07E-09 54.454 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#23132 - CGI_10011007 superfamily 201217 203 250 6.25E-08 49.4464 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#23132 - CGI_10011007 superfamily 201217 256 328 3.02E-06 44.824 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#23132 - CGI_10011007 superfamily 201217 432 482 0.00185697 36.3496 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#23133 - CGI_10011008 superfamily 214781 183 288 5.66E-14 69.2932 cl02747 NRF superfamily - - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#23134 - CGI_10011009 superfamily 246723 19 640 0 652.832 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#23136 - CGI_10011011 superfamily 243064 38 96 6.86E-07 43.9434 cl02512 NTR_like superfamily NC - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#23137 - CGI_10011012 superfamily 247905 190 296 5.09E-12 62.2552 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#23137 - CGI_10011012 superfamily 247805 43 130 0.00035616 39.5499 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#23140 - CGI_10011015 superfamily 243064 22 67 4.03E-10 52.361 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#23141 - CGI_10005713 superfamily 245202 81 169 2.07E-41 136.622 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#23141 - CGI_10005713 superfamily 243703 2 81 5.23E-39 130.38 cl04309 RNAP_Rpb7_N_like superfamily - - "RNAP_Rpb7_N_like: This conserved domain represents the N-terminal ribonucleoprotein (RNP) domain of the Rpb7 subunit of eukaryotic RNA polymerase (RNAP) II and its homologs, Rpa43 of eukaryotic RNAP I, Rpc25 of eukaryotic RNAP III, and RpoE (subunit E) of archaeal RNAP. These proteins have, in addition to their N-terminal RNP domain, a C-terminal oligonucleotide-binding (OB) domain. Each of these subunits heterodimerizes with another RNAP subunit (Rpb7 to Rpb4, Rpc25 to Rpc17, RpoE to RpoF, and Rpa43 to Rpa14). The heterodimer is thought to tether the RNAP to a given promoter via its interactions with a promoter-bound transcription factor.The heterodimer is also thought to bind and position nascent RNA as it exits the polymerase complex." Q#23143 - CGI_10005715 superfamily 245201 78 341 4.91E-163 472.888 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23144 - CGI_10005716 superfamily 248318 2212 2274 1.91E-05 45.4637 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#23144 - CGI_10005716 superfamily 218263 1340 1486 0.00112522 40.1795 cl04748 DUF547 superfamily - - "Protein of unknown function, DUF547; Family of uncharacterized proteins from C. elegans and A. thaliana." Q#23144 - CGI_10005716 superfamily 221338 1754 1852 0.00751899 39.9267 cl13402 PORR superfamily C - "Plant organelle RNA recognition domain; This family, which was previously known as DUF860, has been shown to be a component of group II intron ribonucleoprotein particles in maize chloroplasts. The domain is required for the splicing of the introns with which it associates, and promotes splicing in the context of a heterodimer with the RNase III-domain protein RNC1. All of the members are predicted to localise to mitochondria or chloroplasts. It seems likely that most PORR proteins function in organellar RNA metabolism." Q#23145 - CGI_10005717 superfamily 195671 51 163 9.09E-34 118.708 cl08257 Ribosomal_L11 superfamily - - "Ribosomal protein L11. Ribosomal protein L11, together with proteins L10 and L7/L12, and 23S rRNA, form the L7/L12 stalk on the surface of the large subunit of the ribosome. The homologous eukaryotic cytoplasmic protein is also called 60S ribosomal protein L12, which is distinct from the L12 involved in the formation of the L7/L12 stalk. The C-terminal domain (CTD) of L11 is essential for binding 23S rRNA, while the N-terminal domain (NTD) contains the binding site for the antibiotics thiostrepton and micrococcin. L11 and 23S rRNA form an essential part of the GTPase-associated region (GAR). Based on differences in the relative positions of the L11 NTD and CTD during the translational cycle, L11 is proposed to play a significant role in the binding of initiation factors, elongation factors, and release factors to the ribosome. Several factors, including the class I release factors RF1 and RF2, are known to interact directly with L11. In eukaryotes, L11 has been implicated in regulating the levels of ubiquinated p53 and MDM2 in the MDM2-p53 feedback loop, which is responsible for apoptosis in response to DNA damage. In bacteria, the "stringent response" to harsh conditions allows bacteria to survive, and ribosomes that lack L11 are deficient in stringent factor stimulation." Q#23146 - CGI_10005718 superfamily 248318 10 72 6.72E-06 44.6934 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#23147 - CGI_10005719 superfamily 195671 51 163 1.62E-33 117.937 cl08257 Ribosomal_L11 superfamily - - "Ribosomal protein L11. Ribosomal protein L11, together with proteins L10 and L7/L12, and 23S rRNA, form the L7/L12 stalk on the surface of the large subunit of the ribosome. The homologous eukaryotic cytoplasmic protein is also called 60S ribosomal protein L12, which is distinct from the L12 involved in the formation of the L7/L12 stalk. The C-terminal domain (CTD) of L11 is essential for binding 23S rRNA, while the N-terminal domain (NTD) contains the binding site for the antibiotics thiostrepton and micrococcin. L11 and 23S rRNA form an essential part of the GTPase-associated region (GAR). Based on differences in the relative positions of the L11 NTD and CTD during the translational cycle, L11 is proposed to play a significant role in the binding of initiation factors, elongation factors, and release factors to the ribosome. Several factors, including the class I release factors RF1 and RF2, are known to interact directly with L11. In eukaryotes, L11 has been implicated in regulating the levels of ubiquinated p53 and MDM2 in the MDM2-p53 feedback loop, which is responsible for apoptosis in response to DNA damage. In bacteria, the "stringent response" to harsh conditions allows bacteria to survive, and ribosomes that lack L11 are deficient in stringent factor stimulation." Q#23148 - CGI_10005720 superfamily 112836 654 782 3.28E-79 255.239 cl04373 DUF382 superfamily - - Domain of unknown function (DUF382); This domain is specific to the human splicing factor 3b subunit 2 and it's orthologues. Splicing factor 3b subunit 2 or SAP145 is a suppressor of U2 snRNA mutations. Pre-mRNA splicing is catalyzed by a large ribonucleoprotein complex called the spliceosome. Spliceosomes are multi-component enzymes that catalyze pre-mRNA splicing and form step-wise by the ordered interaction of UsnRNPs and non-snRNP proteins with short conserved regions of the pre-mRNA at the 5' and 3' splice sites and branch site. Q#23148 - CGI_10005720 superfamily 207686 786 844 4.00E-20 86.2547 cl02643 PSP superfamily - - PSP; Proline rich domain found in numerous spliceosome associated proteins. Q#23148 - CGI_10005720 superfamily 207684 11 44 2.73E-08 51.6107 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#23149 - CGI_10005721 superfamily 215866 7 156 1.79E-40 141.309 cl18349 Arrestin_N superfamily - - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#23149 - CGI_10005721 superfamily 243212 180 309 2.80E-26 102.037 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#23150 - CGI_10005722 superfamily 246954 134 450 3.76E-96 303.008 cl15415 Sec1 superfamily N - Sec1 family; Sec1 family. Q#23150 - CGI_10005722 superfamily 246954 9 131 1.01E-44 164.722 cl15415 Sec1 superfamily C - Sec1 family; Sec1 family. Q#23151 - CGI_10005723 superfamily 243175 106 241 5.70E-55 177.441 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#23151 - CGI_10005723 superfamily 241832 11 78 1.84E-20 83.0879 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#23152 - CGI_10005724 superfamily 248223 2 244 4.07E-22 94.761 cl17669 ampG superfamily C - muropeptide transporter; Validated Q#23156 - CGI_10011183 superfamily 242902 36 98 3.83E-11 59.9531 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#23157 - CGI_10011184 superfamily 242902 25 115 3.50E-14 69.1978 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#23159 - CGI_10011187 superfamily 248458 55 180 2.46E-11 65.0277 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23159 - CGI_10011187 superfamily 248458 235 398 1.23E-09 59.6349 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23159 - CGI_10011187 superfamily 195686 397 528 5.65E-34 129.051 cl08291 TCTP superfamily - - Translationally controlled tumour protein; Translationally controlled tumour protein. Q#23159 - CGI_10011187 superfamily 247725 531 623 2.06E-12 65.1138 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23160 - CGI_10011188 superfamily 197361 365 456 2.68E-06 45.8031 cl15254 UBAN superfamily - - "polyubiquitin binding domain of NEMO and related proteins; NEMO (NF-kappaB essential modulator) is a regulatory subunit of the kinase complex IKK, which is involved in the activation of NF-kappaB via phosporylation of inhibitory IkappaBs. This mechanism requires the binding of NEMO to ubiquinated substrates. Binding is achieved via the UBAN motif (ubiquitin binding in ABIN and NEMO), which is described in this model. This region of NEMO has also been named CoZi (for coiled-coil 2 and leucine zipper). ABINs (A20-binding inhibitors of NF-kappaB) are sensors for ubiquitin that are involved in regulation of apoptosis, ABIN-1 is presumed to inhibit signalling via the NF-kappaB route. The UBAN motif is also found in optineurin, the product of a gene associated with glaucoma, which has been characterized as a negative regulator of NF-kappaB as well." Q#23165 - CGI_10002505 superfamily 241584 58 90 0.00961139 33.6239 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#23166 - CGI_10002506 superfamily 241584 102 164 0.00103474 35.8627 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#23167 - CGI_10005326 superfamily 243035 32 146 4.02E-26 96.5349 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23168 - CGI_10005327 superfamily 243035 138 235 4.29E-14 65.7189 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23168 - CGI_10005327 superfamily 243035 38 130 2.46E-05 41.4514 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23169 - CGI_10005328 superfamily 245206 12 271 1.38E-17 81.6456 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23171 - CGI_10005330 superfamily 245814 43 106 0.00231058 35.9279 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#23175 - CGI_10008161 superfamily 241607 96 128 9.45E-06 41.4866 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#23175 - CGI_10008161 superfamily 241607 55 87 0.0051986 33.7826 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#23176 - CGI_10008162 superfamily 216347 302 721 1.16E-121 374.18 cl08309 Cu_amine_oxid superfamily - - "Copper amine oxidase, enzyme domain; Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ). This family corresponds to the catalytic domain of the enzyme." Q#23176 - CGI_10008162 superfamily 145726 50 138 0.00244241 36.9398 cl08353 Cu_amine_oxidN2 superfamily - - "Copper amine oxidase, N2 domain; This domain is the first or second structural domain in copper amine oxidases, it is known as the N2 domain. Its function is uncertain. The catalytic domain can be found in pfam01179. Copper amine oxidases are a ubiquitous and novel group of quinoenzymes that catalyze the oxidative deamination of primary amines to the corresponding aldehydes, with concomitant reduction of molecular oxygen to hydrogen peroxide. The enzymes are dimers of identical 70-90 kDa subunits, each of which contains a single copper ion and a covalently bound cofactor formed by the post-translational modification of a tyrosine side chain to 2,4,5-trihydroxyphenylalanine quinone (TPQ)." Q#23182 - CGI_10004315 superfamily 243134 30 149 3.30E-37 130.077 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#23182 - CGI_10004315 superfamily 243134 162 285 4.21E-29 108.506 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#23186 - CGI_10004319 superfamily 110440 293 318 0.00139182 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#23187 - CGI_10006058 superfamily 248097 65 184 3.16E-23 90.4022 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23188 - CGI_10006059 superfamily 246925 327 539 3.70E-08 55.0542 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#23188 - CGI_10006059 superfamily 246925 141 273 2.90E-07 51.9726 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#23188 - CGI_10006059 superfamily 214507 654 693 0.00555966 36.254 cl15307 LRRCT superfamily C - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#23188 - CGI_10006059 superfamily 241640 750 813 0.00987397 37.4188 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#23190 - CGI_10006061 superfamily 247725 32 126 3.31E-15 72.2598 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23191 - CGI_10006063 superfamily 217293 27 238 8.33E-75 238.687 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#23191 - CGI_10006063 superfamily 202474 245 338 8.25E-26 105.428 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#23192 - CGI_10006064 superfamily 247639 45 398 1.80E-150 434.39 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#23193 - CGI_10006065 superfamily 247639 50 416 0 568.054 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#23194 - CGI_10006066 superfamily 245814 27 89 0.00640732 34.7723 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#23197 - CGI_10007625 superfamily 247743 174 310 1.33E-14 72.1787 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23198 - CGI_10007626 superfamily 241599 99 157 2.60E-24 95.7732 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#23198 - CGI_10007626 superfamily 146451 414 434 0.000694242 37.3387 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#23199 - CGI_10007627 superfamily 216554 25 188 6.01E-38 134.914 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#23200 - CGI_10007628 superfamily 241640 428 664 3.36E-77 248.73 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#23200 - CGI_10007628 superfamily 243040 104 227 1.46E-18 82.5546 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#23200 - CGI_10007628 superfamily 241613 240 274 7.26E-09 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23200 - CGI_10007628 superfamily 243040 24 100 5.27E-08 51.3535 cl02447 CRD_FZ superfamily C - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#23200 - CGI_10007628 superfamily 243061 317 410 2.62E-06 45.9206 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23201 - CGI_10007629 superfamily 241583 186 373 3.60E-73 233.617 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23201 - CGI_10007629 superfamily 243051 416 567 4.43E-42 148.68 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23201 - CGI_10007629 superfamily 243051 47 125 1.82E-19 85.5073 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23202 - CGI_10007630 superfamily 241583 1 167 2.16E-65 210.505 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23202 - CGI_10007630 superfamily 243051 203 353 2.03E-40 143.287 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23202 - CGI_10007630 superfamily 243051 334 461 1.61E-24 98.9893 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23205 - CGI_10007633 superfamily 216062 22 91 2.04E-05 40.499 cl02928 TGFb_propeptide superfamily C - TGF-beta propeptide; This propeptide is known as latency associated peptide (LAP) in TGF-beta. LAP is a homodimer which is disulfide linked to TGF-beta binding protein. Q#23206 - CGI_10007634 superfamily 243062 11 113 3.96E-47 148.578 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#23208 - CGI_10016976 superfamily 216653 80 203 5.90E-15 68.3927 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#23209 - CGI_10016977 superfamily 216653 395 548 6.74E-23 94.5862 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#23209 - CGI_10016977 superfamily 207627 111 198 9.28E-21 87.3051 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#23209 - CGI_10016977 superfamily 207627 223 311 5.82E-17 76.5243 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#23210 - CGI_10016978 superfamily 216653 653 812 6.95E-20 87.2674 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#23210 - CGI_10016978 superfamily 216653 42 202 1.00E-14 72.2447 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#23210 - CGI_10016978 superfamily 207627 451 547 9.41E-14 68.4303 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#23210 - CGI_10016978 superfamily 207627 338 425 3.50E-11 60.7263 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#23211 - CGI_10016979 superfamily 245814 350 423 0.00704012 34.7885 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#23212 - CGI_10016980 superfamily 245622 105 161 1.06E-16 72.6422 cl11446 Rhomboid superfamily C - "Rhomboid family; This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite." Q#23214 - CGI_10016982 superfamily 220735 330 949 3.38E-166 505.647 cl15660 Ufd2P_core superfamily - - "Ubiquitin elongating factor core; This is the most conserved part of the core region of Ufd2P ubiquitin elongating factor or E4, running from helix alpha-11 to alpha-38. It consists of 31 helices of variable length connected by loops of variable size forming a compact unit; the helical packing pattern of the compact unit consists of five structural repeats that resemble tandem Armadillo (ARM) repeats. This domain is involved in ubiquitination as it binds Cdc48p and escorts ubiquitinated proteins from Cdc48p to the proteasome for degradation. The core is structurally similar to the nuclear transporter protein importin-alpha. The core is associated with the U-box at the C-terminus, pfam04564, which has ligase activity." Q#23214 - CGI_10016982 superfamily 248098 965 1036 1.60E-27 107.764 cl17544 U-box superfamily - - U-box domain; This domain is related to the Ring finger pfam00097 but lacks the zinc binding residues. Q#23215 - CGI_10016983 superfamily 114359 7 308 9.62E-69 224.788 cl17943 DUF791 superfamily - - Protein of unknown function (DUF791); This family consists of several eukaryotic proteins of unknown function. Q#23216 - CGI_10016984 superfamily 216686 25 202 1.72E-41 143.232 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#23217 - CGI_10016985 superfamily 113585 127 241 1.85E-26 101.632 cl04775 DUF716 superfamily - - "Family of unknown function (DUF716); This family is equally distributed in both metazoa and plants. Annotation associated with a member from Nicotiana tabacum suggest that it may be involved in response to viral attack in plants. However, no clear function has been assigned to this family." Q#23218 - CGI_10016986 superfamily 247805 22 94 5.75E-10 51.8763 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#23219 - CGI_10016987 superfamily 247905 190 296 5.28E-12 62.2552 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#23219 - CGI_10016987 superfamily 247805 43 130 7.16E-05 41.8611 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#23220 - CGI_10016988 superfamily 218140 353 771 1.50E-76 258.682 cl04579 Anoctamin superfamily - - "Calcium-activated chloride channel; The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes." Q#23221 - CGI_10016989 superfamily 247905 6 109 4.70E-19 80.7448 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#23221 - CGI_10016989 superfamily 221155 183 302 3.65E-08 50.444 cl13152 RIG-I_C-RD superfamily - - "C-terminal domain of RIG-I; This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerisation. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity." Q#23222 - CGI_10016990 superfamily 247805 109 260 7.98E-24 96.6375 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#23225 - CGI_10016993 superfamily 152119 182 234 0.00246823 36.1287 cl13182 DUF3278 superfamily N - Protein of unknown function (DUF3278); This bacterial family of proteins has no known function. Q#23228 - CGI_10016997 superfamily 241613 117 152 8.09E-05 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23228 - CGI_10016997 superfamily 246918 54 106 2.50E-11 56.0559 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#23230 - CGI_10016999 superfamily 241583 392 604 3.09E-25 105.398 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23231 - CGI_10017000 superfamily 217293 49 259 1.02E-37 137.379 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#23231 - CGI_10017000 superfamily 202474 266 357 7.45E-12 63.4417 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#23233 - CGI_10017002 superfamily 206050 28 124 1.19E-25 96.9595 cl16449 KIAA1430 superfamily - - KIAA1430 homologue; This is a family of KIAA1430 homologues. The function is not known. Q#23234 - CGI_10017003 superfamily 241599 103 159 2.01E-22 88.0692 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#23234 - CGI_10017003 superfamily 146451 266 284 0.000383259 37.3387 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#23235 - CGI_10017004 superfamily 246228 482 682 9.68E-55 187.104 cl13209 CPSF73-100_C superfamily - - Pre-mRNA 3'-end-processing endonuclease polyadenylation factor C-term; This is the C-terminal conserved region of the pre-mRNA 3'-end-processing of the polyadenylation factor CPSF-73/CPSF-100 proteins. The exact function of this domain is not known. Q#23235 - CGI_10017004 superfamily 246031 251 372 4.23E-43 151.538 cl12567 Beta-Casp superfamily - - Beta-Casp domain; The beta-CASP domain is found C terminal to the beta-lactamase domain in pre-mRNA 3'-end-processing endonuclease. The active site of this enzyme is located at the interface of these two domains. Q#23235 - CGI_10017004 superfamily 241867 25 185 2.13E-11 62.358 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#23235 - CGI_10017004 superfamily 203663 385 427 5.07E-09 53.2623 cl06522 RMMBL superfamily - - RNA-metabolising metallo-beta-lactamase; The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in RNA metabolism. Q#23236 - CGI_10017005 superfamily 247976 167 280 2.09E-49 162.74 cl17422 RF-1 superfamily - - "RF-1 domain; This domain is found in peptide chain release factors such as RF-1 and RF-2, and a number of smaller proteins of unknown function. This domain contains the peptidyl-tRNA hydrolase activity. The domain contains a highly conserved motif GGQ, where the glutamine is thought to coordinate the water that mediates the hydrolysis." Q#23236 - CGI_10017005 superfamily 248299 34 138 8.44E-22 88.3607 cl17745 PCRF superfamily - - PCRF domain; This domain is found in peptide chain release factors. Q#23239 - CGI_10011798 superfamily 247856 7 30 0.00564633 30.9789 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23240 - CGI_10011799 superfamily 216212 350 832 8.75E-84 279.562 cl03037 HCO3_cotransp superfamily - - HCO3- transporter family; This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. Q#23240 - CGI_10011799 superfamily 241651 236 320 0.00020501 41.1566 cl00163 PTS_IIA_fru superfamily N - "PTS_IIA, PTS system, fructose/mannitol specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation." Q#23241 - CGI_10011800 superfamily 219209 42 373 2.27E-29 116.301 cl06089 RNA_pol_I_A49 superfamily - - "A49-like RNA polymerase I associated factor; Saccharomyces cerevisiae A49 is a specific subunit associated with RNA polymerase I (Pol I) in eukaryotes. Pol I maintains transcription activities in A49 deletion mutants. However, such mutants are deficient in transcription activity at low temperatures. Deletion analysis of the fusion yeast homolog indicate that only the C-terminal two thirds are required for function. Transcript analysis has demonstrated that A49 is maximising transcription of ribosomal DNA." Q#23242 - CGI_10011801 superfamily 245596 74 369 0 517.913 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#23242 - CGI_10011801 superfamily 247085 387 510 2.67E-22 92.5686 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#23243 - CGI_10011802 superfamily 241599 5 50 1.33E-07 47.2381 cl00084 homeodomain superfamily N - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#23244 - CGI_10011803 superfamily 247856 161 224 1.80E-08 49.0833 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23244 - CGI_10011803 superfamily 215821 56 150 3.30E-37 127.357 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#23247 - CGI_10011806 superfamily 245201 248 514 1.60E-144 420.312 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23248 - CGI_10011807 superfamily 216686 106 297 3.37E-52 172.893 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#23252 - CGI_10012539 superfamily 243095 631 879 3.46E-57 195.637 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#23252 - CGI_10012539 superfamily 241900 264 483 9.22E-106 330.815 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#23253 - CGI_10012540 superfamily 245838 190 452 2.41E-45 158.762 cl12018 Peptidase_M48 superfamily - - Peptidase family M48; Peptidase family M48. Q#23254 - CGI_10012541 superfamily 243043 168 187 7.11E-05 38.3766 cl02453 IlGF_like superfamily N - "Insulin/insulin-like growth factor/relaxin family; insulin family of proteins. Members include a number of active peptides which are evolutionary related including insulin, relaxin, prorelaxin, insulin-like growth factors I and II, mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP; gene INSL4), insect prothoracicotropic hormone (bombyxin), locust insulin-related peptide (LIRP), molluscan insulin-related peptides 1 to 5 (MIP), and C. elegans insulin-like peptides. Typically, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds." Q#23255 - CGI_10012542 superfamily 243043 52 146 1.20E-06 43.2346 cl02453 IlGF_like superfamily - - "Insulin/insulin-like growth factor/relaxin family; insulin family of proteins. Members include a number of active peptides which are evolutionary related including insulin, relaxin, prorelaxin, insulin-like growth factors I and II, mammalian Leydig cell-specific insulin-like peptide (gene INSL3), early placenta insulin-like peptide (ELIP; gene INSL4), insect prothoracicotropic hormone (bombyxin), locust insulin-related peptide (LIRP), molluscan insulin-related peptides 1 to 5 (MIP), and C. elegans insulin-like peptides. Typically, the active forms of these peptide hormones are composed of two chains (A and B) linked by two disulfide bonds; the arrangement of four cysteines is conserved in the "A" chain: Cys1 is linked by a disulfide bond to Cys3, Cys2 and Cys4 are linked by interchain disulfide bonds to cysteines in the "B" chain. This alignment contains both chains, plus the intervening linker region, arranged as found in the propeptide form. Propeptides are cleaved to yield two separate chains linked covalently by the two disulfide bonds." Q#23257 - CGI_10012544 superfamily 248458 348 732 5.28E-29 118.57 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23258 - CGI_10012545 superfamily 241571 6 130 3.12E-20 83.6158 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23258 - CGI_10012545 superfamily 241571 137 259 2.07E-12 62.0446 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23260 - CGI_10008246 superfamily 246921 322 374 1.65E-09 55.0741 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#23260 - CGI_10008246 superfamily 246921 254 305 5.32E-08 50.8369 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#23260 - CGI_10008246 superfamily 246921 31 78 0.00838168 35.4289 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#23261 - CGI_10008247 superfamily 245230 34 466 0 912.444 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#23266 - CGI_10007018 superfamily 216686 3 123 3.07E-19 80.8301 cl18377 Galactosyl_T superfamily N - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#23269 - CGI_10007021 superfamily 241874 9 456 2.52E-125 387.992 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#23269 - CGI_10007021 superfamily 248289 679 715 0.00203628 37.1104 cl17735 VWC superfamily C - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#23269 - CGI_10007021 superfamily 214565 597 662 0.00374413 36.3865 cl18312 VWC_out superfamily - - von Willebrand factor (vWF) type C domain; von Willebrand factor (vWF) type C domain. Q#23269 - CGI_10007021 superfamily 248289 515 552 0.00526457 35.9548 cl17735 VWC superfamily C - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#23271 - CGI_10006688 superfamily 242406 1 67 3.26E-10 53.3641 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#23275 - CGI_10003751 superfamily 146263 125 170 2.10E-07 46.9139 cl04138 SK_channel superfamily C - Calcium-activated SK potassium channel; Calcium-activated SK potassium channel. Q#23275 - CGI_10003751 superfamily 216423 169 199 4.00E-07 48.0015 cl18367 Glyco_hydro_35 superfamily NC - Glycosyl hydrolases family 35; Glycosyl hydrolases family 35. Q#23276 - CGI_10003752 superfamily 198825 268 340 4.03E-37 129.841 cl03763 CaMBD superfamily - - "Calmodulin binding domain; Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other." Q#23276 - CGI_10003752 superfamily 146263 11 95 2.45E-28 107.005 cl04138 SK_channel superfamily N - Calcium-activated SK potassium channel; Calcium-activated SK potassium channel. Q#23276 - CGI_10003752 superfamily 219619 175 250 2.34E-10 56.0619 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#23277 - CGI_10020447 superfamily 241559 19 134 2.61E-18 81.5883 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#23277 - CGI_10020447 superfamily 241559 163 268 9.88E-16 74.2695 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#23277 - CGI_10020447 superfamily 241559 285 391 5.15E-11 60.4023 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#23277 - CGI_10020447 superfamily 241559 408 511 1.53E-07 50.0019 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#23277 - CGI_10020447 superfamily 247829 531 560 2.36E-13 70.2946 cl17275 PRTase_typeII superfamily N - "Phosphoribosyltransferase (PRTase) type II; This family contains two enzymes that play an important role in NAD production by either allowing quinolinic acid (QA) , quinolinate phosphoribosyl transferase (QAPRTase), or nicotinic acid (NA), nicotinate phosphoribosyltransferase (NAPRTase), to be used in the synthesis of NAD. QAPRTase catalyses the reaction of quinolinic acid (QA) with 5-phosphoribosyl-1-pyrophosphate (PRPP) in the presence of Mg2+ to produce nicotinic acid mononucleotide (NAMN), pyrophosphate and carbon dioxide, an important step in the de novo synthesis of NAD. NAPRTase catalyses a similar reaction leading to NAMN and pyrophosphate, using nicotinic acid an PPRP as substrates, used in the NAD salvage pathway." Q#23278 - CGI_10020448 superfamily 247068 13 89 7.66E-16 73.1165 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23278 - CGI_10020448 superfamily 247068 130 217 3.11E-11 60.0197 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23279 - CGI_10020449 superfamily 243319 7 189 6.69E-110 314.981 cl03141 Ribosomal_S7e superfamily - - Ribosomal protein S7e; Ribosomal protein S7e. Q#23280 - CGI_10020450 superfamily 241644 5 102 5.08E-29 104.977 cl00154 UBCc superfamily C - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#23281 - CGI_10020451 superfamily 241756 164 237 1.61E-24 98.9345 cl00289 FIG superfamily N - "FIG, FBPase/IMPase/glpX-like domain. A superfamily of metal-dependent phosphatases with various substrates. Fructose-1,6-bisphospatase (both the major and the glpX-encoded variant) hydrolyze fructose-1,6,-bisphosphate to fructose-6-phosphate in gluconeogenesis. Inositol-monophosphatases and inositol polyphosphatases play vital roles in eukaryotic signalling, as they participate in metabolizing the messenger molecule Inositol-1,4,5-triphosphate. Many of these enzymes are inhibited by Li+." Q#23281 - CGI_10020451 superfamily 241756 45 162 7.14E-22 91.2305 cl00289 FIG superfamily C - "FIG, FBPase/IMPase/glpX-like domain. A superfamily of metal-dependent phosphatases with various substrates. Fructose-1,6-bisphospatase (both the major and the glpX-encoded variant) hydrolyze fructose-1,6,-bisphosphate to fructose-6-phosphate in gluconeogenesis. Inositol-monophosphatases and inositol polyphosphatases play vital roles in eukaryotic signalling, as they participate in metabolizing the messenger molecule Inositol-1,4,5-triphosphate. Many of these enzymes are inhibited by Li+." Q#23282 - CGI_10020452 superfamily 221744 19 285 3.84E-09 55.9051 cl18614 CABIT superfamily - - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#23283 - CGI_10020453 superfamily 221744 45 309 7.30E-14 69.3871 cl18614 CABIT superfamily - - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#23284 - CGI_10020454 superfamily 218122 306 608 2.17E-89 283.334 cl04558 Choline_transpo superfamily - - Plasma-membrane choline transporter; This family represents a high-affinity plasma-membrane choline transporter in C.elegans which is thought to be rate-limiting for ACh synthesis in cholinergic nerve terminals. Q#23285 - CGI_10020455 superfamily 245201 946 1186 3.20E-34 133.431 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23287 - CGI_10020457 superfamily 219127 184 506 9.94E-70 227.963 cl05943 MIG-14_Wnt-bd superfamily - - "Wnt-binding factor required for Wnt secretion; MIG-14 is a Wnt-binding factor. Newly synthesised EGL-20/Wnt binds to MIG-14 in the Golgi, targetting the Wnt to the cell membrane for secretion. AP-2-mediated endocytosis and retromer retrieval at the sorting endosome would recycle MIG-14 to the Golgi, where it can bind to EGL-20/Wnt for next cycle of secretion." Q#23288 - CGI_10020458 superfamily 221744 54 303 9.42E-21 92.8842 cl18614 CABIT superfamily - - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#23289 - CGI_10020459 superfamily 221744 92 301 7.23E-09 54.7495 cl18614 CABIT superfamily - - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#23291 - CGI_10020461 superfamily 248097 14 125 4.56E-19 77.3054 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23292 - CGI_10020462 superfamily 202711 38 210 6.18E-92 269.607 cl04190 Mob1_phocein superfamily - - "Mob1/phocein family; Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature. This family also includes phocein, a rat protein that by yeast two hybrid interacts with striatin." Q#23293 - CGI_10020463 superfamily 215859 479 696 1.49E-70 230.566 cl18347 Peptidase_S9 superfamily - - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#23294 - CGI_10020464 superfamily 248097 59 167 1.11E-12 60.7418 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23298 - CGI_10020468 superfamily 245225 38 494 5.08E-49 181.672 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#23298 - CGI_10020468 superfamily 245225 556 754 6.11E-28 117.344 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#23299 - CGI_10020469 superfamily 222150 456 481 5.54E-05 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23299 - CGI_10020469 superfamily 222150 344 369 7.11E-05 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23299 - CGI_10020469 superfamily 222150 373 395 0.000233624 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23299 - CGI_10020469 superfamily 222150 401 425 0.000417124 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23299 - CGI_10020469 superfamily 222150 428 453 0.00146095 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23302 - CGI_10020472 superfamily 243061 667 764 2.19E-24 99.3386 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23302 - CGI_10020472 superfamily 243061 771 876 2.60E-22 93.5606 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23302 - CGI_10020472 superfamily 243061 96 203 1.69E-07 50.033 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23302 - CGI_10020472 superfamily 238012 233 262 0.00295393 36.9486 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#23303 - CGI_10020473 superfamily 241571 791 842 1.55E-06 47.7923 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23303 - CGI_10020473 superfamily 243061 544 636 5.36E-20 87.3974 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23303 - CGI_10020473 superfamily 243061 187 274 1.87E-08 53.4998 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23303 - CGI_10020473 superfamily 243061 367 446 3.25E-06 46.5662 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23303 - CGI_10020473 superfamily 243061 643 682 0.000243293 40.7882 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23303 - CGI_10020473 superfamily 238012 37 67 0.00267647 37.3338 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#23303 - CGI_10020473 superfamily 238012 119 152 0.00519408 36.1782 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#23304 - CGI_10020474 superfamily 241571 1727 1846 7.18E-11 62.0446 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23304 - CGI_10020474 superfamily 243061 1155 1257 7.27E-32 122.836 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23304 - CGI_10020474 superfamily 243061 406 514 4.68E-13 68.1374 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23304 - CGI_10020474 superfamily 243061 704 817 3.22E-10 59.663 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23304 - CGI_10020474 superfamily 243061 23 107 4.31E-09 56.1962 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23304 - CGI_10020474 superfamily 243061 1055 1148 2.87E-07 50.8034 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23304 - CGI_10020474 superfamily 243061 217 321 3.66E-05 44.255 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#23304 - CGI_10020474 superfamily 241574 2050 2087 0.000154204 44.1138 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#23304 - CGI_10020474 superfamily 243051 1343 1505 0.000545166 41.1821 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23304 - CGI_10020474 superfamily 205157 1599 1622 0.00143984 38.6727 cl18264 EGF_3 superfamily N - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#23304 - CGI_10020474 superfamily 238012 833 867 0.00860652 36.5634 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#23305 - CGI_10020475 superfamily 243072 172 306 2.69E-32 118.64 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23306 - CGI_10020476 superfamily 247676 3 129 4.52E-59 183.61 cl17012 GINS_A superfamily - - "Alpha-helical domain of GINS complex proteins; Sld5, Psf1, Psf2 and Psf3; The GINS complex is involved in both initiation and elongation stages of eukaryotic chromosome replication, with GINS being the component that most likely serves as the replicative helicase that unwinds duplex DNA ahead of the moving replication fork. In eukaryotes, GINS is a tetrameric arrangement of four subunits Sld5, Psf1, Psf2 and Psf3. The GINS complex has been found in eukaryotes and archaea, but not in bacteria. The four subunits of the complex are homologous and consist of two domains each, termed the alpha-helical (A) and beta-strand (B) domains. The A and B domains of Sld5/Psf1 are permuted with respect to Psf1/Psf3." Q#23307 - CGI_10020477 superfamily 247757 22 433 0 601.806 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#23309 - CGI_10020479 superfamily 247692 58 655 0 942.743 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#23310 - CGI_10020480 superfamily 202484 1 44 2.52E-11 53.3868 cl03798 zf-Tim10_DDP superfamily N - Tim10/DDP family zinc finger; Putative zinc binding domain with four conserved cysteine residues. This domain is found in the human disease protein TIMM8A. Members of this family such as Tim9 and Tim10 are involved in mitochondrial protein import. Members of this family seem to be localised to the mitochondrial intermembrane space. Q#23312 - CGI_10020482 superfamily 192729 7 60 1.57E-34 125.242 cl14978 IRF-2BP1_2 superfamily - - Interferon regulatory factor 2-binding protein zinc finger; IRF-2BP1 and IRF-2BP2 are nuclear transcriptional repressor proteins and can inhibit both enhancer-activated and basal transcription. They both contain N-terminal zinc finger represented in this family and C-terminal RING finger domains. Q#23314 - CGI_10020484 superfamily 241645 12 125 7.19E-56 172.035 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#23316 - CGI_10001715 superfamily 217293 25 229 1.61E-41 145.468 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#23316 - CGI_10001715 superfamily 202474 239 330 3.22E-23 95.4132 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#23318 - CGI_10001569 superfamily 241754 230 547 6.01E-136 428.26 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#23319 - CGI_10006946 superfamily 245864 1 394 5.59E-50 177.275 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#23321 - CGI_10006948 superfamily 247792 38 71 9.15E-07 44.3588 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#23323 - CGI_10006950 superfamily 241597 166 236 1.31E-20 88.5093 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#23324 - CGI_10006951 superfamily 241896 116 317 7.30E-124 356.426 cl00483 UDG_like superfamily - - "Uracil-DNA glycosylases (UDG) and related enzymes; Uracil-DNA glycosylases (UDG) catalyzes the removal of uracil from DNA, which initiates the DNA base excision repair pathway. Uracil in DNA can arise as a result of mis-incorporation of dUMP residues by DNA polymerase or via deamination of cytosine. Uracil in DNA mispaired with guanine is one of the major pro-mutagenic events, causing G:C->A:T mutations. Thus, UDG is an essential enzyme for maintaining the integrity of genetic information. At least five UDG families have been characterized so far; these families share similar overall folds and common active site motifs. They demonstrate different substrate specificities, but often the function of one enzyme can be complemented by the other. Family 1 enzymes are active against uracil in both ssDNA and dsDNA, and recognize uracil explicitly in an extrahelical conformation via a combination of protein and bound-water interactions. Family 2 enzymes are mismatch specific and explicitly recognize the widowed guanine on the complementary strand, rather than the extrahelical scissile pyrimidine. This allows a broader specificity so that some Family 2 enzymes can excise uracil as well as 3, N(4)-ethenocytosine from mismatches with guanine. A Family 3 UDG from human was first characterized to remove Uracil from ssDNA, hence the name hSMUG (single-strand-selective monofunctional uracil-DNA glycosylase). However, subsequent research has shown that hSMUG1 and its rat ortholog can remove uracil and its oxidized pyrimidine derivatives from both, ssDNA and dsDNA. Enzymes in Families 4 and 5 are both thermostable. Family 4 enzymes specifically recognize uracil in a manner similar to human UDG (Family 1), rather than guanine in the complementary strand DNA, as does E. coli MUG (Family 2). These results suggest that the mechanism by which Family 4 UDGs remove uracils from DNA is similar to that of Family 1 enzyme. Although Family 5 enzymes are close relatives of Family 4, they show different substrate specificities." Q#23325 - CGI_10006952 superfamily 247044 3 138 7.45E-62 200.941 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#23325 - CGI_10006952 superfamily 247683 465 518 2.10E-30 112.493 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#23326 - CGI_10006953 superfamily 247724 13 208 4.45E-72 234.091 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#23326 - CGI_10006953 superfamily 243066 471 568 2.62E-14 69.642 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#23328 - CGI_10006955 superfamily 247941 173 292 0.00879039 35.5661 cl17387 Methyltransf_21 superfamily C - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#23332 - CGI_10007277 superfamily 245210 1 105 4.95E-56 179.982 cl09938 cond_enzymes superfamily N - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#23333 - CGI_10007278 superfamily 248097 10 123 6.43E-20 79.6166 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23334 - CGI_10007279 superfamily 245210 41 237 1.44E-67 215.421 cl09938 cond_enzymes superfamily C - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#23335 - CGI_10007280 superfamily 243096 981 1165 4.64E-29 117.014 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#23335 - CGI_10007280 superfamily 247725 1212 1305 2.58E-28 111.988 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23335 - CGI_10007280 superfamily 241566 726 771 1.08E-08 54.0352 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#23336 - CGI_10007281 superfamily 245201 30 217 4.94E-37 133.029 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23337 - CGI_10007282 superfamily 217473 114 205 5.01E-05 43.893 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#23338 - CGI_10007283 superfamily 192604 21 96 3.66E-14 66.1686 cl11135 PACT_coil_coil superfamily - - "Pericentrin-AKAP-450 domain of centrosomal targeting protein; This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly." Q#23343 - CGI_10013128 superfamily 247740 106 155 0.00427014 36.455 cl17186 TIM_phosphate_binding superfamily NC - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#23346 - CGI_10013131 superfamily 241752 643 709 5.09E-20 86.9897 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#23346 - CGI_10013131 superfamily 241752 136 188 1.57E-18 82.7525 cl00283 ADP_ribosyl superfamily C - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#23346 - CGI_10013131 superfamily 145501 414 567 0.00346884 37.288 cl03566 Cornifin superfamily - - Cornifin (SPRR) family; SPRR genes (formerly SPR) encode a novel class of polypeptides (small proline rich proteins) that are strongly induced during differentiation of human epidermal keratinocytes in vitro and in vivo. The most characteristic feature of the SPRR gene family resides in the structure of the central segments of the encoded polypeptides that are built up from tandemly repeated units of either eight (SPRR1 and SPRR3) or nine (SPRR2) amino acids with the general consensus XKXPEPXX where X is any amino acid. Q#23348 - CGI_10013133 superfamily 241583 247 434 1.67E-59 200.105 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23348 - CGI_10013133 superfamily 241571 618 736 4.79E-23 95.557 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23348 - CGI_10013133 superfamily 243035 495 610 2.71E-09 55.7037 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23348 - CGI_10013133 superfamily 216301 27 216 2.38E-40 146.64 cl03099 EMP24_GP25L superfamily - - emp24/gp25L/p24 family/GOLD; Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Q#23349 - CGI_10013134 superfamily 241578 187 358 1.89E-20 90.9165 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#23349 - CGI_10013134 superfamily 217211 408 474 4.75E-08 51.9014 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#23351 - CGI_10013136 superfamily 248097 66 189 1.04E-14 66.905 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23353 - CGI_10013138 superfamily 248097 9 119 1.83E-08 48.0302 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23356 - CGI_10013141 superfamily 241831 18 81 5.60E-13 59.4104 cl00386 BolA superfamily N - BolA-like protein; This family consist of the morphoprotein BolA from E. coli and its various homologues. In E. coli over expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5. Q#23357 - CGI_10009439 superfamily 241600 2 194 1.65E-54 174.736 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23358 - CGI_10009440 superfamily 241600 9 153 2.75E-46 163.18 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23358 - CGI_10009440 superfamily 241600 372 488 8.49E-28 111.563 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23358 - CGI_10009440 superfamily 241600 178 255 4.64E-21 91.9182 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23358 - CGI_10009440 superfamily 248275 616 637 0.00491457 35.6336 cl17721 zf-C2H2_jaz superfamily C - "Zinc-finger double-stranded RNA-binding; This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation." Q#23360 - CGI_10009442 superfamily 241613 632 664 2.69E-06 45.6606 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23361 - CGI_10009443 superfamily 241613 438 469 1.64E-11 60.2981 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23362 - CGI_10009444 superfamily 243035 113 206 5.05E-08 49.1554 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23373 - CGI_10002849 superfamily 110440 484 510 9.62E-05 40.0837 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#23373 - CGI_10002849 superfamily 241563 59 95 0.000353698 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#23373 - CGI_10002849 superfamily 128778 103 212 0.00981233 35.3183 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#23374 - CGI_10002044 superfamily 241600 1 102 2.58E-49 159.328 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23375 - CGI_10002045 superfamily 241600 27 197 3.61E-69 212.871 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23376 - CGI_10002046 superfamily 247905 39 173 4.87E-07 47.6177 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#23376 - CGI_10002046 superfamily 243778 222 296 7.17E-19 80.3459 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#23380 - CGI_10015782 superfamily 247856 155 201 4.09E-13 61.7949 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23382 - CGI_10015784 superfamily 241599 111 160 1.40E-20 83.0616 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#23382 - CGI_10015784 superfamily 217600 218 264 1.42E-05 42.1018 cl04137 TF_Otx superfamily C - Otx1 transcription factor; Otx1 transcription factor. Q#23383 - CGI_10015785 superfamily 221433 484 615 2.13E-35 130.496 cl13553 DUF3585 superfamily - - Protein of unknown function (DUF3585); This domain is found in eukaryotes. This domain is typically between 135 and 149 amino acids in length and is found associated with pfam00307. Q#23383 - CGI_10015785 superfamily 241559 40 82 0.000346346 39.57 cl00030 CH superfamily N - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#23384 - CGI_10015786 superfamily 247068 80 167 2.56E-13 65.7977 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23384 - CGI_10015786 superfamily 247068 185 257 2.53E-08 51.5454 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23386 - CGI_10015788 superfamily 217293 27 225 1.72E-45 157.795 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#23386 - CGI_10015788 superfamily 202474 232 315 6.01E-10 57.6637 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#23387 - CGI_10015789 superfamily 202474 1 164 6.02E-09 51.8857 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#23389 - CGI_10015791 superfamily 245206 5 260 6.07E-90 269.702 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23391 - CGI_10015793 superfamily 215988 99 199 2.78E-17 74.9856 cl18355 Ligase_CoA superfamily C - "CoA-ligase; This family includes the CoA ligases Succinyl-CoA synthetase alpha and beta chains, malate CoA ligase and ATP-citrate lyase. Some members of the family utilise ATP others use GTP." Q#23392 - CGI_10015794 superfamily 245206 40 311 6.16E-91 275.114 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23393 - CGI_10015795 superfamily 216152 119 398 1.56E-87 272.265 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#23394 - CGI_10015796 superfamily 245819 447 630 3.23E-57 192.409 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#23394 - CGI_10015796 superfamily 219812 120 353 3.10E-14 71.9536 cl07121 NIT superfamily - - "Nitrate and nitrite sensing; The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure." Q#23394 - CGI_10015796 superfamily 219526 391 432 1.03E-08 54.5475 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#23395 - CGI_10015797 superfamily 245819 457 618 3.14E-61 203.58 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#23395 - CGI_10015797 superfamily 219812 130 367 3.16E-15 75.0352 cl07121 NIT superfamily - - "Nitrate and nitrite sensing; The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure." Q#23395 - CGI_10015797 superfamily 219526 401 444 4.37E-08 52.6215 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#23397 - CGI_10015799 superfamily 248458 47 399 5.92E-19 86.5989 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23398 - CGI_10015800 superfamily 247804 36 77 4.20E-06 41.7922 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#23400 - CGI_10008880 superfamily 247916 161 278 9.41E-18 79.7522 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#23407 - CGI_10003890 superfamily 112929 1 151 5.62E-43 155.815 cl04414 Sec34 superfamily - - "Sec34-like family; Sec34 and Sec35 form a sub-complex, in a seven protein complex that includes Dor1 (pfam04124). This complex is thought to be important for tether vesicles to the Golgi." Q#23411 - CGI_10014510 superfamily 243035 19 85 1.42E-13 61.4817 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23412 - CGI_10014511 superfamily 243035 145 246 4.12E-19 79.9713 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23412 - CGI_10014511 superfamily 243035 44 143 4.14E-16 71.4969 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23419 - CGI_10014518 superfamily 222150 356 378 0.00128218 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23419 - CGI_10014518 superfamily 222150 382 406 0.00158009 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23419 - CGI_10014518 superfamily 222150 409 434 0.00940429 34.2897 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23420 - CGI_10014520 superfamily 247856 292 354 2.17E-16 73.3509 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23420 - CGI_10014520 superfamily 247856 230 282 7.34E-12 60.6393 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23420 - CGI_10014520 superfamily 247856 29 90 2.29E-10 56.4021 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23420 - CGI_10014520 superfamily 247856 103 164 2.31E-10 56.4021 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23420 - CGI_10014520 superfamily 247856 365 417 9.86E-07 46.0017 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23420 - CGI_10014520 superfamily 247856 159 219 0.00287496 35.6013 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23421 - CGI_10014521 superfamily 247856 24 86 1.06E-18 78.3585 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23421 - CGI_10014521 superfamily 247856 198 257 3.21E-09 52.5501 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23421 - CGI_10014521 superfamily 247856 98 150 1.36E-06 44.8461 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23421 - CGI_10014521 superfamily 247856 265 328 6.34E-05 40.2237 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23422 - CGI_10014522 superfamily 247856 26 88 8.43E-18 73.3509 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23422 - CGI_10014522 superfamily 247856 99 157 4.79E-09 49.0833 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23423 - CGI_10014523 superfamily 242232 68 117 3.32E-12 57.9532 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#23424 - CGI_10014524 superfamily 242232 68 117 3.32E-12 57.9532 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#23425 - CGI_10014525 superfamily 247856 622 683 3.01E-18 80.2845 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23425 - CGI_10014525 superfamily 247856 414 476 3.38E-16 74.1213 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23425 - CGI_10014525 superfamily 247856 488 547 3.89E-15 71.0397 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23425 - CGI_10014525 superfamily 247856 548 610 1.76E-14 69.4989 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23425 - CGI_10014525 superfamily 247856 228 290 1.94E-14 69.1137 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23425 - CGI_10014525 superfamily 247856 144 198 4.14E-14 68.3433 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23425 - CGI_10014525 superfamily 247856 69 131 1.74E-13 66.4173 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23425 - CGI_10014525 superfamily 247856 13 69 2.79E-13 66.0321 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23425 - CGI_10014525 superfamily 247856 321 383 3.87E-11 59.8689 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23426 - CGI_10026210 superfamily 222090 275 466 1.14E-17 81.5502 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#23427 - CGI_10026211 superfamily 241571 64 163 1.46E-07 47.7923 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23427 - CGI_10026211 superfamily 241613 169 198 4.75E-06 42.1938 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23430 - CGI_10026214 superfamily 243362 109 259 9.86E-42 143.72 cl03262 DnaJ_C superfamily - - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#23430 - CGI_10026214 superfamily 243077 7 57 1.42E-24 94.1493 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#23431 - CGI_10026215 superfamily 243072 127 240 0.000462666 38.9039 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23434 - CGI_10026218 superfamily 218588 1 361 1.22E-154 442.271 cl05147 FIBP superfamily - - Acidic fibroblast growth factor binding (FIBP); Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that is thought to be involved in the intracellular function of aFGF. Q#23437 - CGI_10026221 superfamily 241749 36 64 0.003152 32.7429 cl00280 globin_like superfamily C - superfamily containing globins and truncated hemoglobins Q#23438 - CGI_10026222 superfamily 241749 1 79 4.17E-16 68.5665 cl00280 globin_like superfamily N - superfamily containing globins and truncated hemoglobins Q#23439 - CGI_10026223 superfamily 146263 454 568 1.09E-33 126.65 cl04138 SK_channel superfamily - - Calcium-activated SK potassium channel; Calcium-activated SK potassium channel. Q#23439 - CGI_10026223 superfamily 219619 666 725 1.27E-11 61.8399 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#23439 - CGI_10026223 superfamily 198825 745 814 0.00483984 36.2377 cl03763 CaMBD superfamily - - "Calmodulin binding domain; Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other." Q#23440 - CGI_10026224 superfamily 243092 114 218 0.000736541 38.8552 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23441 - CGI_10026225 superfamily 215827 136 316 1.12E-29 117.569 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#23442 - CGI_10026226 superfamily 215827 144 320 6.07E-31 121.036 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#23443 - CGI_10026227 superfamily 215827 67 245 1.37E-43 157.245 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#23443 - CGI_10026227 superfamily 243119 771 805 5.82E-05 41.6606 cl02629 CBM_14 superfamily N - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#23444 - CGI_10026228 superfamily 116798 34 174 8.18E-30 107.763 cl17955 Lipocalin_2 superfamily - - "Lipocalin-like domain; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The structure is an eight-stranded beta barrel." Q#23445 - CGI_10026229 superfamily 116798 34 179 1.69E-25 96.5918 cl17955 Lipocalin_2 superfamily - - "Lipocalin-like domain; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The structure is an eight-stranded beta barrel." Q#23446 - CGI_10026230 superfamily 247905 68 151 2.42E-06 46.0769 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#23446 - CGI_10026230 superfamily 215827 260 438 1.35E-39 143.378 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#23451 - CGI_10026235 superfamily 243035 12 104 1.61E-14 65.3337 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23454 - CGI_10026238 superfamily 246680 6 94 6.22E-14 65.5252 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#23455 - CGI_10026239 superfamily 202111 95 168 4.76E-32 112.691 cl03448 ODC_AZ superfamily - - Ornithine decarboxylase antizyme; This family consists of ornithine decarboxylase antizyme proteins. The polyamine biosynthetic enzyme ornithine decarboxylase (ODC) is degraded by the 26 S proteasome via a ubiquitin-independent pathway. Its degradation is greatly accelerated by association with the polyamine-induced regulatory protein antizyme 1 (AZ1). Q#23456 - CGI_10026240 superfamily 241571 347 464 1.63E-10 58.963 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23456 - CGI_10026240 superfamily 245213 469 502 4.94E-05 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23456 - CGI_10026240 superfamily 241583 119 300 6.24E-31 119.983 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23456 - CGI_10026240 superfamily 243051 514 669 2.11E-10 59.2865 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23457 - CGI_10026241 superfamily 245836 335 522 1.15E-124 366.881 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#23458 - CGI_10026242 superfamily 248097 47 168 1.60E-27 101.188 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23460 - CGI_10026244 superfamily 217643 55 331 4.07E-111 331.429 cl04182 Solute_trans_a superfamily - - "Organic solute transporter Ostalpha; This family is a transmembrane organic solute transport protein. In vertebrates these proteins form a complex with Ostbeta, and function as bile transporters. In plants they may transport brassinosteroid-like compounds and act as regulators of cell death." Q#23461 - CGI_10026245 superfamily 248458 2 205 0.00108752 38.8341 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23462 - CGI_10026246 superfamily 247723 16 88 2.70E-37 125.486 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23463 - CGI_10026247 superfamily 243088 148 267 5.99E-34 124.577 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#23463 - CGI_10026247 superfamily 149621 407 488 1.64E-13 66.9244 cl07303 Nexin_C superfamily - - Sorting nexin C terminal; This region is found a the C terminal of proteins belonging to the sorting nexin family. It is found on proteins which also contain pfam00787. Q#23465 - CGI_10026249 superfamily 247723 28 99 1.64E-25 97.8013 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23468 - CGI_10026252 superfamily 247723 357 430 1.18E-41 148.262 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23468 - CGI_10026252 superfamily 207684 10 42 2.40E-07 48.9143 cl02640 SAP superfamily - - "SAP domain; The SAP (after SAF-A/B, Acinus and PIAS) motif is a putative DNA/RNA binding domain found in diverse nuclear and cytoplasmic proteins." Q#23471 - CGI_10026255 superfamily 241580 250 327 7.10E-46 156.945 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#23471 - CGI_10026255 superfamily 241581 51 138 1.17E-11 62.0186 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#23472 - CGI_10026256 superfamily 243092 360 657 2.85E-31 123.599 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23472 - CGI_10026256 superfamily 243092 59 480 1.04E-26 110.117 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23473 - CGI_10026257 superfamily 241645 32 118 5.72E-43 137.788 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#23474 - CGI_10026258 superfamily 244972 169 232 4.47E-06 45.0278 cl08475 PIG-X superfamily C - PIG-X / PBN1; Mammalian PIG-X and yeast PBN1 are essential components of glycosylphosphatidylinositol-mannosyltransferase I. These enzymes are involved in the transfer of sugar molecules. Q#23476 - CGI_10026260 superfamily 243084 2488 2588 1.98E-44 159.257 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#23476 - CGI_10026260 superfamily 243137 167 226 1.74E-17 80.3251 cl02674 DDT superfamily - - "DDT domain; This domain is approximately 60 residues in length, and is predicted to be a DNA binding domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors). It is exclusively associated with nuclear domains, and is thought to be arranged into three alpha helices." Q#23476 - CGI_10026260 superfamily 247999 2367 2414 5.55E-10 58.2708 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#23476 - CGI_10026260 superfamily 247999 2425 2471 4.32E-09 55.681 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#23476 - CGI_10026260 superfamily 247999 323 367 6.27E-06 46.3296 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#23477 - CGI_10026261 superfamily 241748 321 550 1.45E-103 316.425 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#23477 - CGI_10026261 superfamily 216431 1 131 0.00235998 37.2625 cl08317 Creatinase_N superfamily - - Creatinase/Prolidase N-terminal domain; This family includes the N-terminal non-catalytic domains from creatinase and prolidase. The exact function of this domain is uncertain. Q#23478 - CGI_10026262 superfamily 241748 11 39 4.60E-09 51.4081 cl00279 APP_MetAP superfamily N - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#23479 - CGI_10026263 superfamily 241748 339 460 7.14E-36 132.685 cl00279 APP_MetAP superfamily C - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#23479 - CGI_10026263 superfamily 216431 64 165 0.000153888 40.3441 cl08317 Creatinase_N superfamily - - Creatinase/Prolidase N-terminal domain; This family includes the N-terminal non-catalytic domains from creatinase and prolidase. The exact function of this domain is uncertain. Q#23481 - CGI_10026265 superfamily 243175 92 213 2.29E-61 189.836 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#23481 - CGI_10026265 superfamily 241832 5 76 1.12E-32 114.332 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#23484 - CGI_10002806 superfamily 247743 1740 1773 0.00103285 40.3564 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23484 - CGI_10002806 superfamily 247743 2140 2179 0.00723678 37.66 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23487 - CGI_10002160 superfamily 245210 15 403 0 513.565 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#23489 - CGI_10002162 superfamily 199156 213 227 0.00493622 33.5708 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#23492 - CGI_10002608 superfamily 247038 422 516 9.65E-22 91.6082 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#23492 - CGI_10002608 superfamily 247038 517 614 1.28E-17 79.1905 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#23492 - CGI_10002608 superfamily 247042 10 164 6.78E-28 117.34 cl15693 Sema superfamily N - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#23492 - CGI_10002608 superfamily 247038 616 714 5.77E-10 57.4269 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#23492 - CGI_10002608 superfamily 243104 371 420 7.77E-07 47.1628 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#23493 - CGI_10007575 superfamily 247065 38 85 7.10E-12 57.3546 cl15777 GGCT_like superfamily N - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#23494 - CGI_10007576 superfamily 246669 1571 1747 4.94E-79 259.563 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#23494 - CGI_10007576 superfamily 246669 1434 1556 1.34E-61 208.263 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#23494 - CGI_10007576 superfamily 203315 62 111 0.00257325 38.2527 cl05335 zf-piccolo superfamily - - "Piccolo Zn-finger; This (predicted) Zinc finger is found in the bassoon and piccolo proteins. There are eight conserved cysteines, suggesting that it coordinates two zinc ligands." Q#23495 - CGI_10007577 superfamily 247727 268 367 2.18E-15 72.4626 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#23495 - CGI_10007577 superfamily 247727 166 307 1.74E-05 45.3044 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#23496 - CGI_10007578 superfamily 245202 576 626 2.49E-10 57.9738 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#23496 - CGI_10007578 superfamily 245202 237 298 4.03E-10 57.2034 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#23496 - CGI_10007578 superfamily 245202 400 462 1.98E-07 49.1142 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#23496 - CGI_10007578 superfamily 245202 83 124 0.000110446 41.025 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#23496 - CGI_10007578 superfamily 245202 725 788 8.91E-09 53.3715 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#23496 - CGI_10007578 superfamily 193374 800 832 0.000416189 39.0716 cl15152 SUZ-C superfamily - - SUZ-C motif; The SUZ-C domain is a conserved motif found in one or more copies in several RNA-binding proteins. It is always found at the C-terminus of the protein and appear to be required for localization of the protein to specific subcellular structures. It was first characterized in the C.elegans protein Szy-20 which localizes to the centrosome. It is widely distributed in eukaryotes. Q#23497 - CGI_10007579 superfamily 227778 93 281 1.90E-09 54.8143 cl17122 VPS24 superfamily - - Conserved protein implicated in secretion [Cell motility and secretion] Q#23500 - CGI_10007582 superfamily 241575 934 978 5.60E-08 51.1191 cl00054 DSRM superfamily C - "Double-stranded RNA binding motif. Binding is not sequence specific but is highly specific for double stranded RNA. Found in a variety of proteins including dsRNA dependent protein kinase PKR, RNA helicases, Drosophila staufen protein, E. coli RNase III, RNases H1, and dsRNA dependent adenosine deaminases." Q#23500 - CGI_10007582 superfamily 243107 869 909 1.13E-13 67.1472 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#23503 - CGI_10007585 superfamily 248097 50 172 1.30E-15 69.2162 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23505 - CGI_10007587 superfamily 241977 1 176 7.46E-43 142.939 cl00607 PUA superfamily - - "PUA domain; The PUA domain named after Pseudouridine synthase and Archaeosine transglycosylase, was detected in archaeal and eukaryotic pseudouridine synthases, archaeal archaeosine synthases, a family of predicted ATPases that may be involved in RNA modification, a family of predicted archaeal and bacterial rRNA methylases. Additionally, the PUA domain was detected in a family of eukaryotic proteins that also contain a domain homologous to the translation initiation factor eIF1/SUI1; these proteins may comprise a novel type of translation factors. Unexpectedly, the PUA domain was detected also in bacterial and yeast glutamate kinases; this is compatible with the demonstrated role of these enzymes in the regulation of the expression of other genes. It is predicted that the PUA domain is an RNA binding domain." Q#23506 - CGI_10002861 superfamily 243072 100 221 2.05E-15 75.1126 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23506 - CGI_10002861 superfamily 243072 302 459 2.15E-13 69.3346 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23506 - CGI_10002861 superfamily 243072 398 524 1.96E-12 66.253 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23506 - CGI_10002861 superfamily 243072 196 350 3.44E-10 59.7046 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23507 - CGI_10002862 superfamily 243034 84 175 1.00E-17 75.8795 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23508 - CGI_10002863 superfamily 241748 186 433 4.98E-75 237.856 cl00279 APP_MetAP superfamily - - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#23508 - CGI_10002863 superfamily 244955 2 147 5.54E-24 96.925 cl08433 AMP_N superfamily - - "Aminopeptidase P, N-terminal domain; This domain is structurally very similar to the creatinase N-terminal domain (pfam01321). However, little or no sequence similarity exists between the two families." Q#23509 - CGI_10002864 superfamily 242904 236 517 3.50E-79 251.646 cl02149 CDC73 superfamily - - "RNA pol II accessory factor, Cdc73 family; RNA pol II accessory factor, Cdc73 family. " Q#23510 - CGI_10002865 superfamily 241799 14 232 9.32E-85 254.409 cl00339 SugarP_isomerase superfamily - - "SugarP_isomerase: Sugar Phosphate Isomerase family; includes type A ribose 5-phosphate isomerase (RPI_A), glucosamine-6-phosphate (GlcN6P) deaminase, and 6-phosphogluconolactonase (6PGL). RPI catalyzes the reversible conversion of ribose-5-phosphate to ribulose 5-phosphate, the first step of the non-oxidative branch of the pentose phosphate pathway. GlcN6P deaminase catalyzes the reversible conversion of GlcN6P to D-fructose-6-phosphate (Fru6P) and ammonium, the last step of the metabolic pathway of N-acetyl-D-glucosamine-6-phosphate. 6PGL converts 6-phosphoglucono-1,5-lactone to 6-phosphogluconate, the second step of the oxidative phase of the pentose phosphate pathway." Q#23511 - CGI_10010628 superfamily 247866 140 282 1.32E-17 79.4188 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#23512 - CGI_10010629 superfamily 247866 60 287 1.59E-20 87.8932 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#23513 - CGI_10010630 superfamily 245868 65 298 4.44E-81 248.099 cl12093 YIF1 superfamily - - YIF1; YIF1 (Yip1 interacting factor) is an integral membrane protein that is required for membrane fusion of ER derived vesicles. It also plays a role in the biogenesis of ER derived COPII transport vesicles. Q#23514 - CGI_10010631 superfamily 218605 4 74 5.11E-24 87.3085 cl05186 SRP9-21 superfamily - - "Signal recognition particle 9 kDa protein (SRP9); This family consists of several eukaryotic SRP9 proteins. SRP9 together with the Alu-homologous region of 7SL RNA and SRP14 comprise the "Alu domain" of SRP, which mediates pausing of synthesis of ribosome associated nascent polypeptides that have been engaged by the targeting domain of SRP. This family also contains the homologous fungal SRP21." Q#23515 - CGI_10010632 superfamily 247725 20 127 2.68E-70 219.857 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23515 - CGI_10010632 superfamily 204060 379 413 0.000160897 39.3551 cl07401 VASP_tetra superfamily - - VASP tetramerisation domain; Vasodilator-stimulated phosphoprotein (VASP) is an actin cytoskeletal regulatory protein. This region corresponds to the tetramerisation domain which forms a right handed alpha helical coiled coil structure. Q#23517 - CGI_10010634 superfamily 247736 50 108 9.36E-08 45.7297 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#23518 - CGI_10010635 superfamily 241607 94 134 1.72E-08 48.4202 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#23519 - CGI_10010636 superfamily 243034 436 533 0.000387197 39.2856 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23519 - CGI_10010636 superfamily 111114 185 217 5.95E-05 41.3972 cl03482 HAT superfamily - - HAT (Half-A-TPR) repeat; The HAT (Half A TPR) repeat is found in several RNA processing proteins. Q#23519 - CGI_10010636 superfamily 214642 84 116 0.000277585 39.0694 cl02592 HAT superfamily - - HAT (Half-A-TPR) repeats; Present in several RNA-binding proteins. Structurally and sequentially thought to be similar to TPRs. Q#23521 - CGI_10010638 superfamily 192535 13 292 1.10E-08 54.139 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#23523 - CGI_10010640 superfamily 243053 762 972 1.55E-62 212.113 cl02485 RasGEF superfamily - - "Guanine nucleotide exchange factor for Ras-like small GTPases. Small GTP-binding proteins of the Ras superfamily function as molecular switches in fundamental events such as signal transduction, cytoskeleton dynamics and intracellular trafficking. Guanine-nucleotide-exchange factors (GEFs) positively regulate these GTP-binding proteins in response to a variety of signals. GEFs catalyze the dissociation of GDP from the inactive GTP-binding proteins. GTP can then bind and induce structural changes that allow interaction with effectors." Q#23523 - CGI_10010640 superfamily 243038 205 333 6.19E-62 206.426 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#23523 - CGI_10010640 superfamily 241570 355 466 4.41E-21 90.4629 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#23523 - CGI_10010640 superfamily 241570 34 134 2.52E-17 79.6774 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#23523 - CGI_10010640 superfamily 243067 502 609 5.65E-09 55.1112 cl02520 REM superfamily - - "Guanine nucleotide exchange factor for Ras-like GTPases; N-terminal domain (RasGef_N), also called REM domain (Ras exchanger motif). This domain is common in nucleotide exchange factors for Ras-like small GTPases and is typically found immediately N-terminal to the RasGef (Cdc25-like) domain. REM contacts the GTPase and is assumed to participate in the catalytic activity of the exchange factor. Proteins with the REM domain include Sos1 and Sos2, which relay signals from tyrosine-kinase mediated signalling to Ras, RasGRP1-4, RasGRF1,2, CNrasGEF, and RAP-specific nucleotide exchange factors, to name a few." Q#23528 - CGI_10003050 superfamily 247724 17 113 1.63E-59 182.608 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#23529 - CGI_10003051 superfamily 247723 9 77 6.32E-16 69.6413 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23529 - CGI_10003051 superfamily 199156 101 116 0.000955302 35.4968 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#23531 - CGI_10002193 superfamily 248022 72 440 1.20E-28 115.838 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#23533 - CGI_10005496 superfamily 247999 4 57 0.000499565 37.9618 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#23535 - CGI_10005498 superfamily 242274 68 215 3.71E-05 42.0142 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#23537 - CGI_10027788 superfamily 220695 9 127 2.33E-05 42.9511 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#23539 - CGI_10027790 superfamily 220695 96 236 5.52E-07 48.3439 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#23540 - CGI_10027791 superfamily 220695 100 238 4.62E-05 43.3363 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#23543 - CGI_10027794 superfamily 243035 177 248 4.27E-05 41.0662 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23545 - CGI_10027796 superfamily 241564 8 50 2.02E-10 53.0567 cl00035 BIR superfamily N - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#23546 - CGI_10027797 superfamily 243035 105 179 0.00165238 36.829 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23549 - CGI_10027800 superfamily 110440 256 283 0.00936898 33.1501 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#23550 - CGI_10027801 superfamily 241563 62 97 2.26E-05 41.504 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#23550 - CGI_10027801 superfamily 128778 98 212 0.00783064 34.9331 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#23551 - CGI_10027802 superfamily 128778 162 276 0.00314408 36.4739 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#23552 - CGI_10027803 superfamily 202638 100 326 2.04E-37 136.569 cl18230 Anp1 superfamily - - "Anp1; The members of this family (Anp1, Van1 and Mnn9) are membrane proteins required for proper Golgi function. These proteins co-localise within the cis Golgi, and that they are physically associated in two distinct complexes." Q#23553 - CGI_10027804 superfamily 245847 250 398 7.55E-36 129.394 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#23554 - CGI_10027805 superfamily 243034 529 629 5.86E-16 75.1091 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23554 - CGI_10027805 superfamily 243034 420 512 2.36E-13 67.4052 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23554 - CGI_10027805 superfamily 243034 751 822 5.78E-07 48.5304 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23554 - CGI_10027805 superfamily 243034 598 704 6.23E-06 45.4488 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23554 - CGI_10027805 superfamily 149463 255 330 1.48E-29 113.464 cl07144 DUF1736 superfamily - - Domain of unknown function (DUF1736); This domain of unknown function is found in various hypothetical metazoan proteins. Q#23554 - CGI_10027805 superfamily 247918 90 227 0.00129406 39.097 cl17364 PMT_2 superfamily - - Dolichyl-phosphate-mannose-protein mannosyltransferase; This family contains members that are not captured by pfam02366. Q#23555 - CGI_10027806 superfamily 246669 192 327 1.44E-35 126.929 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#23556 - CGI_10027808 superfamily 243035 155 275 2.73E-22 89.2161 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23556 - CGI_10027808 superfamily 243035 68 116 0.0019864 36.0386 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23557 - CGI_10027809 superfamily 243035 45 111 8.65E-12 58.7853 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23557 - CGI_10027809 superfamily 241592 153 193 4.40E-19 79.9488 cl00074 H2A superfamily N - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23558 - CGI_10027810 superfamily 207411 93 133 2.01E-11 57.8396 cl01438 zf-AN1 superfamily - - "AN1-like Zinc finger; Zinc finger at the C-terminus of An1, a ubiquitin-like protein in Xenopus laevis. The following pattern describes the zinc finger. C-X2-C-X(9-12)-C-X(1-2)-C-X4-C-X2-H-X5-H-X-C Where X can be any amino acid, and numbers in brackets indicate the number of residues." Q#23559 - CGI_10027811 superfamily 241570 327 436 3.33E-06 45.3946 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#23560 - CGI_10027812 superfamily 221683 20 100 4.61E-25 98.4951 cl15002 UPF0489 superfamily - - UPF0489 domain; This family is probably an enzyme which is related to the Arginase family. Q#23561 - CGI_10027813 superfamily 222150 297 320 0.0010057 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23562 - CGI_10027814 superfamily 243072 137 219 2.53E-13 67.7938 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23563 - CGI_10027815 superfamily 207794 136 277 2.95E-44 160.841 cl02948 GH20_hexosaminidase superfamily N - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#23563 - CGI_10027815 superfamily 207794 52 120 2.46E-40 149.67 cl02948 GH20_hexosaminidase superfamily C - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#23564 - CGI_10027816 superfamily 243109 1 102 3.34E-68 217.812 cl02614 SPRY superfamily N - "SPRY domain; SPRY domains, first identified in the SP1A kinase of Dictyostelium and rabbit Ryanodine receptor (hence the name), are homologous to B30.2. SPRY domains have been identified in at least 11 protein families, covering a wide range of functions, including regulation of cytokine signaling (SOCS), RNA metabolism (DDX1 and hnRNP), immunity to retroviruses (TRIM5alpha), intracellular calcium release (ryanodine receptors or RyR) and regulatory and developmental processes (HERC1 and Ash2L). B30.2 also contains residues in the N-terminus that form a distinct PRY domain structure; i.e. B30.2 domain consists of PRY and SPRY subdomains. B30.2 domains comprise the C-terminus of three protein families: BTNs (receptor glycoproteins of immunoglobulin superfamily); several TRIM proteins (composed of RING/B-box/coiled-coil or RBCC core); Stonutoxin (secreted poisonous protein of the stonefish Synanceia horrida). While SPRY domains are evolutionarily ancient, B30.2 domains are a more recent adaptation where the SPRY/PRY combination is a possible component of immune defense. Mutations found in the SPRY-containing proteins have shown to cause Mediterranean fever and Opitz syndrome." Q#23564 - CGI_10027816 superfamily 214806 392 489 3.86E-18 80.0321 cl15966 CRA superfamily - - "CT11-RanBPM; protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi)" Q#23564 - CGI_10027816 superfamily 128914 174 230 5.36E-06 44.099 cl15352 CTLH superfamily - - C-terminal to LisH motif; Alpha-helical motif of unknown function. Q#23564 - CGI_10027816 superfamily 199226 135 167 0.000440166 38.1844 cl11662 LisH superfamily - - "LisH; The LisH (lis homology) domain mediates protein dimerisation and tetramerisation. The LisH domain is found in Sif2, a component of the Set3 complex which is responsible for repressing meiotic genes. It has been shown that the LisH domain helps mediate interaction with components of the Set3 complex." Q#23566 - CGI_10027818 superfamily 245201 19 335 0 610.135 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23567 - CGI_10027819 superfamily 216686 2 175 2.64E-42 144.388 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#23569 - CGI_10027821 superfamily 241782 419 667 2.38E-74 246.314 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#23570 - CGI_10027822 superfamily 245206 62 286 1.44E-78 242.148 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23571 - CGI_10027823 superfamily 241782 2 139 3.24E-27 104.176 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#23572 - CGI_10027824 superfamily 241623 570 922 1.46E-169 500.792 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#23572 - CGI_10027824 superfamily 246669 256 411 3.85E-37 138.214 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#23572 - CGI_10027824 superfamily 241742 453 575 1.48E-22 96.5577 cl00271 PI3Ka superfamily - - "Phosphoinositide 3-kinase family, accessory domain (PIK domain); PIK domain is conserved in PI3 and PI4-kinases. Its role is unclear, but it has been suggested to be involved in substrate presentation. Phosphoinositide 3-kinases play an important role in a variety of fundamental cellular processes and can be divided into three main classes, defined by their substrate specificity and domain architecture." Q#23572 - CGI_10027824 superfamily 207610 123 225 4.33E-15 72.7212 cl02484 PI3K_rbd superfamily - - "PI3-kinase family, ras-binding domain; Certain members of the PI3K family possess Ras-binding domains in their N-termini. These regions show some similarity (although not highly significant similarity) to Ras-binding pfam00788 domains (unpublished observation)." Q#23572 - CGI_10027824 superfamily 198687 34 105 3.97E-08 51.8913 cl02483 PI3K_p85B superfamily - - "PI3-kinase family, p85-binding domain; PI3-kinase family, p85-binding domain. " Q#23573 - CGI_10027825 superfamily 247866 44 271 9.64E-16 74.026 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#23574 - CGI_10027826 superfamily 243110 123 312 7.56E-11 60.9061 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#23575 - CGI_10027827 superfamily 243110 124 336 4.82E-18 83.2477 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#23577 - CGI_10027829 superfamily 241563 6 44 2.11E-07 45.4059 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#23579 - CGI_10027832 superfamily 118742 51 173 2.83E-37 127.198 cl10907 DUF2054 superfamily - - "Uncharacterized conserved protein (DUF2054); This entry contains 14 conserved cysteines, three of which are CC-dimers. The region is of approximately 200 residues in length but its function is unknown." Q#23580 - CGI_10027833 superfamily 241600 589 786 1.23E-70 232.516 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23581 - CGI_10027834 superfamily 222080 156 291 1.31E-35 128.945 cl16256 Transglut_core2 superfamily C - Transglutaminase-like superfamily; Transglutaminase-like superfamily. Q#23581 - CGI_10027834 superfamily 242575 337 367 0.00131441 36.7911 cl01548 YccV-like superfamily N - Hemimethylated DNA-binding protein YccV like; YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. Q#23583 - CGI_10027836 superfamily 222269 27 285 1.05E-89 270.349 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#23584 - CGI_10027837 superfamily 245874 25 117 3.23E-05 42.0282 cl12111 TNFR superfamily - - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#23586 - CGI_10027839 superfamily 248020 26 353 1.04E-54 187.672 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#23587 - CGI_10027840 superfamily 248020 226 547 5.54E-55 191.909 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#23589 - CGI_10027842 superfamily 248458 145 290 1.12E-15 77.3541 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23589 - CGI_10027842 superfamily 248458 352 540 2.26E-09 58.0941 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#23590 - CGI_10027843 superfamily 247755 490 682 1.51E-49 170.322 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#23590 - CGI_10027843 superfamily 247755 327 399 8.34E-36 132.187 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#23590 - CGI_10027843 superfamily 247755 176 239 1.31E-14 71.7108 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#23590 - CGI_10027843 superfamily 221805 394 455 1.87E-19 84.171 cl14896 ABC_tran_2 superfamily C - ABC transporter; This domain is related to pfam00005. Q#23591 - CGI_10027844 superfamily 245201 32 302 4.42E-20 90.7589 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23591 - CGI_10027844 superfamily 243092 984 1360 1.44E-13 71.5972 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23592 - CGI_10027845 superfamily 217473 192 346 3.73E-25 105.14 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#23593 - CGI_10027846 superfamily 217380 225 518 3.92E-44 162.494 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#23596 - CGI_10027849 superfamily 151555 34 154 3.56E-31 112.167 cl12666 CENP-M superfamily N - "Centromere protein M (CENP-M); The prime candidate for specifying centromere identity is the array of nucleosomes assembles with CENP-A. CENP-A recruits a nucleosome associated complex (NAC) comprised of CENP-M along with two other proteins. Assembly of the CENP-A NAC at centromeres is partly dependant on CENP-M. The CENP-A NAC is essential, as disruption of the complex causes errors of chromosome alignment and segregation that preclude cell survival." Q#23599 - CGI_10027852 superfamily 247750 12 28 1.91E-06 43.0479 cl17196 E1_enzyme_family superfamily C - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#23600 - CGI_10027853 superfamily 247856 186 266 0.00573378 34.4457 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23601 - CGI_10027854 superfamily 243034 513 607 4.51E-15 72.4127 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#23603 - CGI_10027856 superfamily 247727 51 139 2.80E-07 47.266 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#23604 - CGI_10027857 superfamily 241995 12 115 1.36E-37 134.383 cl00635 Ntn_Asparaginase_2_like superfamily C - "Ntn-hydrolase superfamily, L-Asparaginase type 2-like enzymes. This family includes Glycosylasparaginase, Taspase 1 and L-Asparaginase type 2 enzymes. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue." Q#23604 - CGI_10027857 superfamily 241995 125 225 1.09E-18 81.6106 cl00635 Ntn_Asparaginase_2_like superfamily N - "Ntn-hydrolase superfamily, L-Asparaginase type 2-like enzymes. This family includes Glycosylasparaginase, Taspase 1 and L-Asparaginase type 2 enzymes. Glycosylasparaginase catalyzes the hydrolysis of the glycosylamide bond of asparagine-linked glycoprotein. Taspase1 catalyzes the cleavage of the Mix Lineage Leukemia (MLL) nuclear protein and transcription factor TFIIA. L-Asparaginase type 2 hydrolyzes L-asparagine to L-aspartate and ammonia. The proenzymes of this family undergo autoproteolytic cleavage before a threonine to generate alpha and beta subunits. The threonine becomes the N-terminal residue of the beta subunit and is the catalytic residue." Q#23605 - CGI_10027858 superfamily 241578 140 402 4.93E-153 449.899 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#23605 - CGI_10027858 superfamily 247044 622 742 5.55E-78 248.443 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#23605 - CGI_10027858 superfamily 218277 531 631 4.33E-40 143.44 cl04773 Sec23_helical superfamily - - "Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices." Q#23605 - CGI_10027858 superfamily 219707 415 516 9.31E-27 105.291 cl06871 Sec23_BS superfamily - - Sec23/Sec24 beta-sandwich domain; Sec23/Sec24 beta-sandwich domain. Q#23605 - CGI_10027858 superfamily 203092 70 110 9.12E-17 75.6747 cl04769 zf-Sec23_Sec24 superfamily - - "Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain." Q#23606 - CGI_10027859 superfamily 247856 23 86 1.57E-12 57.5577 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23607 - CGI_10027860 superfamily 219896 37 215 3.51E-33 119.228 cl18532 CIA30 superfamily - - "Complex I intermediate-associated protein 30 (CIA30); This protein is associated with mitochondrial Complex I intermediate-associated protein 30 (CIA30) in human and mouse. The family is also present in Schizosaccharomyces pombe which does not contain the NADH dehydrogenase component of complex I, or many of the other essential subunits. This means it is possible that this family of protein may not be directly involved in oxidative phosphorylation." Q#23608 - CGI_10027861 superfamily 241984 70 354 5.62E-168 483.3 cl00615 Membrane-FADS-like superfamily - - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#23608 - CGI_10027861 superfamily 247736 465 607 7.08E-38 137.144 cl17182 NAT_SF superfamily - - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#23608 - CGI_10027861 superfamily 117132 5 85 1.55E-11 59.9797 cl07243 Lipid_DES superfamily - - Sphingolipid Delta4-desaturase (DES); Sphingolipids are important membrane signalling molecules involved in many different cellular functions in eukaryotes. Sphingolipid delta 4-desaturase catalyzes the formation of (E)-sphing-4-enine. Some proteins in this family have bifunctional delta 4-desaturase/C-4-hydroxylase activity. Delta 4-desaturated sphingolipids may play a role in early signalling required for entry into meiotic and spermatid differentiation pathways during Drosophila spermatogenesis. This small domain associates with FA_desaturase pfam00487 and appears to be specific to sphingolipid delta 4-desaturase. Q#23609 - CGI_10027862 superfamily 222150 270 295 0.000129789 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23609 - CGI_10027862 superfamily 222150 299 326 0.000133036 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23611 - CGI_10027864 superfamily 241559 73 154 1.68E-10 56.9355 cl00030 CH superfamily C - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#23611 - CGI_10027864 superfamily 216033 213 304 2.49E-08 50.41 cl16959 Filamin superfamily N - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#23612 - CGI_10027865 superfamily 241568 8 71 0.00161031 35.1312 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#23613 - CGI_10027866 superfamily 247856 19 73 3.16E-05 40.2237 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23614 - CGI_10027867 superfamily 243060 381 481 4.57E-09 55.4628 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#23614 - CGI_10027867 superfamily 243060 710 803 1.40E-06 47.7588 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#23616 - CGI_10027869 superfamily 202715 138 237 6.85E-35 121.916 cl04194 Tctex-1 superfamily - - Tctex-1 family; Tctex-1 is a dynein light chain. It has been shown that Tctex-1 can bind to the cytoplasmic tail of rhodopsin. C-terminal rhodopsin mutations responsible for retinitis pigmentosa inhibit this interaction. Q#23617 - CGI_10027870 superfamily 241570 315 426 1.42E-18 82.7589 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#23617 - CGI_10027870 superfamily 219619 184 233 6.96E-11 59.1435 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#23618 - CGI_10027871 superfamily 207794 186 310 3.35E-73 237.96 cl02948 GH20_hexosaminidase superfamily C - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#23618 - CGI_10027871 superfamily 207794 311 413 3.34E-17 81.9544 cl02948 GH20_hexosaminidase superfamily NC - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#23618 - CGI_10027871 superfamily 243574 1 51 2.21E-12 63.888 cl03918 CHB_HEX superfamily N - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#23619 - CGI_10027872 superfamily 247724 14 54 9.94E-05 39.0968 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#23621 - CGI_10002519 superfamily 242248 82 150 4.93E-11 57.9469 cl01008 DUF423 superfamily - - Protein of unknown function (DUF423); This family of proteins with unknown function is a possible integral membrane protein from Caenorhabditis elegans. This family of proteins has GO references indicating the protein is involved in nematode larval development and is a positive regulator of growth rate. Q#23623 - CGI_10003191 superfamily 245304 166 471 7.32E-130 391.538 cl10459 Peptidases_S8_S53 superfamily - - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#23623 - CGI_10003191 superfamily 201820 551 638 4.12E-30 115.03 cl08326 P_proprotein superfamily - - Proprotein convertase P-domain; A unique feature of the eukaryotic subtilisin-like proprotein convertases is the presence of an additional highly conserved sequence of approximately 150 residues (P domain) located immediately downstream of the catalytic domain. Q#23624 - CGI_10003192 superfamily 241609 354 433 7.25E-24 95.1375 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#23624 - CGI_10003192 superfamily 241583 88 180 5.58E-42 148.488 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23627 - CGI_10019365 superfamily 243050 178 223 2.00E-13 62.8588 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#23628 - CGI_10019366 superfamily 245603 94 161 7.48E-07 43.8644 cl11403 pepsin_retropepsin_like superfamily C - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#23628 - CGI_10019366 superfamily 199156 24 39 0.000170836 36.6524 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#23630 - CGI_10019368 superfamily 241889 46 254 4.29E-64 202.612 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#23631 - CGI_10019369 superfamily 241705 79 188 1.30E-45 148.094 cl00228 HIT_like superfamily - - "HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups." Q#23633 - CGI_10019371 superfamily 247724 10 165 2.32E-50 162.831 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#23634 - CGI_10019372 superfamily 241564 27 92 1.62E-22 87.3211 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#23634 - CGI_10019372 superfamily 241564 136 202 6.25E-19 77.6911 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#23637 - CGI_10019375 superfamily 243035 29 148 8.16E-22 89.2161 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23637 - CGI_10019375 superfamily 243035 277 379 8.99E-12 61.4817 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23638 - CGI_10019377 superfamily 241568 200 253 0.00122031 36.2868 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#23640 - CGI_10019379 superfamily 243035 123 233 3.10E-16 73.8081 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23640 - CGI_10019379 superfamily 243035 258 371 1.94E-15 71.4969 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23640 - CGI_10019379 superfamily 243035 29 98 0.00014823 39.9106 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23641 - CGI_10019380 superfamily 243035 162 233 3.14E-05 41.0662 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23642 - CGI_10019381 superfamily 243035 177 290 1.23E-14 68.4153 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23642 - CGI_10019381 superfamily 243035 24 135 1.70E-14 68.0301 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#23643 - CGI_10019382 superfamily 242274 3 54 0.000663362 34.1334 cl01053 SGNH_hydrolase superfamily NC - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#23646 - CGI_10019385 superfamily 241619 26 82 7.95E-06 40.1457 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#23647 - CGI_10019386 superfamily 241655 36 207 3.62E-46 152.543 cl00167 VMO-I superfamily - - "Vitelline membrane outer layer protein I (VMO-I) domain, VMO-I is one of the proteins found in the outer layer of the vitelline membrane of poultry eggs; VMO-I, lysozyme, and VMO-II are tightly bound to ovomucin; this complex forms the backbone of the outer layer; VMO-I has three distinct internal repeats; all three repeats are used to define the domain here; VMO-I has recently been shown to synthesize N-acetylchito-oligosaccharides from N-acetylglucosamine; may be a carbohydrate-binding protein; member of the beta-prism-fold family" Q#23648 - CGI_10019387 superfamily 246751 319 551 1.07E-23 101.399 cl14883 Lipase superfamily - - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#23649 - CGI_10019388 superfamily 205803 82 189 1.16E-23 93.3709 cl16326 Helicase_C_3 superfamily C - Helicase conserved C-terminal domain; This domain family is found in a wide variety of helicases and helicase-related proteins. Q#23651 - CGI_10019390 superfamily 243689 56 95 3.48E-06 46.0825 cl04271 IBN_N superfamily N - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#23651 - CGI_10019390 superfamily 202500 392 422 0.000131608 40.9641 cl03819 HEAT superfamily - - HEAT repeat; The HEAT repeat family is related to armadillo/beta-catenin-like repeats (see pfam00514). Q#23652 - CGI_10019391 superfamily 242137 1 205 4.41E-30 112.801 cl00847 PAC2 superfamily - - "PAC2 family; This PAC2 (Proteasome assembly chaperone) family of proteins is found in bacteria, archaea and eukaryotes. Proteins in this family are typically between 247 and 307 amino acids in length. These proteins function as a chaperone for the 26S proteasome. The 26S proteasome mediates ubiquitin-dependent proteolysis in eukaryotic cells. A number of studies including very recent ones have revealed that assembly of its 20S catalytic core particle is an ordered process that involves several conserved proteasome assembly chaperones (PACs). Two heterodimeric chaperones, PAC1-PAC2 and PAC3-PAC4, promote the assembly of rings composed of seven alpha subunits." Q#23653 - CGI_10019392 superfamily 247743 358 493 2.29E-24 100.683 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23653 - CGI_10019392 superfamily 216502 556 759 2.85E-75 245.579 cl03209 Peptidase_M41 superfamily - - Peptidase family M41; Peptidase family M41. Q#23655 - CGI_10005458 superfamily 192997 320 472 6.72E-46 163.907 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#23655 - CGI_10005458 superfamily 243092 1048 1270 1.28E-17 84.3088 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23655 - CGI_10005458 superfamily 243092 820 875 0.00140583 40.7812 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23659 - CGI_10005462 superfamily 248391 23 250 3.11E-63 200.279 cl17837 DUF1295 superfamily - - Protein of unknown function (DUF1295); This family contains a number of bacterial and eukaryotic proteins of unknown function that are approximately 300 residues long. Q#23660 - CGI_10005463 superfamily 243054 321 428 0.00929143 36.6548 cl02488 SPEC superfamily NC - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#23664 - CGI_10006353 superfamily 247692 121 611 3.60E-140 438.416 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#23664 - CGI_10006353 superfamily 245206 1141 1375 1.78E-77 257.467 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23664 - CGI_10006353 superfamily 245206 755 1052 5.16E-72 243.712 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23664 - CGI_10006353 superfamily 245209 636 690 4.23E-10 57.9498 cl09936 PP-binding superfamily - - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#23665 - CGI_10006354 superfamily 218793 133 349 4.91E-19 83.7349 cl12319 CNPase superfamily - - "2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP or CNPase); This family consists of the eukaryotic protein 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP). 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP) is one of the earliest myelin-related proteins expressed in differentiating oligodendrocytes and Schwann cells. CNP is abundant in the central nervous system and in oligodendrocytes. This protein is also found in mammalian photoreceptor cells, testis and lymphocytes. Although the biological function of CNP is unknown, it is thought to play a significant role in the formation of the myelin sheath, where it comprises 4% of total protein. CNP selectively cleaves 2',3'-cyclic nucleotides to produce 2'-nucleotides in vitro. Although physiologically relevant substrates with 2',3'-cyclic termini are still unknown, numerous cyclic phosphate containing RNAs occur transiently within eukaryotic cells. Other known protein families capable of hydrolysing 2',3'-cyclic nucleotides include tRNA ligases and plant cyclic phosphodiesterases. The catalytic domains from all these proteins contain two tetra-peptide motifs H-X-T/S-X, where X is usually a hydrophobic residue. Mutation of either histidine in CNP abolishes enzymatic activity. CNPases belong to the 2H phosphoesterase superfamily. They share a common active site, characterized by two conserved histidines, with the bacterial tRNA-ligating enzyme LigT, vertebrate myelin-associated 2',3' phosphodiesterases, plant Arabidopsis thaliana CPDases and several several bacteria and virus proteins." Q#23665 - CGI_10006354 superfamily 247807 3 78 4.47E-07 47.2897 cl17253 AAA_17 superfamily C - AAA domain; AAA domain. Q#23666 - CGI_10006355 superfamily 218793 133 349 4.63E-19 83.7349 cl12319 CNPase superfamily - - "2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP or CNPase); This family consists of the eukaryotic protein 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP). 2',3'-cyclic nucleotide 3'-phosphodiesterase (CNP) is one of the earliest myelin-related proteins expressed in differentiating oligodendrocytes and Schwann cells. CNP is abundant in the central nervous system and in oligodendrocytes. This protein is also found in mammalian photoreceptor cells, testis and lymphocytes. Although the biological function of CNP is unknown, it is thought to play a significant role in the formation of the myelin sheath, where it comprises 4% of total protein. CNP selectively cleaves 2',3'-cyclic nucleotides to produce 2'-nucleotides in vitro. Although physiologically relevant substrates with 2',3'-cyclic termini are still unknown, numerous cyclic phosphate containing RNAs occur transiently within eukaryotic cells. Other known protein families capable of hydrolysing 2',3'-cyclic nucleotides include tRNA ligases and plant cyclic phosphodiesterases. The catalytic domains from all these proteins contain two tetra-peptide motifs H-X-T/S-X, where X is usually a hydrophobic residue. Mutation of either histidine in CNP abolishes enzymatic activity. CNPases belong to the 2H phosphoesterase superfamily. They share a common active site, characterized by two conserved histidines, with the bacterial tRNA-ligating enzyme LigT, vertebrate myelin-associated 2',3' phosphodiesterases, plant Arabidopsis thaliana CPDases and several several bacteria and virus proteins." Q#23666 - CGI_10006355 superfamily 247807 3 78 7.10E-07 46.5193 cl17253 AAA_17 superfamily C - AAA domain; AAA domain. Q#23667 - CGI_10006356 superfamily 247858 93 275 3.32E-29 110.17 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#23668 - CGI_10006357 superfamily 203913 1 131 4.13E-06 44.1301 cl07084 P4Ha_N superfamily - - "Prolyl 4-Hydroxylase alpha-subunit, N-terminal region; The members of this family are eukaryotic proteins, and include all three isoforms of the prolyl 4-hydroxylase alpha subunit. This enzyme (EC:1.14.11.2) is important in the post-translational modification of collagen, as it catalyzes the formation of 4-hydroxyproline. In vertebrates, the complete enzyme is an alpha2-beta2 tetramer; the beta-subunit is identical to protein disulphide isomerase. The function of the N-terminal region featured in this family does not seem to be known." Q#23669 - CGI_10006358 superfamily 243072 585 704 5.45E-21 90.5206 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23669 - CGI_10006358 superfamily 243072 653 799 1.81E-19 86.2834 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23669 - CGI_10006358 superfamily 218713 209 316 6.54E-13 67.7524 cl05332 MRG superfamily NC - "MRG; This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation. It contains 2 chromo domains and a leucine zipper motif." Q#23669 - CGI_10006358 superfamily 218713 484 549 1.16E-10 60.8188 cl05332 MRG superfamily N - "MRG; This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation. It contains 2 chromo domains and a leucine zipper motif." Q#23669 - CGI_10006358 superfamily 152153 10 73 1.45E-05 43.7434 cl18050 Tudor-knot superfamily - - RNA binding activity-knot of a chromodomain; This is a novel knotted tudor domain which is required for binding to RNA. The know influences the loop conformation of the helical turn Ht2 - residues 61-6 3- that is located at the side opposite the knot in the tudor domain-chromodomain; stabilisation of Ht2 is essential for RNA binding. Q#23670 - CGI_10006359 superfamily 247684 328 745 6.30E-89 290.333 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#23671 - CGI_10003308 superfamily 245201 28 282 9.70E-61 197.742 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23673 - CGI_10003310 superfamily 241550 408 642 1.31E-99 323.415 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#23673 - CGI_10003310 superfamily 241550 40 182 7.27E-82 273.724 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#23673 - CGI_10003310 superfamily 245839 642 835 1.36E-75 249.776 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#23674 - CGI_10003311 superfamily 247639 11 128 3.05E-31 113.71 cl16914 O-FucT_like superfamily N - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#23675 - CGI_10004374 superfamily 248021 7 82 8.39E-36 124.745 cl17467 CTP_transf_1 superfamily N - Cytidylyltransferase family; The members of this family are integral membrane protein cytidylyltransferases. The family includes phosphatidate cytidylyltransferase EC:2.7.7.41 as well as Sec59 from yeast. Sec59 is a dolichol kinase EC:2.7.1.108. Q#23676 - CGI_10003560 superfamily 217473 321 586 8.44E-26 108.606 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#23676 - CGI_10003560 superfamily 215866 20 104 2.61E-12 65.4244 cl18349 Arrestin_N superfamily N - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#23677 - CGI_10003562 superfamily 218140 394 969 5.16E-133 411.221 cl04579 Anoctamin superfamily - - "Calcium-activated chloride channel; The family carries eight putative transmembrane domains, and, although it has no similarity to other known channel proteins, it is clearly a calcium-activated ionic channel. It is expressed in various secretory epithelia, the retina and sensory neurons, and mediates receptor-activated chloride currents in diverse physiological processes." Q#23678 - CGI_10003563 superfamily 148752 29 215 2.10E-49 161.934 cl06383 DGCR6 superfamily - - "DiGeorge syndrome critical region 6 (DGCR6) protein; This family contains DiGeorge syndrome critical region 6 (DGCR6) proteins (approximately 200 residues long) of a number of vertebrates. DGCR6 is a candidate for involvement in the DiGeorge syndrome pathology by playing a role in neural crest cell migration into the third and fourth pharyngeal pouches, the structures from which derive the organs affected in DiGeorge syndrome. Also found in this family is the Drosophila melanogaster gonadal protein gdl." Q#23679 - CGI_10004129 superfamily 247057 40 108 1.99E-29 103.117 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#23680 - CGI_10004130 superfamily 222150 59 84 0.00332989 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23681 - CGI_10004131 superfamily 201536 212 309 2.27E-39 138.532 cl03053 UDPG_MGDP_dh superfamily - - "UDP-glucose/GDP-mannose dehydrogenase family, central domain; The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate." Q#23681 - CGI_10004131 superfamily 248301 331 446 3.02E-32 118.448 cl17747 UDPG_MGDP_dh_C superfamily - - "UDP-glucose/GDP-mannose dehydrogenase family, UDP binding domain; The UDP-glucose/GDP-mannose dehydrogenaseses are a small group of enzymes which possesses the ability to catalyze the NAD-dependent 2-fold oxidation of an alcohol to an acid without the release of an aldehyde intermediate." Q#23682 - CGI_10004132 superfamily 245596 20 236 3.27E-88 277.181 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#23682 - CGI_10004132 superfamily 247677 508 666 1.38E-45 159.732 cl17013 W2 superfamily - - "C-terminal domain of eIF4-gamma/eIF5/eIF2b-epsilon; This domain is found at the C-terminus of several translation initiation factors, including the epsilon chain of eIF2b, where it has been found to catalyze the conversion of eIF2.GDP to its active eIF2.GTP form. The structure of the domain resembles that of a set of concatenated HEAT repeats." Q#23682 - CGI_10004132 superfamily 193687 339 417 2.57E-31 117.679 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#23683 - CGI_10004133 superfamily 248097 15 100 4.62E-17 71.9126 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23684 - CGI_10004134 superfamily 248097 12 116 2.46E-17 75.7646 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23684 - CGI_10004134 superfamily 248097 178 303 2.95E-15 69.9866 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23685 - CGI_10004135 superfamily 248097 14 117 1.63E-15 67.6754 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23686 - CGI_10007022 superfamily 241546 2450 2571 1.08E-50 178.624 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#23686 - CGI_10007022 superfamily 248011 484 529 0.00126599 40.1726 cl17457 PKD superfamily N - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#23686 - CGI_10007022 superfamily 243093 1 59 6.13E-10 59.081 cl02568 WSC superfamily N - WSC domain; This domain may be involved in carbohydrate binding. Q#23686 - CGI_10007022 superfamily 243086 2340 2380 4.55E-09 55.8442 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#23686 - CGI_10007022 superfamily 248011 77 146 0.0013277 40.0882 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#23686 - CGI_10007022 superfamily 248011 1074 1100 0.00753818 37.8158 cl17457 PKD superfamily N - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#23687 - CGI_10007023 superfamily 241832 2 134 3.72E-41 136.491 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#23688 - CGI_10007024 superfamily 241832 36 68 4.78E-09 48.2805 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#23690 - CGI_10007026 superfamily 242031 5 183 3.47E-57 180.081 cl00689 TYW3 superfamily - - Methyltransferase TYW3; The methyltransferase TYW3 (tRNA-yW- synthesising protein 3) has been identified in yeast to be involved in wybutosine (yW) biosynthesis. yW is a complexly modified guanosine residue that contains a tricyclic base and is found at the 3' position adjacent the anticodon of phenylalanine tRNA. TYW3 is an N-4 methylase that methylates yW-86 to yield yW-72 in an Ado-Met-dependent manner. Q#23691 - CGI_10007027 superfamily 243037 319 476 2.72E-65 210.121 cl02440 DAGK_acc superfamily - - Diacylglycerol kinase accessory domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown. Q#23691 - CGI_10007027 superfamily 248019 175 300 7.51E-36 129.724 cl17465 DAGK_cat superfamily - - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#23691 - CGI_10007027 superfamily 241566 81 133 7.18E-07 46.6787 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#23693 - CGI_10007029 superfamily 243064 176 268 0.00702378 34.5775 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#23695 - CGI_10007031 superfamily 245596 2 180 9.18E-88 264.32 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#23697 - CGI_10007033 superfamily 245240 7 149 3.05E-43 156.344 cl10034 LabA_like/DUF88 superfamily - - "LabA_like proteins; The LabA-like superfamily is composed of a well conserved group of bacterial proteins with no defined function. LabA, a member from Synechococcus elongatus PCC 7942, has been shown to play a role in cyanobacterial circadian timing." Q#23697 - CGI_10007033 superfamily 246749 952 1022 5.28E-38 138.711 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#23697 - CGI_10007033 superfamily 247723 167 237 8.01E-22 92.3358 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23697 - CGI_10007033 superfamily 246749 795 863 6.22E-33 124.12 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#23697 - CGI_10007033 superfamily 247723 495 572 7.41E-27 107.422 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23697 - CGI_10007033 superfamily 246749 699 767 6.01E-24 98.5196 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#23697 - CGI_10007033 superfamily 246749 868 939 7.59E-19 83.5625 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#23697 - CGI_10007033 superfamily 246749 587 643 1.93E-12 64.9091 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#23697 - CGI_10007033 superfamily 246749 1169 1237 1.51E-09 56.842 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#23697 - CGI_10007033 superfamily 246749 1029 1098 5.23E-09 54.8818 cl14879 LabA_like_C superfamily - - "C-terminal domain of LabA_like proteins; This C-terminal domain is found in a well conserved group of mainly bacterial proteins with no defined function, which contain an N-terminal LabA-like domain. LabA from Synechococcus elongatus PCC 7942, (which does not contain this C-terminal domain) has been shown to play a role in cyanobacterial circadian timing. LabA-like C-terminal domains described here may be related to the LOTUS domain family (which also co-occurs with LabA-like N-terminal domains)." Q#23698 - CGI_10007034 superfamily 243263 83 235 8.90E-28 111.744 cl02990 ASC superfamily NC - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#23698 - CGI_10007034 superfamily 243263 19 78 4.72E-09 55.1066 cl02990 ASC superfamily C - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#23699 - CGI_10007035 superfamily 243263 16 111 5.61E-21 86.693 cl02990 ASC superfamily C - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#23701 - CGI_10009597 superfamily 241600 140 227 1.46E-38 136.987 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23702 - CGI_10009598 superfamily 241750 75 319 7.47E-28 109.294 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#23703 - CGI_10009599 superfamily 241600 1 57 1.10E-13 61.4875 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23704 - CGI_10009600 superfamily 241600 142 342 4.00E-74 230.59 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23705 - CGI_10009601 superfamily 247725 330 471 5.36E-67 217.183 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23705 - CGI_10009601 superfamily 243096 147 326 1.28E-28 112.776 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#23707 - CGI_10009603 superfamily 247792 18 68 3.60E-06 44.744 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#23708 - CGI_10009604 superfamily 248054 6 220 5.86E-14 70.0239 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#23710 - CGI_10009606 superfamily 245882 36 416 2.05E-174 497.969 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#23713 - CGI_10009609 superfamily 243362 304 345 0.00565431 36.2491 cl03262 DnaJ_C superfamily N - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#23713 - CGI_10009609 superfamily 110440 431 457 0.00670749 34.6909 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#23716 - CGI_10003716 superfamily 243263 72 424 3.92E-44 160.651 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#23717 - CGI_10003717 superfamily 243263 70 430 2.66E-68 226.135 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#23719 - CGI_10003958 superfamily 221522 226 573 5.62E-120 362.082 cl18610 KBP_C superfamily - - KIF-1 binding protein C terminal; This family of proteins is found in bacteria and eukaryotes. Proteins in this family are typically between 365 and 621 amino acids in length. There is a conserved LLP sequence motif. KBP is a binding partner for KIF1Balpha that is a regulator of its transport function and thus represents a type of kinesin interacting protein. Q#23720 - CGI_10003959 superfamily 247725 431 566 1.19E-91 291.901 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23720 - CGI_10003959 superfamily 218982 624 717 1.53E-23 97.6211 cl05682 NumbF superfamily - - NUMB domain; This presumed domain is found in the Numb family of proteins adjacent to the PTB domain.. Q#23720 - CGI_10003959 superfamily 218982 1011 1104 1.53E-23 97.6211 cl05682 NumbF superfamily - - NUMB domain; This presumed domain is found in the Numb family of proteins adjacent to the PTB domain.. Q#23720 - CGI_10003959 superfamily 247725 932 953 2.92E-07 50.3808 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23720 - CGI_10003959 superfamily 247724 232 293 0.00770896 37.7586 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#23721 - CGI_10003960 superfamily 241568 666 718 0.000215757 40.9092 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#23721 - CGI_10003960 superfamily 241568 589 630 0.0034153 37.4424 cl00043 CCP superfamily C - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#23721 - CGI_10003960 superfamily 111397 718 801 2.33E-12 65.055 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#23721 - CGI_10003960 superfamily 219525 1331 1376 1.14E-10 59.3549 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#23721 - CGI_10003960 superfamily 111397 1 63 5.59E-10 58.1214 cl03620 HYR superfamily N - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#23721 - CGI_10003960 superfamily 219525 414 454 3.44E-08 52.0361 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#23721 - CGI_10003960 superfamily 111397 492 571 1.62E-06 47.7211 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#23721 - CGI_10003960 superfamily 219525 1118 1168 0.000334807 40.095 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#23721 - CGI_10003960 superfamily 241568 68 119 0.00443967 36.7844 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#23721 - CGI_10003960 superfamily 241568 806 857 0.00443967 36.7844 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#23723 - CGI_10016614 superfamily 245226 26 221 1.51E-40 143.964 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#23723 - CGI_10016614 superfamily 245226 296 369 0.000450435 39.9603 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#23724 - CGI_10016615 superfamily 245226 25 191 4.70E-25 100.822 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#23725 - CGI_10016616 superfamily 245226 22 106 5.40E-24 91.1919 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#23726 - CGI_10016617 superfamily 242406 3 87 3.87E-11 55.2901 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#23727 - CGI_10016618 superfamily 246918 265 317 1.87E-15 73.0047 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#23727 - CGI_10016618 superfamily 246918 436 488 6.98E-15 71.4639 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#23727 - CGI_10016618 superfamily 216897 17 83 1.18E-14 71.5585 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#23727 - CGI_10016618 superfamily 246918 322 374 8.15E-14 68.3823 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#23727 - CGI_10016618 superfamily 246918 208 260 2.37E-12 64.1451 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#23727 - CGI_10016618 superfamily 246918 397 430 0.000186547 41.0331 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#23727 - CGI_10016618 superfamily 246918 101 144 0.00257133 37.5663 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#23728 - CGI_10016619 superfamily 219621 30 157 4.45E-20 82.4091 cl06777 Rrp15p superfamily - - Rrp15p; Rrp15p is required for the formation of 60S ribosomal subunits. Q#23729 - CGI_10016620 superfamily 247905 349 461 4.24E-26 103.857 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#23729 - CGI_10016620 superfamily 247805 131 337 1.23E-55 187.692 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#23730 - CGI_10016621 superfamily 246908 335 435 6.12E-49 164.87 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#23730 - CGI_10016621 superfamily 243073 444 491 6.69E-15 69.6355 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#23731 - CGI_10016622 superfamily 243092 11 284 2.32E-33 131.303 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23731 - CGI_10016622 superfamily 241597 936 999 5.50E-05 42.2208 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#23731 - CGI_10016622 superfamily 204888 495 521 8.86E-07 47.0466 cl13738 DUF3639 superfamily - - "Protein of unknown function (DUF3639); This domain family is found in eukaryotes, and is approximately 30 amino acids in length. The family is found in association with pfam00400. There are two completely conserved residues (E and R) that may be functionally important." Q#23733 - CGI_10016624 superfamily 241789 12 128 5.48E-21 82.524 cl00328 Ribosomal_L14 superfamily - - Ribosomal protein L14p/L23e; Ribosomal protein L14p/L23e. Q#23734 - CGI_10016625 superfamily 246925 188 356 1.24E-16 80.0921 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#23735 - CGI_10016626 superfamily 241571 222 337 2.67E-34 122.906 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23735 - CGI_10016626 superfamily 241583 129 183 9.88E-12 61.8182 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23736 - CGI_10016627 superfamily 246597 196 452 5.18E-147 422.795 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#23739 - CGI_10016630 superfamily 241548 4 459 0 826.754 cl00013 Lyase_I_like superfamily - - "Lyase class I_like superfamily: contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase, which catalyze similar beta-elimination reactions; Lyase class I_like superfamily of enzymes that catalyze beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. This superfamily contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase. The lyase class I family comprises proteins similar to class II fumarase, aspartase, adenylosuccinate lyase, argininosuccinate lyase, and 3-carboxy-cis, cis-muconate lactonizing enzyme which, for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. Histidine or phenylalanine ammonia-lyase catalyze a beta-elimination of ammonia from histidine and phenylalanine, respectively." Q#23740 - CGI_10016631 superfamily 245201 71 183 5.27E-28 108.778 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23741 - CGI_10016632 superfamily 247856 499 557 1.58E-08 51.7797 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23741 - CGI_10016632 superfamily 246925 158 434 1.50E-28 115.53 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#23742 - CGI_10016633 superfamily 246925 141 432 1.98E-25 106.286 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#23745 - CGI_10016636 superfamily 193409 408 476 1.78E-17 78.0113 cl15178 Kri1_C superfamily N - KRI1-like family C-terminal; The yeast member of this family (Kri1p) is found to be required for 40S ribosome biogenesis in the nucleolus. This is the C-terminal domain of the family. Q#23745 - CGI_10016636 superfamily 218482 292 354 6.96E-12 62.2793 cl04969 Kri1 superfamily C - KRI1-like family; The yeast member of this family (Kri1p) is found to be required for 40S ribosome biogenesis in the nucleolus. Q#23746 - CGI_10008055 superfamily 241591 26 97 1.79E-27 102.314 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#23746 - CGI_10008055 superfamily 241592 162 264 5.40E-68 209.376 cl00074 H2A superfamily N - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23747 - CGI_10008056 superfamily 241592 22 101 4.51E-39 127.232 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23748 - CGI_10008057 superfamily 241592 33 121 1.27E-51 160.375 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23749 - CGI_10008058 superfamily 241592 23 119 7.20E-43 139.007 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23750 - CGI_10008059 superfamily 241591 26 97 7.94E-25 93.8399 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#23751 - CGI_10008060 superfamily 241592 1 136 5.16E-83 242.503 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23752 - CGI_10008061 superfamily 241592 22 101 4.51E-39 127.232 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23753 - CGI_10008062 superfamily 241592 25 121 1.52E-51 160.375 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23754 - CGI_10008063 superfamily 241592 23 119 7.20E-43 139.007 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23755 - CGI_10008064 superfamily 243072 170 216 1.42E-09 55.8526 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23755 - CGI_10008064 superfamily 215882 356 480 4.92E-10 57.2906 cl09511 FERM_M superfamily - - FERM central domain; This domain is the central structural domain of the FERM domain. Q#23755 - CGI_10008064 superfamily 247725 461 554 0.000276777 39.8865 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23756 - CGI_10008065 superfamily 241592 23 119 7.20E-43 139.007 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23757 - CGI_10008066 superfamily 241592 1 136 5.16E-83 242.503 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23758 - CGI_10008067 superfamily 241592 22 101 4.51E-39 127.232 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23759 - CGI_10008068 superfamily 241592 25 121 1.52E-51 160.375 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23760 - CGI_10008069 superfamily 241592 23 119 7.20E-43 139.007 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23761 - CGI_10008070 superfamily 241591 23 97 2.26E-26 98.0771 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#23762 - CGI_10008071 superfamily 241592 1 136 5.16E-83 242.503 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23763 - CGI_10008072 superfamily 241592 22 101 4.51E-39 127.232 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23764 - CGI_10008073 superfamily 241592 25 121 1.52E-51 160.375 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23765 - CGI_10008074 superfamily 241592 23 119 7.20E-43 139.007 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23766 - CGI_10008075 superfamily 241591 23 97 2.26E-26 98.0771 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#23767 - CGI_10008076 superfamily 241592 1 136 5.16E-83 242.503 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23768 - CGI_10008077 superfamily 241592 22 101 4.51E-39 127.232 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23769 - CGI_10008078 superfamily 241592 33 121 6.85E-52 161.145 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23770 - CGI_10008079 superfamily 241592 23 119 7.20E-43 139.007 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23771 - CGI_10008080 superfamily 241591 23 97 2.26E-26 98.0771 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#23772 - CGI_10008081 superfamily 241592 1 136 5.16E-83 242.503 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23773 - CGI_10008082 superfamily 241592 22 101 4.51E-39 127.232 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23774 - CGI_10008083 superfamily 241592 33 121 6.85E-52 161.145 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23775 - CGI_10008084 superfamily 241592 23 119 7.20E-43 139.007 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23776 - CGI_10008085 superfamily 241591 23 97 2.32E-26 98.0771 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#23777 - CGI_10008086 superfamily 241592 1 157 4.59E-80 235.955 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23778 - CGI_10008087 superfamily 241592 22 101 4.51E-39 127.232 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#23780 - CGI_10008089 superfamily 247739 21 199 1.91E-40 138.173 cl17185 LPLAT superfamily - - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#23781 - CGI_10008090 superfamily 247792 12 55 1.15E-07 48.596 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#23782 - CGI_10008091 superfamily 247792 14 61 1.67E-07 48.2108 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#23783 - CGI_10008092 superfamily 247792 11 54 5.44E-08 49.7516 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#23784 - CGI_10008093 superfamily 247792 465 510 1.19E-06 46.67 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#23784 - CGI_10008093 superfamily 247792 9 54 1.52E-06 46.67 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#23785 - CGI_10008094 superfamily 243060 202 273 3.17E-09 54.6924 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#23785 - CGI_10008094 superfamily 243060 338 386 0.00670777 35.4324 cl02507 SEA superfamily C - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#23788 - CGI_10005334 superfamily 221377 208 275 1.64E-05 43.2263 cl13449 DUF3504 superfamily C - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#23790 - CGI_10005336 superfamily 241643 159 196 5.57E-06 43.2167 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#23790 - CGI_10005336 superfamily 241752 279 348 4.98E-08 50.636 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#23792 - CGI_10007932 superfamily 245599 214 424 1.38E-94 286.113 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#23792 - CGI_10007932 superfamily 207662 67 158 1.18E-63 202.095 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#23793 - CGI_10007933 superfamily 220904 244 332 2.51E-09 54.6105 cl12494 DUF2781 superfamily N - Protein of unknown function (DUF2781); This is a eukaryotic family of uncharacterized proteins. Some of the proteins in this family are annotated as membrane proteins. Q#23795 - CGI_10007935 superfamily 246908 146 239 1.39E-53 177.482 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#23795 - CGI_10007935 superfamily 247683 22 78 4.22E-30 111.627 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#23795 - CGI_10007935 superfamily 246908 88 161 7.13E-17 75.955 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#23795 - CGI_10007935 superfamily 245201 255 509 1.66E-154 444.15 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23797 - CGI_10007937 superfamily 241546 343 440 2.56E-29 110.828 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#23798 - CGI_10007938 superfamily 216363 125 232 4.83E-22 87.9109 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#23800 - CGI_10007600 superfamily 241600 1 171 1.20E-55 176.662 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23801 - CGI_10007601 superfamily 241600 62 232 1.61E-57 183.981 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23802 - CGI_10007602 superfamily 241600 56 225 3.18E-58 185.522 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23803 - CGI_10007603 superfamily 241600 58 229 5.53E-55 177.433 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23804 - CGI_10007604 superfamily 241600 18 192 2.84E-56 179.359 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#23805 - CGI_10007605 superfamily 215647 239 457 3.53E-29 115.784 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#23805 - CGI_10007605 superfamily 243029 162 218 1.65E-14 68.9165 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#23806 - CGI_10007606 superfamily 245599 237 466 2.65E-54 183.407 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#23806 - CGI_10007606 superfamily 207662 83 162 2.46E-45 154.353 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#23808 - CGI_10004080 superfamily 216363 63 143 4.79E-09 49.7762 cl08312 UPF0029 superfamily C - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#23809 - CGI_10004081 superfamily 220371 59 198 2.25E-29 107.776 cl10720 Bud13 superfamily - - "Pre-mRNA-splicing factor of RES complex; This entry is characterized by proteins with alternating conserved and low-complexity regions. Bud13 together with Snu17p and a newly identified factor, Pml1p/Ylr016c, form a novel trimeric complex. called The RES complex, pre-mRNA retention and splicing complex. Subunits of this complex are not essential for viability of yeasts but they are required for efficient splicing in vitro and in vivo. Furthermore, inactivation of this complex causes pre-mRNA leakage from the nucleus. Bud13 contains a unique, phylogenetically conserved C-terminal region of unknown function." Q#23811 - CGI_10004083 superfamily 241594 32 64 0.00178889 34.2032 cl00077 HECTc superfamily N - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#23817 - CGI_10005825 superfamily 245304 13 484 0 773.765 cl10459 Peptidases_S8_S53 superfamily - - "Peptidase domain in the S8 and S53 families; Members of the peptidases S8 (subtilisin and kexin) and S53 (sedolisin) family include endopeptidases and exopeptidases. The S8 family has an Asp/His/Ser catalytic triad similar to that found in trypsin-like proteases, but do not share their three-dimensional structure and are not homologous to trypsin. Serine acts as a nucleophile, aspartate as an electrophile, and histidine as a base. The S53 family contains a catalytic triad Glu/Asp/Ser with an additional acidic residue Asp in the oxyanion hole, similar to that of subtilisin. The serine residue here is the nucleophilic equivalent of the serine residue in the S8 family, while glutamic acid has the same role here as the histidine base. However, the aspartic acid residue that acts as an electrophile is quite different. In S53, it follows glutamic acid, while in S8 it precedes histidine. The stability of these enzymes may be enhanced by calcium; some members have been shown to bind up to 4 ions via binding sites with different affinity. There is a great diversity in the characteristics of their members: some contain disulfide bonds, some are intracellular while others are extracellular, some function at extreme temperatures, and others at high or low pH values." Q#23817 - CGI_10005825 superfamily 221649 776 963 1.48E-71 237.529 cl13955 TPPII superfamily - - "Tripeptidyl peptidase II; This domain family is found in bacteria and eukaryotes, and is approximately 190 amino acids in length. The family is found in association with pfam00082. Tripeptidyl peptidase II (TPPII) is a crucial component of the proteolytic cascade acting downstream of the 26S proteasome in the ubiquitin-proteasome pathway. It is an amino peptidase belonging to the subtilase family removing tripeptides from the free N terminus of oligopeptides." Q#23818 - CGI_10005826 superfamily 241599 37 93 6.91E-21 82.2912 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#23820 - CGI_10005828 superfamily 219502 407 661 2.66E-75 242.35 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#23820 - CGI_10005828 superfamily 201962 225 295 2.17E-20 86.2756 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#23820 - CGI_10005828 superfamily 219507 304 399 1.06E-13 68.0347 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#23822 - CGI_10013776 superfamily 241548 1 222 6.86E-102 305.208 cl00013 Lyase_I_like superfamily NC - "Lyase class I_like superfamily: contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase, which catalyze similar beta-elimination reactions; Lyase class I_like superfamily of enzymes that catalyze beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. This superfamily contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase. The lyase class I family comprises proteins similar to class II fumarase, aspartase, adenylosuccinate lyase, argininosuccinate lyase, and 3-carboxy-cis, cis-muconate lactonizing enzyme which, for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. Histidine or phenylalanine ammonia-lyase catalyze a beta-elimination of ammonia from histidine and phenylalanine, respectively." Q#23823 - CGI_10013777 superfamily 152488 1 82 7.78E-05 37.5877 cl13485 DUF3534 superfamily C - Domain of unknown function (DUF3534); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is about 150 amino acids in length. This domain is found associated with pfam00595. This domain has a conserved GILD sequence motif. Q#23824 - CGI_10013778 superfamily 247856 221 283 1.48E-08 52.1649 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#23826 - CGI_10013780 superfamily 217210 587 1103 1.32E-175 531.468 cl10595 Ald_Xan_dh_C2 superfamily - - Molybdopterin-binding domain of aldehyde dehydrogenase; Molybdopterin-binding domain of aldehyde dehydrogenase. Q#23826 - CGI_10013780 superfamily 243326 472 579 6.46E-38 139.19 cl03161 Ald_Xan_dh_C superfamily - - "Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain; Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. " Q#23826 - CGI_10013780 superfamily 201981 24 77 2.92E-27 107.566 cl08334 Fer2_2 superfamily N - [2Fe-2S] binding domain; [2Fe-2S] binding domain. Q#23826 - CGI_10013780 superfamily 244932 324 422 6.95E-14 69.8377 cl08390 CO_deh_flav_C superfamily - - CO dehydrogenase flavoprotein C-terminal domain; CO dehydrogenase flavoprotein C-terminal domain. Q#23827 - CGI_10013781 superfamily 247723 1487 1581 5.88E-58 196.725 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23827 - CGI_10013781 superfamily 217210 859 1082 2.50E-79 274.155 cl10595 Ald_Xan_dh_C2 superfamily N - Molybdopterin-binding domain of aldehyde dehydrogenase; Molybdopterin-binding domain of aldehyde dehydrogenase. Q#23827 - CGI_10013781 superfamily 217210 594 857 2.24E-57 209.827 cl10595 Ald_Xan_dh_C2 superfamily C - Molybdopterin-binding domain of aldehyde dehydrogenase; Molybdopterin-binding domain of aldehyde dehydrogenase. Q#23827 - CGI_10013781 superfamily 243326 479 586 5.85E-43 154.213 cl03161 Ald_Xan_dh_C superfamily - - "Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain; Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. " Q#23827 - CGI_10013781 superfamily 201981 5 80 3.32E-36 133.76 cl08334 Fer2_2 superfamily - - [2Fe-2S] binding domain; [2Fe-2S] binding domain. Q#23827 - CGI_10013781 superfamily 244932 328 428 4.86E-14 70.6081 cl08390 CO_deh_flav_C superfamily - - CO dehydrogenase flavoprotein C-terminal domain; CO dehydrogenase flavoprotein C-terminal domain. Q#23827 - CGI_10013781 superfamily 243107 1420 1459 3.19E-09 55.2438 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#23828 - CGI_10013782 superfamily 192507 9 89 5.61E-30 111.53 cl10955 WGG superfamily - - Pre-rRNA-processing protein TSR2; This entry represents the central conserved section of a family of proteins described as pre-rRNA-processing protein TSR2. The region has a distinctive WGG motif but the function is unknown. Q#23829 - CGI_10013783 superfamily 193607 292 423 5.36E-64 212.047 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#23829 - CGI_10013783 superfamily 247792 245 283 2.55E-09 54.7592 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#23829 - CGI_10013783 superfamily 245213 632 668 9.36E-07 47.2462 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23829 - CGI_10013783 superfamily 241554 16 155 6.07E-33 126.616 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#23829 - CGI_10013783 superfamily 241554 123 211 1.65E-05 44.8329 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#23831 - CGI_10013785 superfamily 219122 2322 2458 1.18E-27 115.855 cl05933 DUF1162 superfamily C - Protein of unknown function (DUF1162); This family represents a conserved region within several hypothetical eukaryotic proteins. Family members might be vacuolar protein sorting related-proteins. Q#23831 - CGI_10013785 superfamily 204985 6 63 1.05E-08 55.6407 cl14987 Chorein_N superfamily C - "N-terminal region of Chorein, a TM vesicle-mediated sorter; Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport." Q#23832 - CGI_10013786 superfamily 219122 3 98 4.60E-06 47.6743 cl05933 DUF1162 superfamily N - Protein of unknown function (DUF1162); This family represents a conserved region within several hypothetical eukaryotic proteins. Family members might be vacuolar protein sorting related-proteins. Q#23834 - CGI_10013788 superfamily 248097 161 248 0.0016244 37.2446 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23834 - CGI_10013788 superfamily 248097 429 514 0.00269832 36.8594 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23835 - CGI_10013789 superfamily 247785 42 336 4.51E-77 240.279 cl17231 HpcH_HpaI superfamily - - "HpcH/HpaI aldolase/citrate lyase family; This family includes 2,4-dihydroxyhept-2-ene-1,7-dioic acid aldolase and 4-hydroxy-2-oxovalerate aldolase." Q#23837 - CGI_10013791 superfamily 247999 398 435 0.00310428 35.6506 cl17445 PHD superfamily C - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#23841 - CGI_10013795 superfamily 247723 44 136 6.31E-38 138.97 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23841 - CGI_10013795 superfamily 152200 1243 1378 1.55E-17 82.4391 cl13245 N-SET superfamily - - "COMPASS (Complex proteins associated with Set1p) component N; The n-SET or N-SET domain is a component of the COMPASS complex, associated with SET1, conserved in yeasts and in other eukaryotes up to humans. The COMPASS complex functions to methylate the fourth lysine of Histone 3 and for the silencing of genes close to the telomeres of chromosomes. This domain promotes trimethylation in conjunction with an RRM domain and is necessary for binding of the Spp1 component of COMPASS into the complex." Q#23842 - CGI_10004100 superfamily 247723 118 166 0.00968562 33.5771 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23845 - CGI_10004103 superfamily 241568 706 740 0.0011993 37.8276 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#23845 - CGI_10004103 superfamily 242059 19 426 6.37E-76 256.134 cl00738 MBOAT superfamily - - "MBOAT, membrane-bound O-acyltransferase family; The MBOAT (membrane bound O-acyl transferase) family of membrane proteins contains a variety of acyltransferase enzymes. A conserved histidine has been suggested to be the active site residue." Q#23848 - CGI_10000673 superfamily 218839 7 165 6.57E-53 169.802 cl05501 Med7 superfamily - - MED7 protein; This family consists of several eukaryotic proteins which are homologues of the yeast MED7 protein. Activation of gene transcription in metazoans is a multi-step process that is triggered by factors that recognise transcriptional enhancer sites in DNA. These factors work with co-activators such as MED7 to direct transcriptional initiation by the RNA polymerase II apparatus. Q#23849 - CGI_10001248 superfamily 217859 42 85 9.40E-15 65.7599 cl04376 P34-Arc superfamily NC - "Arp2/3 complex, 34 kD subunit p34-Arc; Arp2/3 protein complex has been implicated in the control of actin polymerisation in cells. The human complex consists of seven subunits which include the actin related Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. This family represents the p34-Arc subunit." Q#23851 - CGI_10010828 superfamily 241754 10 220 1.69E-86 289.275 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#23852 - CGI_10010829 superfamily 241754 2 356 0 632.424 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#23852 - CGI_10010829 superfamily 247725 1605 1715 2.54E-41 150.044 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#23852 - CGI_10010829 superfamily 241581 472 572 1.70E-06 48.5366 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#23852 - CGI_10010829 superfamily 221593 1167 1317 5.59E-44 159.077 cl13857 DUF3694 superfamily - - "Kinesin protein; This domain family is found in eukaryotes, and is typically between 131 and 151 amino acids in length. The family is found in association with pfam00225, pfam00498. There is a single completely conserved residue W that may be functionally important." Q#23852 - CGI_10010829 superfamily 241754 1737 1848 6.03E-37 144.825 cl00286 Motor_domain superfamily C - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#23852 - CGI_10010829 superfamily 221571 834 880 1.39E-07 50.5791 cl13810 KIF1B superfamily - - "Kinesin protein 1B; This domain family is found in eukaryotes, and is approximately 50 amino acids in length. The family is found in association with pfam00225, pfam00498. KIF1B is an anterograde motor for transport of mitochondria in axons of neuronal cells." Q#23853 - CGI_10010830 superfamily 246925 79 361 6.79E-29 114.375 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#23854 - CGI_10010831 superfamily 244840 22 124 6.53E-34 118.043 cl08021 Flavoprotein superfamily N - "Flavoprotein; This family contains diverse flavoprotein enzymes. This family includes epidermin biosynthesis protein, EpiD, which has been shown to be a flavoprotein that binds FMN. This enzyme catalyzes the removal of two reducing equivalents from the cysteine residue of the C-terminal meso-lanthionine of epidermin to form a --C==C-- double bond. This family also includes the B chain of dipicolinate synthase a small polar molecule that accumulates to high concentrations in bacterial endospores, and is thought to play a role in spore heat resistance, or the maintenance of heat resistance. dipicolinate synthase catalyzes the formation of dipicolinic acid from dihydroxydipicolinic acid. This family also includes phenyl-acrylic acid decarboxylase (EC:4.1.1.-)." Q#23855 - CGI_10010832 superfamily 243175 126 251 1.30E-59 187.089 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#23855 - CGI_10010832 superfamily 241832 23 93 1.20E-32 115.601 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#23860 - CGI_10010837 superfamily 241832 4 73 2.41E-25 96.464 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#23860 - CGI_10010837 superfamily 243175 157 271 8.82E-18 76.1222 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#23860 - CGI_10010837 superfamily 243175 84 192 4.73E-13 63.0254 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#23861 - CGI_10010838 superfamily 241832 4 73 2.37E-27 100.316 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#23861 - CGI_10010838 superfamily 243175 84 186 3.43E-12 59.5586 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#23862 - CGI_10010839 superfamily 243490 2 48 0.00494575 31.883 cl03656 PS_Dcarbxylase superfamily NC - "Phosphatidylserine decarboxylase; This is a family of phosphatidylserine decarboxylases, EC:4.1.1.65. These enzymes catalyze the reaction: Phosphatidyl-L-serine <=> phosphatidylethanolamine + CO2. Phosphatidylserine decarboxylase plays a central role in the biosynthesis of aminophospholipids by converting phosphatidylserine to phosphatidylethanolamine." Q#23863 - CGI_10010840 superfamily 247746 106 213 0.00544196 36.8526 cl17192 ATP-synt_B superfamily - - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#23864 - CGI_10010841 superfamily 247750 179 308 8.45E-58 195.201 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#23864 - CGI_10010841 superfamily 247750 34 111 3.91E-43 155.91 cl17196 E1_enzyme_family superfamily NC - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#23864 - CGI_10010841 superfamily 202124 84 146 9.03E-12 59.8684 cl08340 UBACT superfamily - - Repeat in ubiquitin-activating (UBA) protein; Repeat in ubiquitin-activating (UBA) protein. Q#23865 - CGI_10010842 superfamily 247750 116 308 1.20E-70 229.098 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#23865 - CGI_10010842 superfamily 247750 1 48 1.98E-22 96.2043 cl17196 E1_enzyme_family superfamily NC - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#23866 - CGI_10005516 superfamily 218284 49 93 9.18E-13 59.5755 cl04786 SOUL superfamily C - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#23866 - CGI_10005516 superfamily 243051 23 49 0.0010389 34.6337 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23868 - CGI_10005518 superfamily 198738 180 264 4.26E-46 155.504 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#23870 - CGI_10005520 superfamily 220626 1 198 1.10E-33 125.426 cl18564 GpcrRhopsn4 superfamily N - "Rhodopsin-like GPCR transmembrane domain; This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers." Q#23872 - CGI_10005522 superfamily 220626 154 410 6.43E-62 205.933 cl18564 GpcrRhopsn4 superfamily - - "Rhodopsin-like GPCR transmembrane domain; This region of 270 amino acids is the seven transmembrane alpha-helical domains included within five GPCRRHODOPSN4 motifs of a G-protein-coupled-receptor (GPCR) protein, conserved from nematodes to humans. GPCRs are integral membrane receptors whose intracellular actions are mediated by signalling pathways involving G proteins and downstream secondary messengers." Q#23873 - CGI_10005523 superfamily 245029 308 417 8.53E-06 43.7904 cl09190 MAPEG superfamily - - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#23876 - CGI_10025860 superfamily 219542 53 160 2.70E-40 136.603 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#23877 - CGI_10025861 superfamily 215647 23 237 1.46E-40 146.214 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#23879 - CGI_10025863 superfamily 216363 278 373 3.96E-12 62.1026 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#23879 - CGI_10025863 superfamily 242748 148 185 0.00435338 35.2196 cl01853 COG4467 superfamily C - "Regulator of replication initiation timing [Replication, recombination, and repair]" Q#23881 - CGI_10025865 superfamily 216363 7 72 1.60E-15 65.9546 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#23884 - CGI_10025868 superfamily 246669 197 277 2.86E-22 89.6394 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#23884 - CGI_10025868 superfamily 246669 1 71 2.20E-13 64.5075 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#23885 - CGI_10025869 superfamily 238191 27 528 7.14E-105 328.908 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#23886 - CGI_10025870 superfamily 243051 688 838 9.87E-34 128.65 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23886 - CGI_10025870 superfamily 243051 531 636 4.91E-30 118.249 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23886 - CGI_10025870 superfamily 243051 856 1007 1.65E-28 113.627 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23886 - CGI_10025870 superfamily 241571 337 459 2.37E-10 59.3482 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23886 - CGI_10025870 superfamily 245213 1006 1041 0.000350906 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23886 - CGI_10025870 superfamily 243051 1050 1111 1.80E-10 60.4421 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23886 - CGI_10025870 superfamily 241583 192 251 1.35E-08 54.4995 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23887 - CGI_10025871 superfamily 243051 29 158 8.52E-28 103.997 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23888 - CGI_10025872 superfamily 241583 156 336 1.44E-54 188.549 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23888 - CGI_10025872 superfamily 243051 919 1070 2.34E-27 110.16 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23888 - CGI_10025872 superfamily 243051 520 648 2.90E-19 86.6629 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23888 - CGI_10025872 superfamily 241571 391 513 9.45E-11 60.5038 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23888 - CGI_10025872 superfamily 245213 651 684 0.000160525 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23888 - CGI_10025872 superfamily 245213 783 810 0.000520271 39.157 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23888 - CGI_10025872 superfamily 243051 693 773 2.88E-13 68.9165 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23888 - CGI_10025872 superfamily 243051 819 876 1.01E-05 45.4466 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23890 - CGI_10025874 superfamily 243051 2 35 0.000584709 34.661 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23891 - CGI_10025875 superfamily 243051 624 777 1.54E-30 118.634 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23891 - CGI_10025875 superfamily 243051 318 470 4.45E-27 108.619 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23891 - CGI_10025875 superfamily 243051 186 315 6.59E-20 87.8185 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23891 - CGI_10025875 superfamily 245213 588 615 0.000157681 40.3126 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23891 - CGI_10025875 superfamily 245213 477 506 0.00066285 38.3866 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23891 - CGI_10025875 superfamily 243051 515 575 3.01E-08 52.7653 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 241583 1250 1432 4.98E-53 187.008 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23892 - CGI_10025876 superfamily 243051 1613 1758 6.42E-36 136.739 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 243051 929 1082 1.06E-28 115.938 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 243051 2734 2842 6.41E-25 104.767 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 243051 542 662 6.17E-21 93.2113 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 243051 681 788 3.05E-19 88.2037 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 241571 2509 2631 7.41E-12 65.1262 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23892 - CGI_10025876 superfamily 241571 373 495 5.00E-11 62.815 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23892 - CGI_10025876 superfamily 245213 888 920 3.57E-05 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23892 - CGI_10025876 superfamily 245213 791 826 0.00459524 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23892 - CGI_10025876 superfamily 243051 2671 2750 7.84E-12 65.8621 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 241583 230 287 5.12E-08 54.1143 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23892 - CGI_10025876 superfamily 243051 8 80 8.26E-07 50.4269 cl02479 MAM superfamily NC - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 241583 2358 2414 8.31E-07 50.6475 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23892 - CGI_10025876 superfamily 243051 860 886 1.88E-06 49.2986 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23892 - CGI_10025876 superfamily 241571 2020 2079 1.68E-05 45.8663 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#23892 - CGI_10025876 superfamily 241583 1869 1922 0.00109409 41.4027 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#23893 - CGI_10025877 superfamily 243051 23 131 5.78E-28 103.612 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23894 - CGI_10025878 superfamily 243051 91 245 8.20E-25 97.0633 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23894 - CGI_10025878 superfamily 243051 1 73 4.67E-16 72.4105 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23895 - CGI_10025879 superfamily 243051 209 279 0.000503598 38.4785 cl02479 MAM superfamily C - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23896 - CGI_10025880 superfamily 243051 1 36 0.00865902 35.4041 cl02479 MAM superfamily NC - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#23897 - CGI_10025881 superfamily 245312 150 375 1.20E-06 48.7996 cl10482 KefB superfamily - - "Kef-type K+ transport systems, membrane components [Inorganic ion transport and metabolism]" Q#23900 - CGI_10025885 superfamily 243045 99 191 3.01E-11 62.6507 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#23900 - CGI_10025885 superfamily 241596 15 66 4.51E-08 52.2163 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#23900 - CGI_10025885 superfamily 243045 273 349 0.00209482 38.3831 cl02459 PAS superfamily N - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#23901 - CGI_10025887 superfamily 238191 601 1082 1.56E-118 376.672 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#23901 - CGI_10025887 superfamily 238191 99 568 3.86E-113 362.035 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#23902 - CGI_10025888 superfamily 238191 28 537 3.81E-126 387.843 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#23902 - CGI_10025888 superfamily 238012 644 683 0.00466715 35.793 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#23903 - CGI_10025889 superfamily 241877 320 346 0.000333899 40.9401 cl00459 MIT_CorA-like superfamily N - "metal ion transporter CorA-like divalent cation transporter superfamily; This superfamily of essential membrane proteins is involved in transporting divalent cations (uptake or efflux) across membranes. They are found in most bacteria and archaea, and in some eukaryotes. It is a functionally diverse group which includes the Mg2+ transporters of Escherichia coli and Salmonella typhimurium CorAs (which can also transport Co2+, and Ni2+ ), the CorA Co2+ transporter from the hyperthermophilic Thermotoga maritima, and the Zn2+ transporter Salmonella typhimurium ZntB, which mediates the efflux of Zn2+ (and Cd2+). It includes five Saccharomyces cerevisiae members: i) two plasma membrane proteins, the Mg2+ transporter Alr1p/Swc3p and the putative Mg2+ transporter, Alr2p, ii) two mitochondrial inner membrane Mg2+ transporters: Mfm1p/Lpe10p, and Mrs2p, and iii) and the vacuole membrane protein Mnr2p, a putative Mg2+ transporter. It also includes a family of Arabidopsis thaliana members (AtMGTs), some of which are localized to distinct tissues, and not all of which can transport Mg2+. Thermotoga maritima CorA and Vibrio parahaemolyticus and Salmonella typhimurium ZntB form funnel-shaped homopentamers, the tip of the funnel is formed from two C-terminal transmembrane (TM) helices from each monomer, and the large opening of the funnel from the N-terminal cytoplasmic domains. The GMN signature motif of the MIT superfamily occurs just after TM1, mutation within this motif is known to abolish Mg2+ transport through Salmonella typhimurium CorA, Mrs2p, and Alr1p. Natural variants such as GVN and GIN, as in some ZntB family proteins, may be associated with the transport of different divalent cations, such as zinc and cadmium. The functional diversity of MIT transporters may also be due to minor structural differences regulating gating, substrate selection, and transport." Q#23904 - CGI_10025890 superfamily 217311 8 391 2.90E-105 325.06 cl18402 DUF229 superfamily N - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#23905 - CGI_10025891 superfamily 241889 1338 1473 2.29E-27 110.798 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#23905 - CGI_10025891 superfamily 243066 380 479 1.71E-08 54.1605 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#23908 - CGI_10025894 superfamily 247684 17 431 2.83E-93 302.659 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#23908 - CGI_10025894 superfamily 178029 564 857 1.48E-116 362.187 cl18093 PLN02407 superfamily - - diphosphomevalonate decarboxylase Q#23912 - CGI_10025898 superfamily 247907 1963 2144 2.93E-26 108.66 cl17353 LamG superfamily - - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#23912 - CGI_10025898 superfamily 247068 1223 1324 5.24E-26 106.244 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 553 655 3.52E-24 100.851 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 889 999 6.36E-23 96.9989 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 338 434 5.60E-17 80.0501 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 1008 1101 6.79E-17 79.6649 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 1333 1430 1.29E-16 78.8945 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 1445 1541 5.65E-16 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 663 761 6.36E-16 76.9685 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 1109 1213 2.08E-12 66.5681 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 446 535 2.12E-12 66.5681 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 245213 2462 2499 2.73E-11 61.8838 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23912 - CGI_10025898 superfamily 247907 2241 2331 3.00E-09 57.8133 cl17353 LamG superfamily C - "Laminin G domain; Laminin G-like domains are usually Ca++ mediated receptors that can have binding sites for steroids, beta1 integrins, heparin, sulfatides, fibulin-1, and alpha-dystroglycans. Proteins that contain LamG domains serve a variety of purposes including signal transduction via cell-surface steroid receptors, adhesion, migration and differentiation through mediation of cell adhesion molecules." Q#23912 - CGI_10025898 superfamily 247068 776 873 3.38E-09 56.9382 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 219 324 4.07E-08 53.8566 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 1549 1663 5.71E-08 53.0862 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 110 208 5.33E-07 50.3898 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 245213 1927 1960 0.00285383 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23912 - CGI_10025898 superfamily 216265 2549 2652 2.10E-18 85.432 cl03079 Cadherin_C superfamily - - Cadherin cytoplasmic region; Cadherins are vital in cell-cell adhesion during tissue differentiation. Cadherins are linked to the cytoskeleton by catenins. Catenins bind to the cytoplasmic tail of the cadherin. Cadherins cluster to form foci of homophilic binding units. A key determinant to the strength of the binding that it is mediated by cadherins is the juxtamembrane region of the cadherin. This region induces clustering and also binds to the protein p120ctn. Q#23912 - CGI_10025898 superfamily 247068 6 99 0.00272563 38.8338 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23912 - CGI_10025898 superfamily 247068 1685 1766 0.00304001 38.4486 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23913 - CGI_10025899 superfamily 245201 240 475 1.74E-22 95.7665 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23914 - CGI_10025900 superfamily 247692 48 254 6.69E-56 185.032 cl17068 AFD_class_I superfamily N - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#23916 - CGI_10025902 superfamily 247724 1 86 2.41E-22 87.1959 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#23920 - CGI_10025906 superfamily 218759 9 136 8.59E-33 113.989 cl05405 DUF842 superfamily - - "Eukaryotic protein of unknown function (DUF842); This family consists of a number of conserved eukaryotic proteins of unknown function. The sequences carry three sets of CxxxC motifs, which might suggest a type of zinc-finger formation." Q#23921 - CGI_10023479 superfamily 245206 21 264 6.36E-143 410.683 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23921 - CGI_10023479 superfamily 242376 334 423 1.96E-23 93.829 cl01225 SCP2 superfamily - - "SCP-2 sterol transfer family; This domain is involved in binding sterols. It is found in the SCP2 protein, as well as the C terminus of the enzyme estradiol 17 beta-dehydrogenase EC:1.1.1.62. The UNC-24 protein contains an SPFH domain pfam01145." Q#23922 - CGI_10023480 superfamily 245210 53 455 1.00E-115 347.714 cl09938 cond_enzymes superfamily - - "Condensing enzymes; Family of enzymes that catalyze a (decarboxylating or non-decarboxylating) Claisen-like condensation reaction. Members are share strong structural similarity, and are involved in the synthesis and degradation of fatty acids, and the production of polyketides, a diverse group of natural products." Q#23924 - CGI_10023482 superfamily 221308 1 101 4.67E-22 85.7586 cl13364 DUF3429 superfamily N - Protein of unknown function (DUF3429); This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 147 to 245 amino acids in length. Q#23925 - CGI_10023483 superfamily 248097 778 898 8.36E-23 95.795 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23925 - CGI_10023483 superfamily 248097 659 779 2.41E-16 76.9202 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23925 - CGI_10023483 superfamily 247743 119 160 0.000761599 39.4904 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#23926 - CGI_10023484 superfamily 245201 29 174 6.02E-15 73.0397 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23929 - CGI_10023487 superfamily 241763 194 422 5.32E-87 268.37 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#23929 - CGI_10023487 superfamily 207618 46 89 0.0013996 36.6442 cl02508 Somatomedin_B superfamily - - Somatomedin B domain; Somatomedin B domain. Q#23930 - CGI_10023488 superfamily 217955 685 887 1.34E-65 220.219 cl04442 Utp21 superfamily - - "Utp21 specific WD40 associated putative domain; Utp21 is a subunit of U3 snoRNP, which is essential for synthesis of 18S rRNA." Q#23930 - CGI_10023488 superfamily 243092 270 604 6.41E-26 108.962 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23930 - CGI_10023488 superfamily 243092 92 356 1.35E-21 95.8648 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23931 - CGI_10023489 superfamily 241675 119 385 2.39E-49 168.634 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#23932 - CGI_10023490 superfamily 247757 22 279 5.81E-67 210.951 cl17203 Fer4_NifH superfamily - - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#23933 - CGI_10023491 superfamily 241599 38 94 4.69E-11 55.7125 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#23934 - CGI_10023492 superfamily 248020 38 290 1.90E-06 49.3852 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#23935 - CGI_10023493 superfamily 222150 706 733 0.00619037 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#23936 - CGI_10023494 superfamily 220281 237 491 1.11E-34 135.148 cl12364 Ndc1_Nup superfamily N - "Nucleoporin protein Ndc1-Nup; Ndc1 is a nucleoporin protein that is a component of the Nuclear Pore Complex, and, in fungi, also of the Spindle Pole Body. It consists of six transmembrane segments, three lumenal loops, both concentrated at the N-terminus and cytoplasmic domains largely at the C-terminus, all of which are well conserved." Q#23937 - CGI_10023495 superfamily 241622 44 119 3.49E-16 69.5178 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#23938 - CGI_10023496 superfamily 216897 43 125 4.13E-26 99.678 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#23938 - CGI_10023496 superfamily 243029 139 198 1.43E-08 50.976 cl02422 HRM superfamily - - Hormone receptor domain; This extracellular domain contains four conserved cysteines that probably for disulphide bridges. The domain is found in a variety of hormone receptors. It may be a ligand binding domain. Q#23939 - CGI_10023497 superfamily 244859 110 315 1.31E-12 66.0309 cl08171 HtrL_YibB superfamily - - "Bacterial protein of unknown function (HtrL_YibB); The protein from this rare, uncharacterized protein family is designated HtrL or YibB in E. coli, where its gene is found in a region of LPS core biosynthesis genes. Homologues are found in Shigella flexneri, Campylobacter jejuni, and Caenorhabditis elegans only. The htrL gene may represent an insertion to the LPS core biosynthesis region, rather than an LPS biosynthetic protein." Q#23942 - CGI_10023500 superfamily 220691 5 228 0.00206907 37.5974 cl18569 7TM_GPCR_Srv superfamily N - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#23943 - CGI_10023501 superfamily 241642 405 461 0.000340471 38.6306 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#23944 - CGI_10023502 superfamily 192393 13 114 5.56E-29 105.046 cl10788 EAF superfamily - - RNA polymerase II transcription elongation factor; Members of this family act as transcriptional transactivators of ELL and ELL2 elongation activities. Eaf proteins form a stable heterodimer complex with ELL proteins to facilitate the binding of RNA polymerase II to activate transcription elongation. The N-terminus of approx 120 residues is globular and highly conserved. Q#23945 - CGI_10023504 superfamily 218200 9 35 1.77E-06 45.8211 cl04660 Glyco_transf_54 superfamily N - "N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein." Q#23946 - CGI_10023505 superfamily 216653 165 305 2.15E-26 101.905 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#23947 - CGI_10023506 superfamily 246750 28 143 0.00186933 36.8414 cl14880 CBM6-CBM35-CBM36_like superfamily - - "Carbohydrate Binding Module 6 (CBM6) and CBM35_like superfamily; Carbohydrate binding module family 6 (CBM6, family 6 CBM), also known as cellulose binding domain family VI (CBD VI), and related CBMs (CBM35 and CBM36). These are non-catalytic carbohydrate binding domains found in a range of enzymes that display activities against a diverse range of carbohydrate targets, including mannan, xylan, beta-glucans, cellulose, agarose, and arabinans. These domains facilitate the strong binding of the appended catalytic modules to their dedicated, insoluble substrates. Many of these CBMs are associated with glycoside hydrolase (GH) domains. CBM6 is an unusual CBM as it represents a chimera of two distinct binding sites with different modes of binding: binding site I within the loop regions and binding site II on the concave face of the beta-sandwich fold. CBM36s are calcium-dependent xylan binding domains. CBM35s display conserved specificity through extensive sequence similarity, but divergent function through their appended catalytic modules. This alignment model also contains the C-terminal domains of bacterial insecticidal toxins, where they may be involved in determining insect specificity through carbohydrate binding functionality." Q#23948 - CGI_10023507 superfamily 243353 27 67 1.80E-14 63.2172 cl03225 GRIP superfamily - - "GRIP domain; The GRIP (golgin-97, RanBP2alpha,Imh1p and p230/golgin-245) domain is found in many large coiled-coil proteins. It has been shown to be sufficient for targeting to the Golgi. The GRIP domain contains a completely conserved tyrosine residue. At least some of these domains have been shown to bind to GTPase Arl1, see structures in." Q#23950 - CGI_10023509 superfamily 245599 382 540 6.05E-30 116.939 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#23950 - CGI_10023509 superfamily 243104 574 617 3.52E-05 42.1469 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#23953 - CGI_10023512 superfamily 245201 355 499 1.29E-18 84.9809 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23954 - CGI_10023513 superfamily 246921 289 332 1.50E-10 58.5409 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#23954 - CGI_10023513 superfamily 246921 214 281 7.38E-05 41.9773 cl15299 FG-GAP superfamily - - "FG-GAP repeat; This family contains the extracellular repeat that is found in up to seven copies in alpha integrins. This repeat has been predicted to fold into a beta propeller structure. The repeat is called the FG-GAP repeat after two conserved motifs in the repeat. The FG-GAP repeats are found in the N terminus of integrin alpha chains, a region that has been shown to be important for ligand binding. A putative Ca2+ binding motif is found in some of the repeats." Q#23955 - CGI_10023514 superfamily 241574 8 140 3.02E-37 135.406 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#23955 - CGI_10023514 superfamily 241574 186 370 1.56E-22 93.8045 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#23957 - CGI_10012621 superfamily 215648 388 609 1.08E-29 117.697 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#23957 - CGI_10012621 superfamily 217211 206 261 0.000785652 38.4194 cl03691 Cache_1 superfamily C - Cache domain; Cache domain. Q#23958 - CGI_10012622 superfamily 248022 317 473 6.93E-11 62.6803 cl17468 Aa_trans superfamily NC - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#23958 - CGI_10012622 superfamily 248022 136 185 7.93E-05 43.8055 cl17468 Aa_trans superfamily C - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#23960 - CGI_10012624 superfamily 142634 1021 1340 7.37E-166 500.593 cl11429 RNAP_largest_subunit_C superfamily - - "Largest subunit of RNA polymerase (RNAP), C-terminal domain; RNA polymerase (RNAP) is a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is the final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei, RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. Structure studies revealed that prokaryotic and eukaryotic RNAPs share a conserved crab-claw-shape structure. The largest and the second largest subunits each make up one clamp, one jaw, and part of the cleft. The largest RNAP subunit (Rpb1) interacts with the second-largest RNAP subunit (Rpb2) to form the DNA entry and RNA exit channels in addition to the catalytic center of RNA synthesis. The region covered by this domain makes up part of the foot and jaw structures. In archaea, some photosynthetic organisms, and some organelles, this domain exists as a separate subunit, while it forms the C-terminal region of the RNAP largest subunit in eukaryotes and bacteria." Q#23960 - CGI_10012624 superfamily 245715 246 548 1.21E-139 430.787 cl11591 RNA_pol_Rpb1_2 superfamily - - "RNA polymerase Rpb1, domain 2; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 2, contains the active site. The invariant motif -NADFDGD- binds the active site magnesium ion." Q#23960 - CGI_10012624 superfamily 218370 12 354 5.30E-101 326.564 cl04880 RNA_pol_Rpb1_1 superfamily - - "RNA polymerase Rpb1, domain 1; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 1, represents the clamp domain, which a mobile domain involved in positioning the DNA, maintenance of the transcription bubble and positioning of the nascent RNA strand." Q#23960 - CGI_10012624 superfamily 218361 526 700 8.52E-45 160.864 cl04873 RNA_pol_Rpb1_3 superfamily - - "RNA polymerase Rpb1, domain 3; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 3, represents the pore domain. The 3' end of RNA is positioned close to this domain. The pore delimited by this domain is thought to act as a channel through which nucleotides enter the active site and/or where the 3' end of the RNA may be extruded during back-tracking." Q#23960 - CGI_10012624 superfamily 218372 725 831 8.14E-37 136.346 cl04881 RNA_pol_Rpb1_4 superfamily - - "RNA polymerase Rpb1, domain 4; RNA polymerases catalyze the DNA dependent polymerisation of RNA. Prokaryotes contain a single RNA polymerase compared to three in eukaryotes (not including mitochondrial. and chloroplast polymerases). This domain, domain 4, represents the funnel domain. The funnel contain the binding site for some elongation factors." Q#23961 - CGI_10012625 superfamily 247799 155 272 1.30E-35 128.516 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#23962 - CGI_10012626 superfamily 243310 26 226 1.23E-52 176.66 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#23962 - CGI_10012626 superfamily 220228 227 275 4.09E-11 58.4512 cl09661 CENP-X superfamily N - "CENP-S associating Centromere protein X; The centromere, essential for faithful chromosome segregation during mitosis, has a network of constitutive centromere-associated (CCAN) proteins associating with it during mitosis. So far in vertebrates at least 15 centromere proteins have been identified, which are divided into several subclasses based on functional and biochemical analyses. These provide a platform for the formation of a functional kinetochore during mitosis. CENP-S is one that does not associate with the CENP-H-containing complex but rather interacts with CENP-X to form a stable assembly of outer kinetochore proteins that functions downstream of other components of the CCAN. This complex may directly allow efficient and stable formation of the outer kinetochore on the CCAN platform." Q#23963 - CGI_10012627 superfamily 243310 26 262 1.62E-65 206.321 cl03120 ELO superfamily - - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#23964 - CGI_10012628 superfamily 241580 200 277 5.22E-21 87.9946 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#23965 - CGI_10012629 superfamily 243092 260 514 2.28E-41 150.563 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23965 - CGI_10012629 superfamily 150420 39 185 0.00136772 38.1779 cl18042 Jnk-SapK_ap_N superfamily N - JNK_SAPK-associated protein-1; This is the N-terminal 200 residues of a set of proteins conserved from yeasts to humans. Most of the proteins in this entry have an RhoGEF pfam00621 domain at their C-terminal end. Q#23966 - CGI_10012630 superfamily 241900 457 762 4.47E-113 354.637 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#23966 - CGI_10012630 superfamily 246908 29 132 2.24E-31 119.853 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#23966 - CGI_10012630 superfamily 247057 994 1052 1.79E-12 64.6268 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#23967 - CGI_10012631 superfamily 245206 20 286 2.27E-91 274.872 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23968 - CGI_10012632 superfamily 245206 1 256 1.43E-88 266.783 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23969 - CGI_10012633 superfamily 241782 37 294 1.31E-42 150.955 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#23971 - CGI_10012635 superfamily 245206 7 252 3.08E-112 325.905 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#23973 - CGI_10012637 superfamily 241546 941 1049 8.88E-46 161.938 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#23973 - CGI_10012637 superfamily 245670 173 358 1.93E-59 203.558 cl11519 DENN superfamily - - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#23973 - CGI_10012637 superfamily 243635 9 109 2.57E-24 99.7164 cl04085 uDENN superfamily - - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#23973 - CGI_10012637 superfamily 243142 856 929 1.36E-17 81.5187 cl02689 RUN superfamily N - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#23973 - CGI_10012637 superfamily 208095 484 559 3.45E-14 69.5447 cl04084 dDENN superfamily - - dDENN domain; This region is always found associated with pfam02141. It is predicted to form a globular domain. This domain is predicted to be completely alpha helical. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. Q#23973 - CGI_10012637 superfamily 243142 1133 1266 5.45E-08 52.6287 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#23974 - CGI_10000408 superfamily 241743 26 190 1.11E-14 67.9834 cl00274 ML superfamily - - "The ML (MD-2-related lipid-recognition) domain is present in MD-1, MD-2, GM2 activator protein, Niemann-Pick type C2 (Npc2) protein, phosphatidylinositol/phosphatidylglycerol transfer protein (PG/PI-TP), mite allergen Der p 2 and several proteins of unknown function in plants, animals and fungi. These single-domain proteins form two anti-parallel beta-pleated sheets stabilized by three disulfide bonds and with an accessible central hydrophobic cavity, and are predicted to mediate diverse biological functions through interaction with specific lipids." Q#23975 - CGI_10018635 superfamily 246918 224 276 1.54E-16 73.3899 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#23975 - CGI_10018635 superfamily 243072 167 212 0.00306055 36.2075 cl02529 ANK superfamily NC - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#23976 - CGI_10018636 superfamily 245847 34 154 1.76E-21 85.6861 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#23978 - CGI_10018638 superfamily 241550 363 686 2.80E-126 385.347 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#23978 - CGI_10018638 superfamily 245839 695 829 3.55E-17 79.4561 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#23978 - CGI_10018638 superfamily 248097 255 371 1.27E-23 98.1062 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23979 - CGI_10018639 superfamily 248097 70 192 2.80E-30 109.277 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23980 - CGI_10018640 superfamily 248012 5 86 6.21E-10 54.8913 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#23980 - CGI_10018640 superfamily 248012 166 279 0.000174084 39.0981 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#23981 - CGI_10018641 superfamily 248012 2 86 4.98E-08 46.4169 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#23982 - CGI_10018642 superfamily 248097 42 128 3.29E-21 84.239 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#23983 - CGI_10018643 superfamily 247068 64 160 1.69E-14 66.5681 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#23984 - CGI_10018644 superfamily 247723 102 187 9.50E-27 99.6845 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#23985 - CGI_10018645 superfamily 243092 28 369 8.12E-11 61.1968 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#23986 - CGI_10018646 superfamily 220695 73 198 0.00483171 37.5583 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#23987 - CGI_10018647 superfamily 245201 608 821 2.47E-66 222.797 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#23987 - CGI_10018647 superfamily 241584 499 588 0.00184247 37.4759 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#23987 - CGI_10018647 superfamily 245213 29 61 0.00514554 35.9154 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23988 - CGI_10018648 superfamily 192535 50 323 6.97E-05 42.583 cl18179 7TM_GPCR_Srsx superfamily - - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#23989 - CGI_10018649 superfamily 222032 146 298 2.58E-39 136.608 cl16218 CPSF100_C superfamily - - "Cleavage and polyadenylation factor 2 C-terminal; This family lies at the C-terminus of many fungal and plant cleavage and polyadenylation specificity factor subunit 2 proteins. The exact function of the domain is not known, but is likely to function as a binding domain for the protein within the overall CPSF complex." Q#23989 - CGI_10018649 superfamily 203663 64 101 2.28E-06 43.6323 cl06522 RMMBL superfamily - - RNA-metabolising metallo-beta-lactamase; The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in RNA metabolism. Q#23990 - CGI_10018650 superfamily 241613 629 663 2.82E-09 55.6758 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 1960 1994 3.61E-08 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 2297 2328 6.74E-08 51.8238 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 2090 2123 2.24E-07 50.283 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 245213 2384 2422 6.47E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#23990 - CGI_10018650 superfamily 241613 591 623 2.16E-06 47.5866 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 1999 2031 2.55E-06 47.2014 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 2251 2284 5.59E-06 46.0458 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 2127 2159 1.22E-05 45.2754 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 2169 2202 1.31E-05 45.2754 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 682 705 5.52E-05 43.3494 cl00104 LDLa superfamily N - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 1918 1952 5.61E-05 43.3494 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 2053 2077 0.000457802 40.653 cl00104 LDLa superfamily N - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 554 586 0.000460554 40.653 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241613 464 498 0.000470317 40.653 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#23990 - CGI_10018650 superfamily 241752 2496 2586 4.35E-29 115.88 cl00283 ADP_ribosyl superfamily N - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#23990 - CGI_10018650 superfamily 214531 1766 1806 3.25E-11 61.4636 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23990 - CGI_10018650 superfamily 214531 908 951 6.08E-07 49.1373 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23990 - CGI_10018650 superfamily 214531 866 906 1.20E-06 48.3669 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23990 - CGI_10018650 superfamily 214531 1394 1427 1.76E-05 44.9001 cl18310 LY superfamily C - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23990 - CGI_10018650 superfamily 214531 319 354 1.98E-05 44.5149 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23990 - CGI_10018650 superfamily 215683 1739 1781 3.11E-05 44.0831 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#23990 - CGI_10018650 superfamily 221695 2364 2387 4.38E-05 43.5978 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#23990 - CGI_10018650 superfamily 214531 964 998 0.000618923 40.2777 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23990 - CGI_10018650 superfamily 215683 1020 1059 0.00163695 39.0755 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#23990 - CGI_10018650 superfamily 214531 1443 1485 0.00523534 37.5813 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23990 - CGI_10018650 superfamily 215683 1825 1865 0.00561475 37.5347 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#23990 - CGI_10018650 superfamily 214531 1181 1228 0.00639906 37.1961 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23990 - CGI_10018650 superfamily 214531 1670 1707 0.00724299 37.1961 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#23991 - CGI_10018651 superfamily 189857 23 140 2.91E-23 92.313 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#23992 - CGI_10018652 superfamily 243066 136 239 2.80E-06 44.9892 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#23993 - CGI_10018653 superfamily 220603 30 82 4.91E-21 85.5263 cl10849 DUF2360 superfamily C - Predicted coiled-coil domain-containing protein (DUF2360); This is the conserved 140 amino acid region of a family of proteins conserved from nematodes to humans. One C. elegans member is annotated as a Daf-16-dependent longevity protein 1 but this could not be confirmed. The function is unknown. Q#23993 - CGI_10018653 superfamily 220603 139 179 2.27E-16 72.4295 cl10849 DUF2360 superfamily N - Predicted coiled-coil domain-containing protein (DUF2360); This is the conserved 140 amino acid region of a family of proteins conserved from nematodes to humans. One C. elegans member is annotated as a Daf-16-dependent longevity protein 1 but this could not be confirmed. The function is unknown. Q#23994 - CGI_10018654 superfamily 243689 29 95 5.01E-08 51.0901 cl04271 IBN_N superfamily - - Importin-beta N-terminal domain; Importin-beta N-terminal domain. Q#23994 - CGI_10018654 superfamily 219817 122 249 1.26E-07 51.0797 cl07129 Xpo1 superfamily - - "Exportin 1-like protein; The sequences featured in this family are similar to a region close to the N-terminus of yeast exportin 1 (Xpo1, Crm1). This region is found just C-terminal to an importin-beta N-terminal domain (pfam03810) in many members of this family. Exportin 1 is a nuclear export receptor that interacts with leucine-rich nuclear export signal (NES) sequences, and Ran-GTP, and is involved in translocation of proteins out of the nucleus." Q#23995 - CGI_10018655 superfamily 246925 93 174 0.000240533 40.4166 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#23998 - CGI_10018658 superfamily 217380 314 605 6.49E-62 208.718 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#23999 - CGI_10018659 superfamily 241754 170 565 1.73E-108 331.843 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#24000 - CGI_10018660 superfamily 247724 77 207 1.54E-13 68.3394 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24000 - CGI_10018660 superfamily 217881 201 238 1.25E-10 57.5614 cl04390 Abhydro_lipase superfamily N - Partial alpha/beta-hydrolase lipase region; This family corresponds to a N-terminal part of an alpha/beta hydrolase domain. Q#24000 - CGI_10018660 superfamily 218405 221 386 0.00890406 36.7057 cl18455 DUF676 superfamily C - Putative serine esterase (DUF676); This family of proteins are probably serine esterase type enzymes with an alpha/beta hydrolase fold. Q#24003 - CGI_10013015 superfamily 245864 43 280 7.42E-26 105.053 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#24004 - CGI_10013016 superfamily 243051 163 242 8.65E-14 65.8349 cl02479 MAM superfamily NC - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#24012 - CGI_10013024 superfamily 243069 209 314 7.92E-08 50.037 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#24013 - CGI_10013025 superfamily 243058 407 526 1.10E-21 92.7627 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24013 - CGI_10013025 superfamily 243058 255 374 1.94E-18 83.5179 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24013 - CGI_10013025 superfamily 243058 492 633 4.81E-18 82.3623 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24013 - CGI_10013025 superfamily 243058 178 288 2.76E-10 59.2503 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24013 - CGI_10013025 superfamily 243069 941 1034 0.000798856 39.2514 cl02525 Band_7 superfamily N - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#24013 - CGI_10013025 superfamily 243058 376 436 0.000302679 40.1053 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24014 - CGI_10013026 superfamily 222150 713 736 5.47E-09 53.1645 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24014 - CGI_10013026 superfamily 222150 683 710 5.60E-07 47.3865 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24014 - CGI_10013026 superfamily 246975 728 749 4.13E-05 41.9489 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#24016 - CGI_10013028 superfamily 241644 33 178 5.25E-36 124.237 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#24017 - CGI_10013029 superfamily 244994 202 285 2.28E-17 74.965 cl08520 Cdc6_C superfamily - - "Winged-helix domain of essential DNA replication protein Cell division control protein (Cdc6), which mediates DNA binding; This model characterizes the winged-helix, C-terminal domain of the Cell division control protein (Cdc6_C). Cdc6 (also known as Cell division cycle 6 or Cdc18) functions as a regulator at the early stages of DNA replication, by helping to recruit and load the Minichromosome Maintenance Complex (MCM) onto DNA and may have additional roles in the control of mitotic entry. Precise duplication of chromosomal DNA is required for genomic stability during replication. Cdc6 has an essential role in DNA replication and irregular expression of Cdc6 may lead to genomic instability. Cdc6 over-expression is observed in many cancerous lesions. DNA replication begins when an origin recognition complex (ORC) binds to a replication origin site on the chromatin. Studies indicate that Cdc6 interacts with ORC through the Orc1 subunit, and that this association increases the specificity of the ORC-origins interaction. Further studies suggest that hydrolysis of Cdc6-bound ATP promotes the association of the replication licensing factor Cdt1 with origins through an interaction with Orc6 and this in turn promotes the loading of MCM2-7 helicase onto chromatin. The MCM2-7 complex promotes the unwinding of DNA origins, and the binding of additional factors to initiate the DNA replication. S-Cdk (S-phase cyclin and cyclin-dependent kinase complex) prevents rereplication by causing the Cdc6 protein to dissociate from ORC and prevents the Cdc6 and MCM proteins from reassembling at any origin. By phosphorylating Cdc6, S-Cdk also triggers Cdc6's ubiquitination. The Cdc6 protein is composed of three domains, an N-terminal AAA+ domain with Walker A and B, and Sensor-1 and -2 motifs. The central region contains a conserved nucleotide binding/ATPase domain and is a member of the ATPase superfamily. The C-terminal domain (Cdc6_C) is a conserved winged-helix domain that possibly mediates protein-protein interactions or direct DNA interactions. Cdc6 is conserved in eukaryotes, and related genes are found in Archaea. The winged helix fold structure of Cdc6_C is similar to the structures of other eukaryotic replication initiators without apparent sequence similarity." Q#24017 - CGI_10013029 superfamily 247743 21 71 0.000418847 38.3497 cl17189 AAA superfamily N - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#24018 - CGI_10013030 superfamily 241900 227 536 5.03E-77 248.372 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#24020 - CGI_10013032 superfamily 241610 10 63 2.02E-13 58.8006 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#24022 - CGI_10001355 superfamily 243161 4 69 9.02E-06 41.9962 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#24028 - CGI_10001842 superfamily 241578 11 181 1.26E-06 47.1754 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24029 - CGI_10001843 superfamily 190706 28 240 5.09E-21 92.0788 cl04201 Glyco_hydro_79n superfamily - - "Glycosyl hydrolase family 79, N-terminal domain; Family of endo-beta-N-glucuronidase, or heparanase. Heparan sulfate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulfate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular micro-environment. Heparanase degrades HS at specific intra-chain sites. The enzyme is synthesised as a latent approximately 65 kDa protein that is processed at the N-terminus into a highly active approximately 50 kDa form. Experimental evidence suggests that heparanase may facilitate both tumour cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity." Q#24029 - CGI_10001843 superfamily 190706 292 322 0.00358929 37.7657 cl04201 Glyco_hydro_79n superfamily N - "Glycosyl hydrolase family 79, N-terminal domain; Family of endo-beta-N-glucuronidase, or heparanase. Heparan sulfate proteoglycans (HSPGs) play a key role in the self- assembly, insolubility and barrier properties of basement membranes and extracellular matrices. Hence, cleavage of heparan sulfate (HS) affects the integrity and functional state of tissues and thereby fundamental normal and pathological phenomena involving cell migration and response to changes in the extracellular micro-environment. Heparanase degrades HS at specific intra-chain sites. The enzyme is synthesised as a latent approximately 65 kDa protein that is processed at the N-terminus into a highly active approximately 50 kDa form. Experimental evidence suggests that heparanase may facilitate both tumour cell invasion and neovascularization, both critical steps in cancer progression. The enzyme is also involved in cell migration associated with inflammation and autoimmunity." Q#24032 - CGI_10001846 superfamily 247866 3 133 4.30E-26 100.605 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#24034 - CGI_10014237 superfamily 216152 39 321 1.29E-49 170.958 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#24035 - CGI_10014238 superfamily 241637 18 76 4.28E-07 48.0734 cl00146 TFIIS_I superfamily - - N-terminal domain (domain I) of transcription elongation factor S-II (TFIIS); similar to a domain found in elongin A and CRSP70; likely to be involved in transcription; domain I from TFIIS interacts with RNA polymerase II holoenzyme Q#24035 - CGI_10014238 superfamily 219215 548 634 4.60E-21 89.704 cl06096 Elongin_A superfamily - - "RNA polymerase II transcription factor SIII (Elongin) subunit A; This family represents a conserved region within RNA polymerase II transcription factor SIII (Elongin) subunit A. In mammals, the Elongin complex activates elongation by RNA polymerase II by suppressing transient pausing of the polymerase at many sites within transcription units. Elongin is a heterotrimer composed of A, B, and C subunits of 110, 18, and 15 kilodaltons, respectively. Subunit A has been shown to function as the transcriptionally active component of Elongin." Q#24036 - CGI_10014239 superfamily 242156 1 62 7.44E-29 100.171 cl00869 PTH2_family superfamily N - "Peptidyl-tRNA hydrolase, type 2 (PTH2)_like . Peptidyl-tRNA hydrolase activity releases tRNA from the premature translation termination product peptidyl-tRNA. Two structurally different enzymes have been reported to encode such activity, Pth present in bacteria and eukaryotes and Pth2 present in archaea and eukaryotes." Q#24037 - CGI_10014240 superfamily 248011 386 471 0.000445352 39.3566 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#24038 - CGI_10014241 superfamily 243069 266 317 9.89E-14 67.9426 cl02525 Band_7 superfamily C - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#24039 - CGI_10014242 superfamily 243069 60 275 1.41E-86 260.542 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#24040 - CGI_10014243 superfamily 243069 31 246 9.11E-80 245.52 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#24040 - CGI_10014243 superfamily 243069 250 345 1.00E-28 111.085 cl02525 Band_7 superfamily N - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#24041 - CGI_10014244 superfamily 243069 58 272 4.36E-84 253.994 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#24042 - CGI_10014245 superfamily 241619 27 98 1.20E-09 54.3981 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#24042 - CGI_10014245 superfamily 241619 289 351 1.57E-08 50.9313 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#24042 - CGI_10014245 superfamily 241619 106 178 4.22E-08 49.7757 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#24042 - CGI_10014245 superfamily 241619 205 267 0.00131641 36.6789 cl00112 PAN_APPLE superfamily N - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#24043 - CGI_10014246 superfamily 241571 1 108 4.54E-24 96.3274 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24043 - CGI_10014246 superfamily 241571 114 188 6.46E-16 73.6006 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24044 - CGI_10014247 superfamily 247725 122 215 4.16E-15 72.2501 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24044 - CGI_10014247 superfamily 247725 15 117 3.25E-06 46.1269 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24045 - CGI_10014248 superfamily 148425 65 238 1.63E-13 67.4734 cl06049 NPDC1 superfamily N - Neural proliferation differentiation control-1 protein (NPDC1); This family consists of several neural proliferation differentiation control-1 (NPDC1) proteins. NPDC1 plays a role in the control of neural cell proliferation and differentiation. It has been suggested that NPDC1 may be involved in the development of several secretion glands. This family also contains the C-terminal region of the C. elegans protein CAB-1 which is known to interact with AEX-3. Q#24046 - CGI_10014249 superfamily 248281 38 122 1.72E-15 70.7623 cl17727 GT1 superfamily - - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#24047 - CGI_10014250 superfamily 245864 76 438 3.24E-69 229.087 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#24048 - CGI_10014251 superfamily 245864 27 495 8.26E-101 313.061 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#24049 - CGI_10014252 superfamily 245847 237 394 3.33E-35 132.091 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#24049 - CGI_10014252 superfamily 245814 612 669 1.83E-05 44.0171 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24049 - CGI_10014252 superfamily 245814 517 592 0.0080258 35.5427 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24049 - CGI_10014252 superfamily 245814 706 769 2.34E-09 55.1055 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24051 - CGI_10014254 superfamily 243077 282 336 1.67E-20 83.7489 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#24051 - CGI_10014254 superfamily 242232 3 50 7.69E-12 59.8792 cl00984 TM2 superfamily - - "TM2 domain; This family is composed of a pair of transmembrane alpha helices connected by a short linker. The function of this domain is unknown, however it occurs in a wide range or protein contexts." Q#24052 - CGI_10014255 superfamily 247724 22 178 2.22E-58 182.656 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24053 - CGI_10014256 superfamily 247683 10 60 1.24E-06 46.6871 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#24053 - CGI_10014256 superfamily 219574 514 694 2.67E-29 116.227 cl06698 DC_STAMP superfamily - - "DC-STAMP-like protein; This is a family of sequences which are similar to a region of the dendritic cell-specific transmembrane protein (DC-STAMP). This is thought to be a novel receptor protein that shares no identity with other multimembrane-spanning proteins. It is thought to have seven putative transmembrane regions, two of which are found in the region featured in this family. DC-STAMP is also described as having potential N-linked glycosylation sites and a potential phosphorylation site for PKC, but these are not conserved throughout the family." Q#24053 - CGI_10014256 superfamily 221781 170 256 0.00559776 38.3308 cl15101 FUSC-like superfamily C - FUSC-like inner membrane protein yccS; This family has similarities to the fusaric acid resistance protein family. The proteins are lodged in the inner membrane. Q#24054 - CGI_10014257 superfamily 243069 52 232 2.05E-08 51.2144 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#24056 - CGI_10014259 superfamily 247905 248 359 5.76E-11 59.9441 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#24056 - CGI_10014259 superfamily 247805 38 183 7.41E-10 56.5768 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#24058 - CGI_10014261 superfamily 243016 47 174 5.44E-42 139.388 cl02384 NOT2_3_5 superfamily - - "NOT2 / NOT3 / NOT5 family; NOT1, NOT2, NOT3, NOT4 and NOT5 form a nuclear complex that negatively regulates the basal and activated transcription of many genes. This family includes NOT2, NOT3 and NOT5." Q#24061 - CGI_10014264 superfamily 241578 51 207 1.13E-20 91.5076 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24063 - CGI_10014266 superfamily 241550 17 70 8.16E-28 106.711 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#24064 - CGI_10014267 superfamily 241550 1 223 1.17E-81 249.423 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#24066 - CGI_10014269 superfamily 241600 197 349 7.48E-48 162.41 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#24066 - CGI_10014269 superfamily 241600 74 193 2.81E-35 128.897 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#24067 - CGI_10006828 superfamily 246918 147 206 3.25E-10 54.9003 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24067 - CGI_10006828 superfamily 246918 94 141 2.16E-05 41.0331 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24068 - CGI_10006829 superfamily 248097 63 186 3.16E-27 100.803 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24068 - CGI_10006829 superfamily 217473 13 64 0.000138655 40.041 cl03978 Mab-21 superfamily NC - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#24070 - CGI_10024664 superfamily 114259 12 194 7.39E-70 212.975 cl05211 DUF758 superfamily - - "Domain of unknown function (DUF758); Family of eukaryotic proteins with unknown function, which are induced by tumour necrosis factor." Q#24071 - CGI_10024665 superfamily 241566 1513 1561 7.51E-11 60.5836 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#24071 - CGI_10024665 superfamily 241754 157 668 0 769.06 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#24071 - CGI_10024665 superfamily 243095 1624 1812 1.43E-75 251.589 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#24071 - CGI_10024665 superfamily 241754 806 936 1.35E-54 205.128 cl00286 Motor_domain superfamily N - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#24071 - CGI_10024665 superfamily 241645 31 126 4.50E-14 70.79 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#24071 - CGI_10024665 superfamily 210118 984 1005 0.000269484 40.7719 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#24073 - CGI_10024667 superfamily 241684 477 882 0 668.104 cl00205 HMG-CoA_reductase superfamily - - "Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR); Hydroxymethylglutaryl-coenzyme A (HMG-CoA) reductase (HMGR) is a tightly regulated enzyme, which catalyzes the synthesis of coenzyme A and mevalonate in isoprenoid synthesis. In mammals, this is the rate limiting committed step in cholesterol biosynthesis. Bacteria, such as Pseudomonas mevalonii, which rely solely on mevalonate for their carbon source, catalyze the reverse reaction, using an NAD-dependent HMGR to deacetylate mevalonate into 3-hydroxy-3-methylglutaryl-CoA. There are two classes of HMGR: class I enzymes which are found predominantly in eukaryotes and contain N-terminal membrane regions and class II enzymes which are found primarily in prokaryotes and are soluble as they lack the membrane region. With the exception of Archaeoglobus fulgidus, most archeae are assigned to class I, based on sequence similarity of the active site, even though they lack membrane regions. Yeast and human HMGR are divergent in their N-terminal regions, but are conserved in their active site. In contrast, human and bacterial HMGR differ in their active site architecture. While the prokaryotic enzyme is a homodimer, the eukaryotic enzyme is a homotetramer." Q#24073 - CGI_10024667 superfamily 192997 119 249 7.42E-14 70.3031 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#24074 - CGI_10024668 superfamily 243066 1 99 1.19E-18 80.3541 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24074 - CGI_10024668 superfamily 198867 114 214 1.27E-11 60.818 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#24077 - CGI_10024671 superfamily 241564 138 204 2.09E-27 105.04 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#24077 - CGI_10024671 superfamily 241564 5 69 2.16E-13 65.7499 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#24077 - CGI_10024671 superfamily 247792 337 375 0.000800774 37.4252 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24077 - CGI_10024671 superfamily 247792 477 516 0.0019258 36.2696 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24079 - CGI_10024673 superfamily 205992 219 260 2.25E-16 70.7217 cl16421 DUF4187 superfamily N - "Domain of unknown function (DUF4187); This family is found at the very C-terminus of proteins that carry a G-patch domain, pfam01585. The domain is short and cysteine-rich." Q#24079 - CGI_10024673 superfamily 243107 67 111 3.02E-14 64.8738 cl02611 G-patch superfamily - - "G-patch domain; This domain is found in a number of RNA binding proteins, and is also found in proteins that contain RNA binding domains. This suggests that this domain may have an RNA binding function. This domain has seven highly conserved glycines." Q#24084 - CGI_10024678 superfamily 247723 28 104 3.22E-54 167.023 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24086 - CGI_10024680 superfamily 217962 91 244 1.77E-17 76.9156 cl09558 TPD52 superfamily N - "Tumour protein D52 family; The hD52 gene was originally identified through its elevated expression level in human breast carcinoma. Cloning of D52 homologues from other species has indicated that D52 may play roles in calcium-mediated signal transduction and cell proliferation. Two human homologues of hD52, hD53 and hD54, have also been identified, demonstrating the existence of a novel gene/protein family. These proteins have an amino terminal coiled-coil that allows members to form homo- and heterodimers with each other." Q#24091 - CGI_10024685 superfamily 246925 85 300 1.60E-23 98.9669 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#24093 - CGI_10024687 superfamily 247858 139 324 2.44E-32 120.571 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#24094 - CGI_10024688 superfamily 243175 168 284 1.91E-62 195.493 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#24094 - CGI_10024688 superfamily 241832 97 153 2.19E-22 88.3959 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#24095 - CGI_10024689 superfamily 247918 1 215 5.29E-44 157.874 cl17364 PMT_2 superfamily - - Dolichyl-phosphate-mannose-protein mannosyltransferase; This family contains members that are not captured by pfam02366. Q#24095 - CGI_10024689 superfamily 197746 310 365 3.93E-09 53.4991 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#24095 - CGI_10024689 superfamily 197746 378 426 3.08E-06 45.4099 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#24095 - CGI_10024689 superfamily 197746 243 299 0.000458728 38.8615 cl02624 MIR superfamily - - Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases; Domain in ryanodine and inositol trisphosphate receptors and protein O-mannosyltransferases. Q#24096 - CGI_10024690 superfamily 243175 52 104 0.000129991 37.176 cl02776 GST_C_family superfamily C - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#24096 - CGI_10024690 superfamily 241832 8 37 0.000429706 34.8531 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#24097 - CGI_10024691 superfamily 245598 934 1239 1.49E-176 528.511 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#24097 - CGI_10024691 superfamily 241570 610 717 6.48E-21 90.8481 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24097 - CGI_10024691 superfamily 241570 485 596 1.28E-16 78.1366 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24097 - CGI_10024691 superfamily 241570 177 298 8.37E-15 72.7438 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24098 - CGI_10024692 superfamily 207662 261 336 6.17E-32 119.144 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#24098 - CGI_10024692 superfamily 207662 174 241 8.21E-27 104.566 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#24098 - CGI_10024692 superfamily 245599 517 663 7.49E-16 75.8649 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#24099 - CGI_10024693 superfamily 247684 98 241 7.97E-07 47.1983 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24101 - CGI_10024695 superfamily 246962 85 368 2.93E-19 88.2343 cl15430 Nucleoside_tran superfamily C - "Nucleoside transporter; This is a family of nucleoside transporters. In mammalian cells nucleoside transporters transport nucleoside across the plasma membrane and are essential for nucleotide synthesis via the salvage pathways for cells that lack their own de novo synthesis pathways. Also in this family is mouse and human nucleolar protein HNP36, a protein of unknown function; although it has been hypothesised to be a plasma membrane nucleoside transporter." Q#24102 - CGI_10024696 superfamily 243092 8 231 5.71E-24 102.028 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24102 - CGI_10024696 superfamily 217837 442 549 1.30E-13 68.0029 cl04367 Utp12 superfamily - - Dip2/Utp12 Family; This domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2. Q#24104 - CGI_10024698 superfamily 217210 687 1255 0 665.903 cl10595 Ald_Xan_dh_C2 superfamily - - Molybdopterin-binding domain of aldehyde dehydrogenase; Molybdopterin-binding domain of aldehyde dehydrogenase. Q#24104 - CGI_10024698 superfamily 243326 573 679 7.88E-42 150.861 cl03161 Ald_Xan_dh_C superfamily - - "Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain; Aldehyde oxidase and xanthine dehydrogenase, a/b hammerhead domain. " Q#24104 - CGI_10024698 superfamily 201981 72 146 6.65E-33 123.745 cl08334 Fer2_2 superfamily - - [2Fe-2S] binding domain; [2Fe-2S] binding domain. Q#24104 - CGI_10024698 superfamily 244932 406 508 5.58E-25 102.194 cl08390 CO_deh_flav_C superfamily - - CO dehydrogenase flavoprotein C-terminal domain; CO dehydrogenase flavoprotein C-terminal domain. Q#24105 - CGI_10024699 superfamily 241777 9 167 0.00107303 38.4356 cl00316 Cation_efflux superfamily C - "Cation efflux family; Members of this family are integral membrane proteins, that are found to increase tolerance to divalent metal ions such as cadmium, zinc, and cobalt. These proteins are thought to be efflux pumps that remove these ions from cells." Q#24106 - CGI_10024700 superfamily 243100 285 338 5.07E-07 46.0677 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#24107 - CGI_10024701 superfamily 218899 129 281 3.57E-25 100.818 cl05570 DUF947 superfamily - - Domain of unknown function (DUF947); Family of eukaryotic proteins with unknown function. Q#24107 - CGI_10024701 superfamily 218899 297 423 4.57E-17 78.0916 cl05570 DUF947 superfamily - - Domain of unknown function (DUF947); Family of eukaryotic proteins with unknown function. Q#24108 - CGI_10024702 superfamily 243092 21 346 7.02E-19 83.9236 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24109 - CGI_10024703 superfamily 247912 9 264 1.70E-23 102.58 cl17358 Beta-lactamase superfamily N - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#24109 - CGI_10024703 superfamily 216290 908 1026 5.36E-21 91.1957 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#24109 - CGI_10024703 superfamily 217685 1041 1122 9.51E-12 64.2776 cl04225 Cu2_monoox_C superfamily C - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#24109 - CGI_10024703 superfamily 247912 373 482 1.33E-08 56.7409 cl17358 Beta-lactamase superfamily C - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#24109 - CGI_10024703 superfamily 247912 510 645 7.77E-05 44.7997 cl17358 Beta-lactamase superfamily N - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#24112 - CGI_10024706 superfamily 247746 14 124 0.00354878 36.4674 cl17192 ATP-synt_B superfamily - - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#24113 - CGI_10024707 superfamily 110440 395 419 0.00949093 33.9205 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#24114 - CGI_10008451 superfamily 243072 70 200 2.34E-15 74.3422 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24114 - CGI_10008451 superfamily 149414 214 276 1.06E-21 90.7938 cl07091 TRP_2 superfamily - - Transient receptor ion channel II; This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023). Q#24115 - CGI_10008452 superfamily 112708 134 224 9.47E-19 78.2323 cl04323 Sec20 superfamily - - Sec20; Sec20 is a membrane glycoprotein associated with secretory pathway. Q#24116 - CGI_10008453 superfamily 240667 18 198 3.90E-52 176.329 cl18929 TIN2_N superfamily - - "N-terminal domain of TRF-interacting nuclear factor 2; shelterin complex protein of telomeres; TIN2 is one of the six proteins of shelterin complex, which acts to protect telomeres from DNA damage repair machinery. TIN2 binds directly to TRF1 and TRF2 and stabilizes TRF2 complex-telomere binding by tethering it to the TRF1 complex. TIN2 binding to TRF2 is primarily via the TRF binding motif (TBM) region and the N-terminus, while the far C-terminal region has lower affinity. The TIN2 TBM, but not the N-terminal region, is involved in TIN2 binding to TRF1. Truncation of the TIN2 N-terminus in mouse results in telomere elongation, suggesting a negative regulatory function of this region. Three shelterin components (TRF1, TRF2, POT1) bind DNA and 3 components (TIN2, RAP1, TPP1) are recruited by these DNA binding factors. TRF1 activity at telomeres is regulated in part by selective ubiquitination and degradation. Ubiquitination of TRF1 is mediated by Fbx4, which binds TRF1 in the TRFH domain, via a small GTPase module. When bound to telomeres, TIN2 acts to protect TRF1 from SCF-Fbx4 mediated ubiquitination. F-box proteins act in substrate recognition as part of Skp1-Cul1-Rbx1-F- box (SCF) protein complexes. Tankyrase-mediated ADP-ribosylation releases TRF1 from telomeres, rendering them susceptible to ubiquitination and degradation, promoting telomere elongation. TIN2 also binds PIP1, which recruits POT1 to telomeres." Q#24117 - CGI_10008454 superfamily 241874 842 1376 0 803.8 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24117 - CGI_10008454 superfamily 241600 1744 1961 2.39E-86 283.362 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#24117 - CGI_10008454 superfamily 245213 1627 1663 1.10E-12 65.3506 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24117 - CGI_10008454 superfamily 245213 1589 1624 6.55E-10 57.2614 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24117 - CGI_10008454 superfamily 245213 1667 1701 9.79E-09 53.7946 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24117 - CGI_10008454 superfamily 245213 1552 1586 8.56E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24117 - CGI_10008454 superfamily 245213 1703 1729 4.27E-05 43.3942 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24117 - CGI_10008454 superfamily 241874 429 690 1.74E-118 387.43 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24117 - CGI_10008454 superfamily 241874 270 428 2.41E-63 227.957 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24118 - CGI_10008455 superfamily 241578 25 184 1.14E-48 174.017 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24118 - CGI_10008455 superfamily 241578 213 379 8.97E-43 157.005 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24118 - CGI_10008455 superfamily 248289 3002 3065 0.00970296 37.1104 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#24121 - CGI_10001859 superfamily 247805 37 168 8.88E-07 44.6356 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#24123 - CGI_10001861 superfamily 248097 45 127 8.84E-13 60.7418 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24127 - CGI_10007048 superfamily 241554 716 852 1.58E-24 101.568 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#24129 - CGI_10007050 superfamily 247750 305 560 1.07E-134 412.068 cl17196 E1_enzyme_family superfamily C - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24129 - CGI_10007050 superfamily 247750 628 785 1.76E-72 246.432 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24129 - CGI_10007050 superfamily 247750 41 159 5.39E-55 192.097 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24129 - CGI_10007050 superfamily 202124 533 595 1.00E-10 59.098 cl08340 UBACT superfamily - - Repeat in ubiquitin-activating (UBA) protein; Repeat in ubiquitin-activating (UBA) protein. Q#24130 - CGI_10007051 superfamily 110440 311 337 0.00126976 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#24131 - CGI_10007052 superfamily 247750 44 201 1.07E-76 241.039 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24133 - CGI_10007054 superfamily 247750 39 196 2.14E-78 244.891 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24133 - CGI_10007054 superfamily 247750 3 67 0.00104388 38.1293 cl17196 E1_enzyme_family superfamily NC - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24136 - CGI_10007057 superfamily 247750 346 631 1.99E-155 465.611 cl17196 E1_enzyme_family superfamily C - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24136 - CGI_10007057 superfamily 247750 33 173 1.45E-73 244.099 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24136 - CGI_10007057 superfamily 247750 699 856 4.86E-72 244.891 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24136 - CGI_10007057 superfamily 247750 269 311 1.63E-19 89.2489 cl17196 E1_enzyme_family superfamily N - "Superfamily of activating enzymes (E1) of the ubiquitin-like proteins. This family includes classical ubiquitin-activating enzymes E1, ubiquitin-like (ubl) activating enzymes and other mechanistic homologes, like MoeB, Thif1 and others. The common reaction mechanism catalyzed by MoeB, ThiF and the E1 enzymes begins with a nucleophilic attack of the C-terminal carboxylate of MoaD, ThiS and ubiquitin, respectively, on the alpha-phosphate of an ATP molecule bound at the active site of the activating enzymes, leading to the formation of a high-energy acyladenylate intermediate and subsequently to the formation of a thiocarboxylate at the C termini of MoaD and ThiS." Q#24136 - CGI_10007057 superfamily 202124 605 666 2.63E-09 54.8608 cl08340 UBACT superfamily - - Repeat in ubiquitin-activating (UBA) protein; Repeat in ubiquitin-activating (UBA) protein. Q#24137 - CGI_10007058 superfamily 217316 111 185 0.00107319 37.606 cl03832 DUF234 superfamily - - Archaea bacterial proteins of unknown function; Archaea bacterial proteins of unknown function. Q#24137 - CGI_10007058 superfamily 110440 481 507 0.0038662 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#24138 - CGI_10005341 superfamily 217249 182 325 3.72E-81 245.994 cl03742 Prp18 superfamily - - "Prp18 domain; The splicing factor Prp18 is required for the second step of pre-mRNA splicing. The structure of a large fragment of the Saccharomyces cerevisiae Prp18 is known. This fragment is fully active in yeast splicing in vitro and includes the sequences of Prp18 that have been evolutionarily conserved. The core structure consists of five alpha-helices that adopt a novel fold. The most highly conserved region of Prp18, a nearly invariant stretch of 19 aa, forms part of a loop between two alpha-helices and may interact with the U5 small nuclear ribonucleoprotein particles." Q#24138 - CGI_10005341 superfamily 207680 75 104 6.61E-09 51.2047 cl02632 PRP4 superfamily - - pre-mRNA processing factor 4 (PRP4) like; This small domain is found on PRP4 ribonuleoproteins. PRP4 is a U4/U6 small nuclear ribonucleoprotein that is involved in pre-mRNA processing. Q#24140 - CGI_10005343 superfamily 243212 179 294 8.11E-10 55.4278 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#24141 - CGI_10005344 superfamily 241613 2058 2092 3.56E-09 56.061 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2861 2894 5.03E-09 55.6758 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 384 417 8.42E-09 54.9054 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2775 2807 9.68E-09 54.5202 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 302 336 1.04E-08 54.5202 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2099 2133 3.41E-08 52.9794 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2947 2981 6.25E-08 52.209 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 262 293 1.26E-07 51.4386 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2817 2851 1.68E-07 51.0534 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 183 217 1.93E-07 51.0534 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2569 2603 2.11E-07 50.6682 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2527 2564 2.16E-07 50.6682 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2904 2936 9.89E-07 48.7422 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 1730 1762 4.15E-06 47.2014 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 2691 2724 5.24E-06 46.8162 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 1810 1846 5.29E-06 46.8162 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 341 376 0.000128418 42.579 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 245213 546 579 0.000146077 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24141 - CGI_10005344 superfamily 241613 2024 2049 0.000291531 41.4234 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 241613 468 501 0.0037813 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24141 - CGI_10005344 superfamily 214531 1571 1613 3.40E-12 64.9304 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 745 787 1.37E-11 63.0044 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 215683 721 762 3.69E-11 61.8023 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#24141 - CGI_10005344 superfamily 214531 2376 2416 1.07E-09 57.6116 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 981 1015 3.22E-09 56.0709 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 2286 2328 8.67E-09 54.9153 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 658 697 8.47E-08 52.2189 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 215683 1548 1588 5.56E-07 49.4759 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#24141 - CGI_10005344 superfamily 215683 2350 2391 6.31E-07 49.4759 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#24141 - CGI_10005344 superfamily 214531 24 59 1.29E-06 48.7521 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 3233 3274 2.14E-06 47.9817 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 3152 3182 4.59E-06 46.8261 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 1401 1442 5.42E-06 46.8261 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 3186 3230 6.84E-06 46.4409 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 2246 2284 1.80E-05 45.2853 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 221695 2156 2179 9.30E-05 42.8274 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#24141 - CGI_10005344 superfamily 221695 3008 3031 0.000212083 42.057 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#24141 - CGI_10005344 superfamily 214531 1498 1528 0.000242004 41.8185 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 214531 2417 2458 0.002686 38.7369 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24141 - CGI_10005344 superfamily 245213 2176 2216 0.00348874 38.382 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24141 - CGI_10005344 superfamily 245213 3028 3060 0.00397568 38.382 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24141 - CGI_10005344 superfamily 215683 633 673 0.00582345 37.9199 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#24141 - CGI_10005344 superfamily 245213 2134 2169 0.00919747 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24142 - CGI_10005345 superfamily 241613 9 43 2.43E-09 53.7498 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24142 - CGI_10005345 superfamily 245213 125 156 0.00151249 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24142 - CGI_10005345 superfamily 214531 278 323 3.48E-09 53.3745 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24142 - CGI_10005345 superfamily 214531 326 366 1.42E-06 46.0557 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24142 - CGI_10005345 superfamily 214531 235 271 4.68E-05 41.4333 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24142 - CGI_10005345 superfamily 215683 209 250 0.000291329 39.0755 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#24142 - CGI_10005345 superfamily 241613 47 80 0.000628272 37.9981 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24142 - CGI_10005345 superfamily 214531 565 600 0.00187236 36.8109 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24142 - CGI_10005345 superfamily 214531 368 405 0.00425452 35.6553 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24143 - CGI_10005346 superfamily 215821 37 129 3.51E-42 140.453 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#24145 - CGI_10006969 superfamily 241613 51 86 3.42E-05 39.1122 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24146 - CGI_10006970 superfamily 247724 18 188 7.77E-54 171.958 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24147 - CGI_10006971 superfamily 246597 1 112 4.09E-47 159.361 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#24147 - CGI_10006971 superfamily 145792 202 227 2.88E-07 45.3295 cl10597 Antistasin superfamily - - Antistasin family; Members of this family are inhibitors of trypsin family proteases. This domain is highly disulphide bonded. The domain is also found in some large extracellular proteins in multiple copies. Q#24156 - CGI_10021509 superfamily 220695 37 238 8.74E-06 45.2623 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#24157 - CGI_10021510 superfamily 202351 289 402 4.86E-17 77.9223 cl03662 Na_Pi_cotrans superfamily C - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#24157 - CGI_10021510 superfamily 202351 32 155 1.21E-10 59.0476 cl03662 Na_Pi_cotrans superfamily C - Na+/Pi-cotransporter; This is a family of mainly mammalian type II renal Na+/Pi-cotransporters with other related sequences from lower eukaryotes and bacteria some of which are also Na+/Pi-cotransporters. In the kidney the type II renal Na+/Pi-cotransporters protein allows re-absorption of filtered Pi in the proximal tubule. Q#24158 - CGI_10021511 superfamily 245206 150 217 9.23E-11 58.064 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#24159 - CGI_10021512 superfamily 148662 3 81 3.54E-47 147.08 cl06288 SF3b10 superfamily - - Splicing factor 3B subunit 10 (SF3b10); This family consists of several eukaryotic splicing factor 3B subunit 10 (SF3b10) proteins. SF3b10 is a 10 kDa subunit of the splicing factor SF3b. SF3b associates with the splicing factor SF3a and a 12S RNA unit to form the U2 small nuclear ribonucleoproteins complex. SF3b10 and SF3b14b are also thought to facilitate the interaction of U2 with the branch site. Q#24160 - CGI_10021513 superfamily 217473 96 320 4.91E-28 113.999 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#24161 - CGI_10021514 superfamily 201217 284 333 1.87E-11 59.4616 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#24161 - CGI_10021514 superfamily 201217 386 437 6.68E-11 57.9208 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#24161 - CGI_10021514 superfamily 201217 110 159 6.75E-08 49.0612 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#24161 - CGI_10021514 superfamily 201217 59 107 1.48E-06 45.2092 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#24161 - CGI_10021514 superfamily 201217 162 211 4.57E-06 44.0536 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#24161 - CGI_10021514 superfamily 201217 337 383 5.61E-05 40.5868 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#24161 - CGI_10021514 superfamily 205718 198 227 0.000824553 37.0846 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#24166 - CGI_10021519 superfamily 241597 2 64 6.99E-07 42.9912 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#24167 - CGI_10021520 superfamily 241874 748 1261 0 563.258 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24168 - CGI_10021521 superfamily 241874 24 576 0 554.013 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24170 - CGI_10021523 superfamily 241599 127 181 3.78E-21 84.2172 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#24170 - CGI_10021523 superfamily 146451 248 262 0.00271964 34.6423 cl08404 OAR superfamily - - OAR domain; OAR domain. Q#24172 - CGI_10021525 superfamily 243066 7 62 3.98E-09 53.7109 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24172 - CGI_10021525 superfamily 219619 362 419 5.66E-08 49.8988 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#24173 - CGI_10021526 superfamily 219619 324 379 9.71E-08 49.1284 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#24173 - CGI_10021526 superfamily 243066 1 27 0.000959056 37.1473 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24174 - CGI_10021527 superfamily 198867 284 383 5.41E-20 86.6264 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#24174 - CGI_10021527 superfamily 243066 168 275 1.29E-15 73.8057 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24174 - CGI_10021527 superfamily 243146 616 662 1.38E-09 54.975 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#24174 - CGI_10021527 superfamily 243146 578 627 2.73E-06 45.6271 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#24175 - CGI_10021528 superfamily 247684 7 436 9.55E-92 290.333 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24176 - CGI_10021529 superfamily 100116 271 319 1.07E-10 58.5108 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#24176 - CGI_10021529 superfamily 100116 377 425 1.77E-10 58.1256 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#24176 - CGI_10021529 superfamily 100116 456 504 2.13E-10 57.7404 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#24176 - CGI_10021529 superfamily 100116 324 372 1.47E-09 55.4292 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#24176 - CGI_10021529 superfamily 100116 216 261 6.52E-06 44.6436 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#24176 - CGI_10021529 superfamily 100116 661 711 0.00825406 35.3988 cl10082 NF-X1-zinc-finger superfamily - - "Presumably a zinc binding domain, which has been shown to bind to DNA in the human nuclear transcriptional repressor NF-X1. The zinc finger can be characterized by the pattern C-X(1-6)-H-X-C-X3-C(H/C)-X(3-4)-(H/C)-X(1-10)-C. The NF-X1 zinc finger co-occurs with atypical RING-finger and R3H domains. Human NF-X1 is involved in the transcriptional repression of major histocompatibility complex class II genes. The drosophila homolog encoded by stc (shuttle craft) plays a role in embryonic development, and the Arabidopsis homologue AtNFXL1 has been shown to function in the response to trichothecene and other defense mechanisms." Q#24177 - CGI_10021530 superfamily 241594 249 600 2.95E-146 429.678 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#24178 - CGI_10021531 superfamily 247805 517 656 1.08E-26 108.579 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#24178 - CGI_10021531 superfamily 247905 791 922 4.38E-26 106.168 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#24178 - CGI_10021531 superfamily 247792 215 260 6.41E-12 62.8484 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24178 - CGI_10021531 superfamily 244870 47 122 1.06E-10 60.9926 cl08238 PA superfamily N - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#24178 - CGI_10021531 superfamily 247804 1163 1202 5.16E-06 45.6442 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#24178 - CGI_10021531 superfamily 220113 1217 1327 2.41E-48 169.684 cl07653 SLIDE superfamily - - "SLIDE; The SLIDE domain adopts a secondary structure comprising a main core of three alpha-helices. It has a role in DNA binding, contacting DNA target sites similar to c-Myb (pfam00249) repeats or homeodomains." Q#24178 - CGI_10021531 superfamily 220112 1061 1139 1.06E-18 84.2504 cl07652 HAND superfamily C - "HAND; The HAND domain adopts a secondary structure consisting of four alpha helices, three of which (H2, H3, H4) form an L-like configuration. Helix H2 runs antiparallel to helices H3 and H4, packing closely against helix H4, whilst helix H1 reposes in the concave surface formed by these three helices and runs perpendicular to them. The domain confers DNA and nucleosome binding properties to the protein." Q#24179 - CGI_10021532 superfamily 243555 20 182 1.23E-15 70.4978 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#24181 - CGI_10021534 superfamily 243072 38 82 2.21E-11 60.0898 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24183 - CGI_10021536 superfamily 241782 267 660 2.18E-130 391.181 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#24183 - CGI_10021536 superfamily 241659 7 46 0.00500433 35.9587 cl00175 alpha-crystallin-Hsps_p23-like superfamily C - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#24185 - CGI_10021538 superfamily 110440 213 239 0.00199074 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#24186 - CGI_10021540 superfamily 241563 68 109 4.40E-07 47.0888 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24189 - CGI_10000776 superfamily 242881 8 184 6.60E-116 334.156 cl02099 CK_II_beta superfamily - - Casein kinase II regulatory subunit; Casein kinase II regulatory subunit. Q#24197 - CGI_10014699 superfamily 241563 35 66 0.00940683 33.6068 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24198 - CGI_10014700 superfamily 241570 250 359 6.91E-17 77.7514 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24199 - CGI_10014701 superfamily 247723 43 169 3.85E-45 149.423 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24199 - CGI_10014701 superfamily 245716 172 196 0.000651843 36.4533 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#24199 - CGI_10014701 superfamily 245716 13 39 0.00148981 35.2267 cl11592 zf-CCCH superfamily - - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#24200 - CGI_10014702 superfamily 241546 21 108 4.23E-27 106.591 cl00011 PLAT superfamily N - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#24201 - CGI_10014703 superfamily 147799 1 74 1.50E-08 46.4219 cl05422 Apc13p superfamily - - Apc13p protein; The anaphase-promoting complex (APC) is a conserved multi-subunit ubiquitin ligase required for the degradation of key cell cycle regulators Members of this family are components of the anaphase-promoting complex homologous to Apc13p. Q#24202 - CGI_10014704 superfamily 241739 82 286 1.81E-58 202.411 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#24202 - CGI_10014704 superfamily 241739 688 817 5.24E-49 175.832 cl00268 class_II_aaRS-like_core superfamily N - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#24202 - CGI_10014704 superfamily 241546 602 686 6.28E-23 95.8056 cl00011 PLAT superfamily C - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#24202 - CGI_10014704 superfamily 245205 2 67 4.95E-11 60.2717 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#24202 - CGI_10014704 superfamily 243086 481 521 2.42E-09 54.3034 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#24206 - CGI_10014709 superfamily 241698 102 320 1.43E-79 244.481 cl00220 cysteine_hydrolases superfamily - - "Cysteine hydrolases; This family contains amidohydrolases, like CSHase (N-carbamoylsarcosine amidohydrolase), involved in creatine metabolism and nicotinamidase, converting nicotinamide to nicotinic acid and ammonia in the pyridine nucleotide cycle. It also contains isochorismatase, an enzyme that catalyzes the conversion of isochorismate to 2,3-dihydroxybenzoate and pyruvate, via the hydrolysis of the vinyl ether bond, and other related enzymes with unknown function." Q#24206 - CGI_10014709 superfamily 247856 24 88 3.64E-09 52.5501 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#24207 - CGI_10014710 superfamily 247792 6 59 1.22E-09 54.7592 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24207 - CGI_10014710 superfamily 128778 160 278 0.000285129 39.9407 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#24207 - CGI_10014710 superfamily 241563 105 129 0.0040291 35.918 cl00034 BBOX superfamily C - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24209 - CGI_10014712 superfamily 221425 1240 1533 4.84E-80 267.694 cl13536 Nup96 superfamily - - Nuclear protein 96; Nup96 (often known by the name of its yeast homolog Nup145C) is part of the Nup84 heptameric complex in the nuclear pore complex. Nup96 complexes with Sec13 in the middle of the heptamer. The function of the heptamer is to coat the curvature of the nuclear pore complex between the inner and outer nuclear membranes. Nup96 is predicted to be an alpha helical solenoid. The interaction between Nup96 and Sec13 is the point of curvature in the heptameric complex. Q#24209 - CGI_10014712 superfamily 202886 706 846 1.99E-56 193.979 cl04399 Nucleoporin2 superfamily - - Nucleoporin autopeptidase; Nucleoporin autopeptidase. Q#24209 - CGI_10014712 superfamily 222274 110 177 1.56E-05 45.1696 cl18658 Nucleoporin_FG superfamily C - "Nucleoporin FG repeat region; This family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145." Q#24209 - CGI_10014712 superfamily 222274 226 323 0.00766887 36.6952 cl18658 Nucleoporin_FG superfamily - - "Nucleoporin FG repeat region; This family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145." Q#24210 - CGI_10014713 superfamily 243088 53 161 4.97E-09 53.566 cl02563 PX_domain superfamily - - "The Phox Homology domain, a phosphoinositide binding module; The PX domain is a phosphoinositide (PI) binding module involved in targeting proteins to membranes. Proteins containing PX domains interact with PIs and have been implicated in highly diverse functions such as cell signaling, vesicular trafficking, protein sorting, lipid modification, cell polarity and division, activation of T and B cells, and cell survival. Many members of this superfamily bind phosphatidylinositol-3-phosphate (PI3P) but in some cases, other PIs such as PI4P or PI(3,4)P2, among others, are the preferred substrates. In addition to protein-lipid interaction, the PX domain may also be involved in protein-protein interaction, as in the cases of p40phox, p47phox, and some sorting nexins (SNXs). The PX domain is conserved from yeast to humans and is found in more than 100 proteins. The majority of PX domain-containing proteins are SNXs, which play important roles in endosomal sorting." Q#24210 - CGI_10014713 superfamily 245835 250 294 1.25E-06 47.1161 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#24211 - CGI_10014714 superfamily 245206 4 255 6.75E-90 269.281 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#24212 - CGI_10014715 superfamily 241884 47 242 8.04E-107 325.294 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#24212 - CGI_10014715 superfamily 245595 405 696 3.19E-101 314.459 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#24212 - CGI_10014715 superfamily 216944 309 381 1.45E-10 58.3627 cl03496 Propep_M14 superfamily - - "Carboxypeptidase activation peptide; Carboxypeptidases are found in abundance in pancreatic secretions. The pro-segment moiety (activation peptide) accounts for up to a quarter of the total length of the peptidase, and is responsible for modulation of folding and activity of the pro-enzyme." Q#24213 - CGI_10014716 superfamily 217293 37 235 1.24E-36 133.912 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#24213 - CGI_10014716 superfamily 202474 242 297 7.32E-05 42.6409 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#24214 - CGI_10000941 superfamily 245670 86 267 1.01E-60 198.574 cl11519 DENN superfamily - - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#24214 - CGI_10000941 superfamily 208095 304 369 8.54E-17 74.9375 cl04084 dDENN superfamily - - dDENN domain; This region is always found associated with pfam02141. It is predicted to form a globular domain. This domain is predicted to be completely alpha helical. Although not statistically supported it has been suggested that this domain may be similar to members of the Rho/Rac/Cdc42 GEF family. Q#24214 - CGI_10000941 superfamily 243635 4 79 4.29E-15 70.8264 cl04085 uDENN superfamily - - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#24215 - CGI_10025581 superfamily 248097 77 181 2.83E-14 65.7494 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24218 - CGI_10025584 superfamily 245847 2 92 1.42E-07 44.855 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#24220 - CGI_10025586 superfamily 203031 105 164 5.33E-07 46.1672 cl04548 FLYWCH superfamily - - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#24221 - CGI_10025587 superfamily 247856 106 159 0.000367049 37.1421 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#24222 - CGI_10025588 superfamily 244539 110 334 3.53E-36 132.814 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#24222 - CGI_10025588 superfamily 203841 250 423 6.98E-25 99.3344 cl17716 NAD_binding_6 superfamily - - Ferric reductase NAD binding domain; Ferric reductase NAD binding domain. Q#24222 - CGI_10025588 superfamily 242267 21 72 1.15E-07 49.596 cl01043 Ferric_reduct superfamily N - "Ferric reductase like transmembrane component; This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease." Q#24225 - CGI_10025591 superfamily 248097 128 256 5.38E-19 80.0018 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24227 - CGI_10025593 superfamily 241760 287 331 2.49E-15 72.2925 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#24227 - CGI_10025593 superfamily 243157 27 83 1.25E-08 53.6965 cl02720 PB1 superfamily N - "The PB1 domain is a modular domain mediating specific protein-protein interactions which play a role in many critical cell processes, such as osteoclastogenesis, angiogenesis, early cardiovascular development, and cell polarity. A canonical PB1-PB1 interaction, which involves heterodimerization of two PB1 domain, is required for the formation of macromolecular signaling complexes ensuring specificity and fidelity during cellular signaling. The interaction between two PB1 domain depends on the type of PB1. There are three types of PB1 domains: type I which contains an OPCA motif, acidic aminoacid cluster, type II which contains a basic cluster, and type I/II which contains both an OPCA motif and a basic cluster. Interactions of PB1 domains with other protein domains have been described as a noncanonical PB1-interactions. The PB1 domain module is conserved in amoebas, fungi, animals, and plants." Q#24228 - CGI_10025594 superfamily 247044 7 121 1.85E-50 172.793 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24228 - CGI_10025594 superfamily 247044 619 716 5.72E-39 140.51 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24228 - CGI_10025594 superfamily 247044 391 491 1.54E-38 139.328 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24228 - CGI_10025594 superfamily 247044 521 604 1.03E-29 114.251 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24228 - CGI_10025594 superfamily 247044 252 350 4.97E-29 112.343 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24228 - CGI_10025594 superfamily 247044 134 207 1.81E-26 105.012 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24228 - CGI_10025594 superfamily 207613 784 819 4.47E-08 50.7805 cl02491 VHP superfamily - - Villin headpiece domain; Villin headpiece domain. Q#24230 - CGI_10025596 superfamily 243077 623 682 1.03E-19 84.1341 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#24230 - CGI_10025596 superfamily 243034 505 595 2.26E-16 75.8795 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#24230 - CGI_10025596 superfamily 243034 297 369 1.37E-10 58.9308 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#24230 - CGI_10025596 superfamily 243034 340 484 7.38E-08 50.8416 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#24230 - CGI_10025596 superfamily 241547 37 289 3.59E-94 295.862 cl00012 alpha_CA superfamily - - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#24233 - CGI_10025599 superfamily 217293 1 38 1.30E-06 44.5459 cl03788 Neur_chan_LBD superfamily NC - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#24234 - CGI_10025600 superfamily 217293 315 509 5.59E-38 140.846 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#24234 - CGI_10025600 superfamily 217293 33 222 1.79E-32 125.438 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#24234 - CGI_10025600 superfamily 202474 517 612 2.44E-05 44.9521 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#24234 - CGI_10025600 superfamily 202474 230 280 0.00473856 38.0185 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#24235 - CGI_10025601 superfamily 248097 155 283 5.60E-13 63.4382 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24236 - CGI_10025602 superfamily 247724 7 173 7.95E-75 224.714 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24237 - CGI_10025603 superfamily 243072 74 194 5.75E-22 88.5946 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24237 - CGI_10025603 superfamily 243073 230 275 1.64E-06 43.9981 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#24238 - CGI_10025604 superfamily 189332 49 283 1.59E-76 252.623 cl14874 Luminal_IRE1_like superfamily - - "The Luminal domain, a dimerization domain, of Inositol-requiring protein 1-like proteins; The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), eukaryotic translation Initiation Factor 2-Alpha Kinase 3 (EIF2AK3), and similar proteins. IRE1 and EIF2AK3 are serine/threonine protein kinases (STKs) and are type I transmembrane proteins that are localized in the endoplasmic reticulum (ER). They are kinase receptors that are activated through the release of BiP, a chaperone bound to their luminal domains under unstressed conditions. This results in dimerization through their luminal domains, allowing trans-autophosphorylation of their kinase domains and activation. They play roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), contains an endoribonuclease domain in its cytoplasmic side and acts as an ER stress sensor. It is the oldest and most conserved component of the UPR in eukaryotes. Its activation results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. EIF2AK3, also called PKR-like Endoplasmic Reticulum Kinase (PERK), phosphorylates the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. It functions as the central regulator of translational control during the UPR pathway. In addition to the eIF-2 alpha subunit, EIF2AK3 also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR." Q#24238 - CGI_10025604 superfamily 246937 745 870 2.79E-62 206.668 cl15368 RNase_Ire1_like superfamily - - "RNase domain (also known as the kinase extension nuclease domain) of Ire1 and RNase L; This RNase domain is found in the multi-functional protein Ire1; Ire1 also contains a type I transmembrane serine/threonine protein kinase (STK) domain, and a Luminal dimerization domain. Ire1 is essential for the endoplasmic reticulum (ER) unfolded protein response (UPR). The UPR is activated when protein misfolding is detected in the ER in order to reduce the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1 acts as an ER stress sensor; IRE1 dimerizes through its N-terminal luminal domain and forms oligomers, promoting trans-autophosphorylation by its cytosolic kinase domain which stimulates its endoribonuclease (RNase) activity and results in the cleavage of its mRNA substrate, Hac1 in yeast and Xbp1 in metazoans, thus promoting a splicing event that enables translation into a transcription factor which activates the UPR. This RNase domain is also found in Ribonuclease L (RNase L), sometimes referred to as the 2-5A-dependent RNase. RNase L is a highly regulated, latent endoribonuclease widely expressed in most mammalian tissues. It is involved in the mediation of the antiviral and pro-apoptotic activities of the interferon-inducible 2-5A system; the interferon (IFN)-inducible 2'-5'-oligoadenylate synthetase (OAS)/RNase L pathway blocks infections by certain types of viruses through cleavage of viral and cellular single-stranded RNA. RNase L has been shown to have an impact on the pathogenesis of prostate cancer; the RNase L gene, RNASEL, has been identified as a strong candidate for the hereditary prostate cancer 1 (HPC1) allele." Q#24238 - CGI_10025604 superfamily 245201 490 739 1.47E-47 168.954 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24240 - CGI_10025606 superfamily 189332 58 84 1.80E-06 43.0745 cl14874 Luminal_IRE1_like superfamily C - "The Luminal domain, a dimerization domain, of Inositol-requiring protein 1-like proteins; The Luminal domain is a dimerization domain present in Inositol-requiring protein 1 (IRE1), eukaryotic translation Initiation Factor 2-Alpha Kinase 3 (EIF2AK3), and similar proteins. IRE1 and EIF2AK3 are serine/threonine protein kinases (STKs) and are type I transmembrane proteins that are localized in the endoplasmic reticulum (ER). They are kinase receptors that are activated through the release of BiP, a chaperone bound to their luminal domains under unstressed conditions. This results in dimerization through their luminal domains, allowing trans-autophosphorylation of their kinase domains and activation. They play roles in the signaling of the unfolded protein response (UPR), which is activated when protein misfolding is detected in the ER in order to decrease the synthesis of new proteins and increase the capacity of the ER to cope with the stress. IRE1, also called Endoplasmic reticulum (ER)-to-nucleus signaling protein (or ERN), contains an endoribonuclease domain in its cytoplasmic side and acts as an ER stress sensor. It is the oldest and most conserved component of the UPR in eukaryotes. Its activation results in the cleavage of its mRNA substrate, HAC1 in yeast and Xbp1 in metazoans, promoting a splicing event that enables translation into a transcription factor which activates the UPR. EIF2AK3, also called PKR-like Endoplasmic Reticulum Kinase (PERK), phosphorylates the alpha subunit of eIF-2, resulting in the downregulation of protein synthesis. It functions as the central regulator of translational control during the UPR pathway. In addition to the eIF-2 alpha subunit, EIF2AK3 also phosphorylates Nrf2, a leucine zipper transcription factor which regulates cellular redox status and promotes cell survival during the UPR." Q#24241 - CGI_10025607 superfamily 192522 625 714 7.42E-27 106.103 cl10974 DUF2404 superfamily - - Putative integral membrane protein conserved region (DUF2404); This domain is conserved from plants to humans. The function is not known. Q#24242 - CGI_10025608 superfamily 221329 31 183 1.85E-40 137.53 cl13389 DUF3456 superfamily - - "TLR4 regulator and MIR-interacting MSAP; This family of proteins, found from plants to humans, is PRAT4 (A and B), a Protein Associated with Toll-like receptor 4. The Toll family of receptors - TLRs - plays an essential role in innate recognition of microbial products, the first line of defence against bacterial infection. PRAT4A influences the subcellular distribution and the strength of TLR responses and alters the relative activity of each TLR. PRAT4B regulates TLR4 trafficking to the cell surface and the extent of its expression there. TLR4 recognizes lipopolysaccharide (LPS), one of the most immuno-stimulatory glycolipids constituting the outer membrane of the Gram-negative bacteria. This family has also been described as a SAP-like MIR-interacting protein family." Q#24243 - CGI_10025609 superfamily 243092 45 354 2.15E-18 82.768 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24244 - CGI_10025610 superfamily 243507 125 322 7.86E-86 270.38 cl03728 Alpha_kinase superfamily - - "Alpha-kinase family; This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains." Q#24245 - CGI_10025611 superfamily 247723 1198 1252 0.00197591 39.2105 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24246 - CGI_10025612 superfamily 243072 324 379 4.53E-09 55.8526 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24246 - CGI_10025612 superfamily 241584 458 543 4.00E-08 52.4987 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#24246 - CGI_10025612 superfamily 241645 1116 1203 2.54E-06 46.9076 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#24248 - CGI_10025614 superfamily 220679 174 247 4.13E-09 54.6405 cl18567 Methyltransf_16 superfamily C - Putative methyltransferase; Putative methyltransferase. Q#24248 - CGI_10025614 superfamily 220679 337 417 0.000721822 38.8473 cl18567 Methyltransf_16 superfamily N - Putative methyltransferase; Putative methyltransferase. Q#24249 - CGI_10025615 superfamily 247723 111 134 1.78E-10 54.8691 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24250 - CGI_10025616 superfamily 198867 101 181 3.01E-10 57.5535 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#24250 - CGI_10025616 superfamily 243066 2 75 4.53E-08 51.0789 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24250 - CGI_10025616 superfamily 243146 313 359 0.00193573 36.6574 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#24251 - CGI_10025617 superfamily 240425 1 156 3.87E-59 187.331 cl18912 PTZ00464 superfamily C - SNF-7-like protein; Provisional Q#24252 - CGI_10025618 superfamily 241644 551 690 4.62E-49 168.92 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#24252 - CGI_10025618 superfamily 248054 12 78 4.71E-08 50.9336 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#24252 - CGI_10025618 superfamily 248054 223 269 0.00511682 36.8045 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#24253 - CGI_10025619 superfamily 217046 37 174 1.18E-09 53.3512 cl03599 Reticulon superfamily - - "Reticulon; Reticulon, also know as neuroendocrine-specific protein (NSP), is a protein of unknown function which associates with the endoplasmic reticulum. This family represents the C-terminal domain of the three reticulon isoforms and their homologues." Q#24254 - CGI_10025620 superfamily 241832 26 71 2.89E-10 53.2743 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#24254 - CGI_10025620 superfamily 243175 132 189 8.68E-10 52.6828 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#24255 - CGI_10025621 superfamily 245209 49 123 1.41E-08 48.541 cl09936 PP-binding superfamily N - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#24256 - CGI_10025622 superfamily 247724 3 176 1.36E-134 381.853 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24256 - CGI_10025622 superfamily 247724 190 288 1.68E-69 215.832 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24259 - CGI_10025625 superfamily 247684 23 442 1.62E-78 256.435 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24264 - CGI_10001184 superfamily 110440 255 280 0.00423951 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#24265 - CGI_10001185 superfamily 241563 60 98 3.40E-06 44.2503 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24267 - CGI_10001283 superfamily 192535 49 145 0.000112454 41.0422 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#24270 - CGI_10006705 superfamily 216456 41 178 0.00162264 37.6882 cl03182 RYDR_ITPR superfamily N - "RIH domain; The RIH (RyR and IP3R Homology) domain is an extracellular domain from two types of calcium channels. This region is found in the ryanodine receptor and the inositol-1,4,5- trisphosphate receptor. This domain may form a binding site for IP3." Q#24271 - CGI_10006706 superfamily 241563 43 78 5.04E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24271 - CGI_10006706 superfamily 110440 504 531 0.00933903 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#24273 - CGI_10006708 superfamily 243097 123 256 6.74E-30 115.82 cl02572 PIPKc superfamily C - "Phosphatidylinositol phosphate kinases (PIPK) catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. CD alignment includes type II phosphatidylinositol phosphate kinases (PIPKII-beta), type I andII PIPK (-alpha, -beta, and -gamma) kinases and related yeast Fab1p and Mss4p kinases. Signaling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. The catalytic core domains of PIPKs are structurally similar to PI3K, PI4K, and cAMP-dependent protein kinases (PKA), the dimerization region is a unique feature of the PIPKs." Q#24273 - CGI_10006708 superfamily 243097 364 412 1.28E-09 57.2692 cl02572 PIPKc superfamily N - "Phosphatidylinositol phosphate kinases (PIPK) catalyze the phosphorylation of phosphatidylinositol phosphate on the fourth or fifth hydroxyl of the inositol ring, to form phosphatidylinositol bisphosphate. CD alignment includes type II phosphatidylinositol phosphate kinases (PIPKII-beta), type I andII PIPK (-alpha, -beta, and -gamma) kinases and related yeast Fab1p and Mss4p kinases. Signaling by phosphorylated species of phosphatidylinositol regulates secretion, vesicular trafficking, membrane translocation, cell adhesion, chemotaxis, DNA synthesis, and cell cycling. The catalytic core domains of PIPKs are structurally similar to PI3K, PI4K, and cAMP-dependent protein kinases (PKA), the dimerization region is a unique feature of the PIPKs." Q#24274 - CGI_10006709 superfamily 247727 69 174 0.000694576 37.7095 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#24277 - CGI_10006712 superfamily 151147 12 70 2.19E-11 58.2537 cl11240 DUF2475 superfamily - - Protein of unknown function (DUF2475); This family of proteins has no known function. Q#24277 - CGI_10006712 superfamily 151147 264 289 0.00043857 37.453 cl11240 DUF2475 superfamily C - Protein of unknown function (DUF2475); This family of proteins has no known function. Q#24278 - CGI_10006713 superfamily 241570 186 349 3.05E-10 58.1062 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24278 - CGI_10006713 superfamily 241570 37 168 2.99E-06 45.7798 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24279 - CGI_10006714 superfamily 246669 161 281 2.07E-26 100.792 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#24282 - CGI_10002019 superfamily 245213 50 83 0.000113784 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24282 - CGI_10002019 superfamily 245213 89 120 0.000222311 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24282 - CGI_10002019 superfamily 245213 341 372 0.000777743 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24282 - CGI_10002019 superfamily 245213 302 335 0.00299523 35.305 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24288 - CGI_10002401 superfamily 241609 38 113 2.46E-20 82.4259 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#24288 - CGI_10002401 superfamily 241609 121 198 2.72E-18 76.6479 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#24288 - CGI_10002401 superfamily 241613 3 28 0.00016003 37.9566 cl00104 LDLa superfamily N - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24288 - CGI_10002401 superfamily 241613 196 226 0.000912847 35.6454 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24293 - CGI_10005923 superfamily 245205 66 143 1.11E-06 44.1509 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#24297 - CGI_10005927 superfamily 248264 6 54 0.000430113 38.3722 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#24298 - CGI_10005928 superfamily 220379 66 117 6.90E-18 78.6259 cl10734 DRY_EERY superfamily C - "Alternative splicing regulator; This entry represents the conserved N-terminal region of SWAP (suppressor-of-white-apricot protein) proteins. This region contains two highly conserved motifs, viz: DRY and EERY, which appear to be the sites for alternative splicing of exons 2 and 3 of the SWAP mRNA. These proteins are thus thought to be involved in auto-regulation of pre-mRNA splicing. Most family members are associated with two Surp domains pfam01805 and an Arginine- serine-rich binding region towards the C-terminus." Q#24300 - CGI_10005930 superfamily 245206 26 291 3.56E-90 272.176 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#24301 - CGI_10005931 superfamily 247058 48 251 4.90E-53 182.375 cl15762 crotonase-like superfamily - - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#24302 - CGI_10005932 superfamily 248264 48 94 0.000924846 34.9054 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#24303 - CGI_10002673 superfamily 241563 297 333 5.76E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24304 - CGI_10002674 superfamily 248097 140 265 2.33E-31 114.285 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24304 - CGI_10002674 superfamily 248097 2 122 5.80E-23 91.5578 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24305 - CGI_10002675 superfamily 241563 65 97 3.38E-05 41.5539 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24307 - CGI_10017694 superfamily 241766 52 327 9.83E-126 364.126 cl00303 PNP_UDP_1 superfamily - - Phosphorylase superfamily; Members of this family include: purine nucleoside phosphorylase (PNP) Uridine phosphorylase (UdRPase) 5'-methylthioadenosine phosphorylase (MTA phosphorylase) Q#24309 - CGI_10017696 superfamily 152459 290 519 1.02E-71 234.588 cl13461 DUF3512 superfamily - - Domain of unknown function (DUF3512); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 231 to 249 amino acids in length. This domain is found associated with pfam00439. Q#24309 - CGI_10017696 superfamily 243084 139 235 2.73E-48 164.891 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#24311 - CGI_10017698 superfamily 241580 85 162 1.33E-48 163.108 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#24313 - CGI_10017701 superfamily 241580 70 147 6.83E-51 166.96 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#24314 - CGI_10017702 superfamily 245206 10 339 0 536.025 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#24315 - CGI_10017703 superfamily 245603 174 247 0.000672922 37.316 cl11403 pepsin_retropepsin_like superfamily - - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#24315 - CGI_10017703 superfamily 247057 9 63 3.29E-07 46.9077 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#24315 - CGI_10017703 superfamily 247057 74 134 2.52E-06 44.5933 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#24316 - CGI_10017704 superfamily 244895 21 483 9.15E-127 381.124 cl08294 Peptidase_M17 superfamily - - "Cytosol aminopeptidase family, N-terminal and catalytic domains. Family M17 contains zinc- and manganese-dependent exopeptidases ( EC 3.4.11.1), including leucine aminopeptidase. They catalyze removal of amino acids from the N-terminus of a protein and play a key role in protein degradation and in the metabolism of biologically active peptides. They do not contain HEXXH motif (which is used as one of the signature patterns to group the peptidase families) in the metal-binding site. The two associated zinc ions and the active site are entirely enclosed within the C-terminal catalytic domain in leucine aminopeptidase. The enzyme is a hexamer, with the catalytic domains clustered around the three-fold axis, and the two trimers related to one another by a two-fold rotation. The N-terminal domain is structurally similar to the ADP-ribose binding Macro domain. This family includes proteins from bacteria, archaea, animals and plants." Q#24317 - CGI_10017705 superfamily 198867 158 251 1.24E-18 82.004 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#24317 - CGI_10017705 superfamily 243066 32 150 5.55E-12 63.0201 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24317 - CGI_10017705 superfamily 243146 392 438 1.07E-08 52.2786 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#24317 - CGI_10017705 superfamily 243146 356 403 1.68E-08 51.4051 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#24317 - CGI_10017705 superfamily 243146 452 498 8.90E-07 46.3975 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#24317 - CGI_10017705 superfamily 243146 487 540 0.00475513 35.7151 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#24318 - CGI_10017706 superfamily 112128 212 415 4.50E-92 279.37 cl03992 TF_AP-2 superfamily - - Transcription factor AP-2; Transcription factor AP-2. Q#24319 - CGI_10017707 superfamily 242902 222 390 2.66E-41 145.927 cl02144 TLD superfamily - - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#24320 - CGI_10017708 superfamily 244906 116 186 2.97E-12 61.8324 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#24320 - CGI_10017708 superfamily 244906 217 278 1.65E-07 48.3504 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#24320 - CGI_10017708 superfamily 207637 310 384 0.000213476 39.1282 cl02541 CIDE_N superfamily - - "CIDE_N domain, found at the N-terminus of the CIDE (cell death-inducing DFF45-like effector) proteins, as well as CAD nuclease (caspase-activated DNase/DNA fragmentation factor, DFF40) and its inhibitor, ICAD(DFF45). These proteins are associated with the chromatin condensation and DNA fragmentation events of apoptosis; the CIDE_N domain is thought to regulate the activity of ICAD/DFF45, and the CAD/DFF40 and CIDE nucleases during apoptosis. The CIDE-N domain is also found in the FSP27/CIDE-C protein." Q#24321 - CGI_10017709 superfamily 243082 672 955 8.58E-40 148.444 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#24321 - CGI_10017709 superfamily 244906 113 185 1.26E-15 73.3884 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#24321 - CGI_10017709 superfamily 244906 467 538 3.30E-13 66.4548 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#24321 - CGI_10017709 superfamily 244906 229 290 6.50E-06 44.8836 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#24322 - CGI_10017710 superfamily 242883 25 206 2.77E-64 200.314 cl02103 Maf1 superfamily - - Maf1 regulator; Maf1 is a negative regulator of RNA polymerase III. It targets the initiation factor TFIIIB. Q#24323 - CGI_10017711 superfamily 247792 675 717 0.00558591 35.8844 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24323 - CGI_10017711 superfamily 241645 288 359 5.89E-13 65.6504 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#24324 - CGI_10017712 superfamily 247724 16 164 6.99E-77 230.008 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24325 - CGI_10017713 superfamily 245201 243 504 2.77E-49 170.11 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24325 - CGI_10017713 superfamily 247725 119 215 2.46E-23 94.8349 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24326 - CGI_10017714 superfamily 248100 1088 1148 3.19E-07 49.0748 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#24326 - CGI_10017714 superfamily 248100 925 985 1.63E-05 44.0672 cl17546 PQ-loop superfamily - - "PQ loop repeat; Members of this family are all membrane bound proteins possessing a pair of repeats each spanning two transmembrane helices connected by a loop. The PQ motif found on loop 2 is critical for the localisation of cystinosin to lysosomes. However, the PQ motif appears not to be a general lysosome-targeting motif. It is thought likely to possess a more general function. Most probably this involves a glutamine residue." Q#24327 - CGI_10017715 superfamily 243092 13 307 4.38E-73 231.455 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24328 - CGI_10017716 superfamily 222150 563 588 1.88E-06 45.4605 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24328 - CGI_10017716 superfamily 222150 535 559 3.92E-05 41.6085 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24329 - CGI_10017717 superfamily 241597 259 319 4.39E-16 73.4219 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#24330 - CGI_10017718 superfamily 218660 8 152 3.36E-38 130.147 cl05277 DUF788 superfamily - - Protein of unknown function (DUF788); This family consists of several eukaryotic proteins of unknown function. Q#24332 - CGI_10017720 superfamily 243034 29 126 2.80E-08 51.2268 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#24332 - CGI_10017720 superfamily 243034 172 248 1.96E-07 48.5304 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#24333 - CGI_10017721 superfamily 215733 333 590 9.80E-12 64.8939 cl02811 E1-E2_ATPase superfamily - - E1-E2 ATPase; E1-E2 ATPase. Q#24333 - CGI_10017721 superfamily 222006 695 803 3.07E-09 55.6914 cl16182 Hydrolase_like2 superfamily - - Putative hydrolase of sodium-potassium ATPase alpha subunit; This is a putative hydrolase of the sodium-potassium ATPase alpha subunit. Q#24334 - CGI_10017722 superfamily 244906 58 127 3.18E-19 79.1664 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#24336 - CGI_10017724 superfamily 241826 20 49 1.94E-05 37.5413 cl00380 Ribosomal_L36 superfamily - - Ribosomal protein L36; Ribosomal protein L36. Q#24338 - CGI_10017726 superfamily 215686 5 135 3.38E-06 42.4045 cl18340 Lipocalin superfamily - - "Lipocalin / cytosolic fatty-acid binding protein family; Lipocalins are transporters for small hydrophobic molecules, such as lipids, steroid hormones, bilins, and retinoids. The family also encompasses the enzyme prostaglandin D synthase (EC:5.3.99.2). Alignment subsumes both the lipocalin and fatty acid binding protein signatures from PROSITE. This is supported on structural and functional grounds. The structure is an eight-stranded beta barrel." Q#24341 - CGI_10017729 superfamily 248012 82 175 0.00311852 35.3768 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#24342 - CGI_10017730 superfamily 247866 10 128 5.62E-05 41.284 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#24343 - CGI_10002729 superfamily 245882 21 396 8.83E-164 470.235 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#24344 - CGI_10002730 superfamily 217701 71 164 7.10E-06 43.4809 cl04237 Retrotrans_gag superfamily - - "Retrotransposon gag protein; Gag or Capsid-like proteins from LTR retrotransposons. There is a central motif QGXXEXXXXXFXXLXXH that is common to Retroviridae gag-proteins, but is poorly conserved." Q#24347 - CGI_10002733 superfamily 220964 47 123 0.00496917 35.6545 cl12630 DUF2869 superfamily NC - Protein of unknown function (DUF2869); This bacterial family of proteins has no known function. Q#24350 - CGI_10015678 superfamily 216411 1069 1189 9.03E-19 84.6358 cl15974 MARVEL superfamily N - "Membrane-associating domain; MARVEL domain-containing proteins are often found in lipid-associating proteins - such as Occludin and MAL family proteins. It may be part of the machinery of membrane apposition events, such as transport vesicle biogenesis." Q#24351 - CGI_10015679 superfamily 247692 128 480 2.26E-44 160.149 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#24351 - CGI_10015679 superfamily 247692 1 153 3.94E-06 47.9746 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#24352 - CGI_10015680 superfamily 247692 128 480 1.23E-40 149.749 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#24352 - CGI_10015680 superfamily 247692 1 65 5.89E-06 47.1398 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#24353 - CGI_10015681 superfamily 247692 128 480 1.58E-35 135.111 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#24353 - CGI_10015681 superfamily 247692 1 65 8.52E-06 46.7546 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#24358 - CGI_10015686 superfamily 248458 40 376 1.34E-07 51.9309 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24360 - CGI_10015688 superfamily 148067 294 385 7.88E-30 111.595 cl05643 DUF1011 superfamily - - Protein of unknown function (DUF1011); Family of uncharacterized eukaryotic proteins. Q#24361 - CGI_10015689 superfamily 241578 276 433 6.85E-37 133.571 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24362 - CGI_10015690 superfamily 245201 46 235 2.52E-55 181.281 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24363 - CGI_10015691 superfamily 241645 43 97 5.24E-16 72.9103 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#24364 - CGI_10015692 superfamily 216167 153 292 4.06E-32 123.466 cl02999 DNA_photolyase superfamily - - DNA photolyase; This domain binds a light harvesting cofactor. Q#24365 - CGI_10015693 superfamily 220692 37 338 2.24E-08 53.7473 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#24366 - CGI_10015694 superfamily 220692 43 335 6.79E-24 99.2009 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#24367 - CGI_10015695 superfamily 201362 8 112 8.47E-36 124.785 cl08277 Motile_Sperm superfamily - - MSP (Major sperm protein) domain; Major sperm proteins are involved in sperm motility. These proteins oligomerise to form filaments. This family contains many other proteins. Q#24372 - CGI_10015700 superfamily 241578 762 938 6.60E-47 167.392 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24372 - CGI_10015700 superfamily 207701 519 635 2.30E-38 140.893 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#24372 - CGI_10015700 superfamily 241578 32 208 5.84E-30 118.472 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24373 - CGI_10015701 superfamily 217473 1 73 8.77E-08 51.9821 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#24376 - CGI_10007241 superfamily 248279 63 181 3.83E-38 139.009 cl17725 zf-HC5HC2H superfamily - - "PHD-like zinc-binding domain; The members of this family are annotated as containing PHD domain, but the zinc-binding region here is not typical of PHD domains. The conformation here is a well-conserved cysteine-histidine rich region spanning 90 residues, where the Cys and His are arranged as HxxC(31)CxxC(6)CxxCxxxxCxxxxHxxC (21)CxxH." Q#24376 - CGI_10007241 superfamily 247999 20 54 2.26E-10 57.2703 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#24382 - CGI_10003314 superfamily 245213 511 546 6.02E-09 53.7946 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24382 - CGI_10003314 superfamily 245213 359 394 7.54E-09 53.4094 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24382 - CGI_10003314 superfamily 245213 549 584 1.83E-07 49.1722 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24382 - CGI_10003314 superfamily 245213 435 470 2.43E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24382 - CGI_10003314 superfamily 245213 397 432 2.89E-07 48.787 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24382 - CGI_10003314 superfamily 245213 473 508 3.09E-06 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24382 - CGI_10003314 superfamily 245213 321 357 0.000159523 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24382 - CGI_10003314 superfamily 241583 649 846 4.65E-50 178.336 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#24383 - CGI_10003315 superfamily 245213 38 73 3.74E-09 49.5574 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24383 - CGI_10003315 superfamily 245213 76 111 2.54E-06 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24384 - CGI_10004837 superfamily 243072 14 150 2.36E-29 106.699 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24386 - CGI_10004839 superfamily 218771 2 142 7.60E-08 47.1944 cl05420 Synaphin superfamily - - "Synaphin protein; This family consists of several eukaryotic synaphin 1 and 2 proteins. Synaphin/complexin is a cytosolic protein that preferentially binds to syntaxin within the SNARE complex. Synaphin promotes SNAREs to form precomplexes that oligomerise into higher order structures. A peptide from the central, syntaxin binding domain of synaphin competitively inhibits these two proteins from interacting and prevents SNARE complexes from oligomerising. It is thought that oligomerisation of SNARE complexes into a higher order structure creates a SNARE scaffold for efficient, regulated fusion of synaptic vesicles. Synaphin promotes neuronal exocytosis by promoting interaction between the complementary syntaxin and synaptobrevin transmembrane regions that reside in opposing membranes prior to fusion." Q#24388 - CGI_10004841 superfamily 241583 249 444 1.42E-67 227.113 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#24388 - CGI_10004841 superfamily 245321 461 537 3.14E-33 125.045 cl10507 Disintegrin superfamily - - Disintegrin; Disintegrin. Q#24388 - CGI_10004841 superfamily 246968 540 685 6.32E-33 125.935 cl15456 ADAM_CR superfamily - - ADAM cysteine-rich; ADAMs are membrane-anchored proteases that proteolytically modify cell surface and extracellular matrix (ECM) in order to alter cell behaviour. It has been shown that the cysteine-rich domain of ADAM13 regulates the protein's metalloprotease activity. Q#24388 - CGI_10004841 superfamily 216572 70 219 2.22E-20 89.2562 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#24391 - CGI_10001742 superfamily 245814 81 142 6.92E-09 48.9423 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24391 - CGI_10001742 superfamily 245814 4 63 0.0012159 35.1737 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24393 - CGI_10003776 superfamily 241841 11 124 9.75E-43 139.575 cl00399 MoaE superfamily - - "MoaE family. Members of this family are involved in biosynthesis of the molybdenum cofactor (Moco), an essential cofactor for a diverse group of redox enzymes. Moco biosynthesis is an evolutionarily conserved pathway present in eubacteria, archaea and eukaryotes. Moco contains a tricyclic pyranopterin, termed molybdopterin (MPT), which carries the cis-dithiolene group responsible for molybdenum ligation. This dithiolene group is generated by MPT synthase in the second major step in Moco biosynthesis. MPT synthase is a heterotetramer consisting of two large (MoaE) and two small (MoaD) subunits." Q#24394 - CGI_10003777 superfamily 240441 1430 1484 4.64E-23 95.7095 cl18913 Na_channel_gate superfamily - - Inactivation gate of the voltage-gated sodium channel alpha subunits; This region is part of the intracellular linker between domains III and IV of the alpha subunits of voltage-gated sodium channels. It is responsible for fast inactivation of the channel and essential for proper physiological function. Q#24394 - CGI_10003777 superfamily 219069 1126 1176 5.03E-05 45.4458 cl05828 Na_trans_assoc superfamily N - "Sodium ion transport-associated; Members of this family contain a region found exclusively in eukaryotic sodium channels or their subunits, many of which are voltage-gated. Members very often also contain between one and four copies of pfam00520 and, less often, one copy of pfam00612." Q#24395 - CGI_10003778 superfamily 247065 6 116 7.11E-13 61.2066 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#24397 - CGI_10006787 superfamily 245847 27 152 6.49E-07 44.4156 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#24398 - CGI_10006788 superfamily 248012 131 231 7.72E-24 93.4112 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#24398 - CGI_10006788 superfamily 243058 21 123 0.00210027 36.1384 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24399 - CGI_10006789 superfamily 217403 141 245 2.61E-14 66.6774 cl18408 2OG-FeII_Oxy superfamily - - "2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 catalyzing the reaction: Procollagen L-proline + 2-oxoglutarate + O2 <=> procollagen trans- 4-hydroxy-L-proline + succinate + CO2. The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB." Q#24399 - CGI_10006789 superfamily 222608 11 88 0.000198434 38.7747 cl18680 DIOX_N superfamily N - non-haem dioxygenase in morphine synthesis N-terminal; This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity. Q#24400 - CGI_10006790 superfamily 201664 5 175 4.25E-54 177.035 cl18216 NAD_Gly3P_dh_N superfamily - - NAD-dependent glycerol-3-phosphate dehydrogenase N-terminus; NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the N-terminal NAD-binding domain. Q#24400 - CGI_10006790 superfamily 116100 194 343 9.01E-53 173.153 cl08454 NAD_Gly3P_dh_C superfamily - - NAD-dependent glycerol-3-phosphate dehydrogenase C-terminus; NAD-dependent glycerol-3-phosphate dehydrogenase (GPDH) catalyzes the interconversion of dihydroxyacetone phosphate and L-glycerol-3-phosphate. This family represents the C-terminal substrate-binding domain. Q#24401 - CGI_10006791 superfamily 243054 177 373 0.000331716 41.6624 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#24401 - CGI_10006791 superfamily 219569 686 751 0.00759299 37.5176 cl06693 MFMR superfamily NC - "G-box binding protein MFMR; This region is found to the N-terminus of the pfam00170 transcription factor domain. It is between 150 and 200 amino acids in length. The N-terminal half is rather rich in proline residues and has been termed the PRD (proline rich domain), whereas the C-terminal half is more polar and has been called the MFMR (multifunctional mosaic region). It has been suggested that this family is composed of three sub-families called A, B and C, classified according to motif composition. It has been suggested that some of these motifs may be involved in mediating protein-protein interactions. The MFMR region contains a nuclear localisation signal in bZIP opaque and GBF-2. The MFMR also contains a transregulatory activity in TAF-1. The MFMR in CPRF-2 contains cytoplasmic retention signals." Q#24402 - CGI_10006792 superfamily 245010 957 1068 8.21E-26 105.004 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#24402 - CGI_10006792 superfamily 245201 18 305 1.56E-97 325.679 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24402 - CGI_10006792 superfamily 245201 452 917 1.89E-67 239.779 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24403 - CGI_10006793 superfamily 243058 626 747 4.17E-26 105.859 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24403 - CGI_10006793 superfamily 243058 925 1046 1.03E-16 78.5103 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24404 - CGI_10006794 superfamily 247057 333 372 0.000292483 38.4537 cl15755 SAM_superfamily superfamily C - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#24405 - CGI_10006795 superfamily 247057 501 539 0.000144799 39.9153 cl15755 SAM_superfamily superfamily C - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#24406 - CGI_10006796 superfamily 245213 816 850 1.42E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1825 1859 1.42E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 646 680 0.000108109 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1655 1689 0.000108109 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 520 554 0.000262631 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1529 1563 0.000262631 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 344 380 0.000559912 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 944 980 0.000648207 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1953 1989 0.000648207 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 900 943 0.000840541 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1909 1952 0.000840541 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 246 284 0.00121324 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1486 1528 0.00184683 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 477 519 0.00184683 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1570 1611 8.76E-05 42.7212 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 561 602 8.76E-05 42.7212 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 731 766 0.000122553 42.336 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1740 1775 0.000122553 42.336 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 205157 1445 1481 0.000313163 40.9839 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#24406 - CGI_10006796 superfamily 205157 436 472 0.000313163 40.9839 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#24406 - CGI_10006796 superfamily 205157 1785 1823 0.000469316 40.5987 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#24406 - CGI_10006796 superfamily 205157 776 814 0.000469316 40.5987 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#24406 - CGI_10006796 superfamily 205157 208 244 0.00120772 39.4431 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#24406 - CGI_10006796 superfamily 245213 1866 1905 0.00146475 39.2544 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 857 896 0.00146475 39.2544 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 1696 1732 0.00241119 38.484 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24406 - CGI_10006796 superfamily 245213 687 723 0.00241119 38.484 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24408 - CGI_10016843 superfamily 243179 46 161 7.31E-05 39.4383 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#24409 - CGI_10016844 superfamily 247684 8 42 6.42E-11 55.8147 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24410 - CGI_10016845 superfamily 247684 68 221 1.20E-64 209.895 cl17037 NBD_sugar-kinase_HSP70_actin superfamily NC - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24411 - CGI_10016846 superfamily 247684 9 208 4.32E-71 226.458 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24412 - CGI_10016847 superfamily 243179 111 211 8.32E-11 56.7723 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#24413 - CGI_10016848 superfamily 243179 137 255 2.89E-11 58.6983 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#24414 - CGI_10016849 superfamily 245847 150 305 2.24E-32 122.076 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#24414 - CGI_10016849 superfamily 243035 316 452 3.49E-22 92.7529 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#24414 - CGI_10016849 superfamily 216897 30 108 1.26E-13 66.9361 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#24415 - CGI_10016850 superfamily 217293 24 227 1.28E-32 122.741 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#24418 - CGI_10016854 superfamily 219605 37 240 3.83E-51 167.121 cl06746 DUF1637 superfamily - - "Protein of unknown function (DUF1637); This family contains many eukaryotic hypothetical proteins. The region featured in this family is approximately 120 residues long. According to InterPro annotation, some members of this family may belong to the cupin superfamily." Q#24419 - CGI_10016855 superfamily 199166 83 175 5.35E-06 45.396 cl15308 AMN1 superfamily N - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#24420 - CGI_10016856 superfamily 247727 61 163 8.38E-08 47.8099 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#24421 - CGI_10016857 superfamily 245202 74 165 4.65E-33 118.036 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#24421 - CGI_10016857 superfamily 206550 24 63 7.30E-13 61.6424 cl16842 ECR1_N superfamily - - Exosome complex exonuclease RRP4 N-terminal region; ECR1_N is an N-terminal region of the exosome complex exonuclease RRP proteins. It is a G-rich domain which structurally is a rudimentary single hybrid fold with a permuted topology. Q#24422 - CGI_10016858 superfamily 245201 248 510 4.12E-179 531.348 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24422 - CGI_10016858 superfamily 246908 136 229 2.81E-57 193.759 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#24422 - CGI_10016858 superfamily 247683 77 131 1.43E-24 99.409 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#24422 - CGI_10016858 superfamily 198904 1161 1265 2.89E-10 59.4309 cl07494 F_actin_bind superfamily - - "F-actin binding; The F-actin binding domain forms a compact bundle of four antiparallel alpha-helices, which are arranged in a left-handed topology. Binding of F-actin to the F-actin binding domain may result in cytoplasmic retention and subcellular distribution of the protein, as well as possible inhibition of protein function." Q#24423 - CGI_10016859 superfamily 220621 167 302 2.31E-09 54.1794 cl18563 DUF2358 superfamily - - Uncharacterized conserved protein (DUF2358); DUF2358 is a family of conserved proteins found from plants to humans. The function is unknown. Q#24424 - CGI_10016860 superfamily 216554 64 255 6.28E-36 128.366 cl15977 zf-DHHC superfamily - - DHHC palmitoyltransferase; This family includes the well known DHHC zinc binding domain as well as three of the four conserved transmembrane regions found in this family of palmitoyltransferase enzymes. Q#24425 - CGI_10016861 superfamily 216254 44 117 3.41E-11 55.3318 cl08303 Recep_L_domain superfamily C - Receptor L domain; The L domains from these receptors make up the bilobal ligand binding site. Each L domain consists of a single-stranded right hand beta-helix. This Pfam entry is missing the first 50 amino acid residues of the domain. Q#24426 - CGI_10016862 superfamily 241832 29 126 3.74E-37 135.089 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#24426 - CGI_10016862 superfamily 242886 404 502 2.13E-22 93.1394 cl02107 Evr1_Alr superfamily - - Erv1 / Alr family; Biogenesis of Fe/S clusters involves a number of essential mitochondrial proteins. Erv1p of Saccharomyces cerevisiae mitochondria is required for the maturation of Fe/S proteins in the cytosol. The ALR (augmenter of liver regeneration) represents a mammalian orthologue of yeast Erv1p. Both Erv1p and full-length ALR are located in the mitochondrial intermembrane an d it thought to operate downstream of the mitochondrial ABC transporter. Q#24427 - CGI_10016863 superfamily 245201 670 935 1.11E-120 371.099 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24427 - CGI_10016863 superfamily 245814 417 490 7.44E-10 57.1139 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24427 - CGI_10016863 superfamily 245814 186 264 2.27E-07 49.4099 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24427 - CGI_10016863 superfamily 245213 484 519 2.46E-06 45.7054 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24427 - CGI_10016863 superfamily 245213 524 559 4.22E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24427 - CGI_10016863 superfamily 245814 72 165 0.000355467 39.7961 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24428 - CGI_10016864 superfamily 217838 197 374 1.85E-53 184.303 cl04368 Leo1 superfamily NC - Leo1-like protein; Members of this family are part of the Paf1/RNA polymerase II complex. The Paf1 complex probably functions during the elongation phase of transcription. The Leo1 subunit of the yeast Paf1-complex binds RNA and contributes to complex recruitment. The subunit acts by co-ordinating co-transcriptional chromain modifications and helping recruitment of mRNA 3prime-end processing factors. Q#24429 - CGI_10016865 superfamily 199168 201 221 0.00726791 33.8644 cl15310 LRR_TYP superfamily - - "Leucine-rich repeats, typical (most populated) subfamily; Leucine-rich repeats, typical (most populated) subfamily. " Q#24430 - CGI_10016866 superfamily 218176 66 282 2.19E-29 112.453 cl14959 Pex19 superfamily - - Pex19 protein family; Pex19 protein family. Q#24431 - CGI_10016867 superfamily 245201 23 268 1.40E-77 255.137 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24433 - CGI_10016869 superfamily 217895 29 136 0.00473408 36.0819 cl04401 CD20 superfamily C - "CD20-like family; This family includes the CD20 protein and the beta subunit of the high affinity receptor for IgE Fc. The high affinity receptor for IgE is a tetrameric structure consisting of a single IgE-binding alpha subunit, a single beta subunit, and two disulfide-linked gamma subunits. The alpha subunit of Fc epsilon RI and most Fc receptors are homologous members of the Ig superfamily. By contrast, the beta and gamma subunits from Fc epsilon RI are not homologous to the Ig superfamily. Both molecules have four putative transmembrane segments and a probably topology where both amino- and carboxy termini protrude into the cytoplasm. This family also includes LR8 like proteins from humans, mice and rats. The function of the human LR8 protein is unknown although it is known to be strongly expressed in the lung fibroblasts. This family also includes sarcospan is a transmembrane component of dystrophin-associated glycoprotein. Loss of the sarcoglycan complex and sarcospan alone is sufficient to cause muscular dystrophy. The role of the sarcoglycan complex and sarcospan is thought to be to strengthen the dystrophin axis connecting the basement membrane with the cytoskeleton." Q#24434 - CGI_10016870 superfamily 243128 478 659 3.08E-27 112.038 cl02652 MIF4G superfamily - - "MIF4G domain; MIF4G is named after Middle domain of eukaryotic initiation factor 4G (eIF4G). Also occurs in NMD2p and CBP80. The domain is rich in alpha-helices and may contain multiple alpha-helical repeats. In eIF4G, this domain binds eIF4A, eIF3, RNA and DNA." Q#24434 - CGI_10016870 superfamily 243129 765 871 8.83E-22 93.4721 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#24435 - CGI_10016871 superfamily 241920 107 197 5.04E-23 90.6467 cl00519 Oligomerisation superfamily - - "Oligomerisation domain; In yeasts, this domain is required for the oligomerisation of ATP synthase subunit 9 into a ring structure." Q#24436 - CGI_10016872 superfamily 243267 458 821 1.19E-101 321.1 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#24436 - CGI_10016872 superfamily 241782 44 402 3.35E-28 116.486 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#24437 - CGI_10016873 superfamily 243267 445 808 6.52E-101 318.789 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#24437 - CGI_10016873 superfamily 241782 55 389 8.44E-28 115.331 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#24438 - CGI_10016874 superfamily 243267 32 421 2.30E-68 224.03 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#24439 - CGI_10016875 superfamily 241578 403 544 2.00E-16 77.717 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24439 - CGI_10016875 superfamily 241578 548 646 3.42E-13 68.087 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24439 - CGI_10016875 superfamily 241578 208 374 4.52E-08 52.4568 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24443 - CGI_10016879 superfamily 241578 93 202 1.21E-10 58.457 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24444 - CGI_10016880 superfamily 241578 23 192 1.75E-26 102.306 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24445 - CGI_10016881 superfamily 241578 14 170 5.72E-29 109.689 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24446 - CGI_10004043 superfamily 241600 108 244 1.86E-43 152.395 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#24446 - CGI_10004043 superfamily 241600 251 407 2.85E-43 151.624 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#24448 - CGI_10004045 superfamily 216897 33 99 1.40E-14 66.1657 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#24448 - CGI_10004045 superfamily 246918 125 160 4.43E-05 39.4923 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24449 - CGI_10006075 superfamily 218118 85 149 2.14E-10 55.3129 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#24449 - CGI_10006075 superfamily 218118 180 249 2.27E-06 44.1421 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#24449 - CGI_10006075 superfamily 218118 142 184 0.000810051 36.4381 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#24450 - CGI_10006076 superfamily 218118 46 111 4.15E-16 68.4096 cl04552 CD225 superfamily - - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#24451 - CGI_10006077 superfamily 218118 73 126 1.12E-09 51.4609 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#24454 - CGI_10006081 superfamily 247769 83 195 4.93E-11 59.6605 cl17215 HDc superfamily C - Metal dependent phosphohydrolases with conserved 'HD' motif Q#24455 - CGI_10010687 superfamily 222445 61 134 1.69E-10 55.0722 cl16466 R3H-assoc superfamily C - "R3H-associated N-terminal domain; This family is found at the N-terminus of R3H, pfam01424, domain-containing proteins. The function is not known." Q#24456 - CGI_10010688 superfamily 241571 24 138 6.23E-36 122.136 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24459 - CGI_10010691 superfamily 243035 63 130 5.17E-13 61.0965 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#24459 - CGI_10010691 superfamily 241568 30 55 0.00265289 32.82 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#24460 - CGI_10010693 superfamily 243035 9 124 8.45E-29 102.698 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#24461 - CGI_10010695 superfamily 238012 73 121 2.59E-12 61.2162 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#24466 - CGI_10010813 superfamily 165418 23 85 0.0091106 32.224 cl14511 PHA03147 superfamily C - hypothetical protein; Provisional Q#24468 - CGI_10010815 superfamily 243066 131 184 5.42E-12 58.7185 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24473 - CGI_10010820 superfamily 245814 219 292 1.16E-06 45.9431 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24474 - CGI_10010821 superfamily 219727 150 410 2.04E-139 421.249 cl18522 BOP1NT superfamily - - BOP1NT (NUC169) domain; This N terminal domain is found in BOP1-like WD40 proteins. Q#24474 - CGI_10010821 superfamily 243092 416 751 4.41E-24 103.569 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24475 - CGI_10010822 superfamily 241756 8 202 7.92E-68 210.856 cl00289 FIG superfamily - - "FIG, FBPase/IMPase/glpX-like domain. A superfamily of metal-dependent phosphatases with various substrates. Fructose-1,6-bisphospatase (both the major and the glpX-encoded variant) hydrolyze fructose-1,6,-bisphosphate to fructose-6-phosphate in gluconeogenesis. Inositol-monophosphatases and inositol polyphosphatases play vital roles in eukaryotic signalling, as they participate in metabolizing the messenger molecule Inositol-1,4,5-triphosphate. Many of these enzymes are inhibited by Li+." Q#24477 - CGI_10010824 superfamily 216981 159 283 2.13E-28 106.463 cl17087 OTU superfamily - - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#24478 - CGI_10010825 superfamily 247724 6 174 5.05E-96 299.666 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24478 - CGI_10010825 superfamily 247724 253 420 1.00E-93 293.503 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24478 - CGI_10010825 superfamily 243185 455 537 5.45E-33 122.971 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#24478 - CGI_10010825 superfamily 243185 209 239 5.19E-09 54.4054 cl02787 Translation_Factor_II_like superfamily C - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#24479 - CGI_10010826 superfamily 148541 96 225 7.22E-13 62.4521 cl06160 DUF1301 superfamily - - Protein of unknown function (DUF1301); This family contains a number of eukaryotic proteins of unknown function that are approximately 160 residues long. Q#24481 - CGI_10008150 superfamily 241571 397 513 5.55E-11 60.5038 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24481 - CGI_10008150 superfamily 245213 522 551 0.00142614 37.231 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24481 - CGI_10008150 superfamily 241583 169 350 5.47E-33 125.376 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#24482 - CGI_10008151 superfamily 241683 16 340 6.02E-171 497.423 cl00204 PFK superfamily C - "Phosphofructokinase, a key regulatory enzyme in glycolysis, catalyzes the phosphorylation of fructose-6-phosphate to fructose-1,6-biphosphate. The members belong to PFK family that includes ATP- and pyrophosphate (PPi)- dependent phosphofructokinases. Some members evolved by gene duplication and thus have a large C-terminal/N-terminal extension comprising a second PFK domain. Generally, ATP-PFKs are allosteric homotetramers, and PPi-PFKs are dimeric and nonallosteric except for plant PPi-PFKs which are allosteric heterotetramers." Q#24483 - CGI_10008152 superfamily 241709 4 159 1.29E-60 188.713 cl00232 Ribosomal_L19e superfamily - - "Ribosomal protein L19e. L19e is found in the large ribosomal subunit of eukaryotes and archaea. L19e is distinct from the ribosomal subunit L19, which is found in prokaryotes. It consists of two small globular domains connected by an extended segment. It is located toward the surface of the large subunit, with one exposed end involved in forming the intersubunit bridge with the small subunit. The other exposed end is involved in forming the translocon binding site, along with L22, L23, L24, L29, and L31e subunits." Q#24485 - CGI_10008154 superfamily 243066 52 142 4.03E-16 76.1169 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24486 - CGI_10008155 superfamily 241992 445 870 0 663.157 cl00628 Piwi-like superfamily - - "Piwi-like: PIWI domain. Domain found in proteins involved in RNA silencing. RNA silencing refers to a group of related gene-silencing mechanisms mediated by short RNA molecules, including siRNAs, miRNAs, and heterochromatin-related guide RNAs. The central component of the RNA-induced silencing complex (RISC) and related complexes is Argonaute. The PIWI domain is the C-terminal portion of Argonaute and consists of two subdomains, one of which provides the 5' anchoring of the guide RNA and the other, the catalytic site for slicing. This domain is also found in closely related proteins, including the Piwi subfamily, where it is believed to perform a crucial role in germline cells, via a similar mechanism." Q#24486 - CGI_10008155 superfamily 241765 280 400 1.63E-38 140.53 cl00301 PAZ superfamily - - "PAZ domain, named PAZ after the proteins Piwi Argonaut and Zwille. PAZ is found in two families of proteins that are essential components of RNA-mediated gene-silencing pathways, including RNA interference, the piwi and Dicer families. PAZ functions as a nucleic-acid binding domain, with a strong preference for single-stranded nucleic acids (RNA or DNA) or RNA duplexes with single-stranded 3' overhangs. It has been suggested that the PAZ domain provides a unique mode for the recognition of the two 3'-terminal nucleotides in single-stranded nucleic acids and buries the 3' OH group, and that it might recognize characteristic 3' overhangs in siRNAs within RISC (RNA-induced silencing) and other complexes. This parent model also contains structures of an archaeal PAZ domain." Q#24486 - CGI_10008155 superfamily 219976 228 280 4.12E-20 86.0725 cl07356 DUF1785 superfamily - - Domain of unknown function (DUF1785); This region is found in argonaute proteins and often co-occurs with pfam02179 and pfam02171. Q#24486 - CGI_10008155 superfamily 219677 935 963 0.00116116 38.1876 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#24487 - CGI_10008156 superfamily 217437 59 171 1.95E-26 100.099 cl03944 GILT superfamily - - "Gamma interferon inducible lysosomal thiol reductase (GILT); This family includes the two characterized human gamma-interferon-inducible lysosomal thiol reductase (GILT) sequences. It also contains several other eukaryotic putative proteins with similarity to GILT. The aligned region contains three conserved cysteine residues. In addition, the two GILT sequences possess a C-X(2)-C motif that is shared by some of the other sequences in the family. This motif is thought to be associated with disulphide bond reduction." Q#24488 - CGI_10008157 superfamily 217437 55 166 1.27E-23 92.3953 cl03944 GILT superfamily - - "Gamma interferon inducible lysosomal thiol reductase (GILT); This family includes the two characterized human gamma-interferon-inducible lysosomal thiol reductase (GILT) sequences. It also contains several other eukaryotic putative proteins with similarity to GILT. The aligned region contains three conserved cysteine residues. In addition, the two GILT sequences possess a C-X(2)-C motif that is shared by some of the other sequences in the family. This motif is thought to be associated with disulphide bond reduction." Q#24489 - CGI_10008158 superfamily 243100 624 685 1.83E-10 58.7295 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#24489 - CGI_10008158 superfamily 243100 945 1006 1.83E-10 58.7295 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#24491 - CGI_10003006 superfamily 220545 241 279 7.34E-18 74.9698 cl15332 DUF2296 superfamily C - "Predicted integral membrane metal-binding protein (DUF2296); This domain, found in various hypothetical bacterial and eukaryotic metal-binding proteins, has no known function." Q#24492 - CGI_10003007 superfamily 245225 349 778 1.22E-39 152.397 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#24492 - CGI_10003007 superfamily 245225 4 301 1.04E-17 85.3723 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily N - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#24494 - CGI_10011255 superfamily 241874 29 601 0 579.134 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24495 - CGI_10011256 superfamily 241622 152 233 4.08E-14 65.2806 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#24500 - CGI_10011261 superfamily 241622 264 308 2.41E-06 44.938 cl00117 PDZ superfamily C - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#24500 - CGI_10011261 superfamily 247792 18 53 0.00171209 35.8844 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24505 - CGI_10011266 superfamily 243175 152 254 1.31E-50 164.669 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#24505 - CGI_10011266 superfamily 241832 57 134 4.58E-13 62.2337 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#24507 - CGI_10011268 superfamily 245213 333 368 5.01E-09 52.639 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24507 - CGI_10011268 superfamily 245213 292 330 0.000163803 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24511 - CGI_10005439 superfamily 248264 2 136 1.75E-13 66.4917 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#24520 - CGI_10003178 superfamily 241600 7 82 4.50E-20 80.3623 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#24524 - CGI_10003231 superfamily 247684 15 94 0.000352035 41.5084 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24524 - CGI_10003231 superfamily 241563 152 185 0.00612224 34.7624 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24526 - CGI_10002108 superfamily 246597 137 426 4.57E-94 290.329 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#24528 - CGI_10004333 superfamily 241815 1 174 1.32E-31 115.371 cl00361 Transcrip_reg superfamily C - "Transcriptional regulator; This is a family of transcriptional regulators. In mammals, it activates the transcription of mitochondrially-encoded COX1. In bacteria, it negatively regulates the quorum-sensing response regulator by binding to its promoter region." Q#24529 - CGI_10004334 superfamily 247085 219 335 7.24E-30 111.443 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#24529 - CGI_10004334 superfamily 245596 76 206 1.05E-69 222.08 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#24530 - CGI_10004335 superfamily 247085 442 558 5.89E-28 108.747 cl15820 RICIN superfamily - - "Ricin-type beta-trefoil; Carbohydrate-binding domain formed from presumed gene triplication. The domain is found in a variety of molecules serving diverse functions such as enzymatic activity, inhibitory toxicity and signal transduction. Highly specific ligand binding occurs on exposed surfaces of the compact domain sturcture." Q#24530 - CGI_10004335 superfamily 245596 193 396 4.76E-101 310.675 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#24530 - CGI_10004335 superfamily 245596 104 153 1.00E-15 76.4741 cl11394 Glyco_tranf_GTA_type superfamily NC - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#24530 - CGI_10004335 superfamily 245596 28 102 1.55E-13 69.9257 cl11394 Glyco_tranf_GTA_type superfamily NC - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#24534 - CGI_10009043 superfamily 247941 144 294 3.22E-13 65.4348 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#24535 - CGI_10009044 superfamily 214507 138 188 0.000185995 37.4096 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#24535 - CGI_10009044 superfamily 243030 33 59 0.00609023 33.0599 cl02423 LRRNT superfamily - - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#24536 - CGI_10009045 superfamily 215827 218 395 1.53E-37 139.526 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#24537 - CGI_10009046 superfamily 248097 87 172 1.06E-18 77.6906 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24537 - CGI_10009046 superfamily 243100 42 82 1.58E-05 39.8548 cl02576 B_zip1 superfamily N - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#24538 - CGI_10009047 superfamily 248097 13 145 1.13E-22 87.7058 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24539 - CGI_10009048 superfamily 248097 63 195 6.74E-24 92.3282 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24540 - CGI_10009049 superfamily 248097 63 195 4.82E-22 87.3206 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24541 - CGI_10015759 superfamily 243035 1 72 1.54E-12 58.0149 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#24544 - CGI_10015762 superfamily 248097 77 202 3.46E-24 93.0986 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24545 - CGI_10015763 superfamily 248097 52 153 1.53E-16 73.8386 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24550 - CGI_10015768 superfamily 243066 36 132 3.98E-17 73.8792 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24551 - CGI_10015769 superfamily 243066 238 334 6.95E-15 70.0272 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24551 - CGI_10015769 superfamily 243066 1 71 5.81E-05 41.1373 cl02518 BTB superfamily N - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24553 - CGI_10015771 superfamily 243066 36 132 6.14E-17 73.494 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24556 - CGI_10015774 superfamily 241578 209 397 3.78E-16 78.1582 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24556 - CGI_10015774 superfamily 217211 473 533 3.72E-07 49.5902 cl03691 Cache_1 superfamily - - Cache domain; Cache domain. Q#24556 - CGI_10015774 superfamily 219821 142 182 6.44E-05 42.7434 cl07136 VWA_N superfamily N - "VWA N-terminal; This domain is found at the N-terminus of proteins containing von Willebrand factor type A (VWA, pfam00092) and Cache (pfam02743) domains. It has been found in vertebrates, Drosophila and C. elegans but has not yet been identified in other eukaryotes. It is probably involved in the function of some voltage-dependent calcium channel subunits." Q#24558 - CGI_10015776 superfamily 241874 9 534 0 647.7 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24559 - CGI_10015777 superfamily 241874 191 758 0 670.311 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24560 - CGI_10015778 superfamily 222150 1012 1036 0.000341318 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24560 - CGI_10015778 superfamily 222150 1155 1175 0.000398031 40.0677 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24560 - CGI_10015778 superfamily 222150 1124 1149 0.00241841 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24560 - CGI_10015778 superfamily 222150 1043 1065 0.00247716 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24561 - CGI_10005629 superfamily 197448 26 84 1.85E-15 68.6689 cl15240 Reelin_subrepeat_like superfamily N - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#24563 - CGI_10008252 superfamily 216939 15 73 2.91E-06 39.9537 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#24564 - CGI_10008253 superfamily 216939 24 66 5.73E-05 36.4869 cl03492 PC4 superfamily N - Transcriptional Coactivator p15 (PC4); p15 has a bipartite structure composed of an amino-terminal regulatory domain and a carboxy-terminal cryptic DNA-binding domain. The DNA-binding activity of the carboxy-terminal is disguised by the amino-terminal p15 domain. Activity is controlled by protein kinases that target the regulatory domain. Q#24566 - CGI_10008255 superfamily 191243 12 34 0.00186434 32.7995 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#24571 - CGI_10008260 superfamily 247724 24 62 8.94E-07 42.919 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24572 - CGI_10008261 superfamily 243056 373 517 4.21E-33 126.267 cl02495 RabGAP-TBC superfamily N - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#24574 - CGI_10013480 superfamily 241568 192 255 6.80E-05 39.7536 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#24574 - CGI_10013480 superfamily 241568 12 66 0.000951348 36.2868 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#24575 - CGI_10013481 superfamily 245814 1813 1882 2.71E-11 63.2771 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 2541 2607 8.24E-11 61.7363 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1909 1975 1.69E-10 60.9659 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 2095 2164 2.89E-10 60.1955 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 2358 2425 6.56E-10 59.0399 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 704 769 7.98E-10 59.0399 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1720 1784 2.15E-09 57.4991 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1057 1124 3.32E-09 57.1139 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1625 1694 4.01E-09 56.7287 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245213 3762 3802 8.55E-09 54.9502 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24575 - CGI_10013481 superfamily 245213 3493 3532 4.00E-07 50.3278 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24575 - CGI_10013481 superfamily 245814 435 500 7.75E-07 50.1803 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 525 587 8.42E-07 49.7951 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 241578 34 192 1.80E-06 49.8718 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24575 - CGI_10013481 superfamily 245814 2286 2338 2.63E-06 48.6395 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245213 3577 3611 2.84E-05 44.935 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24575 - CGI_10013481 superfamily 245213 3660 3689 8.89E-05 43.3942 cl09941 EGF_CA superfamily C - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24575 - CGI_10013481 superfamily 245213 3617 3655 0.000140351 42.6238 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24575 - CGI_10013481 superfamily 245213 3533 3568 0.00166489 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24575 - CGI_10013481 superfamily 241668 3255 3478 2.29E-42 158.375 cl00186 nidG2 superfamily - - "Nidogen, G2 domain; Nidogen is an important component of the basement membrane, an extracellular sheet-like matrix. Nidogen is a multifunctional protein that interacts with many other basement membrane proteins, like collagen, perlecan, lamin, and has a potential role in the assembly and connection of networks. Nidogen consists of 3 globular domains (G1-G3), G3 is the lamin-binding domain, while G2 binds collagen IV and perlecan. Also found in hemicentin, a protein which functions at various cell-cell and cell-matrix junctions and might assist in refining broad regions of cell contact into oriented, line-shaped junctions. Nidogen G2 consists of an N-terminal EGF-like domain (excluded from this alignment model) and an 11-stranded beta-barrel with a central helix, a topology that exhibits high structural similarity to the green flourescent proteins of Cnidaria." Q#24575 - CGI_10013481 superfamily 246918 3092 3144 3.64E-17 79.5531 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24575 - CGI_10013481 superfamily 246918 2978 3030 7.87E-16 75.7011 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24575 - CGI_10013481 superfamily 245814 1330 1412 1.52E-14 72.9232 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 246918 2922 2973 7.03E-14 70.3083 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24575 - CGI_10013481 superfamily 245814 604 682 7.78E-14 70.9972 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 246918 3206 3256 2.39E-13 68.7675 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24575 - CGI_10013481 superfamily 245814 2186 2252 6.33E-13 67.8171 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 958 1035 7.06E-13 68.3008 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 778 863 1.22E-12 67.6498 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 2769 2830 2.58E-12 65.8911 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 2629 2706 3.26E-12 65.9896 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1234 1305 8.10E-12 65.1089 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1140 1210 8.25E-12 65.1089 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1998 2071 8.65E-12 64.7132 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 2447 2511 9.58E-12 64.3878 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1425 1508 7.49E-11 62.1376 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 880 959 5.79E-10 59.4412 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 1529 1592 7.59E-10 58.5723 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 245814 2865 2905 6.62E-07 50.0979 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24575 - CGI_10013481 superfamily 246918 3050 3086 3.25E-05 44.8851 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24575 - CGI_10013481 superfamily 246918 3167 3201 0.000209734 42.5739 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#24576 - CGI_10013482 superfamily 217473 195 447 4.98E-31 122.859 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#24577 - CGI_10013483 superfamily 246664 81 557 8.07E-155 457.542 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#24578 - CGI_10013484 superfamily 241563 13 58 0.00727187 33.8499 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24579 - CGI_10013485 superfamily 245848 15 227 6.98E-22 91.1789 cl12043 Amidinotransf superfamily - - "Amidinotransferase; This family contains glycine (EC:2.1.4.1) and inosamine (EC:2.1.4.2) amidinotransferases, enzymes involved in creatine and streptomycin biosynthesis respectively. This family also includes arginine deiminases, EC:3.5.3.6. These enzymes catalyze the reaction: arginine + H2O <=> citrulline + NH3. Also found in this family is the Streptococcus anti tumour glycoprotein." Q#24580 - CGI_10013486 superfamily 245848 15 272 1.29E-24 100.424 cl12043 Amidinotransf superfamily - - "Amidinotransferase; This family contains glycine (EC:2.1.4.1) and inosamine (EC:2.1.4.2) amidinotransferases, enzymes involved in creatine and streptomycin biosynthesis respectively. This family also includes arginine deiminases, EC:3.5.3.6. These enzymes catalyze the reaction: arginine + H2O <=> citrulline + NH3. Also found in this family is the Streptococcus anti tumour glycoprotein." Q#24584 - CGI_10013490 superfamily 246710 24 161 7.40E-38 138.789 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#24584 - CGI_10013490 superfamily 246710 586 707 1.33E-27 109.9 cl14783 DOMON_like superfamily - - "Domon-like ligand-binding domains; DOMON-like domains can be found in all three kindgoms of life and are a diverse group of ligand binding domains that have been shown to interact with sugars and hemes. DOMON domains were initially thought to confer protein-protein interactions. They were subsequently found as a heme-binding motif in cellobiose dehydrogenase, an extracellular fungal oxidoreductase that degrades both lignin and cellulose, and in ethylbenzene dehydrogenase, an enzyme that aids in the anaerobic degradation of hydrocarbons. The domain interacts with sugars in the type 9 carbohydrate binding modules (CBM9), which are present in a variety of glycosyl hydrolases, and it can also be found at the N-terminus of sensor histidine kinases." Q#24584 - CGI_10013490 superfamily 216290 184 311 4.77E-33 125.093 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#24584 - CGI_10013490 superfamily 217685 328 485 1.03E-29 116.28 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#24584 - CGI_10013490 superfamily 217685 705 789 1.20E-24 101.642 cl04225 Cu2_monoox_C superfamily N - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#24586 - CGI_10013492 superfamily 222070 201 347 6.44E-18 78.8736 cl18634 DDE_3 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#24587 - CGI_10011531 superfamily 245847 116 247 2.06E-19 81.8341 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#24587 - CGI_10011531 superfamily 245847 1 122 1.57E-12 62.1889 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#24591 - CGI_10011536 superfamily 245847 27 72 0.000804648 34.4546 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#24593 - CGI_10011538 superfamily 245603 110 191 4.73E-06 43.3792 cl11403 pepsin_retropepsin_like superfamily - - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#24594 - CGI_10011539 superfamily 241563 61 97 9.71E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24595 - CGI_10007963 superfamily 218328 141 330 1.91E-99 296.183 cl04844 XAP5 superfamily - - "XAP5, circadian clock regulator; This protein is found in a wide range of eukaryotes. It is a nuclear protein and is suggested to be DNA binding. In plants, this family is essential for correct circadian clock functioning by acting as a light-quality regulator coordinating the activities of blue and red light signalling pathways during plant growth - inhibiting growth in red light but promoting growth in blue light." Q#24597 - CGI_10007965 superfamily 247999 26 70 3.27E-08 46.8214 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#24601 - CGI_10007969 superfamily 248264 8 49 0.000273184 38.7574 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#24602 - CGI_10007970 superfamily 247905 567 620 1.04E-09 56.8625 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#24602 - CGI_10007970 superfamily 247805 406 550 2.90E-09 55.4212 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#24602 - CGI_10007970 superfamily 248264 321 381 0.0042336 37.2166 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#24604 - CGI_10024240 superfamily 248318 274 322 4.41E-18 76.7057 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#24605 - CGI_10024241 superfamily 247725 216 257 2.74E-06 44.2278 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24605 - CGI_10024241 superfamily 149853 19 78 4.55E-06 43.2093 cl07491 Phe_ZIP superfamily - - "Phenylalanine zipper; The phenylalanine zipper consists of aromatic side chains from ten phenylalanine residues that are stacked within a hydrophobic core. This zipper mediates dimerisation of various proteins, such as APS, SH2-B and Lnk." Q#24606 - CGI_10024242 superfamily 246908 30 125 1.79E-63 194.562 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#24607 - CGI_10024245 superfamily 247724 285 559 9.50E-156 453.156 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24608 - CGI_10024246 superfamily 247724 543 810 1.59E-139 418.102 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24608 - CGI_10024246 superfamily 247724 314 545 3.09E-114 352.233 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24609 - CGI_10024247 superfamily 248054 5 218 3.46E-14 70.7943 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#24610 - CGI_10024248 superfamily 241640 589 798 1.08E-67 227.159 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#24610 - CGI_10024248 superfamily 241571 386 486 2.85E-25 102.876 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24610 - CGI_10024248 superfamily 241571 27 133 2.99E-22 94.0162 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24610 - CGI_10024248 superfamily 241571 226 329 5.61E-21 90.5494 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24610 - CGI_10024248 superfamily 241613 516 550 1.13E-11 61.4537 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24610 - CGI_10024248 superfamily 243051 924 1066 2.73E-09 56.5901 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#24610 - CGI_10024248 superfamily 243051 799 916 1.34E-06 48.143 cl02479 MAM superfamily N - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#24613 - CGI_10024251 superfamily 149750 182 207 1.15E-09 52.9472 cl07409 NHR2 superfamily NC - NHR2 domain like; The NHR2 (Nervy homology 2) domain is found in the ETO protein where it mediates oligomerisation and protein-protein interactions. It forms an alpha-helical tetramer. Q#24615 - CGI_10024253 superfamily 241578 64 230 1.53E-41 146.219 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24615 - CGI_10024253 superfamily 241578 278 431 1.72E-05 43.4342 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24617 - CGI_10024255 superfamily 241573 80 383 1.01E-106 325.441 cl00051 CysPc superfamily - - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#24617 - CGI_10024255 superfamily 241653 406 553 6.26E-39 139.741 cl00165 Calpain_III superfamily - - "Calpain, subdomain III. Calpains are calcium-activated cytoplasmic cysteine proteinases, participate in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction. Catalytic domain and the two calmodulin-like domains are separated by C2-like domain III. Domain III plays an important role in calcium-induced activation of calpain involving electrostatic interactions with subdomain II. Proposed to mediate calpain's interaction with phospholipids and translocation to cytoplasmic/nuclear membranes. CD includes subdomain III of typical and atypical calpains." Q#24619 - CGI_10024257 superfamily 244539 243 608 4.61E-124 376.182 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#24619 - CGI_10024257 superfamily 241863 67 193 6.90E-21 90.1443 cl00438 Flavodoxin_2 superfamily - - Flavodoxin-like fold; This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. Q#24620 - CGI_10024258 superfamily 247725 195 331 1.21E-77 247.837 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24620 - CGI_10024258 superfamily 247725 333 429 2.28E-46 160.352 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24620 - CGI_10024258 superfamily 241597 560 610 5.26E-14 68.4233 cl00082 HMG-box superfamily C - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#24621 - CGI_10024259 superfamily 248469 455 578 0.000444903 39.6607 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#24621 - CGI_10024259 superfamily 217255 124 310 1.70E-41 150.597 cl03746 DDHD superfamily - - "DDHD domain; The DDHD domain is 180 residues long and contains four conserved residues that may form a metal binding site. The domain is named after these four residues. This pattern of conservation of metal binding residues is often seen in phosphoesterase domains. This domain is found in retinal degeneration B proteins, as well as a family of probable phospholipases. It has been shown that this domain is found in a longer C terminal region that binds to PYK2 tyrosine kinase. These proteins have been called N-terminal domain-interacting receptor (Nir1, Nir2 and Nir3). This suggests that this region is involved in functionally important interactions in other members of this family." Q#24622 - CGI_10024260 superfamily 241578 15 251 8.09E-84 269.626 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24622 - CGI_10024260 superfamily 234316 438 488 1.18E-05 44.0128 cl14012 Rhs_assc_core superfamily C - "RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain." Q#24624 - CGI_10024262 superfamily 241874 23 591 0 819.238 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#24625 - CGI_10024263 superfamily 245029 27 134 5.37E-20 80.3843 cl09190 MAPEG superfamily - - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#24626 - CGI_10024264 superfamily 245206 6 204 8.49E-36 131.09 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#24627 - CGI_10024265 superfamily 245029 58 148 3.33E-17 73.0656 cl09190 MAPEG superfamily - - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#24629 - CGI_10024267 superfamily 247723 781 857 3.27E-46 160.254 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24629 - CGI_10024267 superfamily 247723 674 755 2.21E-40 144.284 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24629 - CGI_10024267 superfamily 247723 555 626 9.05E-40 142.129 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24629 - CGI_10024267 superfamily 247723 370 448 1.42E-41 147.552 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24629 - CGI_10024267 superfamily 247723 28 92 2.02E-25 101.616 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24629 - CGI_10024267 superfamily 247723 258 322 8.49E-17 76.6362 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24630 - CGI_10024268 superfamily 242880 4 239 1.41E-143 405.028 cl02098 14-3-3 superfamily - - "14-3-3 domain; 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 14-3-3 proteins play important roles in many biological processes that are regulated by phosphorylation, including cell cycle regulation, cell proliferation, protein trafficking, metabolic regulation and apoptosis. More than 300 binding partners of the 14-3-3 domain have been identified in all subcellular compartments and include transcription factors, signaling molecules, tumor suppressors, biosynthetic enzymes, cytoskeletal proteins and apoptosis factors. 14-3-3 binding can alter the conformation, localization, stability, phosphorylation state, activity as well as molecular interactions of a target protein. They function only as dimers, some preferring strictly homodimeric interaction, while others form heterodimers. Binding of the 14-3-3 domain to its target occurs in a phosphospecific manner where it binds to one of two consensus sequences of their target proteins; RSXpSXP (mode-1) and RXXXpSXP (mode-2). In some instances, 14-3-3 domain containing proteins are involved in regulation and signaling of a number of cellular processes in phosphorylation-independent manner. Many organisms express multiple isoforms: there are seven mammalian 14-3-3 family members (beta, gamma, eta, theta, epsilon, sigma, zeta), each encoded by a distinct gene, while plants contain up to 13 isoforms. The flexible C-terminal segment of 14-3-3 isoforms shows the highest sequence variability and may significantly contribute to individual isoform uniqueness by playing an important regulatory role by occupying the ligand binding groove and blocking the binding of inappropriate ligands in a distinct manner. Elevated amounts of 14-3-3 proteins are found in the cerebrospinal fluid of patients with Creutzfeldt-Jakob disease. In protozoa, like Plasmodium or Cryptosporidium parvum 14-3-3 proteins play an important role in key steps of parasite development." Q#24631 - CGI_10024269 superfamily 243166 51 241 4.73E-26 101.601 cl02759 TRAM_LAG1_CLN8 superfamily - - TLC domain; TLC domain. Q#24632 - CGI_10024270 superfamily 203013 269 296 7.14E-09 50.7022 cl04519 zf-HIT superfamily - - HIT zinc finger; This presumed zinc finger contains up to 6 cysteine residues that could coordinate zinc. The domain is named after the HIT protein. This domain is also found in the Thyroid receptor interacting protein 3 (TRIP-3) that specifically interacts with the ligand binding domain of the thyroid receptor. Q#24633 - CGI_10024271 superfamily 222150 183 206 1.03E-05 41.2233 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24633 - CGI_10024271 superfamily 222150 153 180 0.000986726 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24634 - CGI_10024272 superfamily 148805 2 175 1.12E-39 138.738 cl06443 NICE-3 superfamily - - NICE-3 protein; This family consists of several eukaryotic NICE-3 and related proteins. The gene coding for NICE-3 is part of the epidermal differentiation complex (EDC) which comprises a large number of genes that are of crucial importance for the maturation of the human epidermis. The function of NICE-3 is unknown. Q#24636 - CGI_10024274 superfamily 241677 34 199 2.25E-75 227.14 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#24638 - CGI_10024276 superfamily 242323 149 263 1.48E-12 62.526 cl01132 FA_hydroxylase superfamily - - "Fatty acid hydroxylase superfamily; This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins." Q#24639 - CGI_10024277 superfamily 183292 151 194 3.69E-05 44.042 cl18135 PRK11728 superfamily NC - hydroxyglutarate oxidase; Provisional Q#24639 - CGI_10024277 superfamily 183782 2 34 0.000307738 41.4223 cl18137 PRK12834 superfamily C - putative FAD-binding dehydrogenase; Reviewed Q#24640 - CGI_10024278 superfamily 183782 2 34 0.000247619 41.0371 cl18137 PRK12834 superfamily C - putative FAD-binding dehydrogenase; Reviewed Q#24641 - CGI_10024279 superfamily 241739 1 201 3.36E-128 370.342 cl00268 class_II_aaRS-like_core superfamily N - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#24641 - CGI_10024279 superfamily 241738 201 291 4.14E-28 104.893 cl00266 HGTP_anticodon superfamily - - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#24643 - CGI_10024281 superfamily 248458 123 251 1.72E-10 61.5609 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24643 - CGI_10024281 superfamily 248458 363 548 6.77E-06 46.9233 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24644 - CGI_10024282 superfamily 248458 122 242 6.65E-09 56.5533 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24644 - CGI_10024282 superfamily 248458 360 535 0.000437501 41.1453 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24645 - CGI_10024283 superfamily 243309 1 128 9.60E-59 189.98 cl03119 FpgNei_N superfamily - - "N-terminal domain of Fpg (formamidopyrimidine-DNA glycosylase, MutM)_Nei (endonuclease VIII) base-excision repair DNA glycosylases; DNA glycosylases maintain genome integrity by recognizing base lesions created by ionizing radiation, alkylating or oxidizing agents, and endogenous reactive oxygen species. These enzymes initiate the base-excision repair process, which is completed with the help of enzymes such as phosphodiesterases, AP endonucleases, DNA polymerases and DNA ligases. DNA glycolsylases cleave the N-glycosyl bond between the sugar and the damaged base, creating an AP (apurinic/apyrimidinic) site. The FpgNei DNA glycosylases represent one of the two structural superfamilies of DNA glycosylases that recognize oxidized bases (the other is the HTH-GPD superfamily exemplified by Escherichia coli Nth). Most FpgNei DNA glycosylases use their N-terminal proline residue as the key catalytic nucleophile, and the reaction proceeds via a Schiff base intermediate. One exception is mouse Nei-like glycosylase 3 (Neil3) which forms a Schiff base intermediate via its N-terminal valine. In addition to this FpgNei_N domain, FpgNei proteins have a helix-two-turn-helix (H2TH) domain and a zinc (or zincless)-finger motif which also contribute residues to the active site. FpgNei DNA glycosylases have a broad substrate specificity. They are bifunctional, in addition to the glycosylase (recognition) activity, they have a lyase (cleaving) activity on the phosphodiester backbone of the DNA at the AP site. This superfamily includes eukaryotic, bacterial, and viral proteins." Q#24645 - CGI_10024283 superfamily 150080 247 285 2.16E-11 58.6478 cl07797 Neil1-DNA_bind superfamily - - "Endonuclease VIII-like 1, DNA bind; Members of this family are predominantly found in Endonuclease VIII-like 1 and adopt a glucocorticoid receptor-like fold. They allow for DNA binding." Q#24645 - CGI_10024283 superfamily 115485 135 201 1.13E-08 51.941 cl06065 H2TH superfamily C - Formamidopyrimidine-DNA glycosylase H2TH domain; Formamidopyrimidine-DNA glycosylase (Fpg) is a DNA repair enzyme that excises oxidized purines from damaged DNA. This family is the central domain containing the DNA-binding helix-two turn-helix domain. Q#24647 - CGI_10024285 superfamily 247792 500 543 0.000212299 40.1216 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24648 - CGI_10024286 superfamily 243082 1084 1408 1.32E-166 503.381 cl02553 Peptidase_C19 superfamily - - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#24648 - CGI_10024286 superfamily 247725 36 154 1.40E-34 130.436 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24648 - CGI_10024286 superfamily 246669 216 332 9.19E-13 67.2094 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#24649 - CGI_10024287 superfamily 242205 1 160 2.58E-60 186.356 cl00937 Ribosomal_L21e superfamily - - Ribosomal protein L21e; Ribosomal protein L21e. Q#24650 - CGI_10024288 superfamily 243066 14 102 1.62E-11 58.3333 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24651 - CGI_10024289 superfamily 242194 18 99 1.08E-30 116.515 cl00921 Ribosomal_L31e superfamily - - "Eukaryotic/archaeal ribosomal protein L31; Ribosomal protein L31e, which is present in archaea and eukaryotes, binds the 23S rRNA and is one of six protein components encircling the polypeptide exit tunnel. It is a component of the eukaryotic 60S (large) ribosomal subunit, and the archaeal 50S (large) ribosomal subunit." Q#24653 - CGI_10024291 superfamily 243035 134 246 2.15E-07 47.6146 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#24654 - CGI_10024292 superfamily 245814 38 116 0.00923442 32.8625 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24655 - CGI_10024293 superfamily 247684 8 433 3.83E-116 353.891 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24656 - CGI_10024294 superfamily 247684 12 110 5.20E-21 85.7919 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24657 - CGI_10024295 superfamily 247684 1 313 1.21E-77 249.887 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24658 - CGI_10024296 superfamily 247684 5 428 6.74E-106 326.927 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24659 - CGI_10024297 superfamily 243238 182 684 0 666.232 cl02915 Voltage_gated_ClC superfamily - - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#24659 - CGI_10024297 superfamily 246936 715 852 6.44E-34 126.598 cl15354 CBS_pair superfamily - - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#24660 - CGI_10024298 superfamily 246681 3 224 1.82E-128 366.581 cl14643 SRPBCC superfamily - - "START/RHO_alpha_C/PITP/Bet_v1/CoxG/CalC (SRPBCC) ligand-binding domain superfamily; SRPBCC domains have a deep hydrophobic ligand-binding pocket; they bind diverse ligands. Included in this superfamily are the steroidogenic acute regulatory protein (StAR)-related lipid transfer (START) domains of mammalian STARD1-STARD15, and the C-terminal catalytic domains of the alpha oxygenase subunit of Rieske-type non-heme iron aromatic ring-hydroxylating oxygenases (RHOs_alpha_C), as well as the SRPBCC domains of phosphatidylinositol transfer proteins (PITPs), Bet v 1 (the major pollen allergen of white birch, Betula verrucosa), CoxG, CalC, and related proteins. Other members of this superfamily include PYR/PYL/RCAR plant proteins, the aromatase/cyclase (ARO/CYC) domains of proteins such as Streptomyces glaucescens tetracenomycin, and the SRPBCC domains of Streptococcus mutans Smu.440 and related proteins." Q#24661 - CGI_10024299 superfamily 241570 171 256 4.53E-09 54.6394 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24662 - CGI_10024300 superfamily 241570 298 408 1.87E-19 85.0701 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24668 - CGI_10024306 superfamily 241570 258 335 4.37E-05 41.9278 cl00047 CAP_ED superfamily C - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#24669 - CGI_10024307 superfamily 243092 305 616 7.38E-52 183.69 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24669 - CGI_10024307 superfamily 217837 753 860 1.54E-20 88.8037 cl04367 Utp12 superfamily - - Dip2/Utp12 Family; This domain is found at the C-terminus of proteins containing WD40 repeats. These proteins are part of the U3 ribonucleoprotein the yeast protein is called Utp12 or DIP2. Q#24669 - CGI_10024307 superfamily 243092 58 200 2.30E-09 58.1152 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24670 - CGI_10024308 superfamily 241883 11 116 1.92E-34 118.258 cl00466 ATP-synt_C superfamily - - ATP synthase subunit C; ATP synthase subunit C. Q#24670 - CGI_10024308 superfamily 241883 108 150 5.81E-05 38.1992 cl00466 ATP-synt_C superfamily N - ATP synthase subunit C; ATP synthase subunit C. Q#24671 - CGI_10024309 superfamily 241593 45 160 1.20E-08 53.0342 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#24672 - CGI_10026329 superfamily 243096 504 635 1.86E-28 113.932 cl02571 RhoGEF superfamily C - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#24672 - CGI_10026329 superfamily 248318 799 853 3.93E-22 91.7285 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#24672 - CGI_10026329 superfamily 247725 656 754 3.58E-17 78.6458 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24672 - CGI_10026329 superfamily 247725 870 956 2.94E-16 75.8747 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24673 - CGI_10026330 superfamily 248318 36 88 8.50E-18 74.3945 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#24673 - CGI_10026330 superfamily 247725 112 195 1.41E-14 66.2447 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24674 - CGI_10026331 superfamily 243072 6 95 9.17E-09 48.5338 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24675 - CGI_10026332 superfamily 243072 6 95 1.02E-08 48.1486 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24677 - CGI_10026334 superfamily 241600 121 333 3.87E-88 266.028 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#24678 - CGI_10026335 superfamily 243082 84 276 6.90E-20 85.1104 cl02553 Peptidase_C19 superfamily C - "Peptidase C19 contains ubiquitinyl hydrolases. They are intracellular peptidases that remove ubiquitin molecules from polyubiquinated peptides by cleavage of isopeptide bonds. They hydrolyse bonds involving the carboxyl group of the C-terminal Gly residue of ubiquitin The purpose of the de-ubiquitination is thought to be editing of the ubiquitin conjugates, which could rescue them from degradation, as well as recycling of the ubiquitin. The ubiquitin/proteasome system is responsible for most protein turnover in the mammalian cell, and with over 50 members, family C19 is one of the largest families of peptidases in the human genome." Q#24679 - CGI_10026336 superfamily 245201 14 288 2.34E-56 189.653 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24680 - CGI_10026337 superfamily 247724 13 209 2.67E-80 240.925 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24681 - CGI_10026338 superfamily 243066 845 945 1.40E-06 47.6856 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24682 - CGI_10026339 superfamily 241782 152 551 2.08E-119 359.21 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#24683 - CGI_10026340 superfamily 246722 1545 1721 5.20E-80 263.463 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#24683 - CGI_10026340 superfamily 220722 968 1074 1.38E-20 90.5565 cl11040 EST1 superfamily - - Telomerase activating protein Est1; Est1 is a protein which recruits or activates telomerase at the site of polymerisation. Q#24686 - CGI_10026343 superfamily 243051 26 173 3.93E-12 63.5237 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#24687 - CGI_10026344 superfamily 191444 21 93 0.0063329 31.5257 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#24688 - CGI_10026345 superfamily 243119 60 112 1.34E-05 38.9541 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#24688 - CGI_10026345 superfamily 243119 3 52 0.000474979 34.7169 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#24689 - CGI_10026346 superfamily 241810 96 198 5.49E-58 181.977 cl00354 KOW superfamily C - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#24691 - CGI_10026348 superfamily 241740 9 122 2.34E-36 123.881 cl00269 cytidine_deaminase-like superfamily - - "Cytidine and deoxycytidylate deaminase zinc-binding region. The family contains cytidine deaminases, nucleoside deaminases, deoxycytidylate deaminases and riboflavin deaminases. Also included are the apoBec family of mRNA editing enzymes. All members are Zn dependent. The zinc ion in the active site plays a central role in the proposed catalytic mechanism, activating a water molecule to form a hydroxide ion that performs a nucleophilic attack on the substrate." Q#24692 - CGI_10026349 superfamily 215691 191 249 9.44E-05 40.2618 cl15766 Pyr_redox superfamily C - Pyridine nucleotide-disulphide oxidoreductase; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. Q#24694 - CGI_10026351 superfamily 247984 24 200 2.34E-53 182.425 cl17430 FtsJ superfamily - - "FtsJ-like methyltransferase; This family consists of FtsJ from various bacterial and archaeal sources FtsJ is a methyltransferase, but actually has no effect on cell division. FtsJ's substrate is the 23S rRNA. The 1.5 A crystal structure of FtsJ in complex with its cofactor S-adenosylmethionine revealed that FtsJ has a methyltransferase fold. This family also includes the N terminus of flaviviral NS5 protein. It has been hypothesised that the N-terminal domain of NS5 is a methyltransferase involved in viral RNA capping." Q#24694 - CGI_10026351 superfamily 221275 230 314 1.49E-27 109.298 cl13325 DUF3381 superfamily C - "Domain of unknown function (DUF3381); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 156 to 174 amino acids in length. This domain is found associated with pfam07780, pfam01728." Q#24694 - CGI_10026351 superfamily 219572 532 680 5.39E-26 106.609 cl06696 Spb1_C superfamily - - Spb1 C-terminal domain; This presumed domain is found at the C-terminus of a family of FtsJ-like methyltransferases. Members of this family are involved in 60S ribosomal biogenesis. Q#24696 - CGI_10026353 superfamily 247755 1156 1376 2.56E-118 370.671 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#24696 - CGI_10026353 superfamily 247755 566 707 1.02E-54 191.143 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#24696 - CGI_10026353 superfamily 216049 840 1107 1.04E-25 108.913 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#24696 - CGI_10026353 superfamily 216049 290 518 3.60E-24 104.29 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#24699 - CGI_10026356 superfamily 248022 341 590 6.46E-10 59.9839 cl17468 Aa_trans superfamily N - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#24699 - CGI_10026356 superfamily 248022 149 230 1.46E-07 52.6651 cl17468 Aa_trans superfamily C - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#24700 - CGI_10026357 superfamily 243092 434 619 0.00282846 39.2404 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24702 - CGI_10026359 superfamily 243854 1 154 1.29E-97 282.966 cl04709 ASF1_hist_chap superfamily - - ASF1 like histone chaperone; This family includes the yeast and human ASF1 protein. These proteins have histone chaperone activity. ASF1 participates in both the replication-dependent and replication-independent pathways. The structure three-dimensional has been determined as a a compact immunoglobulin-like beta sandwich fold topped by three helical linkers. Q#24704 - CGI_10026361 superfamily 242465 282 319 7.22E-05 42.0436 cl01378 LicD superfamily C - "LicD family; The LICD family of proteins show high sequence similarity and are involved in phosphorylcholine metabolism. There is evidence to show that LicD2 mutants have a reduced ability to take up choline, have decreased ability to adhere to host cells and are less virulent. These proteins are part of the nucleotidyltransferase superfamily." Q#24705 - CGI_10026362 superfamily 217390 158 299 1.29E-26 107.646 cl18407 TPT superfamily - - Triose-phosphate Transporter family; This family includes transporters with a specificity for triose phosphate. Q#24705 - CGI_10026362 superfamily 217473 435 662 1.75E-25 107.836 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#24705 - CGI_10026362 superfamily 248313 21 148 5.65E-08 52.2334 cl17759 EamA superfamily - - EamA-like transporter family; This family includes many hypothetical membrane proteins of unknown function. Many of the proteins contain two copies of the aligned region. The family used to be known as DUF6. Q#24706 - CGI_10026363 superfamily 220691 206 316 0.00982061 36.827 cl18569 7TM_GPCR_Srv superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srv; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srv is a member of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#24707 - CGI_10026365 superfamily 241677 142 299 5.66E-108 313.81 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#24707 - CGI_10026365 superfamily 247723 9 81 1.18E-48 158.54 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24708 - CGI_10026366 superfamily 247755 56 213 6.72E-42 153.523 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#24708 - CGI_10026366 superfamily 247755 986 1032 1.69E-22 97.2834 cl17201 ABC_ATPase superfamily NC - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#24711 - CGI_10026369 superfamily 217473 325 552 3.77E-25 107.066 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#24711 - CGI_10026369 superfamily 217473 1036 1137 5.37E-13 69.7013 cl03978 Mab-21 superfamily NC - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#24712 - CGI_10026370 superfamily 248012 515 643 2.58E-13 66.8325 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#24712 - CGI_10026370 superfamily 248012 340 450 1.28E-10 59.1285 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#24712 - CGI_10026370 superfamily 247057 253 318 5.67E-09 53.46 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#24712 - CGI_10026370 superfamily 248012 75 190 0.00225983 36.7869 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#24713 - CGI_10026371 superfamily 217925 11 120 4.85E-27 99.1177 cl04417 Ctr superfamily - - "Ctr copper transporter family; The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport." Q#24714 - CGI_10026372 superfamily 217925 48 92 0.00308086 34.7893 cl04417 Ctr superfamily N - "Ctr copper transporter family; The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport." Q#24716 - CGI_10026374 superfamily 248097 60 183 1.04E-13 64.2086 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24717 - CGI_10026375 superfamily 248097 220 343 1.34E-14 69.2162 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24717 - CGI_10026375 superfamily 248097 73 159 1.21E-12 63.4382 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24718 - CGI_10026376 superfamily 248097 59 182 1.01E-12 61.127 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24719 - CGI_10026377 superfamily 248097 118 241 2.40E-18 78.0758 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24720 - CGI_10026378 superfamily 222429 7 85 2.50E-14 65.3396 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#24721 - CGI_10026379 superfamily 248097 59 184 5.77E-18 76.1498 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24722 - CGI_10026380 superfamily 243077 10 65 1.78E-20 85.6749 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#24722 - CGI_10026380 superfamily 248275 309 334 1.32E-08 51.4268 cl17721 zf-C2H2_jaz superfamily - - "Zinc-finger double-stranded RNA-binding; This domain family is found in archaea and eukaryotes, and is approximately 30 amino acids in length. The mammalian members of this group occur multiple times along the protein, joined by flexible linkers, and are referred to as JAZ - dsRNA-binding ZF protein - zinc-fingers. The JAZ proteins are expressed in all tissues tested and localise in the nucleus, particularly the nucleolus. JAZ preferentially binds to double-stranded (ds) RNA or RNA/DNA hybrids rather than DNA. In addition to binding double-stranded RNA, these zinc-fingers are required for nucleolar localisation." Q#24724 - CGI_10026382 superfamily 247683 268 337 1.73E-05 42.1096 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#24725 - CGI_10026383 superfamily 246908 504 609 3.12E-56 189.457 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#24725 - CGI_10026383 superfamily 214269 614 765 1.07E-38 141.978 cl17108 iSH2_PI3K_IA_R superfamily - - "Inter-Src homology 2 (iSH2) helical domain of Class IA Phosphoinositide 3-kinase Regulatory subunits; PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives. They play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation, and apoptosis. They are classified according to their substrate specificity, regulation, and domain structure. Class IA PI3Ks are heterodimers of a p110 catalytic (C) subunit and a p85-related regulatory (R) subunit. The R subunit down-regulates PI3K basal activity, stabilizes the C subunit, and plays a role in the activation downstream of tyrosine kinases. All R subunits contain two SH2 domains that flank an intervening helical domain (iSH2), which binds to the N-terminal adaptor-binding domain (ABD) of the catalytic subunit. In vertebrates, there are three genes (PIK3R1, PIK3R2, and PIK3R3) that encode for different Class IA PI3K R subunits." Q#24725 - CGI_10026383 superfamily 243095 298 471 7.19E-28 111.625 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#24725 - CGI_10026383 superfamily 241566 214 263 5.86E-09 53.65 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#24725 - CGI_10026383 superfamily 247057 121 170 2.94E-08 51.8565 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#24725 - CGI_10026383 superfamily 246908 782 884 5.29E-26 104.033 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#24725 - CGI_10026383 superfamily 241566 45 86 0.000155676 40.5155 cl00040 C1 superfamily C - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#24726 - CGI_10026385 superfamily 245206 6 333 0 549.445 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#24728 - CGI_10026387 superfamily 243690 127 287 1.31E-16 75.8233 cl04276 Mtp superfamily N - Golgi 4-transmembrane spanning transporter; Golgi 4-transmembrane spanning transporter. Q#24730 - CGI_10026389 superfamily 248012 279 421 1.06E-22 93.9272 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#24731 - CGI_10026390 superfamily 245213 45 82 2.54E-05 42.2386 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#24733 - CGI_10026392 superfamily 247724 111 146 0.000594677 38.8564 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24736 - CGI_10026395 superfamily 247725 121 264 2.10E-37 137.835 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24736 - CGI_10026395 superfamily 243095 365 559 1.11E-97 311.257 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#24739 - CGI_10026398 superfamily 220666 73 156 2.26E-18 80.0966 cl10951 Tmemb_185A superfamily N - "Transmembrane Fragile-X-F protein; This is a family of conserved transmembrane proteins that appear in humans to be expressed from a region upstream of the FragileXF site and to be intimately linked with the Fragile-X syndrome. Absence of TMEM185A does not necessarily lead to developmental delay, but might in combination with other, yet unknown, factors. Otherwise, the lack of the TMEM185A protein is either disposable (redundant) or its function can be complemented by the highly similar chromosome 2 retro-pseudogene product, TMEM185B." Q#24741 - CGI_10026400 superfamily 245201 1 136 3.03E-46 157.123 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24744 - CGI_10000799 superfamily 215691 160 241 5.65E-08 49.8918 cl15766 Pyr_redox superfamily - - Pyridine nucleotide-disulphide oxidoreductase; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. Q#24746 - CGI_10000801 superfamily 217572 337 400 5.78E-11 59.0684 cl08392 NIR_SIR_ferr superfamily - - Nitrite/Sulfite reductase ferredoxin-like half domain; Sulfite and Nitrite reductases are key to both biosynthetic assimilation of sulfur and nitrogen and dissimilation of oxidized anions for energy transduction. Two copies of this repeat are found in Nitrite and Sulfite reductases and form a single structural domain. Q#24747 - CGI_10000803 superfamily 245203 11 515 0 712.845 cl09928 Molybdopterin-Binding superfamily - - "Molybdopterin-Binding (MopB) domain of the MopB superfamily of proteins, a large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site." Q#24747 - CGI_10000803 superfamily 245204 522 645 1.62E-55 187.01 cl09929 MopB_CT superfamily - - "Molybdopterin-Binding, C-terminal (MopB_CT) domain of the MopB superfamily of proteins, a large, diverse, heterogeneous superfamily of enzymes that, in general, bind molybdopterin as a cofactor. The MopB domain is found in a wide variety of molybdenum- and tungsten-containing enzymes, including formate dehydrogenase-H (Fdh-H) and -N (Fdh-N), several forms of nitrate reductase (Nap, Nas, NarG), dimethylsulfoxide reductase (DMSOR), thiosulfate reductase, formylmethanofuran dehydrogenase, and arsenite oxidase. Molybdenum is present in most of these enzymes in the form of molybdopterin, a modified pterin ring with a dithiolene side chain, which is responsible for ligating the Mo. In many bacterial and archaeal species, molybdopterin is in the form of a dinucleotide, with two molybdopterin dinucleotide units per molybdenum. These proteins can function as monomers, heterodimers, or heterotrimers, depending on the protein and organism. Also included in the MopB superfamily is the eukaryotic/eubacterial protein domain family of the 75-kDa subunit/Nad11/NuoG (second domain) of respiratory complex 1/NADH-quinone oxidoreductase which is postulated to have lost an ancestral formate dehydrogenase activity and only vestigial sequence evidence remains of a molybdopterin binding site. This hierarchy is of the conserved MopB_CT domain present in many, but not all, MopB homologs." Q#24747 - CGI_10000803 superfamily 242297 772 823 5.50E-13 65.3851 cl01093 Fer2_BFD superfamily - - "BFD-like [2Fe-2S] binding domain; The two Fe ions are each coordinated by two conserved cysteine residues. This domain occurs alone in small proteins such as Bacterioferritin-associated ferredoxin (BFD). The function of BFD is not known, but it may may be a general redox and/or regulatory component involved in the iron storage or mobilisation functions of bacterioferritin in bacteria. This domain is also found in nitrate reductase proteins in association with Nitrite and sulphite reductase 4Fe-4S domain (pfam01077), Nitrite/Sulfite reductase ferredoxin-like half domain (pfam03460) and Pyridine nucleotide-disulphide oxidoreductase (pfam00070). It is also found in NifU nitrogen fixation proteins, in association with NifU-like N terminal domain (pfam01592) and NifU-like domain (pfam01106)." Q#24748 - CGI_10014146 superfamily 218787 8 145 1.45E-61 189.404 cl05443 ESCRT-II superfamily - - "ESCRT-II complex subunit; This family of conserved eukaryotic proteins are subunits of the endosome associated complex ESCRT-II which recruits transport machinery for protein sorting at the multivesicular body (MVB). This protein complex transiently associates with the endosomal membrane and thereby initiates the formation of ESCRT-III, a membrane-associated protein complex that functions immediately downstream of ESCRT-II during sorting of MVB cargo. ESCRT-II in turn functions downstream of ESCRT-I, a protein complex that binds to ubiquitinated endosomal cargo." Q#24749 - CGI_10014147 superfamily 243092 420 675 3.90E-68 226.833 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24749 - CGI_10014147 superfamily 208922 93 225 5.85E-51 173.919 cl08418 TAF5_NTD2 superfamily - - "TAF5_NTD2 is the second conserved N-terminal region of TATA Binding Protein (TBP) Associated Factor 5 (TAF5), involved in forming Transcription Factor IID (TFIID); The TATA Binding Protein (TBP) Associated Factor 5 (TAF5) is one of several TAFs that bind TBP and are involved in forming Transcription Factor IID (TFIID) complex. TAF5 contains three domains, two conserved sequence motifs at the N-terminal and one at the C-terminal region. TFIID is one of seven General Transcription Factors (GTF) (TFIIA, TFIIB, TFIID, TFIIE, TFIIF, and TFIID) involved in accurate initiation of transcription by RNA polymerase II in eukaryotes. TFIID plays an important role in the recognition of promoter DNA and assembly of the preinitiation complex. TFIID complex is composed of the TBP and at least 13 TAFs. In yeast and human cells, TAFs have been found as components of other complexes besides TFIID. TAF5 may play a major role in forming TFIID and its related complexes. TAFs from various species were originally named by their predicted molecular weight or their electrophoretic mobility in polyacrylamide gels. A new, unified nomenclature for the pol II TAFs has been suggested to show the relationship between TAF orthologs and paralogs. TAF5 has a paralog gene (TAF5L) which has a redundant function. Several hypotheses are proposed for TAFs functions such as serving as activator-binding sites, core-promoter recognition or a role in essential catalytic activity. C-terminus of TAF5 contains six WD40 repeats that likely form a closed beta propeller structure and may be involved in protein-protein interaction. The first part of the TAF5 N-terminal (TAF5_NTD1) homodimerizes in the absence of other TAFs. The second conserved N-terminal part of TAF5 (TAF5_NTD2) has an alpha-helical domain. One study has shown that TAF5_NTD2 homodimerizes only at high concentration of calcium but not any other metals. No dimerization was observed in other structural studies of TAF_NTD2. Several TAFs interact via histone-fold (HFD) motifs; HFD is the interaction motif involved in heterodimerization of the core histones and their assembly into nucleosome octamer. However, TAF5 does not have a HFD motif." Q#24751 - CGI_10014149 superfamily 218883 148 310 0.00113315 40.196 cl09398 DUF936 superfamily NC - Plant protein of unknown function (DUF936); This family consists of several hypothetical proteins from Arabidopsis thaliana and Oryza sativa. The function of this family is unknown. Q#24752 - CGI_10014150 superfamily 248028 571 755 1.31E-21 95.6517 cl17474 Steroid_dh superfamily N - "3-oxo-5-alpha-steroid 4-dehydrogenase; This family consists of 3-oxo-5-alpha-steroid 4-dehydrogenases, EC:1.3.99.5 Also known as Steroid 5-alpha-reductase, the reaction catalyzed by this enzyme is: 3-oxo-5-alpha-steroid + acceptor <=> 3-oxo-delta(4)-steroid + reduced acceptor. The Steroid 5-alpha-reductase enzyme is responsible for the formation of dihydrotestosterone, this hormone promotes the differentiation of male external genitalia and the prostate during fetal development. In humans mutations in this enzyme can cause a form of male pseudohermaphorditism in which the external genitalia and prostate fail to develop normally. A related enzyme is also found in plants is DET2, a steroid reductase from Arabidopsis. Mutations in this enzyme cause defects in light-regulated development." Q#24752 - CGI_10014150 superfamily 214781 126 224 2.41E-12 64.6708 cl02747 NRF superfamily - - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#24753 - CGI_10014151 superfamily 247787 104 476 0 774.089 cl17233 RecA-like_NTPases superfamily - - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#24753 - CGI_10014151 superfamily 215848 494 636 8.71E-24 96.9653 cl08258 ATP-synt_ab_C superfamily - - "ATP synthase alpha/beta chain, C terminal domain; ATP synthase alpha/beta chain, C terminal domain. " Q#24753 - CGI_10014151 superfamily 217261 40 102 1.03E-12 64.0836 cl18399 ATP-synt_ab_N superfamily - - "ATP synthase alpha/beta family, beta-barrel domain; This family includes the ATP synthase alpha and beta subunits the ATP synthase associated with flagella." Q#24754 - CGI_10014152 superfamily 241581 354 454 1.03E-08 52.7738 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#24755 - CGI_10014153 superfamily 247044 1507 1612 5.72E-32 122.765 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24755 - CGI_10014153 superfamily 247044 1196 1289 5.62E-23 96.5376 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24755 - CGI_10014153 superfamily 247044 1634 1737 9.05E-15 72.6493 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24755 - CGI_10014153 superfamily 247044 1316 1389 4.95E-08 52.7554 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#24759 - CGI_10014157 superfamily 243267 31 395 1.79E-130 382.732 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#24760 - CGI_10014158 superfamily 216212 722 1226 0 675.547 cl03037 HCO3_cotransp superfamily - - HCO3- transporter family; This family contains Band 3 anion exchange proteins that exchange CL-/HCO3-. This family also includes cotransporters of Na+/HCO3-. Q#24760 - CGI_10014158 superfamily 192379 1381 1695 8.98E-57 201.906 cl10764 Hyccin superfamily - - "Hyccin; Members of this family of proteins may have a role in the beta-catenin-Tcf/Lef signaling pathway, as well as in the process of myelination of the central and peripheral nervous system. Defects in Hyccin are the cause of hypomyelination with congenital cataracts. This disorder is characterized by congenital cataracts, progressive neurologic impairment, and diffuse myelin deficiency. Affected individuals experience progressive pyramidal and cerebellar dysfunction, muscle weakness and wasting prevailing in the lower limbs." Q#24760 - CGI_10014158 superfamily 241651 597 697 0.00015517 42.236 cl00163 PTS_IIA_fru superfamily N - "PTS_IIA, PTS system, fructose/mannitol specific IIA subunit. The bacterial phosphoenolpyruvate: sugar phosphotransferase system (PTS) is a multi-protein system involved in the regulation of a variety of metabolic and transcriptional processes. This family is one of four structurally and functionally distinct group IIA PTS system cytoplasmic enzymes, necessary for the uptake of carbohydrates across the cytoplasmic membrane and their phosphorylation." Q#24761 - CGI_10014159 superfamily 241640 5 130 7.44E-23 97.7322 cl00149 Tryp_SPc superfamily C - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#24762 - CGI_10014160 superfamily 220695 44 168 3.83E-05 43.7215 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#24763 - CGI_10014161 superfamily 241640 42 275 1.45E-75 232.552 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#24764 - CGI_10014162 superfamily 245864 55 523 6.74E-86 275.311 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#24765 - CGI_10014163 superfamily 241567 106 159 5.78E-12 60.0214 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#24767 - CGI_10019601 superfamily 248097 318 448 3.64E-16 74.609 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24771 - CGI_10019605 superfamily 247725 140 250 2.60E-49 168.623 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24771 - CGI_10019605 superfamily 219977 12 88 7.65E-13 65.384 cl18539 Vps51 superfamily - - "Vps51/Vps67; This family includes a presumed domain found in a number of components of vesicular transport. The VFT tethering complex (also known as GARP complex, Golgi associated retrograde protein complex, Vps53 tethering complex) is a conserved eukaryotic docking complex which is involved recycling of proteins from endosomes to the late Golgi. Vps51 (also known as Vps67) is a subunit of VFT and interacts with the SNARE Tlg1. Cog1_N is the N-terminus of the Cog1 subunit of the eight-unit Conserved Oligomeric Golgi (COG) complex that participates in retrograde vesicular transport and is required to maintain normal Golgi structure and function. The subunits are located in two lobes and Cog1 serves to bind the two lobes together probably via the highly conserved N-terminal domain of approximately 85 residues." Q#24772 - CGI_10019606 superfamily 241607 45 79 1.69E-06 42.257 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24772 - CGI_10019606 superfamily 241607 86 120 1.93E-06 41.8718 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24775 - CGI_10019609 superfamily 247723 119 195 2.62E-47 159.089 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24775 - CGI_10019609 superfamily 247723 204 273 8.24E-45 152.036 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#24777 - CGI_10019611 superfamily 218077 381 429 8.40E-09 52.3362 cl04505 DUF543 superfamily - - Domain of unknown function (DUF543); This family of short eukaryotic proteins has no known function. Most of the members of this family are only 80 amino acid residues long. However the Arabidopsis homologue is over 300 residues long. The presumed domain contains a conserved amino terminal cysteine and a conserved motif GXGXGXG in the carboxy terminal half that may be functionally important. Q#24778 - CGI_10019612 superfamily 221898 2993 3091 1.65E-19 87.2658 cl16030 DUF3883 superfamily - - Domain of unknown function (DUF3883); This is a domain is uncharacterized. It is found on restriction endonucleases. Q#24780 - CGI_10019614 superfamily 241596 21 46 0.000835082 36.8083 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#24780 - CGI_10019614 superfamily 243123 60 98 1.07E-13 64.8869 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#24782 - CGI_10019616 superfamily 241596 50 107 2.52E-10 56.0683 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#24782 - CGI_10019616 superfamily 243123 125 159 6.03E-11 57.5681 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#24783 - CGI_10019617 superfamily 192535 21 213 0.00355704 37.1902 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#24785 - CGI_10019619 superfamily 201806 122 595 5.88E-92 296.703 cl18220 Peptidase_M8 superfamily N - Leishmanolysin; Leishmanolysin. Q#24789 - CGI_10019623 superfamily 222150 643 666 1.10E-05 43.5345 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24789 - CGI_10019623 superfamily 222150 560 584 1.84E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24789 - CGI_10019623 superfamily 222150 671 696 0.00014493 40.4529 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24789 - CGI_10019623 superfamily 222150 588 613 0.00249598 36.6009 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24789 - CGI_10019623 superfamily 222150 535 557 0.0053565 35.8305 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24790 - CGI_10019624 superfamily 222150 363 387 1.19E-05 43.1493 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24790 - CGI_10019624 superfamily 222150 446 469 1.33E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24790 - CGI_10019624 superfamily 222150 474 498 0.00471542 35.4453 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#24791 - CGI_10019625 superfamily 218364 383 733 8.29E-61 212.542 cl04875 PigN superfamily - - Phosphatidylinositolglycan class N (PIG-N); Phosphatidylinositolglycan class N (PIG-N) is a mammalian homologue of the yeast protein MCD4P and is expressed in the endoplasmic reticulum. PIG-N is essential for glycosylphosphatidylinositol anchor synthesis. Glycosylphosphatidylinositol (GPI)-anchored proteins are cell surface-localised proteins that serve many important cellular functions. Q#24791 - CGI_10019625 superfamily 248020 143 308 0.000129862 43.6072 cl17466 Sulfatase superfamily N - Sulfatase; Sulfatase. Q#24793 - CGI_10019627 superfamily 152053 423 436 0.00398407 35.5839 cl13123 Cu-binding_MopE superfamily N - "Protein metal binding site; This family of proteins represents a unique protein copper binding site that involves a tryptophan metabolite, kynurenine in the protein MopE. The production of kyneurenin by modification of tryptophan and its involvement in copper binding is an innate property of MopE." Q#24797 - CGI_10004029 superfamily 243072 1141 1263 1.47E-34 130.581 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24797 - CGI_10004029 superfamily 243072 945 1054 3.83E-19 85.8982 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24797 - CGI_10004029 superfamily 243072 995 1164 2.03E-16 78.1942 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24797 - CGI_10004029 superfamily 243072 1238 1323 4.67E-07 50.0746 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24798 - CGI_10004030 superfamily 247684 13 435 1.37E-103 321.149 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24799 - CGI_10002337 superfamily 216686 3 123 9.50E-20 82.3709 cl18377 Galactosyl_T superfamily N - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#24802 - CGI_10007972 superfamily 218753 85 149 8.04E-20 88.1092 cl05390 Tcp11 superfamily C - T-complex protein 11; This family consists of several eukaryotic T-complex protein 11 (Tcp11) related sequences. Tcp11 is only expressed in fertile adult mammalian testes and is thought to be important in sperm function and fertility. The family also contains the yeast Sok1 protein which is known to suppress cyclic AMP-dependent protein kinase mutants. Q#24803 - CGI_10007973 superfamily 246723 13 369 1.27E-157 457.796 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#24804 - CGI_10007974 superfamily 246723 13 369 1.25E-152 449.707 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#24806 - CGI_10007976 superfamily 241593 27 145 7.18E-12 63.4601 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#24808 - CGI_10002521 superfamily 241613 101 133 1.67E-08 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24808 - CGI_10002521 superfamily 241613 226 260 1.72E-08 52.5942 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24808 - CGI_10002521 superfamily 241613 187 218 7.11E-08 50.6682 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24808 - CGI_10002521 superfamily 241613 268 300 2.88E-05 42.9642 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24808 - CGI_10002521 superfamily 241613 141 174 0.00108546 38.3418 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24808 - CGI_10002521 superfamily 214531 647 689 2.43E-12 63.7748 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24808 - CGI_10002521 superfamily 215683 623 664 2.79E-12 63.3431 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#24808 - CGI_10002521 superfamily 214531 560 599 4.86E-09 54.1449 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#24808 - CGI_10002521 superfamily 215683 581 620 8.03E-05 41.7719 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#24808 - CGI_10002521 superfamily 241613 345 379 0.0082257 35.7346 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#24810 - CGI_10002523 superfamily 244906 413 480 5.16E-18 78.7212 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#24811 - CGI_10002524 superfamily 217685 234 386 3.36E-18 81.6116 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#24811 - CGI_10002524 superfamily 216290 96 213 1.75E-09 55.3722 cl03089 Cu2_monooxygen superfamily - - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#24818 - CGI_10007573 superfamily 241574 136 304 3.89E-58 198.194 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#24818 - CGI_10007573 superfamily 241574 306 433 4.71E-06 47.1953 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#24818 - CGI_10007573 superfamily 238012 677 718 0.00290467 36.5634 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#24821 - CGI_10000563 superfamily 243061 76 176 9.04E-34 117.058 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#24821 - CGI_10000563 superfamily 243061 25 73 0.000125486 38.0918 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#24822 - CGI_10008482 superfamily 242889 221 289 6.46E-09 53.3999 cl02111 PCI superfamily C - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#24826 - CGI_10008486 superfamily 241563 68 109 3.09E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24826 - CGI_10008486 superfamily 241563 28 59 0.00187435 36.6884 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24827 - CGI_10008487 superfamily 245208 1 371 0 693.718 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#24829 - CGI_10008489 superfamily 241642 181 236 3.59E-09 51.7274 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#24830 - CGI_10008490 superfamily 190308 36 172 2.25E-53 184.83 cl18163 Fringe superfamily N - "Fringe-like; The drosophila protein fringe (FNG) is a glucosaminyltransferase that controls the response of the Notch receptor to specific ligands. FNG is localised to the Golgi apparatus (not secreted as previously thought). Modification of Notch occurs through glycosylation by FNG. The xenopus homologue, lunatic fringe, has been implicated in a variety of functions." Q#24830 - CGI_10008490 superfamily 241563 200 241 0.0058701 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24836 - CGI_10001059 superfamily 220679 15 133 2.59E-09 52.3293 cl18567 Methyltransf_16 superfamily N - Putative methyltransferase; Putative methyltransferase. Q#24844 - CGI_10001533 superfamily 241566 213 258 2.07E-11 58.6576 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#24844 - CGI_10001533 superfamily 241645 130 196 2.11E-12 62.0314 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#24845 - CGI_10001534 superfamily 248097 3 122 4.82E-13 61.127 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24848 - CGI_10024616 superfamily 246936 786 864 4.65E-14 70.7441 cl15354 CBS_pair superfamily N - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#24848 - CGI_10024616 superfamily 246936 697 752 0.000176894 41.8541 cl15354 CBS_pair superfamily C - "The CBS domain, named after human CBS, is a small domain originally identified in cystathionine beta-synthase and is subsequently found in a wide range of different proteins. CBS domains usually occur in tandem repeats. They associate to form a so-called Bateman domain or a CBS pair based on crystallographic studies in bacteria. The CBS pair was used as a basis for this cd hierarchy since the human CBS proteins can adopt the typical core structure and form an intramolecular CBS pair. The interface between the two CBS domains forms a cleft that is a potential ligand binding site. The CBS pair coexists with a variety of other functional domains and this has been used to help in its classification here. It has been proposed that the CBS domain may play a regulatory role, although its exact function is unknown. Mutations of conserved residues within this domain are associated with a variety of human hereditary diseases, including congenital myotonia, idiopathic generalized epilepsy, hypercalciuric nephrolithiasis, and classic Bartter syndrome (CLC chloride channel family members), Wolff-Parkinson-White syndrome (gamma 2 subunit of AMP-activated protein kinase), retinitis pigmentosa (IMP dehydrogenase-1), and homocystinuria (cystathionine beta-synthase)." Q#24848 - CGI_10024616 superfamily 243238 109 460 7.54E-63 223.684 cl02915 Voltage_gated_ClC superfamily C - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#24848 - CGI_10024616 superfamily 243238 540 672 3.31E-41 159.741 cl02915 Voltage_gated_ClC superfamily N - "CLC voltage-gated chloride channel. The ClC chloride channels catalyse the selective flow of Cl- ions across cell membranes, thereby regulating electrical excitation in skeletal muscle and the flow of salt and water across epithelial barriers. This domain is found in the halogen ions (Cl-, Br- and I-) transport proteins of the ClC family. The ClC channels are found in all three kingdoms of life and perform a variety of functions including cellular excitability regulation, cell volume regulation, membrane potential stabilization, acidification of intracellular organelles, signal transduction, transepithelial transport in animals, and the extreme acid resistance response in eubacteria. They lack any structural or sequence similarity to other known ion channels and exhibit unique properties of ion permeation and gating. Unlike cation-selective ion channels, which form oligomers containing a single pore along the axis of symmetry, the ClC channels form two-pore homodimers with one pore per subunit without axial symmetry. Although lacking the typical voltage-sensor found in cation channels, all studied ClC channels are gated (opened and closed) by transmembrane voltage. The gating is conferred by the permeating ion itself, acting as the gating charge. In addition, eukaryotic and some prokaryotic ClC channels have two additional C-terminal CBS (cystathionine beta synthase) domains of putative regulatory function." Q#24851 - CGI_10024619 superfamily 217293 1 157 7.83E-32 119.66 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#24851 - CGI_10024619 superfamily 202474 166 250 1.25E-21 90.7908 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#24851 - CGI_10024619 superfamily 202474 295 326 3.10E-05 43.4113 cl08379 Neur_chan_memb superfamily N - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#24852 - CGI_10024620 superfamily 247684 76 499 2.16E-95 301.889 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24854 - CGI_10024622 superfamily 215648 140 366 9.02E-23 96.8959 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#24855 - CGI_10024623 superfamily 241733 10 85 1.87E-49 152.784 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#24856 - CGI_10024624 superfamily 247725 1459 1581 7.35E-64 216.047 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24856 - CGI_10024624 superfamily 247725 2128 2265 2.89E-45 163.171 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24856 - CGI_10024624 superfamily 243096 1950 2123 6.15E-42 155.148 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#24856 - CGI_10024624 superfamily 243096 1258 1427 1.63E-39 148.215 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#24856 - CGI_10024624 superfamily 243054 543 761 1.02E-19 90.9679 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#24856 - CGI_10024624 superfamily 241584 2622 2714 1.47E-15 75.6107 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#24856 - CGI_10024624 superfamily 243054 657 874 1.17E-12 69.3967 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#24856 - CGI_10024624 superfamily 243054 877 1107 2.26E-12 68.2411 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#24856 - CGI_10024624 superfamily 247069 49 178 4.47E-10 60.8618 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#24856 - CGI_10024624 superfamily 243054 320 541 1.53E-08 56.3 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#24856 - CGI_10024624 superfamily 243054 1064 1215 3.56E-08 55.1444 cl02488 SPEC superfamily N - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#24856 - CGI_10024624 superfamily 243054 194 415 5.36E-08 54.7592 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#24856 - CGI_10024624 superfamily 245201 2739 2993 3.94E-52 188.171 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24856 - CGI_10024624 superfamily 247683 2410 2465 5.84E-24 99.0515 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#24856 - CGI_10024624 superfamily 247683 1661 1723 2.98E-19 85.5253 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#24856 - CGI_10024624 superfamily 245814 2535 2618 7.19E-15 73.6936 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24857 - CGI_10024625 superfamily 218703 4 215 6.19E-85 253.923 cl05325 BCAS2 superfamily - - Breast carcinoma amplified sequence 2 (BCAS2); This family consists of several eukaryotic sequences of unknown function. The mammalian members of this family are annotated as breast carcinoma amplified sequence 2 (BCAS2) proteins. BCAS2 is a putative spliceosome associated protein. Q#24860 - CGI_10024628 superfamily 248458 158 291 5.22E-19 87.3693 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24860 - CGI_10024628 superfamily 248458 387 525 6.90E-06 46.9233 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24861 - CGI_10024629 superfamily 246723 97 515 1.81E-110 350.065 cl14813 GluZincin superfamily C - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#24861 - CGI_10024629 superfamily 246723 529 711 3.57E-88 290.744 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#24862 - CGI_10024630 superfamily 220692 5 282 2.14E-17 81.0965 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#24864 - CGI_10024632 superfamily 241624 72 349 1.14E-76 238.765 cl00120 PP2Cc superfamily - - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#24865 - CGI_10024633 superfamily 243555 18 207 3.87E-16 77.0462 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#24866 - CGI_10024634 superfamily 243035 31 136 0.00160833 35.6734 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#24867 - CGI_10024635 superfamily 247769 247 420 1.06E-13 68.5201 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#24868 - CGI_10024636 superfamily 243066 39 129 7.56E-21 88.7641 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24868 - CGI_10024636 superfamily 219619 366 430 2.58E-08 51.8247 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#24870 - CGI_10024638 superfamily 247916 104 211 9.12E-05 42.0026 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#24872 - CGI_10024640 superfamily 245201 12 318 0 587.545 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24873 - CGI_10024641 superfamily 248318 980 1033 2.53E-18 80.9429 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#24873 - CGI_10024641 superfamily 247725 5 147 8.04E-46 162.84 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#24873 - CGI_10024641 superfamily 219103 190 338 1.86E-41 149.059 cl05893 Myotub-related superfamily - - "Myotubularin-related; This family represents a region within eukaryotic myotubularin-related proteins that is sometimes found with pfam02893. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease." Q#24873 - CGI_10024641 superfamily 206020 398 452 7.21E-34 125.313 cl18286 Y_phosphatase_m superfamily - - "Myotubularin Y_phosphatase-like; This short region is highly conserved and seems to be common to many myotubularin proteins with protein tyrosine pyrophosphate activity. As the family has a number of highly conserved residues such as histidine, cysteine, glutamine and aspartate, it is possible that this represents a catalytic core of the active enzymatic part of the proteins." Q#24876 - CGI_10024644 superfamily 241611 121 234 0.00188384 36.5976 cl00102 PTX superfamily - - "Pentraxins are plasma proteins characterized by their pentameric discoid assembly and their Ca2+ dependent ligand binding, such as Serum amyloid P component (SAP) and C-reactive Protein (CRP), which are cytokine-inducible acute-phase proteins implicated in innate immunity. CRP binds to ligands containing phosphocholine, SAP binds to amyloid fibrils, DNA, chromatin, fibronectin, C4-binding proteins and glycosaminoglycans. "Long" pentraxins have N-terminal extensions to the common pentraxin domain; one group, the neuronal pentraxins, may be involved in synapse formation and remodeling, and they may also be able to form heteromultimers." Q#24877 - CGI_10024645 superfamily 242849 58 110 1.72E-06 45.2725 cl02041 Cyt-b5 superfamily C - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#24877 - CGI_10024645 superfamily 242849 283 331 0.00769479 34.4869 cl02041 Cyt-b5 superfamily C - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#24878 - CGI_10024646 superfamily 241574 472 726 1.55E-104 324.154 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#24878 - CGI_10024646 superfamily 246908 209 307 6.00E-56 187.221 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#24878 - CGI_10024646 superfamily 246908 315 414 1.06E-54 184.02 cl15255 SH2 superfamily - - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#24882 - CGI_10024651 superfamily 248458 292 425 4.44E-11 64.2573 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24882 - CGI_10024651 superfamily 248458 10 121 3.76E-09 58.0941 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24883 - CGI_10024652 superfamily 243078 3 119 1.86E-33 121.55 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#24884 - CGI_10024653 superfamily 245605 7 265 1.51E-149 422.012 cl11409 RNAP_RPB11_RPB3 superfamily - - "RPB11 and RPB3 subunits of RNA polymerase; The eukaryotic RPB11 and RPB3 subunits of RNA polymerase (RNAP), as well as their archaeal (L and D subunits) and bacterial (alpha subunit) counterparts, are involved in the assembly of RNAP, a large multi-subunit complex responsible for the synthesis of RNA. It is the principal enzyme of the transcription process, and is a final target in many regulatory pathways that control gene expression in all living cells. At least three distinct RNAP complexes are found in eukaryotic nuclei: RNAP I, RNAP II, and RNAP III, for the synthesis of ribosomal RNA precursor, mRNA precursor, and 5S and tRNA, respectively. A single distinct RNAP complex is found in prokaryotes and archaea, which may be responsible for the synthesis of all RNAs. The assembly of the two largest eukaryotic RNAP subunits that provide most of the enzyme's catalytic functions depends on the presence of RPB3/RPB11 heterodimer subunits. This is also true for the archaeal (D/L subunits) and bacterial (alpha subunit) counterparts." Q#24885 - CGI_10024654 superfamily 243073 20 54 9.84E-07 44.7685 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#24885 - CGI_10024654 superfamily 245201 116 298 1.81E-20 87.5538 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24886 - CGI_10024655 superfamily 243092 56 307 8.82E-68 226.447 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#24886 - CGI_10024655 superfamily 222456 570 729 1.21E-55 188.226 cl16480 Katanin_con80 superfamily - - "con80 domain of Katanin; The con80 domain of katanin is the C-terminal region of the protein that binds to the N-terminal domain of katanin-p60, the catalytic ATPase. The complex associates with a specific subregion of the mitotic spindle leading to increased microtubule disassembly and targeting of p60 to the spindle poles. The assembly and function of the mitotic spindle requires the activity of a number of microtubule-binding proteins. Katanin, a heterodimeric microtubule-severing ATPase, is found localized at mitotic spindle poles. A proposed model is that katanin is targeted to spindle poles through a combination of direct microtubule binding by the p60 subunit and through interactions between the WD40 domain and an unknown protein." Q#24889 - CGI_10024658 superfamily 243073 102 144 3.40E-06 40.9165 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#24890 - CGI_10024659 superfamily 243072 49 174 2.85E-35 122.877 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24892 - CGI_10024661 superfamily 227778 46 196 1.76E-09 53.6587 cl17122 VPS24 superfamily - - Conserved protein implicated in secretion [Cell motility and secretion] Q#24894 - CGI_10024663 superfamily 241600 1 179 7.54E-77 231.36 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#24895 - CGI_10001824 superfamily 146263 43 162 1.52E-44 152.844 cl04138 SK_channel superfamily - - Calcium-activated SK potassium channel; Calcium-activated SK potassium channel. Q#24895 - CGI_10001824 superfamily 198825 337 410 2.52E-15 70.9056 cl03763 CaMBD superfamily - - "Calmodulin binding domain; Small-conductance Ca2+-activated K+ channels (SK channels) are independent of voltage and gated solely by intracellular Ca2+. These membrane channels are heteromeric complexes that comprise pore-forming alpha-subunits and the Ca2+-binding protein calmodulin (CaM). CaM binds to the SK channel through this the CaM-binding domain (CaMBD), which is located in an intracellular region of the alpha-subunit immediately carboxy-terminal to the pore. Channel opening is triggered when Ca2+ binds the EF hands in the N-lobe of CaM. The structure of this domain complexed with CaM is known. This domain forms an elongated dimer with a CaM molecule bound at each end; each CaM wraps around three alpha-helices, two from one CaMBD subunit and one from the other." Q#24895 - CGI_10001824 superfamily 219619 241 319 1.57E-11 60.2991 cl18518 Ion_trans_2 superfamily - - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#24895 - CGI_10001824 superfamily 191182 387 445 0.00779299 34.9694 cl04917 Nsp1_C superfamily N - Nsp1-like C-terminal region; This family probably forms a coiled-coil. This important region of Nsp1 is involved in binding Nup82. Q#24896 - CGI_10001825 superfamily 244837 67 342 1.34E-63 215.631 cl07971 Glyco_hydro_3 superfamily - - Glycosyl hydrolase family 3 N terminal domain; Glycosyl hydrolase family 3 N terminal domain. Q#24896 - CGI_10001825 superfamily 222669 651 718 7.85E-16 73.582 cl17048 Fn3-like superfamily - - Fibronectin type III-like domain; This domain has a fibronectin type III-like structure. It is often found in association with pfam00933 and pfam01915. Its function is unknown. Q#24897 - CGI_10001826 superfamily 241607 64 88 1.36E-05 38.0198 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24897 - CGI_10001826 superfamily 241607 22 56 7.14E-06 38.8043 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24898 - CGI_10001827 superfamily 241607 191 214 0.000127747 38.0198 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24898 - CGI_10001827 superfamily 241607 16 40 0.000150091 37.6346 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24898 - CGI_10001827 superfamily 241607 58 92 0.000259775 36.8642 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24898 - CGI_10001827 superfamily 241607 149 183 0.000458642 36.479 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24898 - CGI_10001827 superfamily 241607 97 134 0.00325163 33.807 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#24899 - CGI_10001526 superfamily 217598 3 107 1.15E-48 163.409 cl04130 KCNQ_channel superfamily N - KCNQ voltage-gated potassium channel; This family matches to the C-terminal tail of KCNQ type potassium channels. Q#24901 - CGI_10001528 superfamily 243072 294 423 1.61E-31 117.485 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24901 - CGI_10001528 superfamily 243072 373 481 3.99E-27 105.543 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24901 - CGI_10001528 superfamily 243072 178 353 3.51E-09 54.3118 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24902 - CGI_10001529 superfamily 219789 28 128 1.88E-30 108.122 cl07070 cwf18 superfamily - - cwf18 pre-mRNA splicing factor; The cwf18 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe. Q#24903 - CGI_10001981 superfamily 245206 1166 1397 2.67E-50 185.566 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#24903 - CGI_10001981 superfamily 244888 16 310 1.01E-68 236.145 cl08282 Acyl_transf_1 superfamily - - Acyl transferase domain; Acyl transferase domain. Q#24903 - CGI_10001981 superfamily 245209 1442 1514 2.79E-06 47.6301 cl09936 PP-binding superfamily - - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#24910 - CGI_10002946 superfamily 243072 798 915 1.02E-32 124.418 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24910 - CGI_10002946 superfamily 243072 692 852 3.53E-14 70.8754 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24910 - CGI_10002946 superfamily 243072 613 747 2.89E-09 56.2378 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24916 - CGI_10013230 superfamily 241578 903 1064 4.15E-39 144.357 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24916 - CGI_10013230 superfamily 241578 79 238 9.42E-08 51.9086 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24916 - CGI_10013230 superfamily 241578 715 885 6.34E-22 95.1476 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24916 - CGI_10013230 superfamily 241578 497 672 2.17E-09 57.0129 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24916 - CGI_10013230 superfamily 241578 298 440 0.000360612 41.123 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24919 - CGI_10013233 superfamily 241584 915 1011 2.56E-09 56.3507 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#24919 - CGI_10013233 superfamily 241584 832 907 2.49E-08 53.2691 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#24919 - CGI_10013233 superfamily 245814 519 594 1.58E-14 71.3824 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24919 - CGI_10013233 superfamily 245814 431 495 1.17E-11 62.4243 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24919 - CGI_10013233 superfamily 245814 319 413 3.86E-11 61.6421 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24919 - CGI_10013233 superfamily 245814 228 309 5.91E-11 61.405 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24919 - CGI_10013233 superfamily 245814 664 753 1.42E-10 60.331 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24919 - CGI_10013233 superfamily 245814 110 196 4.43E-06 46.6089 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24919 - CGI_10013233 superfamily 245814 617 661 5.63E-06 45.8605 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24919 - CGI_10013233 superfamily 245814 29 93 7.35E-05 42.394 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24920 - CGI_10013234 superfamily 238012 241 285 6.16E-06 43.1118 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#24920 - CGI_10013234 superfamily 241584 42 96 0.000734441 37.4035 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#24921 - CGI_10013235 superfamily 243072 25 102 2.40E-11 58.549 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24922 - CGI_10013236 superfamily 241583 291 408 4.13E-23 98.8499 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#24922 - CGI_10013236 superfamily 216572 10 137 1.75E-05 44.1879 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#24923 - CGI_10013237 superfamily 241782 112 437 2.23E-112 340.127 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#24924 - CGI_10013238 superfamily 247724 444 563 1.01E-14 73.1391 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24926 - CGI_10013240 superfamily 246680 147 228 2.59E-09 51.9526 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#24926 - CGI_10013240 superfamily 246680 47 106 5.96E-09 50.8366 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#24927 - CGI_10013241 superfamily 243175 136 204 2.45E-16 71.1179 cl02776 GST_C_family superfamily N - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#24927 - CGI_10013241 superfamily 241832 2 58 3.57E-18 76.1234 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#24928 - CGI_10013242 superfamily 243175 65 168 3.30E-13 61.8732 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#24928 - CGI_10013242 superfamily 241832 6 45 1.58E-11 56.8634 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#24932 - CGI_10013248 superfamily 245201 52 299 2.08E-69 230.869 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24933 - CGI_10013249 superfamily 241565 43 118 0.000457919 40.3827 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#24933 - CGI_10013249 superfamily 241752 307 609 1.77E-47 175.536 cl00283 ADP_ribosyl superfamily - - "ADP_ribosylating enzymes catalyze the transfer of ADP_ribose from NAD+ to substrates. Bacterial toxins are cytoplasmic and catalyze the transfer of a single ADP_ribose unit to eukaryotic elongation factor 2, halting protein synthesis and killing the cell. Poly(ADP-ribose) polymerases (PARPS 1-3, VPARP, tankyrase) catalyze the addition of up to 100 ADP_ribose units from NAD+. PARPs 1 and 2 are localized in the nucleaus, bind DNA, and are activated by DNA damage. VPARP is part of the vault ribonucleoprotein complex. Tankyrases regulates telomere length in part through poy(ADP_ribosylation) of telomere repeat binding factor 1 (TRF1). Poly(ADP-ribose) polymerase catalyses the covalent attachment of ADP-ribose units from NAD+ to itself and to a limited number of other DNA binding proteins, which decreases their affinity for DNA. Poly(ADP-ribose) polymerase is a regulatory component induced by DNA damage. The carboxyl-terminal region is the most highly conserved region of the protein. Experiments have shown that a carboxyl 40 kDa fragment is still catalytically active." Q#24933 - CGI_10013249 superfamily 241578 916 1086 3.59E-30 119.627 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#24933 - CGI_10013249 superfamily 207701 662 779 2.37E-29 115.469 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#24933 - CGI_10013249 superfamily 247684 1262 1321 9.02E-18 85.9071 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#24941 - CGI_10008906 superfamily 248097 287 412 1.73E-31 116.596 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24941 - CGI_10008906 superfamily 248097 144 269 5.12E-27 104.269 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24941 - CGI_10008906 superfamily 248097 38 105 2.10E-07 48.8006 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24943 - CGI_10008908 superfamily 248097 26 110 4.27E-17 71.9126 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24948 - CGI_10008913 superfamily 247736 1 49 0.000577056 34.1217 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#24949 - CGI_10008914 superfamily 248264 194 361 8.87E-16 74.9661 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#24952 - CGI_10002301 superfamily 243035 89 201 7.84E-05 39.5254 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#24953 - CGI_10003409 superfamily 241563 104 139 0.000196434 39.6279 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24953 - CGI_10003409 superfamily 110440 569 596 0.00461656 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#24956 - CGI_10010217 superfamily 247792 20 67 6.72E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24956 - CGI_10010217 superfamily 241563 157 199 0.00891025 34.7624 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24957 - CGI_10010218 superfamily 243072 546 604 2.71E-06 48.1486 cl02529 ANK superfamily C - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#24957 - CGI_10010218 superfamily 247724 1283 1459 1.47E-38 144.401 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24957 - CGI_10010218 superfamily 245201 1897 2052 1.68E-26 111.821 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#24957 - CGI_10010218 superfamily 246925 1095 1231 0.000241586 44.2686 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#24958 - CGI_10010219 superfamily 246679 27 150 2.10E-49 159.23 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#24959 - CGI_10010220 superfamily 186642 3 72 1.53E-21 80.4142 cl05017 UPF0203 superfamily - - Uncharacterized protein family (UPF0203); This family of proteins is functionally uncharacterized. Q#24960 - CGI_10010221 superfamily 241563 81 123 0.00042611 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24961 - CGI_10010222 superfamily 241563 61 99 0.00130414 37.0736 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24962 - CGI_10010223 superfamily 245226 245 375 3.15E-19 83.1188 cl10012 DnaQ_like_exo superfamily C - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#24966 - CGI_10013610 superfamily 197504 146 261 1.37E-07 48.0545 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#24967 - CGI_10013611 superfamily 248458 112 462 9.18E-35 133.208 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#24968 - CGI_10013612 superfamily 241622 4 82 1.11E-17 79.1478 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#24968 - CGI_10013612 superfamily 128974 394 416 0.000677574 37.9676 cl00302 ZM superfamily - - "ZASP-like motif; Short motif (26 amino acids) present in an alpha-actinin-binding protein, ZASP, and similar molecules." Q#24969 - CGI_10013614 superfamily 239148 61 209 3.04E-73 221.345 cl02780 MIT_C superfamily - - "MIT_C; domain found C-terminal to MIT (contained within Microtubule Interacting and Trafficking molecules) domains, as well as in some bacterial proteins. The function of this domain is unknown." Q#24969 - CGI_10013614 superfamily 241764 18 50 2.28E-05 40.4853 cl00299 MIT superfamily N - "MIT: domain contained within Microtubule Interacting and Trafficking molecules. The MIT domain is found in sorting nexins, the nuclear thiol protease PalBH, the AAA protein spastin and archaebacterial proteins with similar domain architecture, vacuolar sorting proteins and others. The molecular function of the MIT domain is unclear." Q#24971 - CGI_10013616 superfamily 227412 54 238 4.47E-22 90.9909 cl18811 YIP1 superfamily N - "Rab GTPase interacting factor, Golgi membrane protein [Intracellular trafficking and secretion]" Q#24973 - CGI_10013618 superfamily 217414 231 614 1.94E-61 211.419 cl03927 Otopetrin superfamily - - "Protein of unknown function, DUF270; Protein of unknown function, DUF270. " Q#24975 - CGI_10013620 superfamily 243066 86 188 1.01E-16 73.4205 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#24976 - CGI_10013621 superfamily 247065 1012 1106 1.92E-07 50.421 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#24978 - CGI_10013623 superfamily 243058 12 133 2.74E-09 51.1611 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#24980 - CGI_10003623 superfamily 219740 45 121 0.0064787 34.7022 cl06992 Peptidase_S64 superfamily N - "Peptidase family S64; This family of fungal proteins is involved in the processing of membrane bound transcription factor Stp1. The processing causes the signalling domain of Stp1 to be passed to the nucleus where several permease genes are induced. The permeases are important for uptake of amino acids, and processing of tp1 only occurs in an amino acid-rich environment. This family is predicted to be distantly related to the trypsin family (MEROPS:S1) and to have a typical trypsin-like catalytic triad." Q#24981 - CGI_10019010 superfamily 245596 68 240 2.26E-93 276.77 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#24982 - CGI_10019011 superfamily 245814 481 555 4.98E-07 49.7951 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 78 155 1.54E-06 48.6395 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 1738 1815 2.13E-06 47.8691 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 180 255 6.96E-06 46.3283 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 1289 1364 6.31E-05 43.6319 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 1591 1659 0.000406241 40.9355 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 683 758 0.00076627 40.1651 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 380 441 0.00183736 39.0095 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 584 654 0.0058035 37.4687 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 273 358 1.83E-10 60.2116 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 1479 1569 1.24E-07 51.7373 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 1380 1468 7.20E-07 49.4261 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 2137 2191 1.71E-05 45.0904 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 880 948 3.23E-05 44.4185 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 2041 2117 6.43E-05 43.6481 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 1936 2026 8.02E-05 43.2629 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 771 861 0.000346177 41.3369 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24982 - CGI_10019011 superfamily 245814 1039 1065 0.0053484 37.3861 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24983 - CGI_10019012 superfamily 245814 117 178 4.37E-08 47.8184 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24983 - CGI_10019012 superfamily 245814 22 103 4.96E-07 45.122 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#24984 - CGI_10019013 superfamily 247912 45 405 2.49E-29 119.143 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#24985 - CGI_10019014 superfamily 247724 282 442 8.92E-24 99.0674 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#24985 - CGI_10019014 superfamily 247912 18 215 6.18E-11 63.2892 cl17358 Beta-lactamase superfamily N - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#24988 - CGI_10019017 superfamily 241564 533 601 6.69E-28 108.507 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#24988 - CGI_10019017 superfamily 241564 649 698 3.35E-22 92.3287 cl00035 BIR superfamily N - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#24988 - CGI_10019017 superfamily 241564 266 336 4.80E-09 54.2123 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#24988 - CGI_10019017 superfamily 247792 839 869 1.35E-05 43.9076 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#24990 - CGI_10019019 superfamily 245248 44 505 2.12E-72 239.437 cl10080 RPE65 superfamily - - "Retinal pigment epithelial membrane protein; This family represents a retinal pigment epithelial membrane receptor which is abundantly expressed in retinal pigment epithelium, and binds plasma retinal binding protein. The family also includes the sequence related neoxanthin cleavage enzyme in plants and lignostilbene-alpha,beta-dioxygenase in bacteria." Q#24991 - CGI_10019020 superfamily 220249 105 173 6.93E-15 66.0896 cl09695 H_lectin superfamily - - "H-type lectin domain; The H-type lectin domain is a unit of six beta chains, combined into a homo-hexamer. It is involved in self/non-self recognition of cells, through binding with carbohydrates. It is sometimes found in association with the F5_F8_type_C domain pfam00754." Q#24993 - CGI_10019023 superfamily 248097 197 324 1.07E-19 82.6982 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#24994 - CGI_10019024 superfamily 128778 73 169 1.43E-05 43.4075 cl17972 BBC superfamily C - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#24994 - CGI_10019024 superfamily 241563 27 65 4.21E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#24996 - CGI_10019026 superfamily 207685 14 65 1.02E-21 82.2218 cl02642 PABP superfamily C - "Poly-adenylate binding protein, unique domain; The region featured in this family is found towards the C-terminus of poly(A)-binding proteins (PABPs). These are eukaryotic proteins that, through their binding of the 3' poly(A) tail on mRNA, have very important roles in the pathways of gene expression. They seem to provide a scaffold on which other proteins can bind and mediate processes such as export, translation and turnover of the transcripts. Moreover, they may act as antagonists to the binding of factors that allow mRNA degradation, regulating mRNA longevity. PABPs are also involved in nuclear transport. PABPs interact with poly(A) tails via RNA-recognition motifs (pfam00076). Note that the PABP C-terminal region is also found in members of the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains - these are also included in this family." Q#24997 - CGI_10019027 superfamily 241571 40 153 7.19E-27 104.802 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24997 - CGI_10019027 superfamily 241571 168 291 6.09E-14 68.593 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#24999 - CGI_10019029 superfamily 248306 94 394 0 539.408 cl17752 Coprogen_oxidas superfamily - - Coproporphyrinogen III oxidase; Coproporphyrinogen III oxidase. Q#25000 - CGI_10019030 superfamily 248013 14 30 0.000309576 36.3972 cl17459 CHROMO superfamily NC - "Chromatin organization modifier (chromo) domain is a conserved region of around 50 amino acids found in a variety of chromosomal proteins, which appear to play a role in the functional organization of the eukaryotic nucleus. Experimental evidence implicates the chromo domain in the binding activity of these proteins to methylated histone tails and maybe RNA. May occur as single instance, in a tandem arrangement or followd by a related "chromo shadow" domain." Q#25001 - CGI_10019031 superfamily 149431 278 371 1.04E-34 130.511 cl07111 LLGL superfamily - - LLGL2; This domain is found in lethal giant larvae homolog 2 (LLGL2) proteins and syntaxin-binding proteins like tomosyn. It has been identified in eukaryotes and tends to be found together with WD repeats (pfam00400). Q#25001 - CGI_10019031 superfamily 243092 49 258 2.97E-11 64.6636 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25002 - CGI_10019032 superfamily 217474 55 388 7.35E-76 246.88 cl03979 PAE superfamily - - Pectinacetylesterase; Pectinacetylesterase. Q#25004 - CGI_10019034 superfamily 241563 120 159 4.08E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25004 - CGI_10019034 superfamily 245027 217 286 0.00223897 37.3728 cl09176 FlgN superfamily C - FlgN protein; This family includes the FlgN protein and export chaperone involved in flagellar synthesis. Q#25005 - CGI_10019035 superfamily 241563 125 164 3.73E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25006 - CGI_10019036 superfamily 245849 1 101 5.98E-15 66.497 cl12045 Ubiq_cyt_C_chap superfamily - - Ubiquinol-cytochrome C chaperone; Ubiquinol-cytochrome C chaperone. Q#25009 - CGI_10019039 superfamily 246664 171 546 9.33E-176 504.801 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#25009 - CGI_10019039 superfamily 246664 12 45 0.0009823 40.3981 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#25010 - CGI_10019040 superfamily 246664 181 519 2.18E-148 433.925 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#25010 - CGI_10019040 superfamily 246664 21 88 0.00326286 38.8573 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#25011 - CGI_10019041 superfamily 111397 9 85 6.68E-07 44.6395 cl03620 HYR superfamily - - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#25012 - CGI_10019042 superfamily 241646 128 171 8.24E-05 39.7414 cl00156 WAP superfamily - - "whey acidic protein-type four-disulfide core domains. Members of the family include whey acidic protein, elafin (elastase-specific inhibitor), caltrin-like protein (a calcium transport inhibitor) and other extracellular proteinase inhibitors. A group of proteins containing 8 characteristically-spaced cysteine residuesforming disulphide bonds, have been termed '4-disulphide core' proteins. Protease inhibition occurs by insertion of the inhibitory loop into the active site pocket and interference with the catalytic residues of the protease." Q#25014 - CGI_10005624 superfamily 241900 447 561 0.000301068 41.4336 cl00490 EEP superfamily N - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#25014 - CGI_10005624 superfamily 241900 128 322 0.00251361 38.6165 cl00490 EEP superfamily N - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#25015 - CGI_10005625 superfamily 149752 53 80 1.91E-12 61.0201 cl07411 zf-LYAR superfamily - - "LYAR-type C2HC zinc finger; This C2HC zinc finger is found in LYAR proteins, which are involved in cell growth regulation." Q#25017 - CGI_10005627 superfamily 246918 145 197 3.07E-14 66.0711 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25017 - CGI_10005627 superfamily 246918 202 254 2.32E-12 61.0635 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25017 - CGI_10005627 superfamily 246918 86 129 3.51E-06 43.7295 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25023 - CGI_10010905 superfamily 243179 64 172 0.000741713 36.3283 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#25024 - CGI_10010906 superfamily 198675 1501 1700 1.66E-71 239.855 cl02436 COLFI superfamily - - "Fibrillar collagen C-terminal domain; Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1 alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc." Q#25024 - CGI_10010906 superfamily 214560 36 197 2.12E-18 85.4868 cl18311 TSPN superfamily - - Thrombospondin N-terminal -like domains; Heparin-binding and cell adhesion domain of thrombospondin Q#25028 - CGI_10010910 superfamily 110440 522 549 0.00319677 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#25029 - CGI_10009232 superfamily 241600 125 354 1.56E-90 272.962 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#25030 - CGI_10009233 superfamily 243115 4 129 9.09E-20 82.3917 cl02623 WIF superfamily - - "WIF domain; The WIF domain is found in the RYK tyrosine kinase receptors and WIF, the Wnt-inhibitory- factor. The domain is extracellular and contains two conserved cysteines that may form a disulphide bridge. This domain is Wnt binding in WIF, and it has been suggested that RYK may also bind to Wnt. The WIF domain is a member of the immunoglobulin superfamily, and it comprises nine beta-strands and two alpha-helices, with two of the beta-strands (6 and 9) interrupted by four and six residues of irregular secondary structure, respectively. Considering that the activity of Wnts depends on the presence of a palmitoylated cysteine residue in their amino-terminal polypeptide segment, Wnt proteins are lipid-modified and can act as stem cell growth factors, it is likely that the WIF domain recognises and binds to Wnts that have been activated by palmitoylation and that the recognition of palmitoylated Wnts by WIF-1 is effected by its WIF domain rather than by its EGF domains. A strong binding affinity for palmitoylated cysteine residues would further explain the remarkably high affinity of human WIF-1 not only for mammalian Wnts, but also for Wnts from Xenopus and Drosophila." Q#25032 - CGI_10009235 superfamily 241888 82 311 8.90E-94 280.258 cl00473 BI-1-like superfamily - - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#25033 - CGI_10009236 superfamily 241888 13 217 1.76E-53 173.943 cl00473 BI-1-like superfamily - - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#25034 - CGI_10009237 superfamily 110440 311 337 0.00181163 35.8465 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#25035 - CGI_10009238 superfamily 110440 316 342 0.00122744 36.2317 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#25038 - CGI_10009241 superfamily 248264 1 56 7.45E-06 43.3798 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#25039 - CGI_10009242 superfamily 241600 64 150 1.85E-28 110.059 cl00085 FReD superfamily NC - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#25039 - CGI_10009242 superfamily 248097 210 333 3.53E-16 73.0682 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25040 - CGI_10009243 superfamily 221377 40 192 3.28E-16 75.583 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#25044 - CGI_10007830 superfamily 194336 222 309 7.47E-16 74.2045 cl02517 ZU5 superfamily - - ZU5 domain; Domain present in ZO-1 and Unc5-like netrin receptors Domain of unknown function. Q#25044 - CGI_10007830 superfamily 246680 517 566 4.96E-05 41.9257 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25045 - CGI_10007831 superfamily 220639 48 153 3.60E-34 118.375 cl10916 DUF2315 superfamily - - Uncharacterized conserved protein (DUF2315); This is a family of small conserved proteins found from worms to humans. The function is not known. Q#25046 - CGI_10007832 superfamily 241599 3 32 1.40E-07 45.3121 cl00084 homeodomain superfamily N - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25047 - CGI_10007833 superfamily 243555 24 216 2.09E-17 77.0462 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#25049 - CGI_10007835 superfamily 241578 201 357 1.62E-28 109.689 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25050 - CGI_10007836 superfamily 241584 888 967 0.00132221 38.6315 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25051 - CGI_10007837 superfamily 241578 108 262 7.12E-20 88.5026 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25051 - CGI_10007837 superfamily 245213 36 72 4.54E-11 59.9578 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25052 - CGI_10007838 superfamily 241584 1220 1300 0.000456936 40.5575 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25054 - CGI_10007840 superfamily 241578 269 426 2.68E-29 114.311 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25054 - CGI_10007840 superfamily 243035 47 119 5.07E-19 83.8233 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25054 - CGI_10007840 superfamily 243035 188 229 8.97E-05 41.4514 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25056 - CGI_10004361 superfamily 241574 391 619 1.59E-94 299.501 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25056 - CGI_10004361 superfamily 241574 685 902 2.55E-16 78.7817 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25058 - CGI_10004363 superfamily 243179 108 199 5.89E-24 92.4293 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#25059 - CGI_10004364 superfamily 241546 553 671 1.16E-55 186.221 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#25059 - CGI_10004364 superfamily 241546 275 396 1.92E-47 163.109 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#25059 - CGI_10004364 superfamily 241546 18 132 1.78E-30 116.5 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#25059 - CGI_10004364 superfamily 241546 148 265 8.99E-29 111.493 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#25059 - CGI_10004364 superfamily 241546 408 518 1.89E-11 61.802 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#25060 - CGI_10023443 superfamily 118676 305 509 1.31E-24 101.731 cl10845 SCHIP-1 superfamily - - Schwannomin-interacting protein 1; Members of this family are coiled coil protein involved in linking membrane proteins to the cytoskeleton. Q#25060 - CGI_10023443 superfamily 210118 64 85 1.45E-05 42.3127 cl15479 IQ superfamily - - IQ calmodulin-binding motif; Calmodulin-binding motif. Q#25061 - CGI_10023444 superfamily 245201 4 256 1.18E-127 382.231 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25062 - CGI_10023445 superfamily 241705 54 134 9.57E-27 99.5838 cl00228 HIT_like superfamily N - "HIT family: HIT (Histidine triad) proteins, named for a motif related to the sequence HxHxH/Qxx (x, a hydrophobic amino acid), are a superfamily of nucleotide hydrolases and transferases, which act on the alpha-phosphate of ribonucleotides. On the basis of sequence, substrate specificity, structure, evolution and mechanism, HIT proteins are classified in the literacture into three major branches: the Hint branch, which consists of adenosine 5' -monophosphoramide hydrolases, the Fhit branch, that consists of diadenosine polyphosphate hydrolases, and the GalT branch consisting of specific nucloside monophosphate transferases. Further sequence analysis reveals several new closely related, yet uncharacterized subgroups." Q#25064 - CGI_10023447 superfamily 247792 350 388 0.00219428 37.04 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#25067 - CGI_10023450 superfamily 245205 138 217 3.18E-12 59.5589 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#25071 - CGI_10023455 superfamily 247724 6 160 1.56E-33 118.415 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25072 - CGI_10023456 superfamily 247724 6 147 4.69E-34 119.956 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25073 - CGI_10023457 superfamily 247856 18 67 6.66E-15 64.4913 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25074 - CGI_10023458 superfamily 247856 79 139 6.37E-19 76.0473 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25074 - CGI_10023458 superfamily 247856 8 104 9.42E-09 47.9277 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25075 - CGI_10023459 superfamily 247856 96 152 1.20E-15 67.5729 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25075 - CGI_10023459 superfamily 247856 21 80 1.57E-07 44.8461 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25078 - CGI_10023462 superfamily 248458 717 1022 8.47E-43 160.557 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#25083 - CGI_10023468 superfamily 217062 70 336 8.59E-49 166.676 cl12266 Branch superfamily - - "Core-2/I-Branching enzyme; This is a family of two different beta-1,6-N-acetylglucosaminyltransferase enzymes, I-branching enzyme and core-2 branching enzyme . I-branching enzyme is responsible for the production of the blood group I-antigen during embryonic development. Core-2 branching enzyme forms crucial side-chain branches in O-glycans." Q#25088 - CGI_10023473 superfamily 248097 49 178 3.01E-16 71.1422 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25089 - CGI_10023474 superfamily 241563 121 159 2.43E-05 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25090 - CGI_10023475 superfamily 243035 121 232 1.58E-06 44.533 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25091 - CGI_10023476 superfamily 217473 568 634 1.98E-11 63.9233 cl03978 Mab-21 superfamily NC - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25091 - CGI_10023476 superfamily 217473 300 461 1.24E-08 55.4489 cl03978 Mab-21 superfamily C - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25092 - CGI_10023477 superfamily 217473 182 463 8.52E-32 122.859 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25093 - CGI_10023478 superfamily 217473 155 444 3.82E-23 97.4357 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25094 - CGI_10004261 superfamily 241583 280 480 1.51E-81 265.641 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#25094 - CGI_10004261 superfamily 216572 32 181 5.42E-15 73.463 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#25094 - CGI_10004261 superfamily 246918 578 629 1.78E-13 67.2267 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25094 - CGI_10004261 superfamily 246918 921 979 0.00170451 37.9515 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25094 - CGI_10004261 superfamily 204025 1173 1204 0.00201515 37.6161 cl07344 PLAC superfamily - - PLAC (protease and lacunin) domain; The PLAC (protease and lacunin) domain is a short six-cysteine region that is usually found at the C terminal of proteins. It is found in a range of proteins including PACE4 (paired basic amino acid cleaving enzyme 4) and the extracellular matrix protein lacunin. Q#25094 - CGI_10004261 superfamily 246918 1113 1165 0.00218657 37.5663 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25094 - CGI_10004261 superfamily 246918 985 1037 0.0034405 37.1811 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25095 - CGI_10004262 superfamily 246722 5 219 2.35E-115 339.095 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#25095 - CGI_10004262 superfamily 246724 221 290 5.33E-42 143.071 cl14815 H3TH_StructSpec-5'-nucleases superfamily - - "H3TH domains of structure-specific 5' nucleases (or flap endonuclease-1-like) involved in DNA replication, repair, and recombination; The 5' nucleases of this superfamily are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated or branched DNA, in an endonucleolytic, structure-specific manner, and are involved in DNA replication, repair, and recombination. The superfamily includes the H3TH (helix-3-turn-helix) domains of Flap Endonuclease-1 (FEN1), Exonuclease-1 (EXO1), Mkt1, Gap Endonuclease 1 (GEN1) and Xeroderma pigmentosum complementation group G (XPG) nuclease. Also included are the H3TH domains of the 5'-3' exonucleases of DNA polymerase I and single domain protein homologs, as well as, the bacteriophage T4 RNase H, T5-5'nuclease, and other homologs. These nucleases contain a PIN (PilT N terminus) domain with a helical arch/clamp region/I domain (not included here) and inserted within the C-terminal region of the PIN domain is an atypical helix-hairpin-helix-2 (HhH2)-like region. This atypical HhH2 region, the H3TH domain, has an extended loop with at least three turns between the first two helices, and only three of the four helices appear to be conserved. Both the H3TH domain and the helical arch/clamp region are involved in DNA binding. Studies suggest that a glycine-rich loop in the H3TH domain contacts the phosphate backbone of the template strand in the downstream DNA duplex. Typically, the nucleases within this superfamily have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (i. e., Mg2+, Mn2+, Zn2+, or Co2+) required for nuclease activity. The first metal binding site is composed entirely of Asp/Glu residues from the PIN domain, whereas, the second metal binding site is composed generally of two Asp residues from the PIN domain and one or two Asp residues from the H3TH domain. Together with the helical arch and network of amino acids interacting with metal binding ions, the H3TH region defines a positively charged active-site DNA-binding groove in structure-specific 5' nucleases." Q#25095 - CGI_10004262 superfamily 246722 292 341 3.77E-20 87.5596 cl14812 PIN_SF superfamily N - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#25096 - CGI_10004263 superfamily 247805 170 374 1.00E-87 276.673 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25096 - CGI_10004263 superfamily 247905 389 516 2.39E-30 116.954 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#25097 - CGI_10004264 superfamily 245201 83 325 2.88E-76 239.344 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25100 - CGI_10005040 superfamily 247725 211 301 1.11E-30 116.324 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25100 - CGI_10005040 superfamily 220215 19 96 1.85E-24 98.0662 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#25100 - CGI_10005040 superfamily 215882 104 180 7.18E-19 83.099 cl09511 FERM_M superfamily C - FERM central domain; This domain is the central structural domain of the FERM domain. Q#25100 - CGI_10005040 superfamily 192138 316 353 7.09E-09 52.6211 cl07378 FA superfamily - - "FERM adjacent (FA); This region is found adjacent to Band 4.1 / FERM domains (pfam00373) in a subset of FERM containing protein. The region has been hypothesised to play a role in regulatory adaptation, based on similarity to other protein kinase substrates." Q#25101 - CGI_10005041 superfamily 247744 1 120 8.26E-38 127.369 cl17190 NK superfamily N - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#25102 - CGI_10005042 superfamily 207627 546 625 1.45E-07 51.0963 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#25102 - CGI_10005042 superfamily 243086 1230 1283 1.54E-06 47.3822 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#25102 - CGI_10005042 superfamily 207627 174 265 0.00835173 36.4635 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#25104 - CGI_10021189 superfamily 247684 104 345 5.53E-35 132.401 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25104 - CGI_10021189 superfamily 247684 5 105 5.48E-22 94.6515 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25106 - CGI_10021191 superfamily 245206 2 40 1.68E-10 53.6049 cl09931 NADB_Rossmann superfamily NC - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#25107 - CGI_10021192 superfamily 247068 218 297 9.74E-10 56.553 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25107 - CGI_10021192 superfamily 247068 99 192 1.76E-06 46.923 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25107 - CGI_10021192 superfamily 245213 709 743 0.000203585 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25107 - CGI_10021192 superfamily 245213 671 706 0.000407661 39.157 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25107 - CGI_10021192 superfamily 245213 556 591 0.000685012 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25107 - CGI_10021192 superfamily 245213 483 515 0.00106711 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25107 - CGI_10021192 superfamily 241584 299 390 0.00241082 37.0907 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25107 - CGI_10021192 superfamily 245213 407 441 0.00502036 35.6902 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25108 - CGI_10021193 superfamily 189857 30 129 1.99E-30 108.106 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#25110 - CGI_10021195 superfamily 203031 94 140 0.00142372 35.7668 cl04548 FLYWCH superfamily N - "FLYWCH zinc finger domain; Mutations in the mod(mdg4) gene have effects on variegation (PEV), the properties of insulator sequences, correct path-finding of growing nerve cells, meiotic pairing of chromosomes, and apoptosis. The occurrence of FLYWCH motifs in mod(mdg4) gene product and other proteins is discussed in." Q#25115 - CGI_10021201 superfamily 242406 122 207 5.71E-05 40.2673 cl01271 DUF1768 superfamily C - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#25116 - CGI_10021202 superfamily 241584 176 269 0.000121533 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25120 - CGI_10021207 superfamily 216897 3 82 2.04E-22 83.4997 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#25124 - CGI_10021211 superfamily 245213 146 182 9.28E-10 51.4834 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25124 - CGI_10021211 superfamily 245213 108 143 2.39E-08 48.0166 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25125 - CGI_10021212 superfamily 247727 103 207 4.54E-08 48.9655 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#25126 - CGI_10021213 superfamily 241613 255 296 4.44E-09 53.7498 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#25126 - CGI_10021213 superfamily 244394 91 168 3.71E-18 81.6511 cl06508 MANEC superfamily - - "MANEC domain; This region of similarity, comprising 8 conserved cysteines, is found in the N-terminal region of several membrane-associated and extracellular proteins. Although formerly called MANSC (for motif at N terminus with seven cysteines) it has now been renamed by MANEC (motif at N terminus with eight cysteines) by Richard Mitter and Stephen Fitzgerald after the discovery of an eighth conserved cysteine. It is postulated that this domain may play a role in the formation of protein complexes involving various protease activators and inhibitors." Q#25127 - CGI_10021214 superfamily 218643 205 474 4.17E-99 302.468 cl05243 DUF766 superfamily - - Protein of unknown function (DUF766); This family consists of several eukaryotic proteins of unknown function. Q#25127 - CGI_10021214 superfamily 218569 22 116 0.0019588 37.544 cl05100 PP1_inhibitor superfamily C - "PKC-activated protein phosphatase-1 inhibitor; Contractility of vascular smooth muscle depends on phosphorylation of myosin light chains, and is modulated by hormonal control of myosin phosphatase activity. Signaling pathways activate kinases such as PKC or Rho-dependent kinases that phosphorylate the myosin phosphatase inhibitor protein called CPI-17. Phosphorylation of CPI-17 at Thr-38 enhances its inhibitory potency 1000-fold, creating a molecular switch for regulating contraction." Q#25128 - CGI_10021215 superfamily 246597 74 324 4.89E-55 188.716 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#25129 - CGI_10021217 superfamily 150002 80 164 1.99E-20 86.6367 cl07690 Tap-RNA_bind superfamily - - "Tap, RNA-binding; Members of this family adopt a structure consisting of an alpha+beta sandwich with an antiparallel beta-sheet, arranged in a 2(beta-alpha-beta) motif. They are mainly found in mRNA export factors, and mediate the sequence nonspecific nuclear export of cellular mRNAs as well as the sequence-specific export of retroviral mRNAs bearing the constitutive transport element." Q#25129 - CGI_10021217 superfamily 198842 560 600 5.95E-15 70.4088 cl04338 TAP_C superfamily N - "TAP C-terminal domain; The vertebrate Tap protein is a member of the NXF family of shuttling transport receptors for nuclear export of mRNA. Tap has a modular structure, and its most C-terminal domain is important for binding to FG repeat-containing nuclear pore proteins (FG-nucleoporins) and is sufficient to mediate nuclear shuttling. The structure of the C-terminal domain is composed of four helices. The structure is related to the UBA domain." Q#25129 - CGI_10021217 superfamily 245009 354 503 1.76E-07 49.5885 cl09109 NTF2_like superfamily - - "Nuclear transport factor 2 (NTF2-like) superfamily. This family includes members of the NTF2 family, Delta-5-3-ketosteroid isomerases, Scytalone Dehydratases, and the beta subunit of Ring hydroxylating dioxygenases. This family is a classic example of divergent evolution wherein the proteins have many common structural details but diverge greatly in their function. For example, nuclear transport factor 2 (NTF2) mediates the nuclear import of RanGDP and binds to both RanGDP and FxFG repeat-containing nucleoporins while Ketosteroid isomerases catalyze the isomerization of delta-5-3-ketosteroid to delta-4-3-ketosteroid, by intramolecular transfer of the C4-beta proton to the C6-beta position. While the function of the beta sub-unit of the Ring hydroxylating dioxygenases is not known, Scytalone Dehydratases catalyzes two reactions in the biosynthetic pathway that produces fungal melanin. Members of the NTF2-like superfamily are widely distributed among bacteria, archaea and eukaryotes." Q#25131 - CGI_10021219 superfamily 243072 15 141 4.70E-32 122.107 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25131 - CGI_10021219 superfamily 243072 82 213 4.09E-28 110.936 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25131 - CGI_10021219 superfamily 243072 187 365 9.45E-08 51.2302 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25131 - CGI_10021219 superfamily 243146 666 707 1.98E-07 49.1474 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25131 - CGI_10021219 superfamily 243146 560 608 5.65E-06 45.0528 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25131 - CGI_10021219 superfamily 243073 493 537 7.24E-06 44.7825 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#25131 - CGI_10021219 superfamily 243146 781 823 1.18E-05 43.8042 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25131 - CGI_10021219 superfamily 243146 731 789 0.000350647 39.5799 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25131 - CGI_10021219 superfamily 243146 629 678 0.00043828 39.1947 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25133 - CGI_10004451 superfamily 218122 125 350 1.30E-63 208.605 cl04558 Choline_transpo superfamily N - Plasma-membrane choline transporter; This family represents a high-affinity plasma-membrane choline transporter in C.elegans which is thought to be rate-limiting for ACh synthesis in cholinergic nerve terminals. Q#25136 - CGI_10004454 superfamily 218122 301 625 1.29E-91 289.497 cl04558 Choline_transpo superfamily - - Plasma-membrane choline transporter; This family represents a high-affinity plasma-membrane choline transporter in C.elegans which is thought to be rate-limiting for ACh synthesis in cholinergic nerve terminals. Q#25138 - CGI_10004456 superfamily 247792 11 62 1.92E-06 45.8996 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#25138 - CGI_10004456 superfamily 110440 615 642 1.30E-06 45.8617 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#25138 - CGI_10004456 superfamily 241563 158 193 3.05E-06 45.0207 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25138 - CGI_10004456 superfamily 241563 99 140 0.00197081 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25139 - CGI_10004457 superfamily 241567 58 134 0.000124114 40.6846 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#25140 - CGI_10004458 superfamily 241832 36 160 1.31E-83 261.758 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25141 - CGI_10004459 superfamily 241567 12 247 2.14E-26 105.375 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#25141 - CGI_10004459 superfamily 241567 261 328 0.0001711 41.432 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#25143 - CGI_10002572 superfamily 243072 60 186 1.95E-31 117.87 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25144 - CGI_10002573 superfamily 245847 17 164 5.85E-13 68.5329 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25144 - CGI_10002573 superfamily 243065 1168 1348 3.00E-10 60.4933 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#25145 - CGI_10002574 superfamily 245847 34 175 1.65E-07 47.7321 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25150 - CGI_10003797 superfamily 243100 279 332 4.07E-14 66.0981 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#25151 - CGI_10003798 superfamily 247792 23 64 2.53E-12 63.6188 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#25151 - CGI_10003798 superfamily 245674 155 200 4.26E-05 43.0466 cl11531 DUF904 superfamily C - Protein of unknown function (DUF904); This family consists of several bacterial and archaeal hypothetical proteins of unknown function. Q#25151 - CGI_10003798 superfamily 221533 184 251 0.00595525 36.5208 cl13726 TMF_DNA_bd superfamily - - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#25152 - CGI_10003799 superfamily 247725 1 99 1.62E-64 206.447 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25152 - CGI_10003799 superfamily 204041 315 460 2.09E-50 170.841 cl07367 GLTP superfamily - - Glycolipid transfer protein (GLTP); GLTP is a cytosolic protein that catalyzes the intermembrane transfer of glycolipids. Q#25153 - CGI_10003800 superfamily 246751 216 432 9.61E-86 266.802 cl14883 Lipase superfamily C - "Lipase. Lipases are esterases that can hydrolyze long-chain acyl-triglycerides into di- and monoglycerides, glycerol, and free fatty acids at a water/lipid interface. A typical feature of lipases is "interfacial activation", the process of becoming active at the lipid/water interface, although several examples of lipases have been identified that do not undergo interfacial activation . The active site of a lipase contains a catalytic triad consisting of Ser - His - Asp/Glu, but unlike most serine proteases, the active site is buried inside the structure. A "lid" or "flap" covers the active site, making it inaccessible to solvent and substrates. The lid opens during the process of interfacial activation, allowing the lipid substrate access to the active site." Q#25154 - CGI_10003801 superfamily 245814 201 266 3.97E-05 40.5503 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25155 - CGI_10003802 superfamily 247866 17 136 1.10E-12 61.3144 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#25156 - CGI_10006360 superfamily 247856 967 1033 9.10E-26 103.453 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25156 - CGI_10006360 superfamily 247856 839 904 3.13E-21 90.3566 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25156 - CGI_10006360 superfamily 247856 726 792 4.58E-20 86.8898 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25156 - CGI_10006360 superfamily 243095 536 637 7.76E-23 99.712 cl02570 RhoGAP superfamily N - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#25156 - CGI_10006360 superfamily 243038 54 124 1.19E-21 92.2154 cl02442 DEP superfamily C - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#25156 - CGI_10006360 superfamily 243095 164 286 5.21E-21 94.3192 cl02570 RhoGAP superfamily C - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#25156 - CGI_10006360 superfamily 192987 1078 1168 0.00366401 37.5519 cl13724 TMF_TATA_bd superfamily - - "TATA element modulatory factor 1 TATA binding; This is the C-terminal conserved coiled coil region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes. The proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMF1_TATA_bd is the most conserved part of the TMFs. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells. The Rab6-binding domain appears to be the same region as this C-terminal family." Q#25157 - CGI_10006361 superfamily 247725 4 105 2.08E-68 221.165 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25157 - CGI_10006361 superfamily 216381 365 693 1.04E-67 227.473 cl03136 Oxysterol_BP superfamily - - Oxysterol-binding protein; Oxysterol-binding protein. Q#25158 - CGI_10006362 superfamily 241574 513 740 3.60E-102 322.228 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25158 - CGI_10006362 superfamily 241574 803 1032 6.08E-74 244.418 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25158 - CGI_10006362 superfamily 241609 65 146 2.59E-07 49.6962 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#25162 - CGI_10018981 superfamily 241547 19 70 1.02E-16 70.7613 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#25163 - CGI_10018982 superfamily 241547 1 30 7.02E-08 46.1529 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#25165 - CGI_10018984 superfamily 247725 529 621 7.11E-17 78.8593 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25165 - CGI_10018984 superfamily 150957 1332 1439 5.86E-31 119.643 cl11034 Vps39_2 superfamily - - "Vacuolar sorting protein 39 domain 2; This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. This domain is involved in localisation and in mediating the interactions of Vps39 with Vps11." Q#25165 - CGI_10018984 superfamily 243036 623 876 2.63E-24 104.627 cl02434 CNH superfamily - - "CNH domain; Domain found in NIK1-like kinase, mouse citron and yeast ROM1, ROM2. Unpublished observations." Q#25165 - CGI_10018984 superfamily 220718 1042 1143 1.66E-22 94.9984 cl11033 Vps39_1 superfamily - - "Vacuolar sorting protein 39 domain 1; This domain is found on the vacuolar sorting protein Vps39 which is a component of the C-Vps complex. Vps39 is thought to be required for the fusion of endosomes and other types of transport intermediates with the vacuole. In Saccharomyces cerevisiae, Vps39 has been shown to stimulate nucleotide exchange. The precise function of this domain has not been characterized." Q#25165 - CGI_10018984 superfamily 220068 2 49 1.55E-15 73.9218 cl07486 NUP50 superfamily C - "NUP50 (Nucleoporin 50 kDa); Nucleoporin 50 kDa (NUP50) acts as a cofactor for the importin-alpha:importin-beta heterodimer, which in turn allows for transportation of many nuclear-targeted proteins through nuclear pore complexes. The C terminus of NUP50 binds importin-beta through RAN-GTP, the N terminus binds the C terminus of importin-alpha, while a central domain binds importin-beta. NUP50:importin-alpha:importin-beta then binds cargo and can stimulate nuclear import. The N-terminal domain of NUP50 is also able to actively displace nuclear localisation signals from importin-alpha." Q#25165 - CGI_10018984 superfamily 207690 265 293 1.76E-05 43.8777 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#25165 - CGI_10018984 superfamily 207690 325 352 2.96E-05 43.1073 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#25165 - CGI_10018984 superfamily 207690 205 233 8.07E-05 41.9517 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#25166 - CGI_10018985 superfamily 241622 6 87 3.72E-21 86.8518 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#25166 - CGI_10018985 superfamily 241622 211 291 2.91E-19 81.8442 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#25170 - CGI_10018989 superfamily 245208 1 367 1.52E-102 310.354 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#25171 - CGI_10018990 superfamily 247723 313 401 5.10E-48 162.909 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25171 - CGI_10018990 superfamily 247723 466 541 2.17E-44 152.476 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25171 - CGI_10018990 superfamily 241888 130 256 7.37E-26 105.316 cl00473 BI-1-like superfamily N - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#25172 - CGI_10018991 superfamily 197840 96 159 2.75E-23 89.9746 cl18196 PUR superfamily - - DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria; DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria. Q#25172 - CGI_10018991 superfamily 197840 20 81 3.21E-23 89.9746 cl18196 PUR superfamily - - DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria; DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria. Q#25172 - CGI_10018991 superfamily 197840 169 227 2.39E-19 79.189 cl18196 PUR superfamily - - DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria; DNA/RNA-binding repeats in PUR-alpha/beta/gamma and in hypothetical proteins from spirochetes and the Bacteroides-Cytophaga-Flexibacter bacteria. Q#25174 - CGI_10018993 superfamily 241958 44 432 1.75E-74 242.036 cl00573 SDF superfamily - - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#25175 - CGI_10018994 superfamily 247723 41 139 1.05E-57 188.132 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25175 - CGI_10018994 superfamily 241568 446 496 3.80E-06 44.376 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#25176 - CGI_10018995 superfamily 243051 17 173 2.09E-48 164.473 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#25178 - CGI_10018997 superfamily 214531 783 826 4.66E-10 56.8413 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#25178 - CGI_10018997 superfamily 214531 198 240 1.14E-08 52.9893 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#25178 - CGI_10018997 superfamily 215683 174 215 2.18E-08 52.1723 cl18339 Ldl_recept_b superfamily - - Low-density lipoprotein receptor repeat class B; This domain is also known as the YWTD motif after the most conserved region of the repeat. The YWTD repeat is found in multiple tandem repeats and has been predicted to form a beta-propeller structure. Q#25178 - CGI_10018997 superfamily 214531 467 509 7.31E-08 50.6781 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#25178 - CGI_10018997 superfamily 214531 833 870 3.31E-07 48.7521 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#25178 - CGI_10018997 superfamily 214531 431 463 5.82E-05 41.8185 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#25178 - CGI_10018997 superfamily 214531 510 548 0.00120172 37.9665 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#25178 - CGI_10018997 superfamily 214531 749 779 0.00235795 37.1961 cl18310 LY superfamily N - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#25178 - CGI_10018997 superfamily 214531 692 734 0.0049423 36.4257 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#25179 - CGI_10018998 superfamily 247683 340 388 0.000427327 39.4437 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#25179 - CGI_10018998 superfamily 207632 777 808 0.00150835 37.4617 cl02531 Plectin superfamily - - "Plectin repeat; This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen." Q#25180 - CGI_10018999 superfamily 218802 136 177 3.14E-06 43.5042 cl05462 DUF862 superfamily N - "PPPDE putative peptidase domain; The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p)." Q#25182 - CGI_10019001 superfamily 218802 100 129 0.00117096 35.8002 cl05462 DUF862 superfamily N - "PPPDE putative peptidase domain; The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p)." Q#25183 - CGI_10019002 superfamily 218802 258 346 0.00226453 36.9558 cl05462 DUF862 superfamily N - "PPPDE putative peptidase domain; The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p)." Q#25186 - CGI_10019005 superfamily 243263 128 240 1.83E-15 74.3666 cl02990 ASC superfamily NC - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#25186 - CGI_10019005 superfamily 243263 243 266 3.56E-07 49.7138 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#25187 - CGI_10019006 superfamily 243263 8 123 8.63E-23 94.397 cl02990 ASC superfamily NC - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#25187 - CGI_10019006 superfamily 243263 119 149 2.87E-08 52.025 cl02990 ASC superfamily N - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#25188 - CGI_10019008 superfamily 217316 16 79 0.00977156 34.5244 cl03832 DUF234 superfamily C - Archaea bacterial proteins of unknown function; Archaea bacterial proteins of unknown function. Q#25191 - CGI_10015410 superfamily 247684 16 434 1.10E-97 306.126 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25192 - CGI_10015411 superfamily 247068 165 271 3.95E-20 85.8281 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25195 - CGI_10015414 superfamily 198896 1646 1672 0.00371454 37.5865 cl07394 Ca_chan_IQ superfamily - - "Voltage gated calcium channel IQ domain; Voltage gated calcium channels control cellular calcium entry in response to changes in membrane potential. The isoleucine-glutamine (IQ) motif in the voltage gated calcium channel IQ domain interacts with hydrophobic pockets of Ca2+/calmodulin. The interaction regulates two self-regulatory calcium dependent feedback mechanism, calcium dependent inactivation (CDI), and calcium-dependent facilitation (CDF)." Q#25197 - CGI_10015416 superfamily 243072 98 234 1.60E-25 99.7654 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25197 - CGI_10015416 superfamily 243073 350 393 0.000225407 38.5756 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#25201 - CGI_10015420 superfamily 241596 283 333 1.66E-09 54.5275 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#25204 - CGI_10015423 superfamily 243050 120 178 6.28E-33 118.53 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#25204 - CGI_10015423 superfamily 243050 62 115 5.50E-30 110.121 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#25204 - CGI_10015423 superfamily 241599 264 328 8.16E-17 74.202 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25205 - CGI_10015424 superfamily 245201 112 368 9.85E-45 155.858 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25206 - CGI_10015425 superfamily 215731 136 384 6.76E-82 253.675 cl08245 Gln-synt_C superfamily - - "Glutamine synthetase, catalytic domain; Glutamine synthetase, catalytic domain. " Q#25206 - CGI_10015425 superfamily 217811 50 130 9.82E-16 71.7659 cl08409 Gln-synt_N superfamily - - "Glutamine synthetase, beta-Grasp domain; Glutamine synthetase, beta-Grasp domain. " Q#25207 - CGI_10015426 superfamily 215731 109 357 2.69E-84 259.067 cl08245 Gln-synt_C superfamily - - "Glutamine synthetase, catalytic domain; Glutamine synthetase, catalytic domain. " Q#25207 - CGI_10015426 superfamily 217811 24 103 6.94E-19 79.8551 cl08409 Gln-synt_N superfamily - - "Glutamine synthetase, beta-Grasp domain; Glutamine synthetase, beta-Grasp domain. " Q#25208 - CGI_10015427 superfamily 244906 599 667 2.89E-24 98.7515 cl08315 CAP_GLY superfamily - - "CAP-Gly domain; Cytoskeleton-associated proteins (CAPs) are involved in the organisation of microtubules and transportation of vesicles and organelles along the cytoskeletal network. A conserved motif, CAP-Gly, has been identified in a number of CAPs, including CLIP-170 and dynactins. The crystal structure of Caenorhabditis elegans F53F4.3 protein CAP-Gly domain was recently solved. The domain contains three beta-strands. The most conserved sequence, GKNDG, is located in two consecutive sharp turns on the surface, forming the entrance to a groove." Q#25212 - CGI_10015910 superfamily 243134 310 434 2.12E-25 102.342 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#25212 - CGI_10015910 superfamily 243134 589 711 2.40E-23 96.5643 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#25212 - CGI_10015910 superfamily 243134 449 575 1.27E-18 83.0824 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#25212 - CGI_10015910 superfamily 243134 153 297 2.68E-18 81.9268 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#25213 - CGI_10015911 superfamily 241833 197 424 7.59E-25 102.601 cl00389 SIS superfamily - - SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Q#25213 - CGI_10015911 superfamily 241833 22 162 3.20E-12 65.2365 cl00389 SIS superfamily C - SIS domain. SIS (Sugar ISomerase) domains are found in many phosphosugar isomerases and phosphosugar binding proteins. SIS domains are also found in proteins that regulate the expression of genes involved in synthesis of phosphosugars. Q#25216 - CGI_10015914 superfamily 221506 192 361 3.93E-26 108.76 cl13686 BSMAP superfamily - - Brain specific membrane anchored protein; This family of proteins is found in eukaryotes. Proteins in this family are typically between 285 and 331 amino acids in length. BSMAP has a putative transmembrane domain and is predicted to be a type I membrane glycoprotein. Q#25216 - CGI_10015914 superfamily 221506 1 112 1.27E-11 64.8473 cl13686 BSMAP superfamily N - Brain specific membrane anchored protein; This family of proteins is found in eukaryotes. Proteins in this family are typically between 285 and 331 amino acids in length. BSMAP has a putative transmembrane domain and is predicted to be a type I membrane glycoprotein. Q#25216 - CGI_10015914 superfamily 220131 987 1249 0.00965798 38.4114 cl11721 DUF1943 superfamily - - "Domain of unknown function (DUF1943); Members of this family adopt a structure consisting of several large open beta-sheets. Their exact function has not, as yet, been determined." Q#25217 - CGI_10015915 superfamily 216112 468 818 3.27E-96 309.997 cl02964 RNB superfamily - - RNB domain; This domain is the catalytic domain of ribonuclease II. Q#25217 - CGI_10015915 superfamily 246722 21 196 6.08E-32 124.301 cl14812 PIN_SF superfamily - - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#25218 - CGI_10015916 superfamily 215825 64 263 1.05E-55 187.124 cl02828 Calreticulin superfamily N - Calreticulin family; Calreticulin family. Q#25218 - CGI_10015916 superfamily 215825 21 64 5.20E-08 52.3042 cl02828 Calreticulin superfamily C - Calreticulin family; Calreticulin family. Q#25219 - CGI_10015917 superfamily 215825 128 423 2.63E-111 338.122 cl02828 Calreticulin superfamily - - Calreticulin family; Calreticulin family. Q#25222 - CGI_10015920 superfamily 245814 60 120 1.33E-07 49.4099 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25222 - CGI_10015920 superfamily 245814 262 317 0.000159055 40.1651 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25222 - CGI_10015920 superfamily 245814 346 432 1.27E-09 55.5893 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25223 - CGI_10015921 superfamily 222432 566 659 3.27E-10 58.0882 cl16451 Bravo_FIGEY superfamily N - C-terminal domain of Fibronectin type III; This is the very C-terminal region of neural adhesion molecule L1 proteins that are also known as Bravo or NrCAM. It lies upstream of the IG and Fn3 domains and has the highly conserved motif FIGEY. The function is not known. Q#25223 - CGI_10015921 superfamily 245814 333 419 1.34E-08 52.8929 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25223 - CGI_10015921 superfamily 245814 39 112 1.24E-06 47.055 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25223 - CGI_10015921 superfamily 245814 245 312 2.95E-05 42.8777 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25224 - CGI_10015922 superfamily 245814 169 254 6.55E-10 56.3597 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25224 - CGI_10015922 superfamily 245814 23 90 3.74E-06 45.083 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25224 - CGI_10015922 superfamily 245814 409 486 0.000372316 39.1465 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25229 - CGI_10019043 superfamily 241583 121 156 9.82E-07 45.2547 cl00064 ZnMc superfamily C - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#25232 - CGI_10019046 superfamily 238012 960 1004 1.06E-13 68.1498 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 238012 415 461 9.69E-12 62.757 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 238012 746 790 1.66E-11 61.9866 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 238012 462 503 6.65E-10 57.3642 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 238012 907 958 3.23E-09 55.4382 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 238012 1007 1054 2.61E-08 52.7418 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 238012 297 348 2.47E-06 46.9638 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 238012 853 899 5.85E-05 42.7266 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 238012 359 405 0.000903288 39.2598 cl11390 EGF_Lam superfamily C - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#25232 - CGI_10019046 superfamily 243198 56 295 3.37E-105 336.254 cl02806 Laminin_N superfamily - - Laminin N-terminal (Domain VI); Laminin N-terminal (Domain VI). Q#25232 - CGI_10019046 superfamily 243080 579 711 8.70E-28 111.759 cl02548 Laminin_B superfamily - - Laminin B (Domain IV); Laminin B (Domain IV). Q#25232 - CGI_10019046 superfamily 225368 1136 1243 0.000340212 40.8626 cl01058 NtpF superfamily - - Archaeal/vacuolar-type H+-ATPase subunit H [Energy production and conversion] Q#25232 - CGI_10019046 superfamily 191136 1217 1301 0.00169609 39.4486 cl04860 Transcrip_act superfamily NC - Transcriptional activator; This family of proteins may act as a transcriptional activator. It plays a role in stress response in plants. Q#25233 - CGI_10019047 superfamily 241782 589 828 6.77E-28 116.193 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#25233 - CGI_10019047 superfamily 217293 238 376 6.50E-22 95.3923 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#25233 - CGI_10019047 superfamily 217293 8 131 5.88E-17 80.7547 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#25233 - CGI_10019047 superfamily 202474 140 239 2.33E-10 60.3601 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#25233 - CGI_10019047 superfamily 202474 425 511 5.68E-10 59.2045 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#25234 - CGI_10019048 superfamily 222150 1056 1081 2.35E-05 43.1493 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#25234 - CGI_10019048 superfamily 222150 965 987 0.00174723 37.7565 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#25235 - CGI_10019049 superfamily 245029 17 147 1.93E-16 71.1396 cl09190 MAPEG superfamily - - "MAPEG family; This family is has been called MAPEG (Membrane Associated Proteins in Eicosanoid and Glutathione metabolism). It includes proteins such as Prostaglandin E synthase. This enzyme catalyzes the synthesis of PGE2 from PGH2 (produced by cyclooxygenase from arachidonic acid). Because of structural similarities in the active sites of FLAP, LTC4 synthase and PGE synthase, substrates for each enzyme can compete with one another and modulate synthetic activity." Q#25236 - CGI_10019050 superfamily 241678 8 404 0 574.202 cl00198 Phosphoglycerate_kinase superfamily - - "Phosphoglycerate kinase (PGK) is a monomeric enzyme which catalyzes the transfer of the high-energy phosphate group of 1,3-bisphosphoglycerate to ADP, forming ATP and 3-phosphoglycerate. This reaction represents the first of the two substrate-level phosphorylation events in the glycolytic pathway. Substrate-level phosphorylation is defined as production of ATP by a process, which is catalyzed by water-soluble enzymes in the cytosol; not involving membranes and ion gradients." Q#25237 - CGI_10019051 superfamily 218520 20 185 5.59E-10 54.5918 cl05007 EBP superfamily - - "Emopamil binding protein; Emopamil binding protein (EBP) is as a gene that encodes a non-glycosylated type I integral membrane protein of endoplasmic reticulum and shows high level expression in epithelial tissues. The EBP protein has emopamil binding domains, including the sterol acceptor site and the catalytic centre, which show Delta8-Delta7 sterol isomerase activity. Human sterol isomerase, a homologue of mouse EBP, is suggested not only to play a role in cholesterol biosynthesis, but also to affect lipoprotein internalisation. In humans, mutations of EBP are known to cause the genetic disorder of X-linked dominant chondrodysplasia punctata (CDPX2). This syndrome of humans is lethal in most males, and affected females display asymmetric hyperkeratotic skin and skeletal abnormalities." Q#25240 - CGI_10019054 superfamily 247684 15 71 1.82E-05 40.5085 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25243 - CGI_10019057 superfamily 241832 505 608 1.66E-51 173.511 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25243 - CGI_10019057 superfamily 241832 45 144 1.98E-38 136.972 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25243 - CGI_10019057 superfamily 241832 160 258 2.48E-37 134.275 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25243 - CGI_10019057 superfamily 241832 280 369 1.61E-19 84.8446 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25243 - CGI_10019057 superfamily 241832 375 464 1.51E-18 81.9991 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25244 - CGI_10019058 superfamily 241600 342 399 1.60E-18 82.6735 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#25245 - CGI_10019059 superfamily 241600 327 536 7.66E-44 155.476 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#25245 - CGI_10019059 superfamily 149439 240 306 2.64E-18 79.6766 cl07120 Rpn3_C superfamily - - Proteasome regulatory subunit C-terminal; This eukaryotic domain is found at the C-terminus of 26S proteasome regulatory subunits such as the non-ATPase Rpn3 subunit which is essential for proteasomal function. It occurs together with the PCI/PINT domain (pfam01399). Q#25245 - CGI_10019059 superfamily 242889 168 258 7.72E-18 78.823 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#25245 - CGI_10019059 superfamily 243034 82 115 0.0060424 34.7975 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#25246 - CGI_10019060 superfamily 242889 108 206 1.83E-23 92.2809 cl02111 PCI superfamily - - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#25246 - CGI_10019060 superfamily 149439 213 279 2.35E-19 79.6766 cl07120 Rpn3_C superfamily - - Proteasome regulatory subunit C-terminal; This eukaryotic domain is found at the C-terminus of 26S proteasome regulatory subunits such as the non-ATPase Rpn3 subunit which is essential for proteasomal function. It occurs together with the PCI/PINT domain (pfam01399). Q#25247 - CGI_10019061 superfamily 217293 27 112 2.89E-25 95.3923 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#25248 - CGI_10019062 superfamily 217293 9 111 6.47E-32 120.815 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#25248 - CGI_10019062 superfamily 202474 227 339 2.58E-28 110.821 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#25248 - CGI_10019062 superfamily 202474 118 168 2.25E-11 61.9009 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#25248 - CGI_10019062 superfamily 217293 176 220 3.54E-07 49.1683 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#25249 - CGI_10019063 superfamily 217293 15 221 1.79E-70 219.041 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#25249 - CGI_10019063 superfamily 202474 228 278 8.40E-12 61.9009 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#25250 - CGI_10019064 superfamily 218652 19 318 3.26E-128 382.027 cl12311 CLPTM1 superfamily N - "Cleft lip and palate transmembrane protein 1 (CLPTM1); This family consists of several eukaryotic cleft lip and palate transmembrane protein 1 sequences. Cleft lip with or without cleft palate is a common birth defect that is genetically complex. The nonsyndromic forms have been studied genetically using linkage and candidate-gene association studies with only partial success in defining the loci responsible for orofacial clefting. CLPTM1 encodes a transmembrane protein and has strong homology to two Caenorhabditis elegans genes, suggesting that CLPTM1 may belong to a new gene family. This family also contains the human cisplatin resistance related protein CRR9p which is associated with CDDP-induced apoptosis." Q#25251 - CGI_10019065 superfamily 224127 45 164 8.61E-08 51.9627 cl18702 Gid superfamily C - "NAD(FAD)-utilizing enzyme possibly involved in translation [Translation, ribosomal structure and biogenesis]" Q#25253 - CGI_10019067 superfamily 235416 33 114 4.98E-12 66.7076 cl18883 PRK05335 superfamily N - tRNA (uracil-5-)-methyltransferase Gid; Reviewed Q#25253 - CGI_10019067 superfamily 243130 416 458 2.32E-09 53.6485 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#25253 - CGI_10019067 superfamily 206103 260 328 0.000953087 37.4894 cl16485 GIDA_assoc_3 superfamily - - "GidA associated domain 3; The GidA associated domain 3 is a motif that has been identified at the C-terminus of protein GidA. It consists of 4 helices, the last three being rather short and forming small bundle at the top end of the first longer one. It is here named helical domain 3 because in GidA it is preceded by two other C-terminal helical domain (based on crystal structures). GidA is an tRNA modification enzyme found in bacteria and mitochondrial. Based on mutational analysis this domain has been suggested to be implicated in binding of the D-stem of tRNA and to be responsible for the interaction with protein MnmE. Structures of GidA in complex with either tRNA or MnmE are missing. Reported to bind to Pfam family MnmE, pfam12631." Q#25255 - CGI_10019069 superfamily 243092 57 334 1.12E-09 58.5004 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25255 - CGI_10019069 superfamily 203864 477 504 1.87E-06 45.4951 cl06967 NUC153 superfamily - - NUC153 domain; This small domain is found in a a novel nucleolar family. Q#25255 - CGI_10019069 superfamily 131316 308 438 0.00273386 39.2153 cl17983 benz_CoA_red_C superfamily N - "benzoyl-CoA reductase, subunit C; This model describes C subunit of benzoyl-CoA reductase, a 4-subunit enzyme. Many aromatic compounds are metabolized by way of benzoyl-CoA. This enzyme acts under anaerobic conditions." Q#25256 - CGI_10019070 superfamily 248097 205 230 0.00214581 36.089 cl17543 C1q superfamily NC - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25257 - CGI_10019071 superfamily 241645 384 460 2.35E-13 67.5997 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#25257 - CGI_10019071 superfamily 243092 745 1066 1.21E-15 77.7604 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25257 - CGI_10019071 superfamily 190637 661 731 2.13E-13 67.811 cl04081 HELP superfamily - - "HELP motif; The founding member of the EMAP protein family is the 75 kDa Echinoderm Microtubule-Associated Protein, so-named for its abundance in sea urchin, sand dollar and starfish eggs. The Hydrophobic EMAP-Like Protein (HELP) motif was identified initially in the human EMAP-Like Protein 2 (EML2) and subsequently in the entire EMAP Protein family. The HELP motif is approximately 60-70 amino acids in length and is conserved amongst metazoans. Although the HELP motif is hydrophobic, there is no evidence that EMAP-Like Proteins are membrane-associated. All members of the EMAP-Like Protein family, identified to-date, are constructed with an amino terminal HELP motif followed by a WD domain. In C. elegans, EMAP-Like Protein-1 (ELP-1) is required for touch sensation indicating that ELP-1 may play a role in mechanosensation. The localization of ELP-1 to microtubules and adhesion sites implies that ELP-1 may transmit forces between the body surface and the touch receptor neurons." Q#25257 - CGI_10019071 superfamily 243092 905 1262 2.68E-13 70.8268 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25258 - CGI_10019072 superfamily 242141 74 355 2.19E-121 361.765 cl00851 Fumerase superfamily - - "Fumarate hydratase (Fumerase); This family consists of several bacterial fumarate hydratase proteins FumA and FumB. Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth. Three fumarases, FumA, FumB, and FumC, have been reported in E. coli. fumA and fumB genes are homologous and encode products of identical sizes which form thermolabile dimers of Mr 120,000. FumA and FumB are class I enzymes and are members of the iron-dependent hydrolases, which include aconitase and malate hydratase. The active FumA contains a 4Fe-4S centre, and it can be inactivated upon oxidation to give a 3Fe-4S centre." Q#25258 - CGI_10019072 superfamily 242096 359 568 3.78E-100 304.445 cl00795 Fumerase_C superfamily - - "Fumarase C-terminus; This family consists of the C terminal region of several bacterial fumarate hydratase proteins (FumA and FumB). Fumarase, or fumarate hydratase (EC 4.2.1.2), is a component of the citric acid cycle. In facultative anaerobes such as Escherichia coli, fumarase also engages in the reductive pathway from oxaloacetate to succinate during anaerobic growth." Q#25260 - CGI_10019074 superfamily 242075 25 218 4.80E-74 225.519 cl00764 EMG1 superfamily - - EMG1/NEP1 methyltransferase; Members of this family are essential for 40S ribosomal biogenesis. The structure of EMG1 has revealed that it is a novel member of the superfamily of alpha/beta knot fold methyltransferases. Q#25261 - CGI_10019075 superfamily 241571 167 270 5.89E-24 94.0162 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25262 - CGI_10019076 superfamily 148158 119 196 5.85E-23 88.7287 cl05731 ICAT superfamily - - "Beta-catenin-interacting protein ICAT; This family consists of several eukaryotic beta-catenin-interacting (ICAT) proteins. Beta-catenin is a multifunctional protein involved in both cell adhesion and transcriptional activation. Transcription mediated by the beta-catenin/Tcf complex is involved in embryological development and is upregulated in various cancers. ICAT selectively inhibits beta-catenin/Tcf binding in vivo, without disrupting beta-catenin/cadherin interactions." Q#25263 - CGI_10019077 superfamily 247041 55 325 4.19E-95 288.061 cl15692 CE4_SF superfamily - - "Catalytic NodB homology domain of the carbohydrate esterase 4 superfamily; The carbohydrate esterase 4 (CE4) superfamily mainly includes chitin deacetylases (EC 3.5.1.41), bacterial peptidoglycan N-acetylglucosamine deacetylases (EC 3.5.1.-), and acetylxylan esterases (EC 3.1.1.72), which catalyze the N- or O-deacetylation of substrates such as acetylated chitin, peptidoglycan, and acetylated xylan, respectively. Members in this superfamily contain a NodB homology domain that adopts a deformed (beta/alpha)8 barrel fold, which encompasses a mononuclear metalloenzyme employing a conserved His-His-Asp zinc-binding triad, closely associated with the conserved catalytic base (aspartic acid) and acid (histidine) to carry out acid/base catalysis. The NodB homology domain of CE4 superfamily is remotely related to the 7-stranded beta/alpha barrel catalytic domain of the superfamily consisting of family 38 glycoside hydrolases (GH38), family 57 heat stable retaining glycoside hydrolases (GH57), lactam utilization protein LamB/YcsF family proteins, and YdjC-family proteins." Q#25265 - CGI_10019079 superfamily 241567 383 617 4.44E-81 258.299 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#25265 - CGI_10019079 superfamily 246680 10 65 0.0013872 37.5664 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25266 - CGI_10019080 superfamily 243092 49 341 2.18E-89 272.286 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25267 - CGI_10019081 superfamily 221564 33 105 8.85E-20 79.1779 cl13797 P5-ATPase superfamily C - "P5-type ATPase cation transporter; This domain family is found in eukaryotes, and is typically between 110 and 126 amino acids in length. The family is found in association with pfam00122, pfam00702. P-type ATPases comprise a large superfamily of proteins, present in both prokaryotes and eukaryotes, that transport inorganic cations and other substrates across cell membranes." Q#25271 - CGI_10014273 superfamily 244897 36 214 1.28E-18 83.6894 cl08298 PTZ00007 superfamily C - (NAP-L) nucleosome assembly protein -L; Provisional Q#25272 - CGI_10014274 superfamily 247675 45 339 8.24E-131 378.379 cl17011 Arginase_HDAC superfamily - - "Arginase-like and histone-like hydrolases; Arginase-like/histone-like hydrolase superfamily includes metal-dependent enzymes that belong to Arginase-like amidino hydrolase family and histone/histone-like deacetylase class I, II, IV family, respectively. These enzymes catalyze hydrolysis of amide bond. Arginases are known to be involved in control of cellular levels of arginine and ornithine, in histidine and arginine degradation and in clavulanic acid biosynthesis. Deacetylases play a role in signal transduction through histone and/or other protein modification and can repress/activate transcription of a number of different genes. They participate in different cellular processes including cell cycle regulation, DNA damage response, embryonic development, cytokine signaling important for immune response and post-translational control of the acetyl coenzyme A synthetase. Mammalian histone deacetyases are known to be involved in progression of different tumors. Specific inhibitors of mammalian histone deacetylases are an emerging class of promising novel anticancer drugs." Q#25274 - CGI_10014276 superfamily 219542 104 204 2.04E-41 147.388 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#25274 - CGI_10014276 superfamily 219541 493 634 3.41E-28 110.636 cl18516 Cu-oxidase_2 superfamily N - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#25274 - CGI_10014276 superfamily 215896 264 383 9.11E-24 98.5212 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#25275 - CGI_10014277 superfamily 248097 41 160 6.24E-19 78.0758 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25276 - CGI_10014278 superfamily 215896 31 125 1.95E-20 81.5724 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#25277 - CGI_10014279 superfamily 219542 44 144 1.01E-41 136.603 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#25278 - CGI_10014280 superfamily 219541 169 322 2.19E-29 111.021 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#25278 - CGI_10014280 superfamily 215896 2 66 5.74E-16 73.8684 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#25279 - CGI_10014281 superfamily 219542 133 231 9.19E-41 138.143 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#25280 - CGI_10014282 superfamily 219542 161 263 9.85E-40 142.766 cl18517 Cu-oxidase_3 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#25280 - CGI_10014282 superfamily 219541 550 698 6.95E-27 107.169 cl18516 Cu-oxidase_2 superfamily - - Multicopper oxidase; This entry contains many divergent copper oxidase-like domains that are not recognised by the pfam00394 model. Q#25280 - CGI_10014282 superfamily 215896 320 440 5.56E-26 104.684 cl18351 Cu-oxidase superfamily N - Multicopper oxidase; Many of the proteins in this family contain multiple similar copies of this plastocyanin-like domain. Q#25281 - CGI_10014283 superfamily 247727 134 242 0.000380788 38.5703 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#25283 - CGI_10014285 superfamily 241580 291 363 9.37E-36 128.441 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#25284 - CGI_10014286 superfamily 243035 70 188 2.54E-30 117.336 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25284 - CGI_10014286 superfamily 215827 383 559 7.89E-33 127.585 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#25285 - CGI_10014287 superfamily 246676 5 184 1.77E-36 127.791 cl14616 Cyt_b561 superfamily - - "Eukaryotic cytochrome b(561); Cytochrome b(561) is a family of endosomal or secretory vesicle-specific electron transport proteins. They are integral membrane proteins that bind two heme groups non-covalently, and may have six alpha-helical trans-membrane segments. This is an exclusively eukaryotic family. Members of the prokaryotic cytochrome b561 family are not deemed homologous." Q#25286 - CGI_10014288 superfamily 243066 11 114 3.75E-16 74.5761 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25286 - CGI_10014288 superfamily 198867 91 209 3.73E-05 42.3285 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#25286 - CGI_10014288 superfamily 243146 304 344 6.18E-05 41.1207 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25286 - CGI_10014288 superfamily 243146 337 380 0.000742988 37.641 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25289 - CGI_10014291 superfamily 129885 80 252 2.40E-42 146.731 cl17977 nst superfamily C - "UDP-galactose transporter; The 10-12 TMS Nucleotide Sugar Transporters (TC 2.A.7.10)Nucleotide-sugar transporters (NSTs) are found in the Golgi apparatus and the endoplasmic reticulum of eukaryotic cells. Members of the family have been sequenced from yeast, protozoans and animals. Animals such as C. elegans possess many of these transporters. Humans have at least two closely related isoforms of the UDP-galactose:UMP exchange transporter.NSTs generally appear to function by antiport mechanisms, exchanging a nucleotide-sugar for a nucleotide. Thus, CMP-sialic acid is exchanged for CMP; GDP-mannose is preferentially exchanged for GMP, and UDP-galactose and UDP-N-acetylglucosamine are exchanged for UMP (or possibly UDP). Other nucleotide sugars (e.g., GDP-fucose, UDP-xylose, UDP-glucose, UDP-N-acetylgalactosamine, etc.) may also be transported in exchange for various nucleotides, but their transporters have not been molecularly characterized. Each compound appears to be translocated by its own transport protein. Transport allows the compound, synthesized in the cytoplasm, to be exported to the lumen of the Golgi apparatus or the endoplasmic reticulum where it is used for the synthesis of glycoproteins and glycolipids." Q#25290 - CGI_10010224 superfamily 202894 71 138 9.81E-26 94.595 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#25291 - CGI_10010225 superfamily 247755 1869 2093 3.27E-109 349.111 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#25291 - CGI_10010225 superfamily 247755 833 1050 3.29E-105 337.555 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#25292 - CGI_10010226 superfamily 248097 6 124 1.05E-17 73.8386 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25293 - CGI_10010227 superfamily 248097 125 254 3.66E-18 77.6906 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25294 - CGI_10010229 superfamily 248097 74 203 7.50E-18 76.1498 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25295 - CGI_10010230 superfamily 247727 79 176 1.09E-10 56.2842 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#25296 - CGI_10010231 superfamily 241913 54 165 2.04E-28 103.021 cl00509 hot_dog superfamily - - "The hotdog fold was initially identified in the E. coli FabA (beta-hydroxydecanoyl-acyl carrier protein (ACP)-dehydratase) structure and subsequently in 4HBT (4-hydroxybenzoyl-CoA thioesterase) from Pseudomonas. A number of other seemingly unrelated proteins also share the hotdog fold. These proteins have related, but distinct, catalytic activities that include metabolic roles such as thioester hydrolysis in fatty acid metabolism, and degradation of phenylacetic acid and the environmental pollutant 4-chlorobenzoate. This superfamily also includes the PaaI-like protein FapR, a non-catalytic bacterial homolog involved in transcriptional regulation of fatty acid biosynthesis." Q#25297 - CGI_10010232 superfamily 117316 211 250 2.09E-07 47.0093 cl07385 zf-RING-like superfamily - - RING-like domain; This is a zinc finger domain that is related to the C3HC4 RING finger domain (pfam00097). Q#25298 - CGI_10010233 superfamily 247805 225 270 5.26E-06 44.783 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25299 - CGI_10010234 superfamily 247905 498 615 7.71E-28 110.79 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#25299 - CGI_10010234 superfamily 247805 225 438 6.87E-53 184.996 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25302 - CGI_10010237 superfamily 150797 12 181 9.74E-57 179.11 cl10866 DUF2366 superfamily - - Uncharacterized conserved protein (DUF2366); This is a family of proteins conserved from nematodes to humans. The function is not known. Q#25303 - CGI_10010238 superfamily 241584 169 255 6.69E-05 42.4835 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25303 - CGI_10010238 superfamily 241584 463 559 0.000380513 39.7871 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25304 - CGI_10007383 superfamily 245235 565 959 0 818.045 cl10023 POLBc superfamily - - "DNA polymerase type-B family catalytic domain. DNA-directed DNA polymerases elongate DNA by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA. DNA-directed DNA polymerases are multifunctional with both synthetic (polymerase) and degradative modes (exonucleases) and play roles in the processes of DNA replication, repair, and recombination. DNA-dependent DNA polymerases can be classified in six main groups based upon their phylogenetic relationships with E. coli polymerase I (class A), E. coli polymerase II (class B), E. coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB, and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family B DNA polymerases include E. coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon, and zeta), and eukaryotic viral and plasmid-borne enzymes. DNA polymerase is made up of distinct domains and sub-domains. The polymerase domain of DNA polymerase type B (Pol domain) is responsible for the template-directed polymerization of dNTPs onto the growing primer strand of duplex DNA that is usually magnesium dependent. In general, the architecture of the Pol domain has been likened to a right hand with fingers, thumb, and palm sub-domains with a deep groove to accommodate the nucleic acid substrate. There are a few conserved motifs in the Pol domain of family B DNA polymerases. The conserved aspartic acid residues in the DTDS motifs of the palm sub-domain is crucial for binding to divalent metal ion and is suggested to be important for polymerase catalysis." Q#25304 - CGI_10007383 superfamily 245226 290 519 8.53E-155 462.046 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#25304 - CGI_10007383 superfamily 222632 997 1069 5.57E-31 117.892 cl16754 zf-C4pol superfamily - - "C4-type zinc-finger of DNA polymerase delta; In fission yeast this zinc-finger domain appears is the region of Pol3 that binds directly to the B-subunit, Cdc1. Pol delta is a hetero-tetrameric enzyme comprising four evolutionarily well-conserved proteins: the catalytic subunit Pol3 and three smaller subunits Cdc1, Cdc27 and Cdm1." Q#25305 - CGI_10007384 superfamily 241645 8 94 0.000115435 40.3592 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#25306 - CGI_10007385 superfamily 114645 81 201 1.73E-07 50.5275 cl05479 MCLC superfamily NC - Mid-1-related chloride channel (MCLC); This family consists of several mid-1-related chloride channels. mid-1-related chloride channel (MCLC) proteins function as a chloride channel when incorporated in the planar lipid bilayer. Q#25307 - CGI_10007386 superfamily 243035 221 325 7.30E-26 100.002 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25316 - CGI_10007395 superfamily 246748 107 192 0.00752775 35.531 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#25319 - CGI_10002766 superfamily 245595 3 241 1.08E-67 213.152 cl11393 Peptidase_M14_like superfamily - - "M14 family of metallocarboxypeptidases and related proteins; The M14 family of metallocarboxypeptidases (MCPs), also known as funnelins, are zinc-binding carboxypeptidases (CPs) which hydrolyze single, C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. Two major subfamilies of the M14 family, defined based on sequence and structural homology, are the A/B and N/E subfamilies. Enzymes belonging to the A/B subfamily are normally synthesized as inactive precursors containing preceding signal peptide, followed by an N-terminal pro-region linked to the enzyme; these proenzymes are called procarboxypeptidases. The A/B enzymes can be further divided based on their substrate specificity; Carboxypeptidase A-like (CPA-like) enzymes favor hydrophobic residues while carboxypeptidase B-like (CPB-like) enzymes only cleave the basic residues lysine or arginine. The A forms have slightly different specificities, with Carboxypeptidase A1 (CPA1) preferring aliphatic and small aromatic residues, and CPA2 preferring the bulky aromatic side chains. Enzymes belonging to the N/E subfamily enzymes are not produced as inactive precursors and instead rely on their substrate specificity and subcellular compartmentalization to prevent inappropriate cleavage. They contain an extra C-terminal transthyretin-like domain, thought to be involved in folding or formation of oligomers. MCPs can also be classified based on their involvement in specific physiological processes; the pancreatic MCPs participate only in alimentary digestion and include carboxypeptidase A and B (A/B subfamily), while others, namely regulatory MCPs or the N/E subfamily, are involved in more selective reactions, mainly in non-digestive tissues and fluids, acting on blood coagulation/fibrinolysis, inflammation and local anaphylaxis, pro-hormone and neuropeptide processing, cellular response and others. Another MCP subfamily, is that of succinylglutamate desuccinylase /aspartoacylase, which hydrolyzes N-acetyl-L-aspartate (NAA), and deficiency in which is the established cause of Canavan disease. Another subfamily (referred to as subfamily C) includes an exceptional type of activity in the MCP family, that of dipeptidyl-peptidase activity of gamma-glutamyl-(L)-meso-diaminopimelate peptidase I which is involved in bacterial cell wall metabolism." Q#25320 - CGI_10017177 superfamily 247725 832 1057 2.06E-81 266.81 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25320 - CGI_10017177 superfamily 241622 1180 1253 4.00E-15 73.3698 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#25320 - CGI_10017177 superfamily 241622 1090 1175 8.85E-13 66.8214 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#25320 - CGI_10017177 superfamily 246680 1537 1609 0.0043876 37.4407 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25321 - CGI_10017178 superfamily 243187 567 739 2.33E-102 316.819 cl02789 EFG_like_IV superfamily - - "Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm." Q#25321 - CGI_10017178 superfamily 247724 29 244 1.03E-98 308.777 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25321 - CGI_10017178 superfamily 243185 387 480 5.99E-46 160.036 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#25321 - CGI_10017178 superfamily 243183 735 814 5.41E-39 139.983 cl02785 Elongation_Factor_C superfamily - - "Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown." Q#25322 - CGI_10017179 superfamily 241581 136 234 3.95E-20 86.6714 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#25322 - CGI_10017179 superfamily 190615 338 404 2.07E-07 49.5276 cl04028 dsRNA_bind superfamily - - "Double stranded RNA binding domain; This domain is a divergent double stranded RNA-binding domain. It is found in members of the Dicer protein family which function in RNA interference, an evolutionarily conserved mechanism for gene silencing using double-stranded RNA (dsRNA) molecules." Q#25323 - CGI_10017180 superfamily 243066 30 131 1.56E-25 101.54 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25323 - CGI_10017180 superfamily 198867 142 240 2.28E-12 63.7167 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#25323 - CGI_10017180 superfamily 243146 381 426 3.97E-09 53.4342 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25323 - CGI_10017180 superfamily 243146 430 472 1.66E-08 51.5082 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25323 - CGI_10017180 superfamily 243146 331 378 3.89E-08 50.3526 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25323 - CGI_10017180 superfamily 243146 487 538 3.32E-06 44.8567 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25323 - CGI_10017180 superfamily 243146 528 574 9.06E-05 40.5093 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25324 - CGI_10017181 superfamily 115560 872 913 0.0026928 38.3196 cl06117 MEA1 superfamily N - "Male enhanced antigen 1 (MEA1); This family consists of several mammalian male enhanced antigen 1 (MEA1) proteins. The Mea-1 gene is found to be localised in primary and secondary spermatocytes and spermatids, but the protein products are detected only in spermatids. Intensive transcription of Mea-1 gene and specific localisation of the gene product suggest that Mea-1 may play a important role in the late stage of spermatogenesis." Q#25329 - CGI_10017186 superfamily 241739 1190 1452 1.19E-159 488.259 cl00268 class_II_aaRS-like_core superfamily - - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#25329 - CGI_10017186 superfamily 241738 1458 1680 9.94E-75 248.752 cl00266 HGTP_anticodon superfamily - - "HGTP anticodon binding domain, as found at the C-terminus of histidyl, glycyl, threonyl and prolyl tRNA synthetases, which are classified as a group of class II aminoacyl-tRNA synthetases (aaRS). In aaRSs, the anticodon binding domain is responsible for specificity in tRNA-binding, so that the activated amino acid is transferred to a ribose 3' OH group of the appropriate tRNA only. This domain is also found in the accessory subunit of mitochondrial polymerase gamma (Pol gamma b)." Q#25329 - CGI_10017186 superfamily 241550 368 500 7.99E-73 244.854 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#25329 - CGI_10017186 superfamily 243175 72 155 2.30E-26 105.864 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#25329 - CGI_10017186 superfamily 241805 815 864 1.37E-18 82.2832 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#25329 - CGI_10017186 superfamily 241805 745 794 3.85E-17 78.4312 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#25329 - CGI_10017186 superfamily 241805 1022 1071 1.05E-16 76.8904 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#25329 - CGI_10017186 superfamily 241805 951 997 1.57E-15 73.8088 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#25329 - CGI_10017186 superfamily 241805 1096 1141 2.88E-13 67.2604 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#25329 - CGI_10017186 superfamily 241805 881 928 2.11E-09 55.7044 cl00349 S15_NS1_EPRS_RNA-bind superfamily - - "S15/NS1/EPRS_RNA-binding domain. This short domain consists of a helix-turn-helix structure, which can bind to several types of RNA. It is found in the ribosomal protein S15, the influenza A viral nonstructural protein (NSA) and in several eukaryotic aminoacyl tRNA synthetases (aaRSs), where it occurs as a single or a repeated unit. It is involved in both protein-RNA interactions by binding tRNA and protein-protein interactions in the formation of tRNA-synthetases into multienzyme complexes. While this domain lacks significant sequence similarity between the subgroups in which it is found, they share similar electrostatic surface potentials and thus are likely to bind to RNA via the same mechanism." Q#25329 - CGI_10017186 superfamily 241550 201 295 8.89E-47 170.125 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#25329 - CGI_10017186 superfamily 217810 499 674 2.00E-34 131.993 cl04341 tRNA-synt_1c_C superfamily - - "tRNA synthetases class I (E and Q), anti-codon binding domain; Other tRNA synthetase sub-families are too dissimilar to be included. This family includes only glutamyl and glutaminyl tRNA synthetases. In some organisms, a single glutamyl-tRNA synthetase aminoacylates both tRNA(Glu) and tRNA(Gln)." Q#25330 - CGI_10017187 superfamily 241811 5 55 6.35E-25 88.9737 cl00355 Ribosomal_S14 superfamily - - Ribosomal protein S14p/S29e; This family includes both ribosomal S14 from prokaryotes and S29 from eukaryotes. Q#25331 - CGI_10017188 superfamily 198867 158 253 2.65E-22 92.6067 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#25331 - CGI_10017188 superfamily 243066 38 149 7.69E-18 79.9689 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25331 - CGI_10017188 superfamily 245847 327 408 6.32E-05 42.1044 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25331 - CGI_10017188 superfamily 245847 486 561 0.00528729 36.3264 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25332 - CGI_10017189 superfamily 241592 2 35 4.03E-09 52.6571 cl00074 H2A superfamily C - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#25333 - CGI_10017190 superfamily 245319 209 799 1.91E-99 334.317 cl10505 CBF superfamily - - CBF/Mak21 family; CBF/Mak21 family. Q#25334 - CGI_10017191 superfamily 242542 26 199 2.72E-08 50.6924 cl01505 YhhN superfamily - - "YhhN-like protein; The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many of the members of this family are annotated as being possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues." Q#25335 - CGI_10017192 superfamily 242542 44 231 4.82E-25 97.6867 cl01505 YhhN superfamily - - "YhhN-like protein; The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many of the members of this family are annotated as being possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues." Q#25336 - CGI_10017193 superfamily 242542 44 231 4.82E-25 97.6867 cl01505 YhhN superfamily - - "YhhN-like protein; The members of this family are similar to the hypothetical protein yhhN expressed by E. coli. Many of the members of this family are annotated as being possible transmembrane proteins, and in fact they all have a high proportion of hydrophobic residues." Q#25337 - CGI_10017194 superfamily 247637 77 405 0 551.714 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#25338 - CGI_10017195 superfamily 245835 130 242 0.00238179 38.0895 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#25341 - CGI_10017198 superfamily 247725 45 153 4.90E-64 203.325 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25342 - CGI_10017199 superfamily 219977 1 62 1.53E-10 59.2208 cl18539 Vps51 superfamily N - "Vps51/Vps67; This family includes a presumed domain found in a number of components of vesicular transport. The VFT tethering complex (also known as GARP complex, Golgi associated retrograde protein complex, Vps53 tethering complex) is a conserved eukaryotic docking complex which is involved recycling of proteins from endosomes to the late Golgi. Vps51 (also known as Vps67) is a subunit of VFT and interacts with the SNARE Tlg1. Cog1_N is the N-terminus of the Cog1 subunit of the eight-unit Conserved Oligomeric Golgi (COG) complex that participates in retrograde vesicular transport and is required to maintain normal Golgi structure and function. The subunits are located in two lobes and Cog1 serves to bind the two lobes together probably via the highly conserved N-terminal domain of approximately 85 residues." Q#25343 - CGI_10017200 superfamily 243045 356 452 4.91E-17 77.6735 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#25343 - CGI_10017200 superfamily 241596 89 141 2.45E-13 66.0835 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#25343 - CGI_10017200 superfamily 243045 170 237 1.66E-09 55.7171 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#25345 - CGI_10017202 superfamily 243269 19 420 5.85E-113 344.637 cl03012 Ammonium_transp superfamily - - Ammonium Transporter Family; Ammonium Transporter Family. Q#25348 - CGI_10017205 superfamily 243069 25 248 1.14E-88 267.476 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#25352 - CGI_10017209 superfamily 247744 8 53 0.00023105 39.5962 cl17190 NK superfamily C - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#25355 - CGI_10017212 superfamily 218109 89 169 6.37E-15 69.2765 cl12292 Gly_transf_sug superfamily - - "Glycosyltransferase sugar-binding region containing DXD motif; The DXD motif is a short conserved motif found in many families of glycosyltransferases, which add a range of different sugars to other sugars, phosphates and proteins. DXD-containing glycosyltransferases all use nucleoside diphosphate sugars as donors and require divalent cations, usually manganese. The DXD motif is expected to play a carbohydrate binding role in sugar-nucleoside diphosphate and manganese dependent glycosyltransferases." Q#25356 - CGI_10017213 superfamily 247725 46 145 3.18E-45 155.943 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25356 - CGI_10017213 superfamily 243094 292 615 5.39E-147 431.526 cl02569 RasGAP superfamily - - "Ras GTPase Activating Domain; RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator." Q#25356 - CGI_10017213 superfamily 246669 160 294 6.49E-33 122.861 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#25358 - CGI_10013584 superfamily 241584 13 101 0.000435696 37.8611 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25358 - CGI_10013584 superfamily 241571 104 191 3.65E-06 44.3255 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25360 - CGI_10013586 superfamily 248012 2 109 8.97E-18 74.9216 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#25362 - CGI_10013588 superfamily 219556 97 179 8.41E-34 123.459 cl06678 AdoMet_MTase superfamily C - Predicted AdoMet-dependent methyltransferase; Proteins in this family have been predicted to function as AdoMet-dependent methyltransferases. Q#25365 - CGI_10013591 superfamily 242274 215 339 3.30E-05 42.7846 cl01053 SGNH_hydrolase superfamily C - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#25367 - CGI_10013593 superfamily 198738 69 152 1.44E-41 138.608 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#25368 - CGI_10013594 superfamily 241597 758 826 2.93E-16 75.4125 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#25368 - CGI_10013594 superfamily 219470 324 364 7.14E-12 63.0888 cl06557 BCNT superfamily NC - Bucentaur or craniofacial development; Bucentaur or craniofacial development protein 1 (BCNT) in ruminents has a different domain architecture to that in mouse and human. For this reason it has been used as a model for molecular evolution. Both bovine and human BCNTs are phosphorylated by casein kinase II in vitro. Q#25368 - CGI_10013594 superfamily 219470 233 266 1.26E-11 62.3184 cl06557 BCNT superfamily NC - Bucentaur or craniofacial development; Bucentaur or craniofacial development protein 1 (BCNT) in ruminents has a different domain architecture to that in mouse and human. For this reason it has been used as a model for molecular evolution. Both bovine and human BCNTs are phosphorylated by casein kinase II in vitro. Q#25368 - CGI_10013594 superfamily 219431 891 932 1.27E-06 47.0404 cl06504 zf-CW superfamily - - "CW-type Zinc Finger; This domain appears to be a zinc finger. The alignment shows four conserved cysteine residues and a conserved tryptophan. It was first identified by, and is predicted to be a "highly specialised mononuclear four-cysteine zinc finger...that plays a role in DNA binding and/or promoting protein-protein interactions in complicated eukaryotic processes including...chromatin methylation status and early embryonic development." Weak homology to pfam00628 further evidences these predictions (personal obs: C Yeats). Twelve different CW-domain-containing protein subfamilies are described, with different subfamilies being characteristic of vertebrates, higher plants and other animals in which these domain is found." Q#25369 - CGI_10013595 superfamily 243092 369 546 1.78E-07 53.1076 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25369 - CGI_10013595 superfamily 243092 249 413 0.00017812 43.8628 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25372 - CGI_10013598 superfamily 243035 74 153 4.80E-12 59.5557 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25372 - CGI_10013598 superfamily 243035 1 55 0.000285313 37.9846 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25372 - CGI_10013598 superfamily 243035 174 213 5.56E-07 46.0538 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25373 - CGI_10003340 superfamily 243161 10 96 1.04E-16 72.0417 cl02739 THAP superfamily - - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#25374 - CGI_10002823 superfamily 247684 7 382 0 633.996 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25377 - CGI_10002826 superfamily 243015 113 193 4.46E-14 65.361 cl02381 Tim17 superfamily N - "Tim17/Tim22/Tim23/Pmp24 family; The pre-protein translocase of the mitochondrial outer membrane (Tom) allows the import of pre-proteins from the cytoplasm. Tom forms a complex with a number of proteins, including Tim17. Tim17 and Tim23 are thought to form the translocation channel of the inner membrane. This family includes Tim17, Tim22 and Tim23. This family also includes Pmp24 a peroxisomal protein. The involvement of this domain in the targeting of PMP24 remains to be proved. PMP24 was known as Pmp27 in." Q#25378 - CGI_10002827 superfamily 246597 4 288 0 630.797 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#25379 - CGI_10002549 superfamily 243092 299 431 0.00307857 38.47 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25379 - CGI_10002549 superfamily 110440 519 546 0.003736 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#25379 - CGI_10002549 superfamily 241563 60 99 0.00382853 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25383 - CGI_10000827 superfamily 243034 29 86 3.22E-05 37.3596 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#25384 - CGI_10009280 superfamily 248247 213 269 1.04E-10 60.3106 cl17693 Integrin_beta superfamily C - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#25384 - CGI_10009280 superfamily 248247 25 61 8.05E-09 54.5326 cl17693 Integrin_beta superfamily NC - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#25385 - CGI_10009281 superfamily 248247 7 298 1.50E-166 477.096 cl17693 Integrin_beta superfamily N - "Integrin, beta chain; Integrins have been found in animals and their homologues have also been found in cyanobacteria, probably due to horizontal gene transfer. The sequences repeats have been trimmed due to an overlap with EGF." Q#25386 - CGI_10009282 superfamily 243034 624 711 8.63E-06 44.6784 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#25386 - CGI_10009282 superfamily 203444 87 120 9.97E-14 69.2491 cl05761 PRP1_N superfamily N - "PRP1 splicing factor, N-terminal; This domain is specific to the N-terminal part of the prp1 splicing factor, which is involved in mRNA splicing (and possibly also poly(A)+ RNA nuclear export and cell cycle progression). This domain is specific to the N terminus of the RNA splicing factor encoded by prp1. It is involved in mRNA splicing and possibly also poly(A)and RNA nuclear export and cell cycle progression." Q#25386 - CGI_10009282 superfamily 214642 386 418 0.00138057 37.5286 cl02592 HAT superfamily - - HAT (Half-A-TPR) repeats; Present in several RNA-binding proteins. Structurally and sequentially thought to be similar to TPRs. Q#25386 - CGI_10009282 superfamily 243034 335 398 0.00535964 36.204 cl02429 TPR superfamily N - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#25388 - CGI_10009284 superfamily 243065 1681 1839 1.06E-09 59.3377 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#25388 - CGI_10009284 superfamily 219677 2049 2074 0.00807612 36.6468 cl18521 EGF_2 superfamily - - EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins. Q#25389 - CGI_10009285 superfamily 215647 601 830 9.34E-41 150.452 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#25390 - CGI_10009286 superfamily 243142 30 141 3.98E-09 54.9399 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#25391 - CGI_10009287 superfamily 149667 1 94 7.61E-08 46.9799 cl07343 GON superfamily N - GON domain; The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. Q#25392 - CGI_10009288 superfamily 149667 2 52 0.000216418 37.7351 cl07343 GON superfamily N - GON domain; The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. Q#25393 - CGI_10009289 superfamily 246680 12 89 0.000166549 39.8776 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25394 - CGI_10009290 superfamily 243035 40 169 6.54E-18 75.3489 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25399 - CGI_10009586 superfamily 197448 50 108 2.32E-17 72.9061 cl15240 Reelin_subrepeat_like superfamily N - "Tandem repeat subunit of reelin and related proteins; Reelin is an extracellular glycoprotein involved in neuronal development, specifically in the brain cortex. It contains 8 tandemly repeated units, each of which is composed of two highly similar subrepeats and a central EGF domain. This model characterizes the subrepeats, which directly contact each other in a compact arrangement. Consecutive reelin repeat units are packed together to form an overall rod-like molecular structure. Reelin repeats 5 and 6 are reported to interact with neuronal receptors, the apolipoprotein E receptor 2 (ApoER2) and the very-low-density lipoprotein receptor (VLDLR), triggering a signaling cascade upon binding and subsequent tyrosine phosphorylation of the cytoplasmic disabled-1 (Dab1). Genetic deficiency of reelin, or ApoER2 and VLDLR, or Dab1, all exhibit the same phenotypes, including ataxia, cortical layer inversion and abnormal positioning patterns." Q#25400 - CGI_10009587 superfamily 243267 57 423 7.64E-127 373.872 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#25401 - CGI_10009588 superfamily 243267 26 394 2.45E-98 300.685 cl03000 Innexin superfamily - - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#25404 - CGI_10009591 superfamily 241578 26 110 4.88E-11 56.2425 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25405 - CGI_10009592 superfamily 248054 33 247 3.40E-14 70.7943 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#25406 - CGI_10009593 superfamily 243069 551 706 1.11E-05 45.4364 cl02525 Band_7 superfamily N - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#25407 - CGI_10009594 superfamily 221582 475 621 2.05E-11 63.0451 cl13834 Milton superfamily - - "Kinesin associated protein; This domain family is found in eukaryotes, and is typically between 143 and 173 amino acids in length. The family is found in association with pfam04849. This family is a region of the protein milton. Milton recruits the heavy chain of kinesin to mitochondria to allow the motor movement function of kinesin." Q#25408 - CGI_10009595 superfamily 245323 2046 2289 2.51E-132 418.57 cl10511 Beach superfamily - - "BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins." Q#25408 - CGI_10009595 superfamily 245323 2322 2467 1.58E-64 224.044 cl10511 Beach superfamily N - "BEACH (Beige and Chediak-Higashi) domains, implicated in membrane trafficking, are present in a family of proteins conserved throughout eukaryotes. This group contains human lysosomal trafficking regulator (LYST), LPS-responsive and beige-like anchor (LRBA) and neurobeachin. Disruption of LYST leads to Chediak-Higashi syndrome, characterized by severe immunodeficiency, albinism, poor blood coagulation and neurologic problems. Neurobeachin is a candidate gene linked to autism. LBRA seems to be upregulated in several cancer types. It has been shown that the BEACH domain itself is important for the function of these proteins." Q#25408 - CGI_10009595 superfamily 247725 1928 2027 3.31E-34 130.493 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25408 - CGI_10009595 superfamily 243092 2591 2844 6.68E-17 83.1532 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25409 - CGI_10001544 superfamily 247723 79 122 5.82E-20 78.4692 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25410 - CGI_10001545 superfamily 243152 59 189 1.57E-41 138.576 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#25411 - CGI_10007036 superfamily 151906 82 172 1.00E-16 72.0085 cl12990 LEDGF superfamily - - Lens epithelium-derived growth factor (LEDGF); LEDGF is a chromatin-associated protein that protects cells from stress-induced apoptosis. It is the binding partner of HIV-1 integrase in human cells. The integrase binding domain (IBD) of LEDGF is a compact right-handed bundle composed of five alpha-helices. The residues essential for the interaction with the integrase are present in the inter-helical loop regions of the bundle structure. Q#25413 - CGI_10007038 superfamily 245353 235 253 0.00879225 36.1478 cl10644 APO_RNA-bind superfamily NC - "APO RNA-binding; This domain contains conserved cysteine and histidine residues. It resembles zinc fingers, and binds to zinc. This domain functions as an RNA-binding domain." Q#25414 - CGI_10007039 superfamily 241884 8 220 5.08E-149 416.635 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#25415 - CGI_10007040 superfamily 217390 275 415 1.07E-05 43.7025 cl18407 TPT superfamily - - Triose-phosphate Transporter family; This family includes transporters with a specificity for triose phosphate. Q#25416 - CGI_10007041 superfamily 246597 79 156 9.15E-30 110.441 cl13995 MPP_superfamily superfamily C - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#25417 - CGI_10007042 superfamily 241600 67 278 7.46E-102 299.156 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#25418 - CGI_10007043 superfamily 241600 72 267 1.64E-95 282.592 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#25419 - CGI_10007044 superfamily 243035 27 64 3.78E-05 38.3698 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25422 - CGI_10001643 superfamily 243092 31 289 3.41E-13 68.1304 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25425 - CGI_10025748 superfamily 222608 26 126 9.47E-15 65.7386 cl18680 DIOX_N superfamily - - non-haem dioxygenase in morphine synthesis N-terminal; This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity. Q#25426 - CGI_10025749 superfamily 245847 15 87 6.86E-06 40.2326 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25428 - CGI_10025753 superfamily 241574 190 382 7.82E-80 256.359 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25428 - CGI_10025753 superfamily 241574 444 672 2.55E-56 192.416 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25428 - CGI_10025753 superfamily 241568 9 45 0.0088027 35.1312 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#25429 - CGI_10025754 superfamily 191444 96 179 3.88E-13 61.5713 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#25430 - CGI_10025755 superfamily 247724 1 158 2.67E-69 217.747 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25430 - CGI_10025755 superfamily 181730 217 320 4.69E-12 62.4767 cl18118 PRK09256 superfamily N - hypothetical protein; Provisional Q#25432 - CGI_10025757 superfamily 241640 326 583 2.64E-82 260.286 cl00149 Tryp_SPc superfamily - - Trypsin-like serine protease; Many of these are synthesized as inactive precursor zymogens that are cleaved during limited proteolysis to generate their active forms. Alignment contains also inactive enzymes that have substitutions of the catalytic triad residues. Q#25432 - CGI_10025757 superfamily 245847 46 185 1.48E-17 80.0889 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25434 - CGI_10025759 superfamily 204299 464 597 1.53E-48 166.799 cl10718 VEFS-Box superfamily - - "VEFS-Box of polycomb protein; The VEFS-Box (VRN2-EMF2-FIS2-Su(z)12) box is the C-terminal region of these proteins, characterized by an acidic cluster and a tryptophan/methionine-rich sequence, the acidic-W/M domain. Some of these sequences are associated with a zinc-finger domain about 100 residues towards the N-terminus. This protein is one of the polycomb cluster of proteins which control HOX gene transcription as it functions in heterochromatin-mediated repression." Q#25435 - CGI_10025760 superfamily 245882 21 402 1.05E-174 498.354 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#25436 - CGI_10025761 superfamily 245882 22 405 1.62E-165 474.857 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#25437 - CGI_10025762 superfamily 245882 22 405 6.94E-162 465.612 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#25439 - CGI_10025764 superfamily 248022 10 393 3.58E-51 185.174 cl17468 Aa_trans superfamily - - "Transmembrane amino acid transporter protein; This transmembrane region is found in many amino acid transporters including UNC-47 and MTR. UNC-47 encodes a vesicular amino butyric acid (GABA) transporter, (VGAT). UNC-47 is predicted to have 10 transmembrane domains. MTR is a N system amino acid transporter system protein involved in methyltryptophan resistance. Other members of this family include proline transporters and amino acid permeases." Q#25440 - CGI_10025765 superfamily 247725 16 67 6.19E-05 42.8394 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25440 - CGI_10025765 superfamily 247724 377 438 0.000299116 41.0746 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25441 - CGI_10025766 superfamily 150822 10 33 0.000267047 36.6469 cl10896 WRW superfamily C - "Mitochondrial F1F0-ATP synthase, subunit f; This is a family of small proteins of approximately 110 amino acids, which are highly conserved from nematodes to humans. Some members of the family have been annotated in Swiss-Prot as being the f subunit of mitochondrial F1F0-ATP synthase but this could not be confirmed. The sequence has a well-conserved WRW motif. The exact function of the protein is not known." Q#25443 - CGI_10025768 superfamily 247723 115 188 1.47E-47 155.875 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25444 - CGI_10025769 superfamily 241587 208 261 5.17E-09 53.0618 cl00069 GGL superfamily - - "G protein gamma subunit-like motifs, the alpha-helical G-gamma chain dimerizes with the G-beta propeller subunit as part of the heterotrimeric G-protein complex; involved in signal transduction via G-protein-coupled receptors" Q#25444 - CGI_10025769 superfamily 243090 279 403 1.42E-46 160.096 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#25444 - CGI_10025769 superfamily 243038 19 106 1.06E-10 58.4557 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#25445 - CGI_10025770 superfamily 247986 666 820 6.00E-10 58.9238 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#25445 - CGI_10025770 superfamily 247986 483 565 7.74E-07 49.679 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#25445 - CGI_10025770 superfamily 245225 67 413 6.09E-31 125.875 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#25446 - CGI_10025771 superfamily 149077 638 748 1.38E-42 152.008 cl06719 TMC superfamily - - "TMC domain; These sequences are similar to a region conserved amongst various protein products of the transmembrane channel-like (TMC) gene family, such as Transmembrane channel-like protein 3 and EVIN2 - this region is termed the TMC domain. Mutations in these genes are implicated in a number of human conditions, such as deafness and epidermodysplasia verruciformis. TMC proteins are thought to have important cellular roles, and may be modifiers of ion channels or transporters." Q#25447 - CGI_10025772 superfamily 245882 24 407 0 527.629 cl12119 Alpha_L_fucos superfamily - - Alpha-L-fucosidase; Alpha-L-fucosidase. Q#25448 - CGI_10025773 superfamily 241578 105 261 1.54E-43 156.298 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25448 - CGI_10025773 superfamily 243061 1 97 3.54E-31 118.984 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25448 - CGI_10025773 superfamily 246918 346 398 2.06E-12 63.7599 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 803 854 2.88E-12 63.3747 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 746 797 1.14E-11 61.8339 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 860 911 2.14E-11 60.6783 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 575 626 8.33E-11 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 632 683 9.58E-11 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 518 569 9.70E-11 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 461 512 5.63E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 404 455 6.17E-10 56.4411 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 689 740 8.52E-10 56.0559 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25448 - CGI_10025773 superfamily 246918 290 341 1.43E-09 55.6707 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25449 - CGI_10025774 superfamily 245814 181 203 0.00488523 33.6167 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25449 - CGI_10025774 superfamily 245814 115 170 0.000440639 36.9893 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25450 - CGI_10025775 superfamily 248264 13 106 1.64E-08 51.469 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#25454 - CGI_10025779 superfamily 215724 1 279 1.53E-112 330.735 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#25456 - CGI_10025781 superfamily 198867 20 127 1.95E-05 42.3285 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#25457 - CGI_10025782 superfamily 243092 1561 1679 2.19E-20 93.5536 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25457 - CGI_10025782 superfamily 247743 456 566 0.00376699 38.72 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#25458 - CGI_10025783 superfamily 243066 18 122 3.55E-21 89.2137 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25458 - CGI_10025783 superfamily 198867 132 228 9.70E-16 73.5296 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#25458 - CGI_10025783 superfamily 243146 361 406 1.45E-10 57.2862 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25458 - CGI_10025783 superfamily 243146 456 503 0.000895489 37.641 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25463 - CGI_10025788 superfamily 243072 359 468 2.13E-07 50.0746 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25464 - CGI_10025789 superfamily 241750 8 264 1.43E-68 215.13 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#25465 - CGI_10025790 superfamily 241750 56 300 2.50E-60 195.1 cl00281 metallo-dependent_hydrolases superfamily - - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#25466 - CGI_10025791 superfamily 241832 89 170 3.16E-37 126.479 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25467 - CGI_10025792 superfamily 110440 84 110 0.0018744 33.5353 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#25467 - CGI_10025792 superfamily 110440 125 152 0.00603037 32.3797 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#25470 - CGI_10025795 superfamily 247723 1200 1283 2.23E-51 178.611 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25470 - CGI_10025795 superfamily 248011 600 666 1.75E-09 57.5065 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#25470 - CGI_10025795 superfamily 243078 1482 1616 6.68E-27 109.674 cl02544 VHS_ENTH_ANTH superfamily - - "VHS, ENTH and ANTH domain superfamily; composed of proteins containing a VHS, ENTH or ANTH domain. The VHS domain is present in Vps27 (Vacuolar Protein Sorting), Hrs (Hepatocyte growth factor-regulated tyrosine kinase substrate) and STAM (Signal Transducing Adaptor Molecule). It is located at the N-termini of proteins involved in intracellular membrane trafficking. The epsin N-terminal homology (ENTH) domain is an evolutionarily conserved protein module found primarily in proteins that participate in clathrin-mediated endocytosis. A set of proteins previously designated as harboring an ENTH domain in fact contains a highly similar, yet unique module referred to as an AP180 N-terminal homology (ANTH) domain. VHS, ENTH and ANTH domains are structurally similar and are composed of a superhelix of eight alpha helices. ENTH adnd ANTH (E/ANTH) domains bind both inositol phospholipids and proteins and contribute to the nucleation and formation of clathrin coats on membranes. ENTH domains also function in the development of membrane curvature through lipid remodeling during the formation of clathrin-coated vesicles. E/ANTH domain-bearing proteins have recently been shown to function with adaptor protein-1 and GGA adaptors at the trans-Golgi network, which suggests that E/ANTH domains are universal components of the machinery for clathrin-mediated membrane budding." Q#25470 - CGI_10025795 superfamily 243154 1362 1413 1.46E-17 80.3392 cl02715 Surp superfamily - - Surp module; This domain is also known as the SWAP domain. SWAP stands for Suppressor-of-White-APricot. It has been suggested that these domains may be RNA binding. Q#25470 - CGI_10025795 superfamily 203903 1784 1827 5.93E-08 52.1645 cl07067 cwf21 superfamily - - cwf21 domain; The cwf21 family is involved in mRNA splicing. It has been isolated as a subcomplex of the splicosome in Schizosaccharomyces pombe. The function of the cwf21 domain is to bind directly to the spliceosomal protein Prp8. Mutations in the cwf21 domain prevent Prp8 from binding. The structure of this domain has recently been solved which shows this domain to be composed of two alpha helices. Q#25470 - CGI_10025795 superfamily 248011 688 765 4.68E-07 50.1422 cl17457 PKD superfamily - - "polycystic kidney disease I (PKD) domain; similar to other cell-surface modules, with an IG-like fold; domain probably functions as a ligand binding site in protein-protein or protein-carbohydrate interactions; a single instance of the repeat is presented here. The domain is also found in microbial collagenases and chitinases." Q#25472 - CGI_10025798 superfamily 154924 239 349 6.58E-46 154.085 cl02467 C4 superfamily - - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#25472 - CGI_10025798 superfamily 154924 167 235 2.03E-31 115.472 cl02467 C4 superfamily N - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#25472 - CGI_10025798 superfamily 154924 1 46 1.01E-13 66.166 cl02467 C4 superfamily N - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#25472 - CGI_10025798 superfamily 154924 64 86 1.36E-05 42.6688 cl02467 C4 superfamily C - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#25475 - CGI_10025801 superfamily 154924 135 201 5.04E-29 105.842 cl02467 C4 superfamily N - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#25477 - CGI_10025803 superfamily 154924 72 169 1.57E-48 158.999 cl02467 C4 superfamily - - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#25477 - CGI_10025803 superfamily 154924 183 289 8.61E-39 133.669 cl02467 C4 superfamily - - C-terminal tandem repeated domain in type 4 procollagen; Duplicated domain in C-terminus of type 4 collagens. Mutations in alpha-5 collagen IV are associated with X-linked Alport syndrome. Q#25478 - CGI_10025804 superfamily 241626 129 248 4.96E-56 178.952 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#25481 - CGI_10008988 superfamily 245598 83 279 6.02E-76 241.029 cl11396 Patatin_and_cPLA2 superfamily - - "Patatins and Phospholipases; Patatin-like phospholipase. This family consists of various patatin glycoproteins from plants. The patatin protein accounts for up to 40% of the total soluble protein in potato tubers. Patatin is a storage protein, but it also has the enzymatic activity of a lipid acyl hydrolase, catalyzing the cleavage of fatty acids from membrane lipids. Members of this family have also been found in vertebrates. This family also includes the catalytic domain of cytosolic phospholipase A2 (PLA2; EC 3.1.1.4) hydrolyzes the sn-2-acyl ester bond of phospholipids to release arachidonic acid. At the active site, cPLA2 contains a serine nucleophile through which the catalytic mechanism is initiated. The active site is partially covered by a solvent-accessible flexible lid. cPLA2 displays interfacial activation as it exists in both "closed lid" and "open lid" forms." Q#25481 - CGI_10008988 superfamily 247856 397 422 0.00758481 34.6617 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25482 - CGI_10008989 superfamily 241563 62 98 0.000324078 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25482 - CGI_10008989 superfamily 241563 8 52 0.00362434 35.5328 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25484 - CGI_10008991 superfamily 199156 90 105 0.001918 32.8004 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#25485 - CGI_10008992 superfamily 243119 359 403 3.85E-06 43.9717 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#25486 - CGI_10008993 superfamily 243119 61 107 7.18E-06 39.7346 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#25488 - CGI_10008995 superfamily 241958 72 504 3.17E-100 310.601 cl00573 SDF superfamily - - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#25489 - CGI_10008996 superfamily 217293 1 151 1.10E-46 160.491 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#25489 - CGI_10008996 superfamily 202474 158 354 8.03E-30 115.058 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#25491 - CGI_10006490 superfamily 241600 200 335 3.24E-57 186.677 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#25492 - CGI_10006493 superfamily 243035 103 189 2.53E-19 79.9713 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25493 - CGI_10006494 superfamily 243066 3 87 1.44E-19 82.6653 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25493 - CGI_10006494 superfamily 198867 97 207 7.65E-15 69.6776 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#25494 - CGI_10006495 superfamily 243035 324 439 6.03E-20 84.9789 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25494 - CGI_10006495 superfamily 243066 22 123 7.71E-26 100.384 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25494 - CGI_10006495 superfamily 198867 135 241 4.15E-08 50.8028 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#25496 - CGI_10009011 superfamily 217473 2935 3021 6.08E-07 52.7525 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25498 - CGI_10009013 superfamily 243100 649 694 0.0041261 36.148 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#25499 - CGI_10009014 superfamily 241623 246 495 1.35E-33 127.461 cl00119 PI3Kc_like superfamily - - "Phosphoinositide 3-kinase (PI3K)-like family, catalytic domain; The PI3K-like catalytic domain family is part of a larger superfamily that includes the catalytic domains of other kinases such as the typical serine/threonine/tyrosine protein kinases (PKs), aminoglycoside phosphotransferase, choline kinase, and RIO kinases. Members of the family include PI3K, phosphoinositide 4-kinase (PI4K), PI3K-related protein kinases (PIKKs), and TRansformation/tRanscription domain-Associated Protein (TRRAP). PI3Ks catalyze the transfer of the gamma-phosphoryl group from ATP to the 3-hydroxyl of the inositol ring of D-myo-phosphatidylinositol (PtdIns) or its derivatives, while PI4K catalyze the phosphorylation of the 4-hydroxyl of PtdIns. PIKKs are protein kinases that catalyze the phosphorylation of serine/threonine residues, especially those that are followed by a glutamine. PI3Ks play an important role in a variety of fundamental cellular processes, including cell motility, the Ras pathway, vesicle trafficking and secretion, immune cell activation and apoptosis. PI4Ks produce PtdIns(4)P, the major precursor to important signaling phosphoinositides. PIKKs have diverse functions including cell-cycle checkpoints, genome surveillance, mRNA surveillance, and translation control." Q#25500 - CGI_10009015 superfamily 241647 102 128 6.16E-05 41.7446 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#25500 - CGI_10009015 superfamily 241647 222 249 0.000125087 40.589 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#25500 - CGI_10009015 superfamily 241647 332 361 0.000295811 39.4334 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#25500 - CGI_10009015 superfamily 207669 575 620 1.10E-10 58.6098 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#25500 - CGI_10009015 superfamily 207669 506 552 5.61E-10 56.6838 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#25500 - CGI_10009015 superfamily 207669 439 485 4.19E-09 53.9874 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#25500 - CGI_10009015 superfamily 207669 820 880 4.90E-06 45.1278 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#25500 - CGI_10009015 superfamily 207669 763 813 9.61E-06 44.3574 cl02610 FF superfamily - - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#25500 - CGI_10009015 superfamily 207669 705 739 0.00991833 35.1126 cl02610 FF superfamily C - "FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions." Q#25501 - CGI_10009016 superfamily 247684 10 441 9.32E-79 257.206 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25502 - CGI_10009017 superfamily 247684 10 519 8.86E-91 290.718 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25505 - CGI_10009020 superfamily 247684 10 383 4.18E-66 221.767 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25513 - CGI_10009934 superfamily 243035 50 123 4.83E-09 50.311 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25515 - CGI_10009936 superfamily 246723 116 771 0 684.418 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#25516 - CGI_10009937 superfamily 245864 23 233 3.99E-79 247.192 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#25517 - CGI_10009938 superfamily 217473 113 386 1.83E-36 139.037 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25518 - CGI_10009939 superfamily 247984 109 312 1.52E-26 107.311 cl17430 FtsJ superfamily - - "FtsJ-like methyltransferase; This family consists of FtsJ from various bacterial and archaeal sources FtsJ is a methyltransferase, but actually has no effect on cell division. FtsJ's substrate is the 23S rRNA. The 1.5 A crystal structure of FtsJ in complex with its cofactor S-adenosylmethionine revealed that FtsJ has a methyltransferase fold. This family also includes the N terminus of flaviviral NS5 protein. It has been hypothesised that the N-terminal domain of NS5 is a methyltransferase involved in viral RNA capping." Q#25519 - CGI_10009940 superfamily 242897 7 110 2.63E-22 85.5103 cl02129 ParBc superfamily - - ParB-like nuclease domain; ParB-like nuclease domain. Q#25520 - CGI_10009941 superfamily 241599 6 63 1.67E-19 77.6688 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25521 - CGI_10009942 superfamily 241599 67 117 1.75E-17 71.8908 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25522 - CGI_10009943 superfamily 241599 624 681 4.69E-21 88.4544 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25522 - CGI_10009943 superfamily 241599 745 802 2.23E-20 86.5284 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25522 - CGI_10009943 superfamily 241578 261 414 2.97E-38 141.198 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25522 - CGI_10009943 superfamily 207701 9 126 2.16E-15 74.3167 cl02699 VIT superfamily - - Vault protein inter-alpha-trypsin domain; Inter-alpha-trypsin inhibitors (ITIs) consist of one light chain and a variable set of heavy chains. ITIs play a role in extracellular matrix (ECM) stabilisation and tumour metastasis as well as in plasma protease inhibition. The vault protein inter-alpha-trypsin (VIT) domain described here is found to the N-terminus of a von Willebrand factor type A domain (pfam00092) in ITI heavy chains (ITIHs) and their precursors. Q#25525 - CGI_10020721 superfamily 198827 1 84 2.30E-22 85.9439 cl03803 BAF superfamily - - Barrier to autointegration factor; The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. Q#25526 - CGI_10020722 superfamily 198827 81 169 1.35E-49 156.821 cl03803 BAF superfamily - - Barrier to autointegration factor; The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. Q#25528 - CGI_10020724 superfamily 198827 1 74 3.57E-43 136.79 cl03803 BAF superfamily - - Barrier to autointegration factor; The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. Q#25530 - CGI_10020726 superfamily 241571 58 164 2.06E-25 101.72 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25530 - CGI_10020726 superfamily 241571 172 273 9.40E-20 85.5418 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25530 - CGI_10020726 superfamily 241571 411 514 1.13E-19 85.5418 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25530 - CGI_10020726 superfamily 241571 290 396 4.99E-17 77.8378 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25530 - CGI_10020726 superfamily 241600 533 606 6.12E-16 76.5103 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#25531 - CGI_10020727 superfamily 247805 48 246 7.75E-88 269.74 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25531 - CGI_10020727 superfamily 247905 270 391 4.18E-21 88.834 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#25532 - CGI_10020728 superfamily 245836 182 396 1.14E-91 281.825 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#25532 - CGI_10020728 superfamily 217793 399 495 3.20E-29 111.018 cl04328 mRNA_cap_C superfamily - - "mRNA capping enzyme, C-terminal domain; mRNA capping enzyme, C-terminal domain. " Q#25532 - CGI_10020728 superfamily 241574 41 106 8.99E-07 47.2788 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25534 - CGI_10020730 superfamily 248312 117 259 1.74E-07 48.5037 cl17758 PMP22_Claudin superfamily - - PMP-22/EMP/MP20/Claudin family; PMP-22/EMP/MP20/Claudin family. Q#25535 - CGI_10020731 superfamily 238191 11 513 5.46E-119 365.502 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#25536 - CGI_10020732 superfamily 219673 105 289 6.54E-58 189.046 cl06835 COPIIcoated_ERV superfamily - - "Endoplasmic reticulum vesicle transporter; This family is conserved from plants and fungi to humans. Erv46 works in close conjunction with Erv41 and together they form a complex which cycles between the endoplasmic reticulum and Golgi complex. Erv46-41 interacts strongly with the endoplasmic reticulum glucosidase II. Mammalian glucosidase II comprises a catalytic alpha-subunit and a 58 kDa beta subunit, which is required for ER localisation. All proteins identified biochemically as Erv41p-Erv46p interactors are localised to the early secretory pathway and are involved in protein maturation and processing in the ER and/or sorting into COPII vesicles for transport to the Golgi." Q#25537 - CGI_10020733 superfamily 243077 334 386 1.70E-20 85.2897 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#25538 - CGI_10020734 superfamily 191444 100 170 9.85E-09 49.2449 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#25539 - CGI_10020735 superfamily 242274 14 210 7.51E-72 220.202 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#25540 - CGI_10020736 superfamily 241568 43 98 6.78E-05 39.3684 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#25541 - CGI_10020737 superfamily 241571 394 506 1.17E-19 86.3122 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25542 - CGI_10020738 superfamily 241571 725 840 5.52E-26 104.802 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25542 - CGI_10020738 superfamily 241571 255 367 6.71E-23 95.9422 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25542 - CGI_10020738 superfamily 241571 614 724 2.25E-20 88.6234 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25542 - CGI_10020738 superfamily 241571 146 254 2.95E-17 79.7638 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25542 - CGI_10020738 superfamily 241571 373 499 4.82E-16 76.297 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25542 - CGI_10020738 superfamily 241571 57 120 2.86E-06 47.0219 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25543 - CGI_10020739 superfamily 242323 113 221 2.51E-20 86.0231 cl01132 FA_hydroxylase superfamily - - "Fatty acid hydroxylase superfamily; This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins." Q#25545 - CGI_10020741 superfamily 243035 46 124 6.01E-05 42.2218 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25545 - CGI_10020741 superfamily 241578 215 399 1.47E-21 93.6129 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25546 - CGI_10020742 superfamily 243035 180 250 3.34E-06 45.3034 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25546 - CGI_10020742 superfamily 243035 378 454 0.000414782 38.755 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25546 - CGI_10020742 superfamily 241619 257 329 0.000327049 39.0992 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#25547 - CGI_10020743 superfamily 241737 123 290 1.04E-87 262.143 cl00264 Ferritin_like superfamily - - "Ferritin-like superfamily of diiron-containing four-helix-bundle proteins; Ferritin-like, diiron-carboxylate proteins participate in a range of functions including iron regulation, mono-oxygenation, and reactive radical production. These proteins are characterized by the fact that they catalyze dioxygen-dependent oxidation-hydroxylation reactions within diiron centers; one exception is manganese catalase, which catalyzes peroxide-dependent oxidation-reduction within a dimanganese center. Diiron-carboxylate proteins are further characterized by the presence of duplicate metal ligands, glutamates and histidines (ExxH) and two additional glutamates within a four-helix bundle. Outside of these conserved residues there is little obvious homology. Members include bacterioferritin, ferritin, rubrerythrin, aromatic and alkene monooxygenase hydroxylases (AAMH), ribonucleotide reductase R2 (RNRR2), acyl-ACP-desaturases (Acyl_ACP_Desat), manganese (Mn) catalases, demethoxyubiquinone hydroxylases (DMQH), DNA protecting proteins (DPS), and ubiquinol oxidases (AOX), and the aerobic cyclase system, Fe-containing subunit (ACSF)." Q#25550 - CGI_10020746 superfamily 241748 17 197 5.73E-89 270.742 cl00279 APP_MetAP superfamily C - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#25550 - CGI_10020746 superfamily 241748 284 333 1.94E-14 70.823 cl00279 APP_MetAP superfamily N - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#25552 - CGI_10020748 superfamily 247724 5 172 3.67E-41 138.446 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25554 - CGI_10020750 superfamily 241574 9 88 0.000689559 34.5065 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25555 - CGI_10020751 superfamily 241574 127 258 1.12E-31 117.324 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25555 - CGI_10020751 superfamily 241574 301 383 4.55E-07 47.6033 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25556 - CGI_10020752 superfamily 247905 418 485 4.01E-12 63.4108 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#25556 - CGI_10020752 superfamily 247805 3 144 6.79E-06 44.6356 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25557 - CGI_10020753 superfamily 241563 61 100 0.000228004 40.1552 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25558 - CGI_10020754 superfamily 247856 2 35 6.43E-05 41.7645 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25558 - CGI_10020754 superfamily 241620 89 105 0.000968744 37.8033 cl00113 CRIB superfamily C - "PAK (p21 activated kinase) Binding Domain (PBD), binds Cdc42p- and/or Rho-like small GTPases; also known as the Cdc42/Rac interactive binding (CRIB) motif; has been shown to inhibit transcriptional activation and cell transformation mediated by the Ras-Rac pathway. CRIB-containing effector proteins are functionally diverse and include serine/threonine kinases, tyrosine kinases, actin-binding proteins, and adapter molecules." Q#25559 - CGI_10020755 superfamily 241832 90 192 9.52E-10 54.1537 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25560 - CGI_10020756 superfamily 241610 684 736 8.74E-22 90.0018 cl00101 KU superfamily - - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#25561 - CGI_10020757 superfamily 241874 11 495 0 634.905 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#25562 - CGI_10020758 superfamily 216301 22 199 1.24E-51 166.671 cl03099 EMP24_GP25L superfamily - - emp24/gp25L/p24 family/GOLD; Members of this family are implicated in bringing cargo forward from the ER and binding to coat proteins by their cytoplasmic domains. This domain corresponds closely to the beta-strand rich GOLD domain described in. The GOLD domain is always found combined with lipid- or membrane-association domains. Q#25565 - CGI_10010772 superfamily 217473 99 333 1.82E-24 102.828 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25566 - CGI_10010773 superfamily 242902 64 126 2.13E-14 69.9682 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#25566 - CGI_10010773 superfamily 247724 366 425 0.00762412 36.0152 cl17170 Ras_like_GTPase superfamily NC - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25567 - CGI_10010774 superfamily 242902 38 130 3.57E-11 59.9531 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#25571 - CGI_10010778 superfamily 243034 101 178 5.74E-07 47.3748 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#25571 - CGI_10010778 superfamily 242223 279 375 5.73E-26 100.925 cl00960 Fic superfamily - - "Fic/DOC family; This family consists of the Fic (filamentation induced by cAMP) protein and doc (death on curing). The Fic protein is involved in cell division and is suggested to be involved in the synthesis of PAB or folate, indicating that the Fic protein and cAMP are involved in a regulatory mechanism of cell division via folate metabolism. This family contains a central conserved motif HPFXXGNG in most members. The exact molecular function of these proteins is uncertain. P1 lysogens of Escherichia coli carry the prophage as a stable low copy number plasmid. The frequency with which viable cells cured of prophage are produced is about 10(-5) per cell per generation. A significant part of this remarkable stability can be attributed to a plasmid-encoded mechanism that causes death of cells that have lost P1. In other words, the lysogenic cells appear to be addicted to the presence of the prophage. The plasmid withdrawal response depends on a gene named doc (death on curing) that is represented by this family. Doc induces a reversible growth arrest of E. coli cells by targetting the protein synthesis machinery. Doc hosts the C-terminal domain of its antitoxin partner Phd (prevents host death) through fold complementation, a domain that is intrinsically disordered in solution but that folds into an alpha-helix on binding to Doc.This domain forms complexes with Phd antitoxins containing pfam02604." Q#25572 - CGI_10010779 superfamily 207662 816 896 2.13E-54 185.066 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#25572 - CGI_10010779 superfamily 247905 527 640 8.03E-24 99.2344 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#25572 - CGI_10010779 superfamily 245599 977 1171 3.21E-77 253.084 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#25572 - CGI_10010779 superfamily 247805 258 466 7.05E-40 147.021 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25574 - CGI_10010781 superfamily 246918 44 93 2.22E-09 50.2779 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25574 - CGI_10010781 superfamily 243119 101 144 0.00039136 36.2577 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#25576 - CGI_10006341 superfamily 247805 39 147 3.77E-11 56.1916 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25577 - CGI_10006343 superfamily 241756 19 332 7.89E-172 483.208 cl00289 FIG superfamily - - "FIG, FBPase/IMPase/glpX-like domain. A superfamily of metal-dependent phosphatases with various substrates. Fructose-1,6-bisphospatase (both the major and the glpX-encoded variant) hydrolyze fructose-1,6,-bisphosphate to fructose-6-phosphate in gluconeogenesis. Inositol-monophosphatases and inositol polyphosphatases play vital roles in eukaryotic signalling, as they participate in metabolizing the messenger molecule Inositol-1,4,5-triphosphate. Many of these enzymes are inhibited by Li+." Q#25578 - CGI_10006344 superfamily 243106 1895 2034 7.20E-50 175.08 cl02608 BAH superfamily - - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#25578 - CGI_10006344 superfamily 243091 1402 1521 3.26E-37 139.007 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#25578 - CGI_10006344 superfamily 243084 1677 1781 2.38E-27 109.787 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#25578 - CGI_10006344 superfamily 197795 1351 1398 9.19E-13 65.8838 cl02673 AWS superfamily - - associated with SET domains; subdomain of PRESET Q#25578 - CGI_10006344 superfamily 247999 1822 1862 2.12E-05 44.5102 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#25579 - CGI_10006345 superfamily 220642 4 93 4.59E-13 69.2021 cl10921 DAP3 superfamily C - "Mitochondrial ribosomal death-associated protein 3; This is a family of conserved proteins which were originally described as death-associated-protein-3 (DAP-3). The proteins carry a P-loop DNA-binding motif, and induce apoptosis. DAP3 has been shown to be a pro-apoptotic factor in the mitochondrial matrix and to be crucial for mitochondrial biogenesis and so has also been designated as MRP-S29 (mitochondrial ribosomal protein subunit 29)." Q#25580 - CGI_10006346 superfamily 241571 20 135 3.61E-08 50.8739 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25580 - CGI_10006346 superfamily 245213 144 174 0.000239551 38.7718 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25580 - CGI_10006346 superfamily 245213 185 213 0.000348003 38.3866 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25581 - CGI_10006347 superfamily 245847 9 110 3.43E-08 49.038 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25582 - CGI_10001173 superfamily 243051 205 361 5.82E-42 148.295 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#25582 - CGI_10001173 superfamily 243051 22 184 8.26E-40 142.132 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#25582 - CGI_10001173 superfamily 243051 367 498 7.95E-34 125.183 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#25584 - CGI_10022939 superfamily 241589 11 141 8.18E-47 156.64 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#25584 - CGI_10022939 superfamily 241589 204 338 2.66E-40 139.691 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#25585 - CGI_10022940 superfamily 241589 63 197 1.83E-41 138.536 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#25586 - CGI_10022941 superfamily 241589 64 198 1.09E-42 141.617 cl00071 GLECT superfamily - - "Galectin/galactose-binding lectin. This domain exclusively binds beta-galactosides, such as lactose, and does not require metal ions for activity. GLECT domains occur as homodimers or tandemly repeated domains. They are developmentally regulated and may be involved in differentiation, cell-cell interaction and cellular regulation." Q#25588 - CGI_10022943 superfamily 242201 204 284 3.71E-13 67.4998 cl00933 ClpS superfamily - - "ATP-dependent Clp protease adaptor protein ClpS; In the bacterial cytosol, ATP-dependent protein degradation is performed by several different chaperone-protease pairs, including ClpAP. ClpS directly influences the ClpAP machine by binding to the N-terminal domain of the chaperone ClpA. The degradation of ClpAP substrates, both SsrA-tagged proteins and ClpA itself, is specifically inhibited by ClpS. ClpS modifies ClpA substrate specificity, potentially redirecting degradation by ClpAP toward aggregated proteins." Q#25589 - CGI_10022944 superfamily 241868 40 206 1.34E-26 100.257 cl00447 Nudix_Hydrolase superfamily - - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#25590 - CGI_10022945 superfamily 241599 198 252 4.48E-16 71.5056 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25593 - CGI_10022948 superfamily 247724 122 235 3.99E-65 203.788 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25593 - CGI_10022948 superfamily 218609 71 124 3.31E-20 83.5771 cl05189 Destabilase superfamily C - "Destabilase; Destabilase is an endo-epsilon(gamma-Glu)-Lys isopeptidase, which cleaves isopeptide bonds formed by transglutaminase (Factor XIIIa) between glutamine gamma-carboxamide and the epsilon-amino group of lysine." Q#25596 - CGI_10022951 superfamily 247042 16 449 2.68E-148 451.928 cl15693 Sema superfamily - - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#25596 - CGI_10022951 superfamily 246918 754 805 4.72E-13 65.6859 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25596 - CGI_10022951 superfamily 246918 634 681 2.58E-12 63.7599 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25596 - CGI_10022951 superfamily 246918 566 610 1.17E-11 61.8339 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25596 - CGI_10022951 superfamily 246918 809 863 5.81E-09 54.1299 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25596 - CGI_10022951 superfamily 243104 452 500 2.28E-06 46.3924 cl02601 PSI superfamily - - "Plexin repeat; A cysteine rich repeat found in several different extracellular receptors. The function of the repeat is unknown. Three copies of the repeat are found Plexin. Two copies of the repeat are found in mahogany protein. A related C. elegans protein contains four copies of the repeat. The Met receptor contains a single copy of the repeat. The Pfam alignment shows 6 conserved cysteine residues that may form three conserved disulphide bridges, whereas shows 8 conserved cysteines. The pattern of conservation suggests that cysteines 5 and 7 (that are not absolutely conserved) form a disulphide bridge (Personal observation. A Bateman)." Q#25596 - CGI_10022951 superfamily 246918 510 561 5.02E-05 42.1887 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25596 - CGI_10022951 superfamily 246918 869 909 0.00283037 37.1811 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25597 - CGI_10022952 superfamily 247725 92 235 3.85E-90 272.61 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25597 - CGI_10022952 superfamily 243100 241 270 0.000192122 39.4696 cl02576 B_zip1 superfamily N - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#25600 - CGI_10022955 superfamily 241584 779 888 1.46E-14 71.3735 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25600 - CGI_10022955 superfamily 241584 1010 1098 3.85E-08 52.4987 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25600 - CGI_10022955 superfamily 241584 909 1000 0.00242382 37.8611 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25600 - CGI_10022955 superfamily 243035 27 156 3.18E-12 65.4037 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25600 - CGI_10022955 superfamily 245814 599 670 4.75E-11 60.5358 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25600 - CGI_10022955 superfamily 245814 696 779 7.85E-11 60.397 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25600 - CGI_10022955 superfamily 245814 187 281 7.97E-11 60.3567 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25600 - CGI_10022955 superfamily 245814 405 469 1.76E-10 59.413 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25600 - CGI_10022955 superfamily 245814 302 361 9.99E-09 54.0984 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25600 - CGI_10022955 superfamily 245814 497 577 1.51E-08 53.6633 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25601 - CGI_10022956 superfamily 222150 583 610 0.000206749 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#25601 - CGI_10022956 superfamily 222150 553 580 0.000250026 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#25604 - CGI_10022959 superfamily 245847 195 337 6.23E-17 76.0561 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25604 - CGI_10022959 superfamily 241619 101 147 0.000216778 38.7173 cl00112 PAN_APPLE superfamily C - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#25605 - CGI_10022960 superfamily 245226 211 378 4.81E-20 85.8152 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#25606 - CGI_10022961 superfamily 247805 185 399 1.58E-78 249.709 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25606 - CGI_10022961 superfamily 247905 409 537 6.58E-35 128.51 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#25608 - CGI_10022963 superfamily 216686 84 266 3.11E-43 149.01 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#25611 - CGI_10004058 superfamily 247856 16 69 5.38E-10 51.7797 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25612 - CGI_10004059 superfamily 247723 113 186 2.81E-26 97.8049 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25613 - CGI_10004060 superfamily 241832 388 484 1.75E-40 145.153 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25613 - CGI_10004060 superfamily 241832 141 244 1.48E-39 142.457 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25613 - CGI_10004060 superfamily 241832 515 616 1.13E-35 131.671 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25613 - CGI_10004060 superfamily 241832 265 366 3.34E-35 130.13 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25613 - CGI_10004060 superfamily 241832 23 128 1.10E-31 120.656 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25613 - CGI_10004060 superfamily 241832 639 731 6.36E-24 98.1588 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25616 - CGI_10001334 superfamily 219917 38 293 9.18E-42 147.908 cl07264 Ribonuc_P_40 superfamily - - "Ribonuclease P 40kDa (Rpp40) subunit; The tRNA processing enzyme ribonuclease P (RNase P) consists of an RNA molecule and at least eight protein subunits. Subunits hpop1, Rpp21, Rpp29, Rpp30, Rpp38, and Rpp40 (this entry) are involved in extensive, but weak, protein-protein interactions in the holoenzyme complex." Q#25619 - CGI_10001617 superfamily 241563 63 94 0.000255408 36.4964 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25620 - CGI_10001711 superfamily 246680 169 247 1.18E-12 61.1974 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25620 - CGI_10001711 superfamily 245874 28 119 0.00135184 35.865 cl12111 TNFR superfamily C - "Tumor necrosis factor receptor (TNFR) domain; superfamily of TNF-like receptor domains. When bound to TNF-like cytokines, TNFRs trigger multiple signal transduction pathways, they are involved in inflammation response, apoptosis, autoimmunity and organogenesis. TNFRs domains are elongated with generally three tandem repeats of cysteine-rich domains (CRDs). They fit in the grooves between protomers within the ligand trimer. Some TNFRs, such as NGFR and HveA, bind ligands with no structural similarity to TNF and do not bind ligand trimers." Q#25621 - CGI_10001712 superfamily 246680 425 499 3.86E-12 62.353 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25622 - CGI_10001852 superfamily 240441 1153 1242 2.90E-20 87.6203 cl18913 Na_channel_gate superfamily - - Inactivation gate of the voltage-gated sodium channel alpha subunits; This region is part of the intracellular linker between domains III and IV of the alpha subunits of voltage-gated sodium channels. It is responsible for fast inactivation of the channel and essential for proper physiological function. Q#25622 - CGI_10001852 superfamily 219069 703 903 2.63E-14 73.5654 cl05828 Na_trans_assoc superfamily - - "Sodium ion transport-associated; Members of this family contain a region found exclusively in eukaryotic sodium channels or their subunits, many of which are voltage-gated. Members very often also contain between one and four copies of pfam00520 and, less often, one copy of pfam00612." Q#25624 - CGI_10001854 superfamily 248012 108 196 2.65E-10 56.4321 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#25627 - CGI_10001881 superfamily 247805 43 149 5.98E-09 50.0284 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25636 - CGI_10008799 superfamily 247866 16 211 2.00E-12 64.0108 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#25637 - CGI_10008800 superfamily 220634 24 580 7.63E-118 385.368 cl12379 DUF2146 superfamily C - Uncharacterized conserved protein (DUF2146); This is a family of proteins conserved from plants to humans. In Dictyostelium it is annotated as Mss11p but this could not be confirmed. Mss11p is required for the activation of pseudo-hyphal and invasive growth by Ste12p in yeast. Q#25637 - CGI_10008800 superfamily 220634 707 997 2.27E-62 227.436 cl12379 DUF2146 superfamily N - Uncharacterized conserved protein (DUF2146); This is a family of proteins conserved from plants to humans. In Dictyostelium it is annotated as Mss11p but this could not be confirmed. Mss11p is required for the activation of pseudo-hyphal and invasive growth by Ste12p in yeast. Q#25638 - CGI_10008801 superfamily 245819 493 669 3.11E-33 126.54 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#25638 - CGI_10008801 superfamily 219812 154 401 1.81E-15 75.8056 cl07121 NIT superfamily - - "Nitrate and nitrite sensing; The nitrate- and nitrite sensing domain (NIT) is found in receptor components of signal transducing pathways in bacteria which control gene expression, cellular motility and enzyme activity in response to nitrate and nitrite concentrations. The NIT domain is predicted to be all alpha-helical in structure." Q#25638 - CGI_10008801 superfamily 219526 453 479 0.000173052 42.2211 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#25639 - CGI_10008802 superfamily 247792 17 76 0.00157565 36.9872 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#25640 - CGI_10008803 superfamily 217390 170 318 8.55E-18 78.7557 cl18407 TPT superfamily - - Triose-phosphate Transporter family; This family includes transporters with a specificity for triose phosphate. Q#25641 - CGI_10001928 superfamily 221349 128 192 4.92E-08 48.504 cl13416 Git3_C superfamily - - "G protein-coupled glucose receptor regulating Gpa2 C-term; Git3 is one of six proteins required for glucose-triggered adenylate cyclase activation, and is a G protein-coupled receptor responsible for the activation of adenylate cyclase through Gpa2 - heterotrimeric G protein alpha subunit, part of the glucose-detection pathway. Git3 contains seven predicted transmembrane domains, a third cytoplasmic loop and a cytoplasmic tail. This family is the conserved C-terminal domain of the member proteins." Q#25642 - CGI_10001929 superfamily 217617 81 274 3.38E-30 116.362 cl15988 Sulfotransfer_2 superfamily - - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#25643 - CGI_10001930 superfamily 247723 1 80 9.93E-31 115.842 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25646 - CGI_10024930 superfamily 243096 796 984 6.40E-36 135.888 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#25647 - CGI_10024931 superfamily 243069 27 199 6.17E-103 297.971 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#25648 - CGI_10024932 superfamily 243035 47 163 4.06E-24 91.9125 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25649 - CGI_10024933 superfamily 243035 157 272 8.02E-24 93.4533 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25649 - CGI_10024933 superfamily 243035 41 123 1.26E-14 67.6449 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25650 - CGI_10024934 superfamily 243035 6 50 0.0060132 31.4362 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25652 - CGI_10024936 superfamily 245226 216 383 4.95E-20 86.9708 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#25653 - CGI_10024937 superfamily 247684 1 272 3.94E-43 155.527 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25658 - CGI_10024943 superfamily 245612 333 501 6.64E-75 245.797 cl11426 Amidase superfamily C - Amidase; Amidase. Q#25662 - CGI_10024948 superfamily 193253 9 168 5.36E-19 82.7773 cl15084 MT superfamily N - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#25663 - CGI_10024949 superfamily 245226 18 48 0.00356857 32.2122 cl10012 DnaQ_like_exo superfamily NC - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#25665 - CGI_10024951 superfamily 217473 78 333 8.26E-25 104.369 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25666 - CGI_10024952 superfamily 245847 149 272 0.00592723 34.7856 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#25668 - CGI_10024955 superfamily 241867 47 240 6.24E-55 180.104 cl00446 Lactamase_B superfamily C - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#25670 - CGI_10024957 superfamily 248458 31 204 3.11E-05 44.6121 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#25670 - CGI_10024957 superfamily 248458 282 408 0.000706861 40.3749 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#25673 - CGI_10024961 superfamily 247724 72 312 2.14E-52 174.879 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25675 - CGI_10024963 superfamily 217836 12 93 0.00496026 34.1786 cl09556 Sas10_Utp3 superfamily - - "Sas10/Utp3/C1D family; This family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex. It also includes the human C1D protein and Saccharomyces cerevisiae YHR081W (rrp47), an exosome-associated protein required for the 3' processing of stable RNAs, and Sas10 which has been identified as a regulator of chromatin silencing. This family also includes the human protein Neuroguidin an initiation factor 4E (eIF4E) binding protein." Q#25677 - CGI_10024965 superfamily 201479 2 105 1.05E-29 114.263 cl02994 Transglut_N superfamily - - Transglutaminase family; Transglutaminase family. Q#25677 - CGI_10024965 superfamily 247916 254 344 6.53E-16 73.5711 cl17362 Transglut_core superfamily - - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#25677 - CGI_10024965 superfamily 216198 574 659 1.00E-12 65.4128 cl08295 Transglut_C superfamily - - "Transglutaminase family, C-terminal ig like domain; Transglutaminase family, C-terminal ig like domain. " Q#25681 - CGI_10024969 superfamily 247792 20 61 2.74E-07 45.1292 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#25682 - CGI_10024970 superfamily 245531 411 484 1.30E-15 72.7817 cl11158 BEN superfamily - - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#25683 - CGI_10024971 superfamily 214773 222 425 3.42E-54 184.933 cl18315 CAP10 superfamily - - Putative lipopolysaccharide-modifying enzyme; Putative lipopolysaccharide-modifying enzyme. Q#25683 - CGI_10024971 superfamily 216033 23 119 2.62E-13 66.2032 cl16959 Filamin superfamily - - Filamin/ABP280 repeat; Filamin/ABP280 repeat. Q#25684 - CGI_10024972 superfamily 206083 1209 1233 0.00578003 36.432 cl16471 zf-C2H2_6 superfamily - - C2H2-type zinc finger; C2H2-type zinc finger. Q#25685 - CGI_10024973 superfamily 241862 51 296 4.15E-05 43.1438 cl00437 COG0428 superfamily - - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#25686 - CGI_10024974 superfamily 248469 295 414 1.10E-09 56.6095 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#25686 - CGI_10024974 superfamily 241581 6 91 0.00275383 36.5954 cl00062 FHA superfamily - - "Forkhead associated domain (FHA); found in eukaryotic and prokaryotic proteins. Putative nuclear signalling domain. FHA domains may bind phosphothreonine, phosphoserine and sometimes phosphotyrosine. In eukaryotes, many FHA domain-containing proteins localize to the nucleus, where they participate in establishing or maintaining cell cycle checkpoints, DNA repair, or transcriptional regulation. Members of the FHA family include: Dun1, Rad53, Cds1, Mek1, KAPP(kinase-associated protein phosphatase),and Ki-67 (a human nuclear protein related to cell proliferation)." Q#25687 - CGI_10024975 superfamily 241677 2 119 1.82E-71 213.273 cl00197 cyclophilin superfamily N - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#25688 - CGI_10024976 superfamily 241591 25 100 2.04E-26 97.6919 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#25689 - CGI_10000359 superfamily 243035 58 156 5.05E-24 91.5273 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25690 - CGI_10002240 superfamily 216686 87 273 7.61E-40 147.855 cl18377 Galactosyl_T superfamily - - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#25690 - CGI_10002240 superfamily 245814 706 777 2.21E-13 67.6368 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25691 - CGI_10002241 superfamily 247856 158 208 6.42E-06 41.7645 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25691 - CGI_10002241 superfamily 242164 100 152 0.0096658 33.3122 cl00878 Ribosomal_S24e superfamily C - Ribosomal protein S24e; Ribosomal protein S24e. Q#25692 - CGI_10002242 superfamily 243074 97 141 9.50E-18 78.7025 cl02535 F-box-like superfamily - - F-box-like; This is an F-box-like family. Q#25693 - CGI_10002243 superfamily 243092 21 137 0.000465467 38.0848 cl02567 WD40 superfamily C - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25694 - CGI_10002101 superfamily 248097 2 105 1.31E-18 75.7646 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25695 - CGI_10002102 superfamily 248097 82 188 1.70E-19 80.387 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25696 - CGI_10002103 superfamily 243072 30 118 2.34E-08 52.771 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25696 - CGI_10002103 superfamily 149414 196 258 1.36E-17 78.4674 cl07091 TRP_2 superfamily - - Transient receptor ion channel II; This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023). Q#25697 - CGI_10003039 superfamily 222429 8 85 1.21E-19 82.6736 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#25699 - CGI_10003041 superfamily 222429 6 71 7.09E-10 54.1688 cl18676 Myb_DNA-bind_5 superfamily - - Myb/SANT-like DNA-binding domain; This presumed domain appears to be related to other Myb/SANT like DNA binding domains. This family is greatly expanded in arthropods and higher eukaryotes. Q#25700 - CGI_10003042 superfamily 222150 596 621 1.76E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#25703 - CGI_10007224 superfamily 247725 732 911 4.84E-74 244.555 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25703 - CGI_10007224 superfamily 243096 592 724 1.95E-21 93.9016 cl02571 RhoGEF superfamily N - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#25703 - CGI_10007224 superfamily 243095 1091 1266 2.85E-48 172.42 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#25703 - CGI_10007224 superfamily 246669 957 1065 1.97E-32 124.157 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#25704 - CGI_10007225 superfamily 215648 553 726 3.89E-19 87.6511 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#25704 - CGI_10007225 superfamily 245225 51 345 1.27E-14 76.1275 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#25705 - CGI_10007226 superfamily 241900 46 334 6.09E-89 275.787 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#25707 - CGI_10007228 superfamily 241618 45 79 2.12E-07 43.0267 cl00111 PAH superfamily - - "Pancreatic Hormone domain, a regulator of pancreatic and gastrointestinal functions; neuropeptide Y (NPY)b, peptide YY (PYY), and pancreatic polypetide (PP) are closely related; propeptide is enzymatically cleaved to yield the mature active peptide with amidated C-terminal ends; receptor binding and activation functions may reside in the N- and C-termini respectively; occurs in neurons, intestinal endocrine cells, and pancreas; exist as monomers and dimers" Q#25708 - CGI_10007229 superfamily 246748 351 566 3.14E-25 106.138 cl14876 Zinc_peptidase_like superfamily - - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#25708 - CGI_10007229 superfamily 244870 206 270 9.59E-10 57.4098 cl08238 PA superfamily C - "PA: Protease-associated (PA) domain. The PA domain is an insert domain in a diverse fraction of proteases. The significance of the PA domain to many of the proteins in which it is inserted is undetermined. It may be a protein-protein interaction domain. At peptidase active sites, the PA domain may participate in substrate binding and/or promoting conformational changes, which influence the stability and accessibility of the site to substrate. Proteins into which the PA domain is inserted include the following: i) various signal peptide peptidases including, hSPPL2a and 2b which catalyze the intramembrane proteolysis of tumor necrosis factor alpha, ii) various proteins containing a C3H2C3 RING finger including, Arabidopsis ReMembR-H2 protein and various E3 ubiquitin ligases such as human GRAIL (gene related to anergy in lymphocytes), iii) EDEM3 (ER-degradation-enhancing mannosidase-like 3 protein), iv) various plant vacuolar sorting receptors such as Pisum sativum BP-80, v) glutamate carboxypeptidase II (GCPII), vi) yeast aminopeptidase Y, vii) Vibrio metschnikovii VapT, a sodium dodecyl sulfate (SDS) resistant extracellular alkaline serine protease, viii) lactocepin (a cell envelope-associated protease from Lactobacillus paracasei subsp. paracasei NCDO 151), ix) various subtilisin-like proteases such as melon Cucumisin, and x) human TfR (transferrin receptor) 1 and 2." Q#25709 - CGI_10007230 superfamily 214545 791 931 8.28E-50 174.045 cl10551 CULLIN superfamily - - Cullin; Cullin. Q#25709 - CGI_10007230 superfamily 245539 1038 1103 9.07E-27 105.714 cl11186 Cullin_Nedd8 superfamily - - "Cullin protein neddylation domain; This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue." Q#25710 - CGI_10007231 superfamily 247727 42 152 6.99E-18 76.6998 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#25714 - CGI_10015997 superfamily 247923 12 153 6.15E-25 95.4754 cl17369 Glyco_tran_28_C superfamily - - Glycosyltransferase family 28 C-terminal domain; The glycosyltransferase family 28 includes monogalactosyldiacylglycerol synthase (EC 2.4.1.46) and UDP-N-acetylglucosamine transferase (EC 2.4.1.-). Structural analysis suggests the C-terminal domain contains the UDP-GlcNAc binding site. Q#25716 - CGI_10015999 superfamily 241563 68 109 3.84E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25717 - CGI_10016000 superfamily 245864 43 398 4.88E-44 160.136 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#25718 - CGI_10016001 superfamily 245864 120 354 4.16E-50 175.544 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#25721 - CGI_10016005 superfamily 248097 2 77 2.77E-12 59.201 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25722 - CGI_10016006 superfamily 246918 150 209 4.31E-09 50.6631 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25722 - CGI_10016006 superfamily 246918 20 69 1.46E-06 43.7295 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25722 - CGI_10016006 superfamily 246918 85 144 0.00370273 34.0995 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25724 - CGI_10016008 superfamily 243091 317 430 1.08E-07 50.4107 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#25724 - CGI_10016008 superfamily 243091 153 205 1.80E-05 43.6348 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#25724 - CGI_10016008 superfamily 243091 484 599 0.000137295 40.7807 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#25725 - CGI_10016009 superfamily 243072 478 617 1.33E-14 71.6458 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25725 - CGI_10016009 superfamily 243072 369 520 4.07E-13 67.4086 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25725 - CGI_10016009 superfamily 243073 736 774 1.01E-07 49.7761 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#25725 - CGI_10016009 superfamily 243091 132 243 1.85E-10 59.2703 cl02566 SET superfamily - - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#25726 - CGI_10016010 superfamily 220532 16 196 2.41E-31 119.798 cl12374 ATG13 superfamily - - "Autophagy-related protein 13; Members of this family of phosphoproteins are involved in cytoplasm to vacuole transport (Cvt), and more specifically in Cvt vesicle formation. They are probably involved in the switching machinery regulating the conversion between the Cvt pathway and autophagy. Finally, ATG13 is also required for glycogen storage." Q#25734 - CGI_10014823 superfamily 241874 31 593 0 586.068 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#25737 - CGI_10014826 superfamily 241729 320 732 0 836.573 cl00254 NOS_oxygenase superfamily - - "Nitric oxide synthase (NOS) produces nitric oxide (NO) by catalyzing a five-electron heme-based oxidation of a guanidine nitrogen of L-arginine to L-citrulline via two successive monooxygenation reactions producing N(omega)-hydroxy-L-arginine (NHA) as an intermediate. In mammals, there are three distinct NOS isozymes: neuronal (nNOS or NOS-1), cytokine-inducible (iNOS or NOS-2) and endothelial (eNOS or NOS-3) . Nitric oxide synthases are homodimers. In eukaryotes, each monomer has an N-terminal oxygenase domain which binds to the substrate L-Arg, zinc, and to the cofactors heme and 5.6.7.8-(6R)-tetrahydrobiopterin (BH4) . Eukaryotic NOSs also have a C-terminal electron supplying reductase region, which is homologous to cytochrome P450 reductase and binds NADH, FAD and FMN. While prokaryotes can produce NO as a byproduct of denitrification, using a completely different set of enzymes than NOS, a few prokaryotes also have a NOS which consists solely of the NOS oxygenase domain. Prokaryotic NOS binds to the substrate L-Arg, zinc, and to the cofactors heme and tetrahydrofolate." Q#25737 - CGI_10014826 superfamily 244539 1021 1430 0 648.237 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#25737 - CGI_10014826 superfamily 241622 6 89 3.63E-12 64.5102 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#25737 - CGI_10014826 superfamily 241863 782 955 1.69E-30 119.419 cl00438 Flavodoxin_2 superfamily - - Flavodoxin-like fold; This family consists of a domain with a flavodoxin-like fold. The family includes bacterial and eukaryotic NAD(P)H dehydrogenase (quinone) EC:1.6.99.2. These enzymes catalyze the NAD(P)H-dependent two-electron reductions of quinones and protect cells against damage by free radicals and reactive oxygen species. This enzyme uses a FAD co-factor. The equation for this reaction is:- NAD(P)H + acceptor <=> NAD(P)(+) + reduced acceptor. This enzyme is also involved in the bioactivation of prodrugs used in chemotherapy. The family also includes acyl carrier protein phosphodiesterase EC:3.1.4.14. This enzyme converts holo-ACP to apo-ACP by hydrolytic cleavage of the phosphopantetheine residue from ACP. This family is related to pfam03358 and pfam00258. Q#25739 - CGI_10014828 superfamily 191444 67 131 0.000172218 36.9185 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#25741 - CGI_10014830 superfamily 243035 41 165 7.52E-32 111.943 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25742 - CGI_10014831 superfamily 207794 83 128 4.91E-15 76.4822 cl02948 GH20_hexosaminidase superfamily C - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#25742 - CGI_10014831 superfamily 111707 47 75 2.22E-08 53.5728 cl03741 Glyco_hydro_20b superfamily N - "Glycosyl hydrolase family 20, domain 2; This domain has a zincin-like fold." Q#25743 - CGI_10014832 superfamily 243077 107 148 0.000473195 38.6805 cl02542 DnaJ superfamily C - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#25743 - CGI_10014832 superfamily 247999 543 579 0.000204857 39.8878 cl17445 PHD superfamily C - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#25744 - CGI_10014833 superfamily 241733 6 73 3.99E-43 136.112 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#25745 - CGI_10014834 superfamily 240523 1 165 1.86E-55 176.659 cl18941 DAXX_histone_binding superfamily - - "Histone binding domain of the death-domain associated protein (DAXX); DAXX is a nuclear protein that modulates transcription of various genes and is involved in cell death and/or the suppression of growth. DAXX is also a histone chaperone conserved in Metazoa that acts specifically on histone H3.3. This alignment models a functional domain of DAXX that interacts with the histone H3.3-H4 dimer, and in doing so competes with DNA binding and interactions between the histone chaperone ASF1/CIA and the H3-H4 dimer." Q#25746 - CGI_10004608 superfamily 243062 187 298 5.21E-14 66.1453 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#25747 - CGI_10004609 superfamily 243062 219 327 1.87E-13 64.9897 cl02510 TGF_beta superfamily - - Transforming growth factor beta like domain; Transforming growth factor beta like domain. Q#25748 - CGI_10004610 superfamily 152135 1437 1513 3.00E-18 82.3635 cl18049 Mif2 superfamily - - "Mif2/CENP-C like; Mif2 is a yeast DNA-binding kinetochore protein which is orthologous to mammalian CENP-C, the inner-kinetochore centromere (CEN) binding protein. Mif2 binds in the CDEIII region of the budding-yeast centromere, and has been shown to recruit a substantial subset of all inner and outer kinetochore proteins. Mif2 adopts a cupin fold and is extremely similar both in polypeptide chain conformation and in dimer geometry to the dimerisation domain of a bacterial transcription factor. The Mif2 dimer appears to be part of an enhanceosome-like structure that nucleates kinetochore assembly in budding yeast." Q#25750 - CGI_10004612 superfamily 243100 135 188 4.95E-12 61.8609 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#25752 - CGI_10002640 superfamily 247787 8 171 6.01E-32 116.529 cl17233 RecA-like_NTPases superfamily N - "RecA-like NTPases. This family includes the NTP binding domain of F1 and V1 H+ATPases, DnaB and related helicases as well as bacterial RecA and related eukaryotic and archaeal recombinases. This group also includes bacterial conjugation proteins and related DNA transfer proteins involved in type II and type IV secretion." Q#25753 - CGI_10002641 superfamily 218215 140 255 2.81E-17 78.2947 cl04683 Pinin_SDK_memA superfamily - - pinin/SDK/memA/ protein conserved region; Members of this family have very varied localisations within the eukaryotic cell. pinin is known to localise at the desmosomes and is implicated in anchoring intermediate filaments to the desmosomal plaque. SDK2/3 is a dynamically localised nuclear protein thought to be involved in modulation of alternative pre-mRNA splicing. memA is a tumour marker preferentially expressed in human melanoma cell lines. A common feature of the members of this family is that they may all participate in regulating protein-protein interactions. Q#25753 - CGI_10002641 superfamily 147050 6 139 7.41E-13 65.4672 cl04684 Pinin_SDK_N superfamily - - pinin/SDK conserved region; SDK2/3 is localised in nuclear speckles where as pinin is known to localise at the desmosomes where it is thought to be involved in anchoring intermediate filaments to the desmosomal plaque. The role of SDK2/3 in the nucleus is thought to be concerned with modulation of alternative pre-mRNA splicing. pinin has also been implicated as a tumour suppressor. The conserved region is found at the N-terminus of the member proteins. Q#25754 - CGI_10002642 superfamily 242899 7 156 3.18E-51 162.345 cl02135 TRAPP superfamily - - "Transport protein particle (TRAPP) component; TRAPP plays a key role in the targeting and/or fusion of ER-to-Golgi transport vesicles with their acceptor compartment. TRAPP is a large multimeric protein that contains at least 10 subunits. This family contains many TRAPP family proteins. The Bet3 subunit is one of the better characterized TRAPP proteins and has a dimeric structure with hydrophobic channels. The channel entrances are located on a putative membrane-interacting surface that is distinctively flat, wide and decorated with positively charged residues. Bet3 is proposed to localise TRAPP to the Golgi." Q#25756 - CGI_10022967 superfamily 245612 1 114 2.55E-45 154.007 cl11426 Amidase superfamily N - Amidase; Amidase. Q#25757 - CGI_10022968 superfamily 243072 36 166 2.05E-23 90.9058 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25758 - CGI_10022969 superfamily 243072 17 117 1.93E-13 62.401 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25759 - CGI_10022970 superfamily 213389 171 311 2.87E-14 71.1663 cl17092 STING_C superfamily C - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#25759 - CGI_10022970 superfamily 213389 498 640 1.59E-08 53.4471 cl17092 STING_C superfamily C - "C-terminal domain of STING; STING (stimulator of interferon genes, also known as MITA, ERIS, MPYS and TMEM173) is a master regulator that mediates cytokine production in response to microbial invasion by directly sensing bacterial secondary messengers such as the cyclic dinucleotide bis-(3'-5')-cyclic dimeric GMP (c-di-GMP) and leading to the activation of IFN regulatory factor 3 (IRF3) through TANK-binding kinase 1 (TBK1) stimulation. STING is also a signaling adaptor in the IFN response to cytosolic DNA. This detection of foreign materials is the first step to a successful immune responses. STING is localized in the ER and comprised of an predicted N-terminal transmembrane region and a C-terminal c-di-GMP binding domain." Q#25759 - CGI_10022970 superfamily 248012 60 115 0.00321805 36.8677 cl17458 TIR_2 superfamily NC - TIR domain; This is a family of bacterial Toll-like receptors. Q#25759 - CGI_10022970 superfamily 248012 360 467 0.00490399 36.0165 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#25760 - CGI_10022971 superfamily 216363 115 214 2.16E-16 71.7326 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#25761 - CGI_10022972 superfamily 247805 300 453 5.68E-19 85.0816 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25761 - CGI_10022972 superfamily 247905 616 750 4.97E-13 67.2628 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#25761 - CGI_10022972 superfamily 213148 501 614 1.55E-11 62.7146 cl17041 helicase_insert_domain superfamily - - "helical domain inserted in SF2-type helicase domain in Hef-, MDA5- and FancM-like proteins; This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases, like archaeal Hef helicase, MDA5-like helicases and FancM-like helicases. The exact function of this domain is unknown, but seems to play a role in interaction with nucleotides and/or the stabilization of the nucleotide complex." Q#25761 - CGI_10022972 superfamily 221155 823 931 2.31E-08 53.1404 cl13152 RIG-I_C-RD superfamily - - "C-terminal domain of RIG-I; This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerisation. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity." Q#25761 - CGI_10022972 superfamily 246680 112 185 3.84E-05 42.9436 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25762 - CGI_10022973 superfamily 247805 508 653 3.66E-18 83.5408 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#25762 - CGI_10022973 superfamily 247905 927 1062 8.58E-16 76.1224 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#25762 - CGI_10022973 superfamily 221155 1134 1250 2.58E-13 68.5484 cl13152 RIG-I_C-RD superfamily - - "C-terminal domain of RIG-I; This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerisation. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity." Q#25762 - CGI_10022973 superfamily 213148 712 846 4.20E-12 65.0258 cl17041 helicase_insert_domain superfamily - - "helical domain inserted in SF2-type helicase domain in Hef-, MDA5- and FancM-like proteins; This helical domain can be found inserted in a subset of SF2-type DEAD-box related helicases, like archaeal Hef helicase, MDA5-like helicases and FancM-like helicases. The exact function of this domain is unknown, but seems to play a role in interaction with nucleotides and/or the stabilization of the nucleotide complex." Q#25762 - CGI_10022973 superfamily 246680 208 281 0.00628998 36.3952 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25764 - CGI_10022975 superfamily 243066 69 158 1.61E-22 89.5345 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25764 - CGI_10022975 superfamily 109845 279 318 1.48E-10 55.8527 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#25764 - CGI_10022975 superfamily 109845 194 238 1.70E-06 44.2967 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#25764 - CGI_10022975 superfamily 109845 170 208 7.92E-06 42.3707 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#25764 - CGI_10022975 superfamily 109845 249 293 0.000296305 37.7483 cl02971 Pentapeptide superfamily - - "Pentapeptide repeats (8 copies); These repeats are found in many cyanobacterial proteins. The repeats were first identified in hglK. The function of these repeats is unknown. The structure of this repeat has been predicted to be a beta-helix. The repeat can be approximately described as A(D/N)LXX, where X can be any amino acid." Q#25767 - CGI_10022978 superfamily 222150 479 504 2.08E-06 45.0753 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#25767 - CGI_10022978 superfamily 222150 451 476 3.42E-06 44.3049 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#25768 - CGI_10022979 superfamily 222370 4 76 4.27E-18 75.2521 cl16386 Longin superfamily - - "Regulated-SNARE-like domain; Longin is one of the approximately 26 components required for transporting proteins from the ER to the plasma membrane, via the Golgi apparatus. It is necessary for the steps of the transfer from the ER to the Golgi complex. Longins are the only R-SNAREs that are common to all eukaryotes, and they are characterized by a conserved N-terminal domain with a profilin-like fold called a longin domain." Q#25768 - CGI_10022979 superfamily 201526 96 152 2.62E-15 67.9485 cl09522 Synaptobrevin superfamily C - Synaptobrevin; Synaptobrevin. Q#25769 - CGI_10022980 superfamily 242186 28 200 9.69E-56 177.364 cl00911 AMMECR1 superfamily - - "AMMECR1; This family consists of several AMMECR1 as well as several uncharacterized proteins. The contiguous gene deletion syndrome AMME is characterized by Alport syndrome, midface hypoplasia, mental retardation and elliptocytosis and is caused by a deletion in Xq22.3, comprising several genes including COL4A5, FACL4 and AMMECR1. This family contains sequences from several eukaryotic species as well as archaebacteria and it has been suggested that the AMMECR1 protein may have a basic cellular function, potentially in either the transcription, replication, repair or translation machinery." Q#25771 - CGI_10022982 superfamily 241559 876 961 1.56E-09 56.5503 cl00030 CH superfamily N - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#25772 - CGI_10022983 superfamily 243092 10 265 1.29E-22 94.7092 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25773 - CGI_10022984 superfamily 247725 1 160 1.81E-66 215.199 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25773 - CGI_10022984 superfamily 219103 155 268 7.00E-45 155.222 cl05893 Myotub-related superfamily - - "Myotubularin-related; This family represents a region within eukaryotic myotubularin-related proteins that is sometimes found with pfam02893. Myotubularin is a dual-specific lipid phosphatase that dephosphorylates phosphatidylinositol 3-phosphate and phosphatidylinositol (3,5)-bi-phosphate. Mutations in gene encoding myotubularin-related proteins have been associated with disease." Q#25773 - CGI_10022984 superfamily 206020 330 384 3.45E-24 96.0378 cl18286 Y_phosphatase_m superfamily - - "Myotubularin Y_phosphatase-like; This short region is highly conserved and seems to be common to many myotubularin proteins with protein tyrosine pyrophosphate activity. As the family has a number of highly conserved residues such as histidine, cysteine, glutamine and aspartate, it is possible that this represents a catalytic core of the active enzymatic part of the proteins." Q#25774 - CGI_10022985 superfamily 206084 108 132 0.0016907 35.6544 cl16472 zf-C2HC_2 superfamily - - zinc-finger of a C2HC-type; This family contains a number of divergent C2H2 type zinc fingers. Q#25778 - CGI_10022989 superfamily 241574 77 247 1.13E-64 211.29 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25778 - CGI_10022989 superfamily 241574 267 490 9.14E-15 72.6185 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#25779 - CGI_10022990 superfamily 241874 48 489 2.24E-93 301.733 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#25779 - CGI_10022990 superfamily 241874 635 786 5.29E-61 216.546 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#25782 - CGI_10022993 superfamily 245206 289 558 1.80E-126 374.881 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#25782 - CGI_10022993 superfamily 241578 25 255 8.24E-90 279.642 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25783 - CGI_10022994 superfamily 243092 102 426 1.08E-53 182.535 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25784 - CGI_10022995 superfamily 241572 181 269 3.71E-05 41.0701 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#25785 - CGI_10022996 superfamily 245201 875 1102 1.80E-36 139.679 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25785 - CGI_10022996 superfamily 245201 1495 1736 2.40E-33 130.434 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25785 - CGI_10022996 superfamily 241584 1339 1430 2.52E-13 68.6771 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25785 - CGI_10022996 superfamily 241584 510 590 5.09E-13 67.5215 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25785 - CGI_10022996 superfamily 241584 38 128 7.28E-12 64.4399 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25785 - CGI_10022996 superfamily 245814 763 832 1.93E-09 56.7287 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25785 - CGI_10022996 superfamily 245814 1264 1332 4.05E-09 55.9583 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25785 - CGI_10022996 superfamily 241584 183 279 6.82E-07 49.4171 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25785 - CGI_10022996 superfamily 245814 674 738 1.69E-05 44.7875 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25785 - CGI_10022996 superfamily 241584 290 343 3.23E-05 44.4095 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25785 - CGI_10022996 superfamily 241584 385 435 0.000147441 42.0983 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25786 - CGI_10022997 superfamily 241584 13 106 4.28E-08 54.0395 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25787 - CGI_10022998 superfamily 241584 8 87 3.25E-09 55.5803 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25787 - CGI_10022998 superfamily 241584 407 499 5.14E-08 52.1135 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25787 - CGI_10022998 superfamily 241584 115 193 1.09E-07 51.3431 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25787 - CGI_10022998 superfamily 241584 305 396 9.79E-07 48.2615 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25787 - CGI_10022998 superfamily 241584 204 295 2.81E-06 46.7207 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25788 - CGI_10022999 superfamily 241584 497 587 2.18E-12 64.0547 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25788 - CGI_10022999 superfamily 241584 90 184 2.97E-11 60.9731 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25788 - CGI_10022999 superfamily 241584 195 288 2.55E-10 57.8915 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25788 - CGI_10022999 superfamily 241584 599 685 1.06E-09 56.3507 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25788 - CGI_10022999 superfamily 241584 399 486 2.48E-09 55.1951 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25788 - CGI_10022999 superfamily 241584 311 389 1.54E-08 52.8839 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25788 - CGI_10022999 superfamily 241584 25 80 1.79E-06 46.7207 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#25790 - CGI_10023001 superfamily 243096 1296 1468 4.21E-18 84.6568 cl02571 RhoGEF superfamily - - Guanine nucleotide exchange factor for Rho/Rac/Cdc42-like GTPases; Also called Dbl-homologous (DH) domain. It appears that PH domains invariably occur C-terminal to RhoGEF/DH domains. Q#25790 - CGI_10023001 superfamily 245814 1093 1162 1.23E-08 54.4175 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25790 - CGI_10023001 superfamily 245814 1672 1754 2.01E-10 59.8264 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25790 - CGI_10023001 superfamily 245814 971 1054 9.14E-07 49.0409 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25790 - CGI_10023001 superfamily 245814 1785 1825 0.000274288 41.5724 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25791 - CGI_10023002 superfamily 245814 14 76 1.39E-13 61.3511 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25792 - CGI_10023003 superfamily 241599 177 234 7.96E-21 83.832 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25793 - CGI_10023005 superfamily 243072 43 165 1.11E-36 131.737 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25794 - CGI_10023006 superfamily 243072 18 148 3.63E-32 113.633 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25796 - CGI_10028051 superfamily 148902 38 106 0.000796804 37.4327 cl06535 EMI superfamily - - EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains. Q#25797 - CGI_10028052 superfamily 241583 244 475 1.74E-81 264.621 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#25797 - CGI_10028052 superfamily 189857 704 826 4.33E-21 91.5426 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#25797 - CGI_10028052 superfamily 216572 89 182 4.39E-06 46.1139 cl03265 Pep_M12B_propep superfamily N - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#25798 - CGI_10028053 superfamily 246613 10 70 2.30E-31 111.648 cl14058 lectin_L-type superfamily NC - "legume lectins; The L-type (legume-type) lectins are a highly diverse family of carbohydrate binding proteins that generally display no enzymatic activity toward the sugars they bind. This family includes arcelin, concanavalinA, the lectin-like receptor kinases, the ERGIC-53/VIP36/EMP46 type1 transmembrane proteins, and an alpha-amylase inhibitor. L-type lectins have a dome-shaped beta-barrel carbohydrate recognition domain with a curved seven-stranded beta-sheet referred to as the "front face" and a flat six-stranded beta-sheet referred to as the "back face". This domain homodimerizes so that adjacent back sheets form a contiguous 12-stranded sheet and homotetramers occur by a back-to-back association of these homodimers. Though L-type lectins exhibit both sequence and structural similarity to one another, their carbohydrate binding specificities differ widely." Q#25801 - CGI_10028056 superfamily 241594 13 158 1.38E-62 198.173 cl00077 HECTc superfamily N - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#25802 - CGI_10028057 superfamily 217473 96 320 8.39E-24 101.673 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#25803 - CGI_10028058 superfamily 245201 92 379 0 552.372 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25803 - CGI_10028058 superfamily 116877 471 559 2.90E-15 71.9435 cl07051 MRP-S33 superfamily - - "Mitochondrial ribosomal subunit S27; This family of proteins corresponds to mitochondrial ribosomal subunit S27 in prokaryotes and to subunit S33 in humans. It is a small 106 residue protein.The evolutionary history of the mitoribosomal proteome that is encoded by a diverse subset of eukaryotic genomes, reveals an ancestral ribosome of alpha-proteobacterial descent that more than doubled its protein content in most eukaryotic lineages. Several new MRPs have originated via duplication of existing MRPs as well as by recruitment from outside of the mitoribosomal proteome." Q#25804 - CGI_10028059 superfamily 219448 215 304 2.21E-16 75.8145 cl06523 DRMBL superfamily - - DNA repair metallo-beta-lactamase; The metallo-beta-lactamase fold contains five sequence motifs. The first four motifs are found in pfam00753 and are common to all metallo-beta-lactamases. The fifth motif appears to be specific to function. This entry represents the fifth motif from metallo-beta-lactamases involved in DNA repair. Q#25804 - CGI_10028059 superfamily 241867 28 129 1.04E-11 63.3414 cl00446 Lactamase_B superfamily N - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#25808 - CGI_10028063 superfamily 217390 224 278 0.00011852 40.2357 cl18407 TPT superfamily N - Triose-phosphate Transporter family; This family includes transporters with a specificity for triose phosphate. Q#25809 - CGI_10028064 superfamily 248097 68 184 1.04E-15 71.9126 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25810 - CGI_10028065 superfamily 248097 51 82 5.10E-08 46.1042 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25811 - CGI_10028066 superfamily 243035 95 205 2.44E-19 80.8117 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25812 - CGI_10028067 superfamily 248097 74 190 2.17E-16 75.7646 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25812 - CGI_10028067 superfamily 248097 368 466 1.17E-10 59.201 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25813 - CGI_10028068 superfamily 241563 61 96 0.000506209 38.6144 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25814 - CGI_10028069 superfamily 220215 78 174 0.00272488 37.5898 cl09630 FERM_N superfamily - - FERM N-terminal domain; This domain is the N-terminal ubiquitin-like structural domain of the FERM domain. Q#25814 - CGI_10028069 superfamily 246925 583 689 0.00812472 38.4906 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#25816 - CGI_10028071 superfamily 247725 782 872 3.26E-47 165.213 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25816 - CGI_10028071 superfamily 247057 14 79 1.27E-28 111.229 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#25816 - CGI_10028071 superfamily 241622 228 304 1.95E-10 59.1174 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#25816 - CGI_10028071 superfamily 247725 1017 1091 1.60E-20 89.3652 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25816 - CGI_10028071 superfamily 151082 87 187 1.04E-09 57.1667 cl11167 CRIC_ras_sig superfamily - - "Connector enhancer of kinase suppressor of ras; The CRIC - Connector enhancer of kinase suppressor of ras - domain functions as a scaffold in several signal cascades and acts on proliferation, differentiation and apoptosis." Q#25817 - CGI_10028072 superfamily 215827 57 228 6.05E-39 142.222 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#25819 - CGI_10028074 superfamily 245596 55 272 7.85E-113 328.387 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#25822 - CGI_10028077 superfamily 243555 73 130 3.82E-05 43.1486 cl03871 Chitin_bind_3 superfamily N - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#25823 - CGI_10028078 superfamily 245206 8 254 6.06E-118 340.542 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#25824 - CGI_10028079 superfamily 215827 312 489 2.81E-41 151.082 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#25826 - CGI_10028081 superfamily 243072 180 281 4.66E-13 67.0234 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25826 - CGI_10028081 superfamily 243072 238 383 6.91E-08 51.2302 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25826 - CGI_10028081 superfamily 243073 685 724 1.01E-06 46.7199 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#25826 - CGI_10028081 superfamily 243072 83 230 0.000123867 41.2151 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25827 - CGI_10028082 superfamily 216839 16 123 1.34E-23 93.5551 cl12260 TFIIE_alpha superfamily - - "TFIIE alpha subunit; The general transcription factor TFIIE has an essential role in eukaryotic transcription initiation together with RNA polymerase II and other general factors. Human TFIIE consists of two subunits TFIIE-alpha and TFIIE-beta, and joins the pre-initiation complex after RNA polymerase II and TFIIF. This family consists of the conserved amino terminal region of eukaryotic TFIIE-alpha and proteins from archaebacteria that are presumed to be TFIIE-alpha subunits also Archaeoglobus fulgidus tfe." Q#25828 - CGI_10028083 superfamily 217293 117 322 3.84E-53 181.292 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#25828 - CGI_10028083 superfamily 202474 329 540 5.53E-20 88.4797 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#25836 - CGI_10028091 superfamily 245814 333 395 1.64E-05 43.2467 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25837 - CGI_10028092 superfamily 243152 29 156 5.99E-35 120.471 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#25849 - CGI_10028104 superfamily 245201 33 290 1.88E-82 255.907 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25850 - CGI_10028105 superfamily 246918 148 207 6.29E-12 59.5227 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25850 - CGI_10028105 superfamily 246918 84 143 8.78E-06 42.1887 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25851 - CGI_10028107 superfamily 248097 83 174 5.09E-14 64.979 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25852 - CGI_10028108 superfamily 248097 52 123 7.56E-11 55.7342 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25853 - CGI_10028109 superfamily 248097 22 70 9.08E-08 46.1042 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25854 - CGI_10028110 superfamily 248097 217 342 3.93E-16 73.4534 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#25855 - CGI_10028111 superfamily 243066 6 91 1.36E-10 54.0961 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25856 - CGI_10028112 superfamily 244886 7 94 4.90E-21 80.7733 cl08278 Rotamase_2 superfamily - - PPIC-type PPIASE domain; PPIC-type PPIASE domain. Q#25857 - CGI_10028113 superfamily 243035 22 126 2.85E-15 67.3297 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#25858 - CGI_10028114 superfamily 246918 746 798 2.98E-15 72.2343 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25858 - CGI_10028114 superfamily 246918 917 969 8.73E-14 67.9971 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25858 - CGI_10028114 superfamily 246918 575 627 3.55E-13 66.0711 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25858 - CGI_10028114 superfamily 246918 632 684 4.87E-12 62.9895 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25858 - CGI_10028114 superfamily 246918 803 855 6.59E-10 56.8263 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25860 - CGI_10028116 superfamily 241835 89 148 1.01E-11 56.899 cl00392 Ribosomal_L35p superfamily - - Ribosomal protein L35; Ribosomal protein L35. Q#25862 - CGI_10028118 superfamily 246918 193 243 2.75E-12 61.4487 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25862 - CGI_10028118 superfamily 246918 249 306 1.44E-11 59.1375 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25862 - CGI_10028118 superfamily 246918 78 130 3.12E-10 55.6707 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25862 - CGI_10028118 superfamily 246918 135 187 1.36E-06 45.2703 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25862 - CGI_10028118 superfamily 246918 14 68 4.28E-05 40.6479 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#25863 - CGI_10028119 superfamily 241874 67 268 1.75E-88 284.183 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#25865 - CGI_10028121 superfamily 243049 56 82 0.000317032 38.113 cl02472 IGFBP superfamily NC - Insulin-like growth factor binding protein; Insulin-like growth factor binding protein. Q#25866 - CGI_10028122 superfamily 203013 5 33 4.80E-07 46.0798 cl04519 zf-HIT superfamily - - HIT zinc finger; This presumed zinc finger contains up to 6 cysteine residues that could coordinate zinc. The domain is named after the HIT protein. This domain is also found in the Thyroid receptor interacting protein 3 (TRIP-3) that specifically interacts with the ligand binding domain of the thyroid receptor. Q#25868 - CGI_10028124 superfamily 247063 58 122 2.42E-12 59.823 cl15768 TGS superfamily - - "The TGS domain, named after the ThrRS, GTPase, and SpoT/RelA proteins where it occurs, is structurally similar to ubiquitin. TGS is a small domain of about 50 amino acid residues with a predominantly beta-sheet structure. There is no direct information on the function of the TGS domain, but its presence in two types of regulatory proteins (the GTPases and guanosine polyphosphate phosphohydrolases/synthetases) suggests a ligand (most likely nucleotide)-binding, regulatory role." Q#25869 - CGI_10028125 superfamily 247725 1268 1383 3.26E-58 199.469 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25869 - CGI_10028125 superfamily 247725 1908 2022 3.12E-56 193.691 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25869 - CGI_10028125 superfamily 247725 2253 2368 1.87E-53 185.602 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25869 - CGI_10028125 superfamily 247725 2436 2525 1.91E-11 63.6498 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#25869 - CGI_10028125 superfamily 207690 1486 1514 0.000528203 40.4109 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#25870 - CGI_10028126 superfamily 242274 4 213 5.55E-97 284.952 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#25874 - CGI_10028130 superfamily 110440 486 510 0.000428477 38.1577 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#25874 - CGI_10028130 superfamily 241563 59 95 0.000496662 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25875 - CGI_10005875 superfamily 244824 110 602 0 680.888 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#25875 - CGI_10005875 superfamily 114894 1088 1128 6.29E-11 64.2684 cl17945 GDE_C superfamily C - "Amylo-alpha-1,6-glucosidase; This family includes human glycogen branching enzyme. This enzyme contains a number of distinct catalytic activities. It has been shown for the yeast homologue that mutations in this region disrupt the enzymes Amylo-alpha-1,6-glucosidase (EC:3.2.1.33)." Q#25876 - CGI_10005876 superfamily 241677 1 154 3.28E-104 297.811 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#25878 - CGI_10005878 superfamily 242903 222 307 2.55E-32 124.248 cl02148 APC10-like superfamily C - "APC10-like DOC1 domains in E3 ubiquitin ligases that mediate substrate ubiquitination; This family contains the single domain protein, APC10, a subunit of the anaphase-promoting complex (APC), as well as the DOC1 domain of multi-domain proteins present in E3 ubiquitin ligases. E3 ubiquitin ligases mediate substrate ubiquitination (or ubiquitylation), a component of the ubiquitin-26S proteasome pathway for selective proteolytic degradation. The APC, a multi-protein complex (or cyclosome), is a cell cycle-regulated, E3 ubiquitin ligase that controls important transitions in mitosis and the G1 phase by ubiquitinating regulatory proteins, thereby targeting them for degradation. APC10-like DOC1 domains such as those present in HECT (Homologous to the E6-AP Carboxyl Terminus) and Cullin-RING (Really Interesting New Gene) E3 ubiquitin ligase proteins, HECTD3, and CUL7, respectively, are also included in this hierarchy. CUL7 is a member of the Cullin-RING ligase family and functions as a molecular scaffold assembling a SCF-ROC1-like E3 ubiquitin ligase complex consisting of Skp1, CUL7, Fbx29 F-box protein, and ROC1 (RING-box protein 1) and promotes ubiquitination. CUL7 is a multi-domain protein with a C-terminal cullin domain that binds ROC1 and a centrally positioned APC10/DOC1 domain. HECTD3 contains a C-terminal HECT domain which contains the active site for ubiquitin transfer onto substrates, and an N-terminal APC10 domain which is responsible for substrate recognition and binding. An APC10/DOC1 domain homolog is also present in HERC2 (HECT domain and RLD2), a large multi-domain protein with three RCC1-like domains (RLDs), additional internal domains including zinc finger ZZ-type and Cyt-b5 (Cytochrome b5-like Heme/Steroid binding) domains, and a C-terminal HECT domain. Recent studies have shown that the protein complex HERC2-RNF8 coordinates ubiquitin-dependent assembly of DNA repair factors on damaged chromosomes. Also included in this hierarchy is an uncharacterized APC10/DOC1-like domain found in a multi-domain protein, which also contains CUB, zinc finger ZZ-type, and EF-hand domains. The APC10/DOC1 domain forms a beta-sandwich structure that is related in architecture to the galactose-binding domain-like fold; their sequences are quite dissimilar, however, and are not included here." Q#25878 - CGI_10005878 superfamily 241760 1345 1393 3.07E-17 78.1258 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#25878 - CGI_10005878 superfamily 241571 848 938 0.0019144 38.5475 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#25879 - CGI_10005879 superfamily 243054 9373 9588 3.29E-32 129.873 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 9485 9695 7.21E-30 122.939 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 8831 9045 1.55E-24 107.146 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 9050 9250 1.28E-22 101.368 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 8720 8936 4.50E-21 96.7459 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 8178 8394 7.35E-21 96.3607 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 6800 7021 3.70E-20 94.0495 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 7307 7518 4.83E-20 93.6643 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 241559 49 153 5.44E-19 87.7515 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#25879 - CGI_10005879 superfamily 243054 7636 7850 4.84E-17 84.8047 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 424 610 8.07E-16 80.9527 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 8608 8827 3.14E-15 79.0267 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 8396 8606 2.70E-14 76.3303 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 7527 7742 8.78E-14 74.4043 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 7852 8065 3.70E-11 66.3152 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 6669 6914 1.70E-10 64.004 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 6559 6719 1.40E-08 58.226 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 288 516 1.02E-06 52.448 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 141488 9945 10014 4.28E-34 130.646 cl02524 GAS2 superfamily - - Growth-Arrest-Specific Protein 2 Domain; Growth-Arrest-Specific Protein 2 Domain. Q#25879 - CGI_10005879 superfamily 243054 9268 9368 8.51E-10 60.8065 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 9707 9762 1.20E-05 48.0949 cl02488 SPEC superfamily C - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 864 1072 2.75E-05 48.2108 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 243054 8091 8173 5.63E-05 45.7837 cl02488 SPEC superfamily N - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#25879 - CGI_10005879 superfamily 207632 4263 4295 0.00900486 38.5857 cl02531 Plectin superfamily - - "Plectin repeat; This family includes repeats from plectin, desmoplakin, envoplakin and bullous pemphigoid antigen." Q#25880 - CGI_10005880 superfamily 241559 17 102 1.40E-19 78.8919 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#25881 - CGI_10003120 superfamily 247727 374 445 0.000322746 39.7207 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#25882 - CGI_10003121 superfamily 198738 520 602 5.49E-52 174.817 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#25882 - CGI_10003121 superfamily 247057 382 454 3.18E-24 97.4809 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#25883 - CGI_10002464 superfamily 245201 38 245 6.57E-100 295.764 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25885 - CGI_10003376 superfamily 217309 142 553 5.83E-140 418.254 cl09289 EMP70 superfamily - - Endomembrane protein 70; Endomembrane protein 70. Q#25888 - CGI_10001426 superfamily 221377 231 375 1.15E-09 55.9378 cl13449 DUF3504 superfamily - - Domain of unknown function (DUF3504); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 156 to 173 amino acids in length. Q#25889 - CGI_10009492 superfamily 247723 356 427 6.38E-31 117.262 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25889 - CGI_10009492 superfamily 247792 798 831 3.19E-06 45.5144 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#25889 - CGI_10009492 superfamily 247723 659 730 3.22E-19 83.8589 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#25889 - CGI_10009492 superfamily 245674 922 967 0.000319947 39.965 cl11531 DUF904 superfamily C - Protein of unknown function (DUF904); This family consists of several bacterial and archaeal hypothetical proteins of unknown function. Q#25889 - CGI_10009492 superfamily 245716 126 147 0.00354632 36.4533 cl11592 zf-CCCH superfamily N - Zinc finger C-x8-C-x5-C-x3-H type (and similar); Zinc finger C-x8-C-x5-C-x3-H type (and similar). Q#25890 - CGI_10009493 superfamily 248024 11 324 1.88E-69 221.638 cl17470 SBF superfamily - - "Sodium Bile acid symporter family; This family consists of Na+/bile acid co-transporters. These transmembrane proteins function in the liver in the uptake of bile acids from portal blood plasma a process mediated by the co-transport of Na+. Also in the family is ARC3 from S. cerevisiae, this is a putative transmembrane protein involved in resistance to arsenic compounds." Q#25891 - CGI_10009494 superfamily 241578 320 464 1.78E-20 89.273 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25891 - CGI_10009494 superfamily 241578 509 670 1.04E-12 66.5462 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25891 - CGI_10009494 superfamily 245213 195 231 1.18E-10 58.0318 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25891 - CGI_10009494 superfamily 245213 273 309 2.15E-08 51.4834 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25891 - CGI_10009494 superfamily 245213 155 192 2.20E-07 48.4018 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25891 - CGI_10009494 superfamily 245213 234 270 1.15E-06 46.4758 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#25892 - CGI_10009495 superfamily 243362 177 336 4.35E-50 166.447 cl03262 DnaJ_C superfamily - - C-terminal substrate binding domain of DnaJ and HSP40; The C-terminal region of the DnaJ/Hsp40 protein mediates oligomerization and binding to denatured polypeptide substrate. DnaJ/Hsp40 is a widely conserved heat-shock protein. It prevents the aggregation of unfolded substrate and forms a ternary complex with both substrate and DnaK/Hsp70; the N-terminal J-domain of DnaJ/Hsp40 stimulates the ATPase activity of DnaK/Hsp70. Q#25892 - CGI_10009495 superfamily 243077 4 57 1.33E-25 97.6161 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#25893 - CGI_10009496 superfamily 243072 402 523 7.26E-36 133.663 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25893 - CGI_10009496 superfamily 243072 201 326 2.66E-31 120.566 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25893 - CGI_10009496 superfamily 243072 65 227 5.42E-25 102.462 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#25894 - CGI_10009497 superfamily 246680 12 92 3.15E-17 71.3594 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25895 - CGI_10009498 superfamily 247637 11 330 9.62E-130 394.643 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#25895 - CGI_10009498 superfamily 243179 716 794 3.45E-05 43.2903 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#25897 - CGI_10009500 superfamily 241563 18 50 0.00119894 36.9315 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25897 - CGI_10009500 superfamily 241717 54 130 0.00742698 36.4017 cl00240 RRF superfamily N - "Ribosome recycling factor (RRF). Ribosome recycling factor dissociates the posttermination complex, composed of the ribosome, deacylated tRNA, and mRNA, after termination of translation. Thus ribosomes are "recycled" and ready for another round of protein synthesis. RRF is believed to bind the ribosome at the A-site in a manner that mimics tRNA, but the specific mechanisms remain unclear. RRF is essential for bacterial growth. It is not necessary for cell growth in archaea or eukaryotes, but is found in mitochondria or chloroplasts of some eukaryotic species." Q#25898 - CGI_10009501 superfamily 242116 27 561 0 1081.23 cl00817 MM_CoA_mutase superfamily - - "Coenzyme B12-dependent-methylmalonyl coenzyme A (CoA) mutase (MCM)-like family; contains proteins similar to MCM, and the large subunit of Streptomyces coenzyme B12-dependent isobutyryl-CoA mutase (ICM). MCM catalyzes the isomerization of methylmalonyl-CoA to succinyl-CoA. The reaction proceeds via radical intermediates beginning with a substrate-induced homolytic cleavage of the Co-C bond of coenzyme B12 to produce cob(II)alamin and the deoxyadenosyl radical. MCM plays an important role in the conversion of propionyl-CoA to succinyl-CoA during the degradation of propionate for the Krebs cycle. In higher animals, MCM is involved in the breakdown of odd-chain fatty acids, several amino acids, and cholesterol. Methylobacterium extorquens MCM participates in the glyoxylate regeneration pathway. In M. extorquens, MCM forms a complex with MeaB; MeaB may protect MCM from irreversible inactivation. In some bacteria, MCM is involved in the reverse metabolic reaction, the rearrangement of succinyl-CoA to methylmalonyl-CoA. Examples include Propionbacterium shermanni MCM during propionic acid fermentation, E.coli MCM in a pathway for the conversion of succinate to propionate and Streptomyces MCM in polyketide biosynthesis. P. shermanni and Streptomyces cinnamonensis MCMs are alpha/beta heterodimers, with both subunits being homologous members of this family. It has been shown for P. shermanni MCM that only the alpha subunit binds coenzyme B12 and substrates. Human MCM is a homodimer with two active sites. Mouse and E.coli MCMs are also homodimers. ICM from S. cinnamonensis is comprised of a large and a small subunit. The holoenzyme appears to be an alpha2beta2 heterotetramer with up to 2 molecules of coenzyme B12 bound. The small subunit binds coenzyme B12. ICM catalyzes the reversible rearrangement of n-butyryl-CoA to isobutyryl-CoA (intermediates in fatty acid and valine catabolism, which in S. cinnamonensis can be converted to methylmalonyl-CoA and used in polyketide synthesis). In humans, impaired activity of MCM results in methylmalonic aciduria, a disorder of propionic acid metabolism." Q#25898 - CGI_10009501 superfamily 241759 603 724 1.33E-53 181.251 cl00293 B12-binding_like superfamily - - "B12 binding domain (B12-BD). Most of the members bind different cobalamid derivates, like B12 (adenosylcobamide) or methylcobalamin or methyl-Co(III) 5-hydroxybenzimidazolylcobamide. This domain is found in several enzymes, such as glutamate mutase, methionine synthase and methylmalonyl-CoA mutase. Cobalamin undergoes a conformational change on binding the protein; the dimethylbenzimidazole group, which is coordinated to the cobalt in the free cofactor, moves away from the corrin and is replaced by a histidine contributed by the protein. The sequence Asp-X-His-X-X-Gly, which contains this histidine ligand, is conserved in many cobalamin-binding proteins. Not all members of this family contain the conserved binding motif." Q#25900 - CGI_10002367 superfamily 247724 19 179 1.98E-95 278.666 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25902 - CGI_10004759 superfamily 243100 5 27 0.000809067 32.9212 cl02576 B_zip1 superfamily NC - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#25903 - CGI_10004760 superfamily 243058 574 686 1.43E-15 76.9695 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#25903 - CGI_10004760 superfamily 243058 834 946 1.43E-15 76.9695 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#25903 - CGI_10004760 superfamily 243058 461 594 3.39E-10 60.7911 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#25903 - CGI_10004760 superfamily 243058 651 727 2.03E-07 52.3167 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#25903 - CGI_10004760 superfamily 243058 911 987 2.03E-07 52.3167 cl02500 ARM superfamily C - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#25903 - CGI_10004760 superfamily 243058 310 416 7.26E-06 47.6943 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#25903 - CGI_10004760 superfamily 151853 97 165 7.51E-07 50.0858 cl12943 Suppressor_APC superfamily - - "Adenomatous polyposis coli tumour suppressor protein; The tumour suppressor protein, APC, has a nuclear export activity as well as many different intracellular functions. The structure consists of three alpha-helices forming two separate antiparallel coiled coils." Q#25903 - CGI_10004760 superfamily 147850 2151 2176 0.000667382 40.4728 cl05471 APC_crr superfamily - - APC cysteine-rich region; This short region is found repeated in the mid region of the adenomatous polyposis proteins (APCs). In the human protein many cancer-linked SNPs are found near the first three occurrences of the motif. These repeats bind beta-catenin. Q#25904 - CGI_10004761 superfamily 243146 260 306 3.23E-10 54.975 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25904 - CGI_10004761 superfamily 243146 226 271 8.32E-07 45.6271 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#25905 - CGI_10004762 superfamily 242179 23 100 2.81E-46 145.583 cl00897 Ribosomal_S27e superfamily - - Ribosomal protein S27; Ribosomal protein S27. Q#25906 - CGI_10004763 superfamily 147555 1 243 6.31E-65 207.005 cl05148 CRF-BP superfamily - - "Corticotropin-releasing factor binding protein (CRF-BP); This family consists of several eukaryotic corticotropin-releasing factor binding proteins (CRF-BP or CRH-BP). Corticotropin-releasing hormone (CRH) plays multiple roles in vertebrate species. In mammals, it is the major hypothalamic releasing factor for pituitary adrenocorticotropin secretion, and is a neurotransmitter or neuromodulator at other sites in the central nervous system. In non-mammalian vertebrates, CRH not only acts as a neurotransmitter and hypophysiotropin, it also acts as a potent thyrotropin-releasing factor, allowing CRH to regulate both the adrenal and thyroid axes, especially in development. CRH-BP is thought to play an inhibitory role in which it binds CRH and other CRH-like ligands and prevents the activation of CRH receptors. There is however evidence that CRH-BP may also exhibit diverse extra and intracellular roles in a cell specific fashion and at specific times in development." Q#25907 - CGI_10004764 superfamily 245008 1617 1681 1.71E-12 65.2872 cl09101 E_set superfamily - - "Early set domain associated with the catalytic domain of sugar utilizing enzymes at either the N or C terminus; The E or "early" set domains of sugar utilizing enzymes are associated with different types of catalytic domains at either the N-terminal or C-terminal end. These domains may be related to the immunoglobulin and/or fibronectin type III superfamilies. Members of this family include alpha amylase, sialidase, galactose oxidase, cellulase, cellulose, hyaluronate lyase, chitobiase, and chitinase. A subset of these members were recently identified as members of the CBM48 (Carbohydrate Binding Module 48) family. Members of the CBM48 family include pullulanase, maltooligosyl trehalose synthase, starch branching enzyme, glycogen branching enzyme, glycogen debranching enzyme, isoamylase, and the beta subunit of AMP-activated protein kinase." Q#25907 - CGI_10004764 superfamily 207794 328 757 1.47E-153 479.866 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#25907 - CGI_10004764 superfamily 207794 1189 1598 5.92E-153 478.325 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#25907 - CGI_10004764 superfamily 243574 35 190 2.03E-24 102.793 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#25907 - CGI_10004764 superfamily 243574 903 1051 3.74E-21 93.1631 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#25907 - CGI_10004764 superfamily 111707 1130 1191 0.000395492 40.8612 cl03741 Glyco_hydro_20b superfamily N - "Glycosyl hydrolase family 20, domain 2; This domain has a zincin-like fold." Q#25908 - CGI_10008695 superfamily 241994 4 113 9.48E-49 169.239 cl00632 ATP-synt_F superfamily - - "ATP synthase (F/14-kDa) subunit; This family includes 14-kDa subunit from vATPases, which is in the peripheral catalytic part of the complex. The family also includes archaebacterial ATP synthase subunit F." Q#25908 - CGI_10008695 superfamily 246680 256 340 0.0012204 38.3368 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#25909 - CGI_10008696 superfamily 241645 229 296 0.0015872 36.091 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#25910 - CGI_10008697 superfamily 243084 2196 2292 4.28E-51 177.952 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#25910 - CGI_10008697 superfamily 241617 661 733 5.85E-20 87.4601 cl00110 MBD superfamily - - "MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family." Q#25910 - CGI_10008697 superfamily 247999 2034 2081 2.64E-13 67.9008 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#25910 - CGI_10008697 superfamily 247999 2088 2131 8.89E-10 57.5004 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#25910 - CGI_10008697 superfamily 243137 1004 1029 0.00596248 37.1827 cl02674 DDT superfamily C - "DDT domain; This domain is approximately 60 residues in length, and is predicted to be a DNA binding domain. The DDT domain is named after (DNA binding homeobox and Different Transcription factors). It is exclusively associated with nuclear domains, and is thought to be arranged into three alpha helices." Q#25913 - CGI_10008700 superfamily 243100 491 555 2.97E-10 56.8035 cl02576 B_zip1 superfamily - - "basic leucine zipper DNA-binding and multimerization region of GCN4 and related proteins; Basic leucine zipper (bZIP) transcription factors act in networks of homo- and hetero-dimers in the regulation in a diverse set of cellular pathways. Classical leucine zippers have alpha helices with leucine residues 7 amino acids apart, which stabilize dimerization with a parallel leucine zipper domain. Dimerization creates a pair of basic regions that bind DNA and undergo conformational change. GCN4 was identified in Saccharomyces cerevisiae from mutations in a deficiency in activation with the general amino acid control pathway. GCN4 encodes a trans-activator of amino acid biosynthetic genes containing 2 acidic activation domains and a C-terminal bZIP domain, comprised of a basic alpha-helical DNA-binding region and a coiled-coil dimerization region." Q#25913 - CGI_10008700 superfamily 197676 90 114 0.0022606 36.2897 cl18194 ZnF_C2H2 superfamily - - zinc finger; zinc finger. Q#25916 - CGI_10008703 superfamily 247068 841 938 6.70E-19 84.6725 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 458 549 4.23E-17 79.2797 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 1064 1159 1.75E-15 74.6573 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 263 352 2.03E-13 68.8793 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 1167 1267 3.70E-13 68.1089 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 1275 1381 1.27E-09 57.3234 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 361 450 1.62E-08 54.2418 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 551 610 2.00E-08 53.8566 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 635 712 3.94E-07 50.0046 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 1406 1446 5.96E-07 49.6194 cl15786 CA_like superfamily C - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 946 1053 4.01E-06 46.923 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 795 830 0.000532012 40.3746 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25916 - CGI_10008703 superfamily 247068 121 166 0.00497101 37.3291 cl15786 CA_like superfamily N - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#25917 - CGI_10008704 superfamily 242323 406 515 3.17E-19 83.7119 cl01132 FA_hydroxylase superfamily - - "Fatty acid hydroxylase superfamily; This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins." Q#25918 - CGI_10008705 superfamily 244824 90 531 1.06E-135 408.157 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#25919 - CGI_10008706 superfamily 241563 101 137 3.47E-06 44.7776 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25920 - CGI_10008707 superfamily 241782 36 459 0 628.078 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#25921 - CGI_10008708 superfamily 241648 8 35 0.000152378 38.5531 cl00158 ZnF_GATA superfamily C - Zinc finger DNA binding domain; binds specifically to DNA consensus sequence [AT]GATA[AG] promoter elements; a subset of family members may also bind protein; zinc-finger consensus topology is C-X(2)-C-X(17)-C-X(2)-C Q#25922 - CGI_10008709 superfamily 221997 11 102 3.94E-05 37.8018 cl18631 Complex1_LYR_2 superfamily - - "Complex1_LYR-like; This is a family of proteins carrying the LYR motif of family Complex1_LYR, pfam05347, likely to be involved in Fe-S cluster biogenesis in mitochondria." Q#25923 - CGI_10008710 superfamily 217940 9 74 4.52E-09 50.3709 cl18433 TAP42 superfamily C - TAP42-like family; The TOR signalling pathway activates a cell-growth program in response to nutrients. TIP41 (pfam04176) interacts with TAP42 and negatively regulates the TOR signaling pathway. Q#25924 - CGI_10007951 superfamily 243061 38 94 1.87E-05 39.3723 cl02509 SRCR superfamily NC - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25925 - CGI_10007952 superfamily 221913 683 864 2.46E-40 150.383 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#25925 - CGI_10007952 superfamily 247743 241 281 0.00487696 38.2811 cl17189 AAA superfamily C - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#25926 - CGI_10007953 superfamily 243061 2627 2727 4.51E-37 138.244 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1852 1953 1.87E-21 93.1754 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1532 1629 2.76E-21 92.7902 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 2185 2289 1.03E-20 90.8642 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 907 1008 1.49E-20 90.479 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1633 1733 1.33E-19 87.7826 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 2419 2507 2.06E-19 87.3974 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1749 1843 9.25E-19 85.4714 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 580 678 1.38E-18 84.701 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 2074 2171 8.49E-18 82.3898 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1207 1309 1.81E-16 78.5378 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 233 299 1.18E-15 76.2266 cl02509 SRCR superfamily C - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 819 900 1.77E-15 75.8414 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1433 1521 7.65E-15 73.9154 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 116 206 9.99E-14 70.4486 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 425 526 4.61E-13 68.5226 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 335 422 1.63E-12 66.9818 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 696 791 2.52E-12 66.3362 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1 61 2.77E-12 66.2114 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1956 2055 5.00E-12 65.441 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 2304 2395 5.44E-12 65.441 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1312 1412 6.03E-12 65.441 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 2516 2618 1.99E-11 63.9002 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1105 1197 3.99E-07 50.8034 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25926 - CGI_10007953 superfamily 243061 1017 1099 8.26E-06 46.9514 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#25927 - CGI_10007954 superfamily 243040 277 375 7.84E-19 82.9398 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#25927 - CGI_10007954 superfamily 245814 192 265 8.80E-07 47.0987 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25927 - CGI_10007954 superfamily 245814 100 160 1.59E-06 46.3283 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25927 - CGI_10007954 superfamily 245814 512 580 1.17E-08 52.8929 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25927 - CGI_10007954 superfamily 245814 407 490 1.11E-07 49.8113 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25927 - CGI_10007954 superfamily 245814 41 65 0.00500649 35.4357 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#25928 - CGI_10007955 superfamily 241758 5 147 8.63E-20 80.4918 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#25929 - CGI_10007956 superfamily 241758 150 248 1.02E-15 70.8618 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#25929 - CGI_10007956 superfamily 241758 48 117 2.71E-08 50.061 cl00292 AANH_like superfamily N - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#25932 - CGI_10007959 superfamily 241659 10 87 1.06E-26 101.579 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#25932 - CGI_10007959 superfamily 243034 272 377 3.03E-16 73.5683 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#25933 - CGI_10007960 superfamily 247684 1 427 1.44E-106 328.468 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25934 - CGI_10007961 superfamily 241563 44 81 5.99E-08 49.7852 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#25936 - CGI_10027339 superfamily 248264 163 202 0.000566725 37.987 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#25938 - CGI_10027341 superfamily 244843 81 613 1.44E-122 376.571 cl08040 Ggt superfamily - - Gamma-glutamyltransferase [Amino acid transport and metabolism] Q#25939 - CGI_10027342 superfamily 243066 28 121 1.50E-35 128.44 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#25939 - CGI_10027342 superfamily 219619 375 427 2.19E-07 48.7432 cl18518 Ion_trans_2 superfamily N - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#25940 - CGI_10027343 superfamily 245225 93 418 3.21E-16 79.4749 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#25940 - CGI_10027343 superfamily 247986 445 551 5.38E-08 53.1458 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#25940 - CGI_10027343 superfamily 197504 658 788 5.11E-21 90.8116 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#25941 - CGI_10027344 superfamily 247986 334 411 2.46E-07 50.8346 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#25941 - CGI_10027344 superfamily 197504 540 677 3.17E-31 119.316 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#25943 - CGI_10027346 superfamily 204985 4 95 1.65E-28 114.576 cl14987 Chorein_N superfamily C - "N-terminal region of Chorein, a TM vesicle-mediated sorter; Although mutations in the full-length vacuolar protein sorting 13A (VPS13A) protein in vertebrates lead to the disease of chorea-acanthocytosis, the exact function of any of the regions within the protein is not yet known. This region is the proposed leucine zipper at the N-terminus. The full-length protein is a transmembrane protein with a presumed role in vesicle-mediated sorting and intracellular protein transport." Q#25943 - CGI_10027346 superfamily 219122 2504 2585 4.63E-09 59.2303 cl05933 DUF1162 superfamily NC - Protein of unknown function (DUF1162); This family represents a conserved region within several hypothetical eukaryotic proteins. Family members might be vacuolar protein sorting related-proteins. Q#25944 - CGI_10027347 superfamily 243092 9 320 3.31E-69 234.922 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#25944 - CGI_10027347 superfamily 219240 816 1229 0 552.935 cl18502 COPI_C superfamily - - Coatomer (COPI) alpha subunit C-terminus; This family represents the C-terminus (approximately 500 residues) of the eukaryotic coatomer alpha subunit. Coatomer (COPI) is a large cytosolic protein complex which forms a coat around vesicles budding from the Golgi apparatus. Such coatomer-coated vesicles have been proposed to play a role in many distinct steps of intracellular transport. Note that many family members also contain the pfam04053 domain. Q#25945 - CGI_10027348 superfamily 241762 200 253 8.68E-23 89.9693 cl00297 R3H superfamily - - "R3H domain. The name of the R3H domain comes from the characteristic spacing of the most conserved arginine and histidine residues. R3H domains are found in proteins together with ATPase domains, SF1 helicase domains, SF2 DEAH helicase domains, Cys-rich repeats, ring-type zinc fingers, and KH domains. The function of the domain is predicted to bind ssDNA or ssRNA in a sequence-specific manner." Q#25947 - CGI_10027350 superfamily 245201 271 566 6.59E-70 231.654 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#25949 - CGI_10027352 superfamily 215821 55 152 1.00E-25 96.5406 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#25952 - CGI_10027355 superfamily 247856 125 182 4.11E-12 62.9505 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#25955 - CGI_10027358 superfamily 243660 57 113 0.0012898 37.2841 cl04176 TDT superfamily N - "The Tellurite-resistance/Dicarboxylate Transporter (TDT) family; The Tellurite-resistance/Dicarboxylate Transporter (TDT) family includes members from all three kingdoms, but only three members of the family have been functionally characterized: the TehA protein of E. coli functioning as a tellurite-resistance uptake permease, the Mae1 protein of S. pombe functioning in the uptake of malate and other dicarboxylates, and the sulfite efflux pump (SSU1) of Saccharomyces cerevisiae. In plants, the plasma membrane protein SLAC1 (Slow Anion Channel-Associated 1), which is preferentially expressed in guard cells, encodes a distant homolog of fungal and bacterial dicarboxylate/malic acid transport proteins. SLAC1 is essential in mediating stomatal responses to physiological and stress stimuli. Members of the TDT family exhibit 10 putative transmembrane alpha-helical spanners (TMSs)." Q#25957 - CGI_10027360 superfamily 244539 51 304 1.23E-116 338.387 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#25959 - CGI_10027362 superfamily 241659 38 119 0.00283733 33.3907 cl00175 alpha-crystallin-Hsps_p23-like superfamily N - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#25960 - CGI_10027363 superfamily 241782 63 319 4.40E-25 103.7 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#25961 - CGI_10027364 superfamily 241782 6 139 3.76E-12 63.6389 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#25962 - CGI_10027365 superfamily 241782 70 279 6.70E-13 67.8761 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#25964 - CGI_10027367 superfamily 247057 506 551 6.12E-05 41.1336 cl15755 SAM_superfamily superfamily N - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#25965 - CGI_10027368 superfamily 221744 92 176 0.00228637 36.2599 cl18614 CABIT superfamily N - "Cell-cycle sustaining, positive selection,; The 'CABIT' domain (for 'cysteine-containing, all- in Themis') is found in a newly identified gene family that has three mammalian homologues (Themis, Icb1 and 9130404H23Rik) that encode proteins with two CABIT domains and a highly conserved proline-rich region. In contrast, Fam59A, Fam59B and related proteins from mammals to cnidarians, including the insect Serrano proteins, have a single copy of the CABIT domain, a proline-rich region and often a C-terminal SAM (sterile-motif) domain. Multiple-sequence alignment has predicted that the CABIT domain adopts an all-strand structure with at least 12 strands, ie a dyad of six-stranded beta-barrel units. The CABIT domain contains a nearly absolutely conserved cysteine residue which is likely to be central to its function. CABIT domain proteins function downstream of tyrosine kinase signalling and interact with GRB2." Q#25966 - CGI_10027369 superfamily 217160 44 346 6.03E-66 220.168 cl18393 DUF187 superfamily - - "Uncharacterized BCR, COG1649; Uncharacterized BCR, COG1649. " Q#25972 - CGI_10027375 superfamily 217915 1144 1380 4.03E-43 166.528 cl14957 Spc97_Spc98 superfamily N - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#25972 - CGI_10027375 superfamily 217915 299 594 3.03E-38 151.505 cl14957 Spc97_Spc98 superfamily C - Spc97 / Spc98 family; The spindle pole body (SPB) functions as the microtubule-organising centre in yeast. Members of this family are spindle pole body (SBP) components such as Spc97 and Spc98 that form a complex with gamma-tubulin. This family of proteins includes the grip motif 1 and grip moti 2. Q#25972 - CGI_10027375 superfamily 247746 585 627 0.00490008 37.8332 cl17192 ATP-synt_B superfamily NC - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#25975 - CGI_10027378 superfamily 241578 7 146 1.29E-14 67.7018 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#25976 - CGI_10027379 superfamily 241626 66 126 1.45E-06 43.057 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#25977 - CGI_10027380 superfamily 241832 4 121 2.93E-48 152.5 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#25978 - CGI_10027381 superfamily 220692 1 275 1.08E-11 63.3773 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#25980 - CGI_10027383 superfamily 247804 603 645 4.06E-07 48.3406 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#25980 - CGI_10027383 superfamily 203011 429 513 2.46E-38 138.879 cl04515 SWIRM superfamily - - SWIRM domain; This SWIRM domain is a small alpha-helical domain of about 85 amino acid residues found in chromosomal proteins. It contains a helix-turn helix motif and binds to DNA. Q#25981 - CGI_10027384 superfamily 242876 13 152 8.03E-33 115.914 cl02092 Clat_adaptor_s superfamily - - Clathrin adaptor complex small chain; Clathrin adaptor complex small chain. Q#25982 - CGI_10027385 superfamily 241599 48 104 2.57E-16 68.424 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25985 - CGI_10027388 superfamily 241599 175 231 5.37E-21 83.832 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#25986 - CGI_10027390 superfamily 241750 120 178 3.15E-15 69.9804 cl00281 metallo-dependent_hydrolases superfamily NC - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#25986 - CGI_10027390 superfamily 241750 63 108 1.16E-09 54.5261 cl00281 metallo-dependent_hydrolases superfamily NC - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#25987 - CGI_10027391 superfamily 247724 16 172 4.86E-17 76.4948 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25989 - CGI_10027393 superfamily 247755 11 92 5.62E-27 101.568 cl17201 ABC_ATPase superfamily C - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#25990 - CGI_10027394 superfamily 247743 351 493 0.000585778 40.2071 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#25991 - CGI_10027395 superfamily 247684 37 389 0 703.685 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#25992 - CGI_10027396 superfamily 247724 20 179 3.25E-68 209.055 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#25993 - CGI_10027397 superfamily 243033 8 84 1.08E-06 42.6906 cl02428 Ependymin superfamily N - Ependymin; Ependymin. Q#25995 - CGI_10027399 superfamily 177822 249 472 1.42E-15 76.4973 cl18088 PLN02164 superfamily N - sulfotransferase Q#25996 - CGI_10027400 superfamily 177822 389 612 1.20E-15 77.2677 cl18088 PLN02164 superfamily N - sulfotransferase Q#25996 - CGI_10027400 superfamily 177822 62 285 4.21E-15 75.7269 cl18088 PLN02164 superfamily N - sulfotransferase Q#25997 - CGI_10027401 superfamily 243074 215 251 1.91E-05 41.3381 cl02535 F-box-like superfamily C - F-box-like; This is an F-box-like family. Q#25998 - CGI_10027402 superfamily 218900 2 251 5.90E-99 291.879 cl05572 Aph-1 superfamily - - "Aph-1 protein; This family consists of several eukaryotic Aph-1 proteins.Gamma-secretase catalyzes the intramembrane proteolysis of Notch, beta-amyloid precursor protein, and other substrates as part of a new signaling paradigm and as a key step in the pathogenesis of Alzheimer's disease. It is thought that the presenilin heterodimer comprises the catalytic site and that a highly glycosylated form of nicastrin associates with it. Aph-1 and Pen-2, two membrane proteins genetically linked to gamma-secretase, associate directly with presenilin and nicastrin in the active protease complex. Co-expression of all four proteins leads to marked increases in presenilin heterodimers, full glycosylation of nicastrin, and enhanced gamma-secretase activity." Q#25999 - CGI_10027403 superfamily 245202 27 98 5.32E-32 114.5 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#26001 - CGI_10027405 superfamily 243169 169 256 1.47E-40 145.366 cl02766 NGN superfamily - - "N-Utilization Substance G (NusG) N-terminal (NGN) domain Superfamily; The N-Utilization Substance G (NusG) and its eukaryotic homolog Spt5 are involved in transcription elongation and termination. NusG contains an NGN domain at its N-terminus and Kyrpides Ouzounis and Woese (KOW) repeats at its C-terminus in bacteria and archaea. The eukaryotic ortholog, Spt5, is a large protein composed of an acidic N-terminus, an NGN domain, and multiple KOW motifs at its C-terminus. Spt5 forms a Spt4-Spt5 complex that is an essential RNA Polymerase II elongation factor. NusG was originally discovered as an N-dependent antitermination enhancing activity in Escherichia coli and has a variety of functions, such as being involved in RNA polymerase elongation and Rho-termination in bacteria. Orthologs of the NusG gene exist in all bacteria, but its functions and requirements are different. The diverse activities suggest that, after diverging from a common ancestor, NusG proteins became specialized in different bacteria." Q#26001 - CGI_10027405 superfamily 241810 463 513 6.88E-26 102.604 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#26001 - CGI_10027405 superfamily 241810 695 746 2.72E-24 97.9405 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#26001 - CGI_10027405 superfamily 241810 414 462 9.61E-22 90.639 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#26001 - CGI_10027405 superfamily 241810 989 1043 9.65E-22 91.0386 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#26001 - CGI_10027405 superfamily 241810 589 631 4.77E-19 82.9529 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#26001 - CGI_10027405 superfamily 241810 268 304 1.55E-15 72.5014 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#26001 - CGI_10027405 superfamily 248304 767 860 3.15E-16 76.7257 cl17750 CTD superfamily - - "Spt5 C-terminal nonapeptide repeat binding Spt4; The C-terminal domain of the transcription elongation factor protein Spt5 is necessary for binding to Spt4 to form the functional complex that regulates early transcription elongation by RNA polymerase II. The complex may be involved in pre-mRNA processing through its association with mRNA capping enzymes. This CTD domain carries a regular nonapeptide repeat that can be present in up to 18 copies, as in S. pombe. The repeat has a characteristic TPA motif." Q#26001 - CGI_10027405 superfamily 221333 104 163 0.000372099 40.1181 cl13391 Spt5_N superfamily N - "Spt5 transcription elongation factor, acidic N-terminal; This is the very acidic N-terminal region of the early transcription elongation factor Spt5. The Spt5-Spt4 complex regulates early transcription elongation by RNA polymerase II and has an imputed role in pre-mRNA processing via its physical association with mRNA capping enzymes. The actual function of this N-terminal domain is not known although it is dispensable for binding to Spt4." Q#26003 - CGI_10027407 superfamily 245201 89 338 1.70E-56 188.861 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26006 - CGI_10027410 superfamily 202085 54 94 6.87E-09 52.7478 cl03401 zf-CXXC superfamily - - "CXXC zinc finger domain; This domain contains eight conserved cysteine residues that bind to two zinc ions. The CXXC domain is found in a variety of chromatin-associated proteins. This domain binds to nonmethyl-CpG dinucleotides. The domain is characterized by two CGXCXXC repeats. The RecQ helicase has a single repeat that also binds to zinc, but this has not been included in this family. The DNA binding interface has been identified by NMR." Q#26007 - CGI_10027411 superfamily 219165 125 313 2.96E-66 223.331 cl06019 LMF1 superfamily C - "Lipase maturation factor; This family of transmembrane proteins includes the lipase maturation factor, LMF1. Lipoprotein lipase and hepatic lipase require LMF1 to fold into their active states. The precise role of LMF1 in lipase folding has yet to be determined." Q#26007 - CGI_10027411 superfamily 219165 353 580 3.12E-59 204.457 cl06019 LMF1 superfamily N - "Lipase maturation factor; This family of transmembrane proteins includes the lipase maturation factor, LMF1. Lipoprotein lipase and hepatic lipase require LMF1 to fold into their active states. The precise role of LMF1 in lipase folding has yet to be determined." Q#26008 - CGI_10027412 superfamily 247724 613 711 1.77E-07 51.1828 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26008 - CGI_10027412 superfamily 246925 333 438 0.00597282 38.8758 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#26009 - CGI_10027413 superfamily 218970 52 560 1.43E-78 259.882 cl09405 DUF1032 superfamily - - Protein of unknown function (DUF1032); This family consists of several conserved eukaryotic proteins of unknown function. Q#26010 - CGI_10027414 superfamily 248012 187 320 7.93E-21 86.2232 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#26011 - CGI_10027415 superfamily 248012 113 238 1.70E-17 75.8228 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#26012 - CGI_10027416 superfamily 247057 698 769 2.84E-39 141.055 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#26012 - CGI_10027416 superfamily 247057 610 672 5.33E-35 128.581 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#26012 - CGI_10027416 superfamily 247057 536 598 5.73E-31 117.329 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#26014 - CGI_10027418 superfamily 241763 8 209 4.03E-83 248.308 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#26018 - CGI_10027423 superfamily 243066 24 122 5.97E-11 59.9385 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#26020 - CGI_10003966 superfamily 243051 449 603 9.79E-34 125.953 cl02479 MAM superfamily - - "Meprin, A5 protein, and protein tyrosine phosphatase Mu (MAM) domain. MAM is an extracellular domain which mediates protein-protein interactions and is found in a diverse set of proteins, many of which are known to function in cell adhesion. Members include: type IIB receptor protein tyrosine phosphatases (such as RPTPmu), meprins (plasma membrane metalloproteases), neuropilins (receptors of secreted semaphorins), and zonadhesins (sperm-specific membrane proteins which bind to the extracellular matrix of the egg). In meprin A and neuropilin-1 and -2, MAM is involved in homo-oligomerization. In RPTPmu, it has been associated with both homophilic adhesive (trans) interactions and lateral (cis) receptor oligomerization. In a GPI-anchored protein that is expressed in cells in the embryonic chicken spinal chord, MDGA1, the MAM domain has been linked to heterophilic interactions with axon-rich region." Q#26020 - CGI_10003966 superfamily 245213 179 211 6.89E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26021 - CGI_10003967 superfamily 246976 1 371 4.50E-68 229.22 cl15483 Dymeclin superfamily C - "Dyggve-Melchior-Clausen syndrome protein; Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. Mutations in the gene coding for this protein in humans give rise to the disorder Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800) which is an autosomal-recessive disorder characterized by the association of a spondylo-epi-metaphyseal dysplasia and mental retardation. DYM transcripts are widely expressed throughout human development and Dymeclin is not an integral membrane protein of the ER, but rather a peripheral membrane protein dynamically associated with the Golgi apparatus." Q#26022 - CGI_10003968 superfamily 246976 33 265 2.84E-72 236.154 cl15483 Dymeclin superfamily N - "Dyggve-Melchior-Clausen syndrome protein; Dymeclin (Dyggve-Melchior-Clausen syndrome protein) contains a large number of leucine and isoleucine residues and a total of 17 repeated dileucine motifs. It is characteristically about 700 residues long and present in plants and animals. Mutations in the gene coding for this protein in humans give rise to the disorder Dyggve-Melchior-Clausen syndrome (DMC, MIM 223800) which is an autosomal-recessive disorder characterized by the association of a spondylo-epi-metaphyseal dysplasia and mental retardation. DYM transcripts are widely expressed throughout human development and Dymeclin is not an integral membrane protein of the ER, but rather a peripheral membrane protein dynamically associated with the Golgi apparatus." Q#26023 - CGI_10003969 superfamily 247097 5 39 0.00769078 31.2674 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#26024 - CGI_10003970 superfamily 219653 105 366 7.40E-124 360.868 cl06813 N2227 superfamily - - N2227-like protein; This family features sequences that are similar to a region of hypothetical yeast gene product N2227. This is thought to be expressed during meiosis and may be involved in the defence response to stressful conditions. Q#26025 - CGI_10003971 superfamily 241574 34 123 1.55E-06 47.1953 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#26025 - CGI_10003971 superfamily 241574 240 308 4.45E-05 42.9582 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#26027 - CGI_10003973 superfamily 110440 128 154 0.000697326 35.0761 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#26029 - CGI_10006607 superfamily 215724 7 321 2.31E-170 478.652 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#26031 - CGI_10006609 superfamily 247986 44 269 6.11E-13 65.4722 cl17432 PBPb superfamily - - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#26033 - CGI_10006611 superfamily 149105 191 272 0.00798519 36.6441 cl12353 TMPIT superfamily C - "TMPIT-like protein; A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this." Q#26035 - CGI_10003936 superfamily 247856 608 666 7.20E-06 44.8461 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#26036 - CGI_10003937 superfamily 247724 15 175 1.16E-57 180.85 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26037 - CGI_10003938 superfamily 247724 1 107 5.95E-31 109.974 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26038 - CGI_10003939 superfamily 247724 15 176 7.04E-66 202.421 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26039 - CGI_10003940 superfamily 247724 7 76 1.34E-24 91.0988 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26040 - CGI_10003941 superfamily 241874 21 566 0 689.603 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#26041 - CGI_10003942 superfamily 148169 60 263 5.88E-30 112.515 cl05744 Alpha-2-MRAP_C superfamily - - "Alpha-2-macroglobulin RAP, C-terminal domain; The alpha-2-macroglobulin receptor-associated protein (RAP) is a intracellular glycoprotein that binds to the 2-macroglobulin receptor and other members of the low density lipoprotein receptor family. The protein inhibits binding of all currently known ligands of these receptors. Two different studies have provided conflicted domain boundaries." Q#26042 - CGI_10003943 superfamily 246680 6 88 1.65E-22 86.5303 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#26044 - CGI_10000002 superfamily 247684 18 110 6.39E-54 174.503 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#26045 - CGI_10000003 superfamily 241600 1 74 4.47E-20 79.9771 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#26051 - CGI_10000012 superfamily 245839 12 130 9.13E-53 167.782 cl12020 Anticodon_Ia_like superfamily - - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#26051 - CGI_10000012 superfamily 245839 106 156 1.54E-05 42.5385 cl12020 Anticodon_Ia_like superfamily N - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#26052 - CGI_10000013 superfamily 243184 52 145 4.29E-31 109.741 cl02786 Translation_factor_III superfamily - - "Domain III of Elongation factor (EF) Tu (EF-TU) and EF-G. Elongation factors (EF) EF-Tu and EF-G participate in the elongation phase during protein biosynthesis on the ribosome. Their functional cycles depend on GTP binding and its hydrolysis. The EF-Tu complexed with GTP and aminoacyl-tRNA delivers tRNA to the ribosome, whereas EF-G stimulates translocation, a process in which tRNA and mRNA movements occur in the ribosome. Experimental data showed that: (1) intrinsic GTPase activity of EF-G is influenced by excision of its domain III; (2) that EF-G lacking domain III has a 1,000-fold decreased GTPase activity on the ribosome and, a slightly decreased affinity for GTP; and (3) EF-G lacking domain III does not stimulate translocation, despite the physical presence of domain IV which is also very important for translocation. These findings indicate an essential contribution of domain III to activation of GTP hydrolysis. Domains III and V of EF-G have the same fold (although they are not completely superimposable), the double split beta-alpha-beta fold. This fold is observed in a large number of ribonucleotide binding proteins and is also referred to as the ribonucleoprotein (RNP) or RNA recognition (RRM) motif. This domain III is found in several elongation factors, as well as in peptide chain release factors and in GT-1 family of GTPase (GTPBP1)." Q#26052 - CGI_10000013 superfamily 243185 3 47 4.95E-15 66.7623 cl02787 Translation_Factor_II_like superfamily N - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#26053 - CGI_10000015 superfamily 149349 2 90 1.23E-17 71.8922 cl07026 SRI superfamily - - "SRI (Set2 Rpb1 interacting) domain; The SRI (Set2 Rpb1 interacting) domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation. This domain is conserved from yeast to humans. Members of this family form a compact, closed three-helix bundle, with an up-down-up topology. The first and second helices are antiparallel to each other and are of similar length; the third helix, which is packed across helices alpha1 and alpha2 is slightly shorter, consisting of only 15 amino acids. Most conserved hydrophobic residues are largely buried in the interior of the structure and form an extensive and contiguous hydrophobic core that stabilises the packing of the three-helix bundle. This domain mediates RNA polymerase II interaction and couples histone H3 K36 methylation with transcript elongation." Q#26054 - CGI_10000016 superfamily 243035 1 77 9.64E-15 64.1781 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26055 - CGI_10000017 superfamily 242113 1 117 2.22E-26 99.7086 cl00814 Cyclase superfamily N - Putative cyclase; Proteins in this family are thought to be cyclase enzymes. They are found in proteins involved in antibiotic synthesis. However they are also found in organisms that do not make antibiotics pointing to a wider role for these proteins. The proteins contain a conserved motif HXGTHXDXPXH that is likely to form part of the active site. Q#26058 - CGI_10000019 superfamily 222090 50 141 9.79E-08 48.423 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#26059 - CGI_10000021 superfamily 245864 50 105 1.57E-10 56.1326 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#26064 - CGI_10000029 superfamily 220672 5 63 3.37E-09 50.323 cl10957 Frag1 superfamily C - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#26065 - CGI_10000030 superfamily 246722 1 83 2.61E-58 177.816 cl14812 PIN_SF superfamily N - "PIN (PilT N terminus) domain: Superfamily; PIN_SF The PIN (PilT N terminus) domain belongs to a large nuclease superfamily with representatives from eukaryota, eubacteria, and archaea. PIN domains were originally named for their sequence similarity to the N-terminal domain of an annotated pili biogenesis protein, PilT, a domain fusion between a PIN-domain and a PilT ATPase domain. The structural properties of the PIN domain indicate its putative active center, consisting of invariant acidic amino acid residues (putative metal-binding residues) is geometrically similar in the active center of structure-specific 5' nucleases (also known as Flap endonuclease-1-like), PIN-domain ribonucleases of eukaryotic rRNA editing proteins, and bacterial toxins of toxin-antitoxin (TA) operons. Seen here, are two major divisions in the PIN domain superfamily. The first major division, the structure-specific 5' nuclease family, is represented by FEN1, the 5'-3' exonuclease of DNA polymerase I, and T4 RNase H nuclease PIN domains. These 5' nucleases are involved in DNA replication, repair, and recombination. They are capable of both 5'-3' exonucleolytic activity and cleaving bifurcated DNA, in an endonucleolytic, structure-specific manner. Unique to FEN1-like nucleases, the PIN domain has a helical arch/clamp region (I domain) of variable length (approximately 16 to 800 residues) and, inserted within the C-terminal region of the PIN domain, a H3TH (helix-3-turn-helix) domain, an atypical helix-hairpin-helix-2-like region. Both the H3TH domain (not included here) and the helical arch/clamp region are involved in DNA binding. With the exception of Mkt1, these nucleases have a carboxylate rich active site that is involved in binding essential divalent metal ion cofactors (Mg2+, Mn2+, Zn2+, or Co2+). The second major division of the PIN domain superfamily, the VapC-Smg6 family, includes such eukaryotic ribonucleases as, Smg6, an essential factor in nonsense-mediated mRNA decay; Rrp44, the catalytic subunit of the exosome; and Nob1, a ribosome assembly factor critical in pre-rRNA processing. A large percentage of members in this family are bacterial ribonuclease toxins of TA operons such as Mycobacterium tuberculosis VapC and Neisseria gonorrhoeae FitB, as well as, archaeal homologs, Pyrobaculum aerophilum Pea0151 and P. aerophilum Pae2754. Also included are the eukaryotic Fcf1/ Utp24 (FAF1-copurifying factor 1/U three-associated protein 24) and Utp23-like proteins. Components of the small subunit processome, Fcf1/Utp24 and Utp23 are essential proteins involved in pre-rRNA processing and 40S ribosomal subunit assembly." Q#26067 - CGI_10000032 superfamily 247724 11 158 2.00E-41 139.216 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26068 - CGI_10000033 superfamily 215779 1 61 1.03E-19 77.5511 cl02819 Ribosomal_S3_C superfamily N - "Ribosomal protein S3, C-terminal domain; This family contains a central domain pfam00013, hence the amino and carboxyl terminal domains are stored separately. This is a minimal carboxyl-terminal domain. Some are much longer." Q#26070 - CGI_10000035 superfamily 241599 177 227 1.42E-10 55.7125 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#26073 - CGI_10000038 superfamily 241782 1 147 1.09E-52 175.647 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#26075 - CGI_10000041 superfamily 241552 3 47 8.73E-11 52.3756 cl00017 Cyt_c_Oxidase_VIa superfamily N - "Cytochrome c oxidase subunit VIa. Cytochrome c oxidase (CcO), the terminal oxidase in the respiratory chains of eukaryotes and most bacteria, is a multi-chain transmembrane protein located in the inner membrane of mitochondria and the cell membrane of prokaryotes. It catalyzes the reduction of O2 and simultaneously pumps protons across the membrane. The number of subunits varies from three to five in bacteria and up to 13 in mammalian mitochondria. Subunits I, II, and III of mammalian CcO are encoded within the mitochondrial genome and the remaining 10 subunits are encoded within the nuclear genome. Found only in eukaryotes, subunit VIa is expressed in two tissue-specific isoforms in mammals but not fish. VIa-H is the heart and skeletal muscle isoform; VIa-L is the liver or non-muscle isoform. Mammalian VIa-H induces a slip in CcO (decrease in proton/electron stoichiometry) at high intramitochondrial ATP/ADP ratios, while VIa-L induces a permanent slip in CcO, depending on the presence of cardiolipin and palmitate." Q#26076 - CGI_10000040 superfamily 245864 3 35 3.65E-10 53.8214 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#26078 - CGI_10000044 superfamily 110440 85 111 0.00619405 32.3797 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#26080 - CGI_10000045 superfamily 241832 15 103 2.48E-32 109.937 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#26082 - CGI_10000047 superfamily 192703 1 133 1.41E-60 185.552 cl12637 YbhQ superfamily - - Putative inner membrane protein YbhQ; This family is conserved in Proteobacteria. The function is not known but most members are annotated as being inner membrane protein YbhQ. Q#26083 - CGI_10000049 superfamily 242042 1 67 2.24E-41 131.032 cl00712 RNA_pol_N superfamily - - RNA polymerases N / 8 kDa subunit; RNA polymerases N / 8 kDa subunit. Q#26085 - CGI_10000050 superfamily 221911 1 45 7.46E-12 58.3539 cl18625 Fer2_3 superfamily N - 2Fe-2S iron-sulfur cluster binding domain; The 2Fe-2S ferredoxin family have a general core structure consisting of beta(2)-alpha-beta(2) which abeta-grasp type fold. The domain is around one hundred amino acids with four conserved cysteine residues to which the 2Fe-2S cluster is ligated. Q#26086 - CGI_10000052 superfamily 241874 1 212 2.06E-46 161.848 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#26090 - CGI_10000055 superfamily 241592 1 136 1.63E-82 240.962 cl00074 H2A superfamily - - "Histone 2A; H2A is a subunit of the nucleosome. The nucleosome is an octamer containing two H2A, H2B, H3, and H4 subunits. The H2A subunit performs essential roles in maintaining structural integrity of the nucleosome, chromatin condensation, and binding of specific chromatin-associated proteins." Q#26091 - CGI_10000056 superfamily 243058 4 89 0.00922513 31.516 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#26092 - CGI_10000057 superfamily 222090 51 238 2.75E-17 77.313 cl18636 Methyltransf_22 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#26093 - CGI_10000058 superfamily 203444 19 59 2.46E-15 66.5527 cl05761 PRP1_N superfamily C - "PRP1 splicing factor, N-terminal; This domain is specific to the N-terminal part of the prp1 splicing factor, which is involved in mRNA splicing (and possibly also poly(A)+ RNA nuclear export and cell cycle progression). This domain is specific to the N terminus of the RNA splicing factor encoded by prp1. It is involved in mRNA splicing and possibly also poly(A)and RNA nuclear export and cell cycle progression." Q#26097 - CGI_10000062 superfamily 243064 19 62 1.53E-05 38.1087 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#26098 - CGI_10000063 superfamily 241594 1 158 1.22E-69 216.277 cl00077 HECTc superfamily N - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#26099 - CGI_10000064 superfamily 243141 1 35 4.35E-06 39.9922 cl02687 RWD superfamily C - "RWD domain; This domain was identified in WD40 repeat proteins and Ring finger domain proteins. The function of this domain is unknown. GCN2 is the alpha-subunit of the only translation initiation factor (eIF2 alpha) kinase that appears in all eukaryotes. Its function requires an interaction with GCN1 via the domain at its N-terminus, which is termed the RWD domain after three major RWD-containing proteins: RING finger-containing proteins, WD-repeat-containing proteins, and yeast DEAD (DEXD)-like helicases. The structure forms an alpha + beta sandwich fold consisting of two layers: a four-stranded antiparallel beta-sheet, and three side-by-side alpha-helices." Q#26100 - CGI_10000067 superfamily 247044 1 94 9.34E-33 112.344 cl15697 ADF_gelsolin superfamily - - Actin depolymerization factor/cofilin- and gelsolin-like domains; Actin depolymerization factor/cofilin-like domains are present in a family of essential eukaryotic actin regulatory proteins; these proteins enhance the turnover rate of actin and interact with actin monomers as well as actin filaments. Q#26101 - CGI_10000069 superfamily 248289 39 96 0.00122727 34.3219 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#26102 - CGI_10000070 superfamily 248469 47 153 1.45E-13 63.9283 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#26104 - CGI_10000071 superfamily 245010 2 88 1.80E-15 66.6417 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#26106 - CGI_10000073 superfamily 245206 4 274 1.29E-103 303.775 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#26107 - CGI_10000074 superfamily 245013 13 63 0.00886898 31.8469 cl09115 Ribosomal_L32p superfamily C - Ribosomal L32p protein family; Ribosomal L32p protein family. Q#26108 - CGI_10000075 superfamily 194545 1 89 2.08E-62 186.166 cl03131 Dynein_light superfamily - - Dynein light chain type 1; Dynein light chain type 1. Q#26111 - CGI_10000078 superfamily 247741 1 131 1.02E-69 215.196 cl17187 Aldolase_Class_I superfamily N - "Class I aldolases; Class I aldolases. The class I aldolases use an active-site lysine which stabilizes a reaction intermediates via Schiff base formation, and have TIM beta/alpha barrel fold. The members of this family include 2-keto-3-deoxy-6-phosphogluconate (KDPG) and 2-keto-4-hydroxyglutarate (KHG) aldolases, transaldolase, dihydrodipicolinate synthase sub-family, Type I 3-dehydroquinate dehydratase, DeoC and DhnA proteins, and metal-independent fructose-1,6-bisphosphate aldolase. Although structurally similar, the class II aldolases use a different mechanism and are believed to have an independent evolutionary origin." Q#26112 - CGI_10000080 superfamily 242849 3 76 4.29E-27 96.1188 cl02041 Cyt-b5 superfamily - - Cytochrome b5-like Heme/Steroid binding domain; This family includes heme binding domains from a diverse range of proteins. This family also includes proteins that bind to steroids. The family includes progesterone receptors. Many members of this subfamily are membrane anchored by an N-terminal transmembrane alpha helix. This family also includes a domain in some chitin synthases. There is no known ligand for this domain in the chitin synthases. Q#26115 - CGI_10000082 superfamily 198827 1 89 6.91E-56 169.532 cl03803 BAF superfamily - - Barrier to autointegration factor; The BAF protein has a SAM-domain-like bundle of orthogonally packed alpha-hairpins - one classic and one pseudo helix-hairpin-helix motif. The protein is involved in the prevention of retroviral DNA integration. Q#26116 - CGI_10000077 superfamily 241788 16 106 1.67E-14 64.4342 cl00327 Ribosomal_L22 superfamily C - "Ribosomal protein L22/L17e. L22 (L17 in eukaryotes) is a core protein of the large ribosomal subunit. It is the only ribosomal protein that interacts with all six domains of 23S rRNA, and is one of the proteins important for directing the proper folding and stabilizing the conformation of 23S rRNA. L22 is the largest protein contributor to the surface of the polypeptide exit channel, the tunnel through which the polypeptide product passes. L22 is also one of six proteins located at the putative translocon binding site on the exterior surface of the ribosome." Q#26117 - CGI_10000083 superfamily 243146 68 115 1.52E-08 46.7827 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#26118 - CGI_10000084 superfamily 191369 1 109 5.72E-27 99.8741 cl05372 DUF837 superfamily C - Protein of unknown function (DUF837); This family consists of several eukaryotic proteins of unknown function. One of the family members is a circulating cathodic antigen (CCA) found in Schistosoma mansoni (Blood fluke). Q#26119 - CGI_10000085 superfamily 202823 1 47 3.71E-15 65.2529 cl08408 Ribosomal_L2_C superfamily N - "Ribosomal Proteins L2, C-terminal domain; Ribosomal Proteins L2, C-terminal domain. " Q#26120 - CGI_10000086 superfamily 247866 37 125 6.83E-09 50.5288 cl17312 PhyH superfamily C - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#26122 - CGI_10000088 superfamily 241600 54 187 7.42E-55 175.507 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#26125 - CGI_10000093 superfamily 241888 86 216 5.46E-40 139.053 cl00473 BI-1-like superfamily C - "BAX inhibitor (BI)-1/YccA-like protein family; Mammalian members of the BAX inhibitor (BI)-1 like family of small transmembrane proteins have been shown to have an antiapoptotic effect either by stimulating the antiapoptotic function of Bcl-2, a well-characterized oncogene, or by inhibiting the proapoptotic effect of Bax, another member of the Bcl-2 family. Their broad tissue distribution and high degree of conservation suggests an important regulatory role. This superfamily also contains the lifeguard(LFG)-like proteins and other subfamilies which appear to be related by common descent and also function as inhibitors of apoptosis. In plants, BI-1 like proteins play a role in pathogen resistance. A prokaryotic member, Escherichia coli YccA, has been shown to interact with ATP-dependent protease FtsH, which degrades abnormal membrane proteins as part of a quality control mechanism to keep the integrity of biological membranes." Q#26129 - CGI_10000098 superfamily 247057 40 128 7.87E-44 140.951 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#26130 - CGI_10000100 superfamily 247755 1 53 1.51E-24 91.8883 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26135 - CGI_10000110 superfamily 245864 3 179 3.30E-10 56.903 cl12078 p450 superfamily C - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#26137 - CGI_10000112 superfamily 243778 45 68 0.00010507 37.2035 cl04503 HA2 superfamily C - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#26141 - CGI_10000113 superfamily 219316 1 154 3.80E-62 192.436 cl06268 B9-C2 superfamily - - "Ciliary basal body-associated, B9 protein; The B9-C2 domain is found in proteins associated with the ciliary basal body. B9 domains were identified as a specific family of C2 domains. There are three sub-families represented by this family, notably, Mks1-Xbx7, Stumpy-Tza1 and Tza2 groups of proteins. Mutations in human Mks1 result in the developmental disorder Mechler-Gruber syndrome; mutations in mouse Stumpy lead to perinatal hydrocephalus and severe polycystic kidney disease. All the three distinct types of B9-C2 proteins cooperatively localise to the basal body or centrosome of cilia." Q#26142 - CGI_10000115 superfamily 241879 93 339 1.37E-24 99.9388 cl00462 TM_ABC_iron-siderophores_like superfamily - - "Transmembrane subunit (TM), of Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters involved in the uptake of siderophores, heme, vitamin B12, or the divalent cations Mg2+ and Zn2+. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The TMs are bundles of alpha helices that transverse the cytoplasmic membrane multiple times. The two ABCs bind and hydrolyze ATP and drive the transport reaction. Each TM has a prominent cytoplasmic loop which contacts an ABC and represents a conserved motif. The two TMs form either a homodimer (e.g. in the case of the BtuC subunits of the Escherichia coli BtuCD vitamin B12 transporter), a heterodimer (e.g. the TroC and TroD subunits of the Treponema pallidum general transition metal transporter, TroBCD), or a pseudo-heterodimer (e.g. the FhuB protein of the E. coli ferrichrome transporter, FhuBC). FhuB contains two tandem TMs which associate to form the pseudo-heterodimer. Both FhuB TMs are found in this hierarchy." Q#26148 - CGI_10000124 superfamily 243175 53 172 1.14E-59 183.989 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#26148 - CGI_10000124 superfamily 241832 2 45 1.43E-20 81.6636 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#26150 - CGI_10000126 superfamily 241584 49 107 4.25E-08 46.3355 cl00065 FN3 superfamily C - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#26152 - CGI_10000129 superfamily 247905 1 63 2.03E-17 72.6556 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#26153 - CGI_10000130 superfamily 238211 28 142 2.74E-42 141.645 cl18908 TS_Pyrimidine_HMase superfamily C - "Thymidylate synthase and pyrimidine hydroxymethylase: Thymidylate synthase (TS) and deoxycytidylate hydroxymethylase (dCMP-HMase) are homologs that catalyze analogous alkylation of C5 of pyrimidine nucleotides. Both enzymes are involved in the biosynthesis of DNA precursors and are active as homodimers. However, they exhibit distinct pyrimidine base specificities and differ in the details of their catalyzed reactions. TS is biologically ubiquitous and catalyzes the conversion of dUMP and methylene-tetrahydrofolate (CH2THF) to dTMP and dihydrofolate (DHF). It also acts as a regulator of its own expression by binding and inactivating its own RNA. Due to its key role in the de novo pathway for thymidylate synthesis and, hence, DNA synthesis, it is one of the most conserved enzymes across species and phyla. TS is a well-recognized target for anticancer chemotherapy, as well as a valuable new target against infectious diseases. Interestingly, in several protozoa, a single polypeptide chain codes for both, dihydrofolate reductase (DHFR) and thymidylate synthase (TS), forming a bifunctional enzyme (DHFR-TS), possibly through gene fusion at a single evolutionary point. DHFR-TS is also active as a dimer. Virus encoded dCMP-HMase catalyzes the reversible conversion of dCMP and CH2THF to hydroxymethyl-dCMP and THF. This family also includes dUMP hydroxymethylase, which is encoded by several bacteriophages that infect Bacillus subtilis, for their own protection against the host restriction system, and contain hydroxymethyl-dUMP instead of dTMP in their DNA." Q#26154 - CGI_10000131 superfamily 242559 30 371 0 535.743 cl01529 GH99_GH71_like superfamily - - "Glycoside hydrolase families 71, 99, and related domains; This superfamily of glycoside hydrolases contains families GH71 and GH99 (following the CAZY nomenclature), as well as other members with undefined function and specificity." Q#26159 - CGI_10000135 superfamily 247692 17 100 1.76E-28 107.247 cl17068 AFD_class_I superfamily NC - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#26161 - CGI_10000136 superfamily 198738 1 46 1.02E-20 77.6934 cl02599 Ets superfamily N - Ets-domain; Ets-domain. Q#26162 - CGI_10000138 superfamily 247725 1 45 8.46E-14 64.6495 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#26162 - CGI_10000138 superfamily 247725 113 201 3.76E-12 59.7978 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#26163 - CGI_10000137 superfamily 241572 16 91 3.75E-17 71.886 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#26165 - CGI_10000141 superfamily 218556 1 330 2.20E-33 132.427 cl05076 RRN3 superfamily N - RNA polymerase I specific transcription initiation factor RRN3; This family consists of several eukaryotic proteins which are homologous to the yeast RRN3 protein. RRN3 is one of the RRN genes specifically required for the transcription of rDNA by RNA polymerase I (Pol I) in Saccharomyces cerevisiae. Q#26166 - CGI_10000144 superfamily 221913 49 255 4.92E-67 210.089 cl18626 AAA_12 superfamily - - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. Q#26166 - CGI_10000144 superfamily 222258 2 38 9.08E-06 44.096 cl18656 AAA_30 superfamily NC - AAA domain; This family of domains contain a P-loop motif that is characteristic of the AAA superfamily. Many of the proteins in this family are conjugative transfer proteins. There is a Walker A and Walker B. Q#26167 - CGI_10000140 superfamily 241619 33 100 0.000914271 34.4801 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#26168 - CGI_10000145 superfamily 197504 1 48 6.21E-17 72.7073 cl18192 PBPe superfamily N - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#26170 - CGI_10000147 superfamily 242205 1 160 2.58E-60 186.356 cl00937 Ribosomal_L21e superfamily - - Ribosomal protein L21e; Ribosomal protein L21e. Q#26173 - CGI_10000096 superfamily 147609 12 176 1.24E-17 75.4707 cl05205 p25-alpha superfamily - - "p25-alpha; This family encodes a 25 kDa protein that is phosphorylated by a Ser/Thr-Pro kinase. It has been described as a brain specific protein, but it is found in Tetrahymena thermophila." Q#26176 - CGI_10000158 superfamily 247755 2 239 9.92E-96 283.239 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26177 - CGI_10000159 superfamily 241914 1 181 2.47E-70 215.742 cl00510 Permease superfamily - - Permease; This domain functions as a permease. In a hypothetical protein from Neisseria it is involved in L-glutamate import into the cell. In Arabidopsis ABC transporter I family member 14 it is involved in lipid transfer within the cell. Q#26178 - CGI_10000156 superfamily 242730 162 203 0.00535667 36.4703 cl01825 Phage_Mu_Gam superfamily C - Bacteriophage Mu Gam like protein; This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. Q#26182 - CGI_10000162 superfamily 241879 81 326 1.09E-35 131.14 cl00462 TM_ABC_iron-siderophores_like superfamily - - "Transmembrane subunit (TM), of Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters involved in the uptake of siderophores, heme, vitamin B12, or the divalent cations Mg2+ and Zn2+. PBP-dependent ABC transporters consist of a PBP, two TMs, and two cytoplasmic ABCs, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The TMs are bundles of alpha helices that transverse the cytoplasmic membrane multiple times. The two ABCs bind and hydrolyze ATP and drive the transport reaction. Each TM has a prominent cytoplasmic loop which contacts an ABC and represents a conserved motif. The two TMs form either a homodimer (e.g. in the case of the BtuC subunits of the Escherichia coli BtuCD vitamin B12 transporter), a heterodimer (e.g. the TroC and TroD subunits of the Treponema pallidum general transition metal transporter, TroBCD), or a pseudo-heterodimer (e.g. the FhuB protein of the E. coli ferrichrome transporter, FhuBC). FhuB contains two tandem TMs which associate to form the pseudo-heterodimer. Both FhuB TMs are found in this hierarchy." Q#26183 - CGI_10000163 superfamily 239148 13 161 7.37E-73 218.648 cl02780 MIT_C superfamily - - "MIT_C; domain found C-terminal to MIT (contained within Microtubule Interacting and Trafficking molecules) domains, as well as in some bacterial proteins. The function of this domain is unknown." Q#26186 - CGI_10000164 superfamily 216363 2 57 1.02E-09 49.7762 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#26189 - CGI_10023130 superfamily 245862 258 318 1.47E-06 46.8045 cl12076 THUMP superfamily N - "THUMP domain, predicted to bind RNA; The THUMP domain is named after THioUridine synthases, RNA Methyltransferases and Pseudo-uridine synthases. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets." Q#26189 - CGI_10023130 superfamily 245862 18 105 3.38E-06 45.6489 cl12076 THUMP superfamily C - "THUMP domain, predicted to bind RNA; The THUMP domain is named after THioUridine synthases, RNA Methyltransferases and Pseudo-uridine synthases. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets." Q#26189 - CGI_10023130 superfamily 247727 323 497 4.13E-37 135.117 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#26191 - CGI_10023132 superfamily 247098 60 190 3.98E-74 222.401 cl15841 COG0229 superfamily - - "Conserved domain frequently associated with peptide methionine sulfoxide reductase [Posttranslational modification, protein turnover, chaperones]" Q#26192 - CGI_10023133 superfamily 247775 1 63 2.46E-07 46.5651 cl17221 ArsB_NhaD_permease superfamily N - "Anion permease ArsB/NhaD. These permeases have been shown to translocate sodium, arsenate, antimonite, sulfate and organic anions across biological membranes in all three kingdoms of life. A typical anion permease contains 8-13 transmembrane helices and can function either independently as a chemiosmotic transporter or as a channel-forming subunit of an ATP-driven anion pump." Q#26193 - CGI_10023134 superfamily 207690 127 151 3.57E-05 39.9937 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#26194 - CGI_10023135 superfamily 241624 22 182 8.40E-49 177.904 cl00120 PP2Cc superfamily C - "Serine/threonine phosphatases, family 2C, catalytic domain; The protein architecture and deduced catalytic mechanism of PP2C phosphatases are similar to the PP1, PP2A, PP2B family of protein Ser/Thr phosphatases, with which PP2C shares no sequence similarity." Q#26195 - CGI_10023136 superfamily 241574 413 623 1.16E-89 286.019 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#26195 - CGI_10023136 superfamily 241574 674 853 1.89E-17 82.2485 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#26196 - CGI_10023137 superfamily 241750 2 182 2.44E-33 120.371 cl00281 metallo-dependent_hydrolases superfamily N - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#26198 - CGI_10023139 superfamily 245864 154 259 3.66E-27 108.135 cl12078 p450 superfamily NC - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#26200 - CGI_10023141 superfamily 243263 38 427 1.55E-55 192.238 cl02990 ASC superfamily - - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#26201 - CGI_10023142 superfamily 243045 227 326 6.80E-16 74.5919 cl02459 PAS superfamily - - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#26201 - CGI_10023142 superfamily 243045 86 156 1.41E-10 59.1839 cl02459 PAS superfamily C - "PAS domain; PAS motifs appear in archaea, eubacteria and eukarya. Probably the most surprising identification of a PAS domain was that in EAG-like K+-channels. PAS domains have been found to bind ligands, and to act as sensors for light and oxygen in signal transduction." Q#26201 - CGI_10023142 superfamily 241596 8 52 4.61E-07 47.9791 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#26202 - CGI_10023143 superfamily 243161 3 60 2.42E-05 40.0702 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#26203 - CGI_10023144 superfamily 248097 104 134 7.19E-06 41.0966 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#26205 - CGI_10023146 superfamily 241596 67 124 2.26E-11 57.2239 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#26206 - CGI_10023147 superfamily 241758 52 237 3.79E-84 256.791 cl00292 AANH_like superfamily - - "Adenine nucleotide alpha hydrolases superfamily including N type ATP PPases, ATP sulphurylases Universal Stress Response protein and electron transfer flavoprotein (ETF). The domain forms a apha/beta/apha fold which binds to Adenosine nucleotide." Q#26206 - CGI_10023147 superfamily 129370 201 302 7.13E-16 72.921 cl11741 TIGR00269 superfamily - - "TIGR00269 family protein; [Hypothetical proteins, Conserved]." Q#26207 - CGI_10023148 superfamily 241900 93 345 2.24E-150 426.968 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#26208 - CGI_10023149 superfamily 243072 137 263 7.49E-25 97.8394 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#26208 - CGI_10023149 superfamily 243072 54 194 2.33E-12 63.1714 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#26208 - CGI_10023149 superfamily 243073 335 379 3.08E-06 43.9981 cl02533 SOCS superfamily - - "SOCS (suppressors of cytokine signaling) box. The SOCS box is found in the C-terminal region of CIS/SOCS family proteins (in combination with a SH2 domain), ASBs (ankyrin repeat-containing proteins with a SOCS box), SSBs (SPRY domain-containing proteins with a SOCS box), and WSBs (WD40 repeat-containing proteins with a SOCS box), as well as, other miscellaneous proteins. The function of the SOCS box is the recruitment of the ubiquitin-transferase system. The SOCS box interacts with Elongins B and C, Cullin-5 or Cullin-2, Rbx-1, and E2. Therefore, SOCS-box-containing proteins probably function as E3 ubiquitin ligases and mediate the degradation of proteins associated through their N-terminal regions." Q#26209 - CGI_10023150 superfamily 247097 106 142 0.000406166 35.1194 cl15839 ShK superfamily - - ShK domain-like; This domain of is found in several C. elegans proteins. The domain is 30 amino acids long and rich in cysteine residues. There are 6 conserved cysteine positions in the domain that form three disulphide bridges. The domain is found in the potassium channel inhibitor ShK in sea anemone. Q#26211 - CGI_10023152 superfamily 243066 327 355 2.14E-05 42.1549 cl02518 BTB superfamily C - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#26214 - CGI_10023156 superfamily 243039 1 153 2.99E-98 284.148 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#26216 - CGI_10023158 superfamily 243039 319 492 2.74E-100 302.252 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#26218 - CGI_10023160 superfamily 241754 34 704 0 1253.26 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#26218 - CGI_10023160 superfamily 247683 1070 1121 9.81E-31 116.745 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#26218 - CGI_10023160 superfamily 218855 731 936 1.88E-62 211.777 cl10652 Myosin_TH1 superfamily - - Myosin tail; Myosin tail. Q#26219 - CGI_10023161 superfamily 241820 11 89 3.20E-16 69.9511 cl00368 Ribosomal_S16 superfamily - - Ribosomal protein S16; Ribosomal protein S16. Q#26220 - CGI_10023162 superfamily 245847 127 184 5.39E-12 59.2881 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#26224 - CGI_10023166 superfamily 241733 54 135 1.13E-39 130.038 cl00259 Sm_like superfamily - - "Sm and related proteins; The eukaryotic Sm and Sm-like (LSm) proteins associate with RNA to form the core domain of the ribonucleoprotein particles involved in a variety of RNA processing events including pre-mRNA splicing, telomere replication, and mRNA degradation. Members of this family share a highly conserved Sm fold containing an N-terminal helix followed by a strongly bent five-stranded antiparallel beta-sheet. Sm-like proteins exist in archaea as well as prokaryotes that form heptameric and hexameric ring structures similar to those found in eukaryotes." Q#26226 - CGI_10023168 superfamily 217617 23 69 0.00317808 33.5437 cl15988 Sulfotransfer_2 superfamily N - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#26233 - CGI_10000174 superfamily 242161 1 41 2.71E-18 72.4319 cl00875 PTZ00255 superfamily NC - 60S ribosomal protein L37a; Provisional Q#26236 - CGI_10000180 superfamily 241783 1 42 3.71E-07 47.1788 cl00322 Ribosomal_L1 superfamily N - "Ribosomal protein L1. The L1 protein, located near the E-site of the ribosome, forms part of the L1 stalk along with 23S rRNA. In bacteria and archaea, L1 functions both as a ribosomal protein that binds rRNA, and as a translation repressor that binds its own mRNA. Like several other large ribosomal subunit proteins, L1 displays RNA chaperone activity. L1 is one of the largest ribosomal proteins. It is composed of two domains that cycle between open and closed conformations via a hinge motion. The RNA-binding site of L1 is highly conserved, with both mRNA and rRNA binding the same binding site." Q#26237 - CGI_10000175 superfamily 245205 19 96 6.78E-17 70.7297 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26243 - CGI_10005236 superfamily 241594 68 141 0.000142987 38.8256 cl00077 HECTc superfamily N - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#26245 - CGI_10005238 superfamily 218735 4 243 3.49E-16 75.3219 cl15681 IER superfamily - - "Immediate early response protein (IER); This family consists of several eukaryotic immediate early response (IER) 2 and 5 proteins. The role of IER5 is unclear although it play an important role in mediating the cellular response to mitogenic signals. Again, little is known about the function of IER2 although it is thought to play a role in mediating the cellular responses to a variety of extracellular signals." Q#26246 - CGI_10005239 superfamily 245814 377 455 7.21E-06 44.7368 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#26246 - CGI_10005239 superfamily 245814 287 361 0.0021379 37.0997 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#26248 - CGI_10000190 superfamily 241563 18 50 0.00365979 35.3907 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26248 - CGI_10000190 superfamily 128778 66 161 0.00853797 35.3183 cl17972 BBC superfamily C - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#26249 - CGI_10000191 superfamily 247740 4 79 9.07E-34 119.521 cl17186 TIM_phosphate_binding superfamily N - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#26252 - CGI_10000194 superfamily 241786 3 171 5.12E-60 193.515 cl00325 Ribosomal_L4 superfamily N - Ribosomal protein L4/L1 family; This family includes Ribosomal L4/L1 from eukaryotes and archaebacteria and L4 from eubacteria. L4 from yeast has been shown to bind rRNA. Q#26252 - CGI_10000194 superfamily 222716 181 260 1.11E-26 100.372 cl16836 Ribos_L4_asso_C superfamily - - 60S ribosomal protein L4 C-terminal domain; This family is found at the very C-terminal of 60 ribosomal L4 proteins. Q#26255 - CGI_10000195 superfamily 246597 1 72 6.51E-24 91.5662 cl13995 MPP_superfamily superfamily NC - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#26258 - CGI_10000199 superfamily 247792 184 236 3.12E-16 70.4427 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#26258 - CGI_10000199 superfamily 214806 73 159 1.98E-14 66.1649 cl15966 CRA superfamily - - "CT11-RanBPM; protein-protein interaction domain present in crown eukaryotes (plants, animals, fungi)" Q#26258 - CGI_10000199 superfamily 128914 14 66 1.68E-09 52.1882 cl15352 CTLH superfamily - - C-terminal to LisH motif; Alpha-helical motif of unknown function. Q#26263 - CGI_10000206 superfamily 243035 76 116 7.36E-08 46.459 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26263 - CGI_10000206 superfamily 243061 17 60 6.56E-06 40.9131 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#26265 - CGI_10000210 superfamily 242031 5 183 3.47E-57 180.081 cl00689 TYW3 superfamily - - Methyltransferase TYW3; The methyltransferase TYW3 (tRNA-yW- synthesising protein 3) has been identified in yeast to be involved in wybutosine (yW) biosynthesis. yW is a complexly modified guanosine residue that contains a tricyclic base and is found at the 3' position adjacent the anticodon of phenylalanine tRNA. TYW3 is an N-4 methylase that methylates yW-86 to yield yW-72 in an Ado-Met-dependent manner. Q#26267 - CGI_10000212 superfamily 241600 7 187 2.61E-89 263.332 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#26268 - CGI_10000213 superfamily 243072 30 118 2.22E-09 52.0006 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#26268 - CGI_10000213 superfamily 149414 169 198 7.97E-05 38.4066 cl07091 TRP_2 superfamily N - Transient receptor ion channel II; This domain is found in the transient receptor ion channel (Trp) family of proteins. There is strong evidence that Trp proteins are structural elements of calcium-ion entry channels activated by G protein-coupled receptors. This domain does not tend to appear with the TRP domain (pfam06011) but is often found to the C-terminus of Ankyrin repeats (pfam00023). Q#26269 - CGI_10000214 superfamily 245847 25 167 1.34E-18 77.9821 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#26273 - CGI_10000215 superfamily 243092 42 284 1.46E-13 69.286 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#26277 - CGI_10020228 superfamily 241571 383 502 6.23E-12 62.4298 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#26277 - CGI_10020228 superfamily 245213 345 375 0.00067863 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26277 - CGI_10020228 superfamily 241583 149 333 1.28E-46 161.585 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#26278 - CGI_10020229 superfamily 201778 3 114 3.32E-18 78.4046 cl18219 GFO_IDH_MocA superfamily - - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#26279 - CGI_10020230 superfamily 242406 704 854 5.09E-57 192.421 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#26279 - CGI_10020230 superfamily 247742 196 488 3.34E-36 141.851 cl17188 enolase_like superfamily N - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#26281 - CGI_10020232 superfamily 241578 255 411 2.41E-26 105.837 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26281 - CGI_10020232 superfamily 241578 27 194 2.19E-41 147.818 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26281 - CGI_10020232 superfamily 241578 452 623 4.65E-11 61.25 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26282 - CGI_10020233 superfamily 241578 115 277 4.59E-40 144.742 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26282 - CGI_10020233 superfamily 241578 306 457 5.91E-24 99.2882 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26282 - CGI_10020233 superfamily 241578 505 653 2.14E-09 56.0823 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26285 - CGI_10020236 superfamily 243054 527 706 8.93E-33 126.791 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26285 - CGI_10020236 superfamily 243054 316 523 2.30E-27 110.998 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26285 - CGI_10020236 superfamily 241559 67 171 4.84E-26 103.93 cl00030 CH superfamily - - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#26285 - CGI_10020236 superfamily 243054 207 417 3.49E-14 71.7079 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26285 - CGI_10020236 superfamily 241559 1 48 5.81E-06 45.3843 cl00030 CH superfamily N - "Calponin homology domain; actin-binding domain which may be present as a single copy or in tandem repeats (which increases binding affinity). The CH domain is found in cytoskeletal and signal transduction proteins, including actin-binding proteins like spectrin, alpha-actinin, dystrophin, utrophin, and fimbrin, proteins essential for regulation of cell shape (cortexillins), and signaling proteins (Vav)." Q#26287 - CGI_10020238 superfamily 247725 2952 3053 2.54E-40 147.741 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#26287 - CGI_10020238 superfamily 243054 2346 2556 1.30E-37 143.74 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 1924 2133 1.28E-36 141.044 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 544 753 1.54E-33 132.184 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 1817 2027 8.91E-33 129.873 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 754 962 1.51E-32 129.103 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 1607 1816 2.49E-31 125.636 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 2136 2343 1.64E-30 123.325 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 228 436 5.94E-30 121.399 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 1284 1495 5.96E-30 121.399 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 437 646 7.89E-30 121.013 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 122 330 1.23E-28 117.547 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 1501 1710 1.29E-27 114.85 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 860 1066 3.90E-27 113.309 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 2558 2747 1.20E-25 108.687 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 243054 1074 1281 2.11E-24 105.22 cl02488 SPEC superfamily - - "Spectrin repeats, found in several proteins involved in cytoskeletal structure; family members include spectrin, alpha-actinin and dystrophin; the spectrin repeat forms a three helix bundle with the second helix interrupted by proline in some sequences; the repeats are independent folding units; tandem repeats are found in differing numbers and arrange in an antiparallel manner to form dimers; the repeats are defined by a characteristic tryptophan (W) residue in helix A and a leucine (L) at the carboxyl end of helix C and separated by a linker of 5 residues; two copies of the repeat are present here" Q#26287 - CGI_10020238 superfamily 247683 2 30 1.53E-05 45.3838 cl17036 SH3 superfamily N - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#26288 - CGI_10020239 superfamily 241694 3 99 6.47E-18 76.5587 cl00216 L-asparaginase_like superfamily NC - "Bacterial L-asparaginases and related enzymes; Asparaginases (amidohydrolases, E.C. 3.5.1.1) are dimeric or tetrameric enzymes that catalyze the hydrolysis of asparagine to aspartic acid and ammonia. In bacteria, there are two classes of amidohydrolases, one highly specific for asparagine and localized to the periplasm (type II L-asparaginase), and a second (asparaginase- glutaminase) present in the cytosol (type I L-asparaginase) that hydrolyzes both asparagine and glutamine with similar specificities and has a lower affinity for its substrate. Bacterial L-asparaginases (type II) are potent antileukemic agents and have been used in the treatment of acute lymphoblastic leukemia (ALL). A conserved threonine residue is thought to supply the nucleophile hydroxy-group that attacks the amide bond. Many bacterial L-asparaginases have both L-asparagine and L-glutamine hydrolysis activities, to a different degree, and some of them are annotated as asparaginase/glutaminase. This wider family also includes a subunit of an archaeal Glu-tRNA amidotransferase." Q#26290 - CGI_10020241 superfamily 201616 33 61 3.88E-08 45.005 cl03112 ER superfamily N - Enhancer of rudimentary; Enhancer of rudimentary is a protein of unknown function that is highly conserved in plants and animals. This protein is found to be an enhancer of the rudimentary gene. Q#26293 - CGI_10020244 superfamily 216653 99 227 5.53E-30 112.305 cl08331 Na_Ca_ex superfamily - - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#26294 - CGI_10020245 superfamily 216653 60 149 7.64E-16 71.0891 cl08331 Na_Ca_ex superfamily C - "Sodium/calcium exchanger protein; This is a family of sodium/calcium exchanger integral membrane proteins. This family covers the integral membrane regions of the proteins. Sodium/calcium exchangers regulate intracellular Ca2+ concentrations in many cells; cardiac myocytes, epithelial cells, neurons retinal rod photoreceptors and smooth muscle cells. Ca2+ is moved into or out of the cytosol depending on Na+ concentration. In humans and rats there are 3 isoforms; NCX1 NCX2 and NCX3." Q#26296 - CGI_10020247 superfamily 241874 90 453 4.90E-164 481.405 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#26296 - CGI_10020247 superfamily 241874 11 86 1.93E-31 126.637 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#26297 - CGI_10020248 superfamily 218118 147 180 0.00290619 34.5121 cl04552 CD225 superfamily NC - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#26298 - CGI_10020249 superfamily 241874 11 110 8.95E-42 144.356 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#26303 - CGI_10000216 superfamily 221397 18 116 1.50E-05 45.3891 cl14983 DUF3535 superfamily NC - "Domain of unknown function (DUF3535); This presumed domain is functionally uncharacterized. This domain is found in eukaryotes. This domain is typically between 439 to 459 amino acids in length. This domain is found associated with pfam00271, pfam02985, pfam00176. This domain has two completely conserved residues (P and K) that may be functionally important." Q#26303 - CGI_10000216 superfamily 110440 352 379 0.00882847 33.9205 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#26306 - CGI_10000220 superfamily 247856 42 98 0.000139865 35.9865 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#26306 - CGI_10000220 superfamily 247856 5 65 0.000455243 34.8309 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#26307 - CGI_10000222 superfamily 248438 34 138 1.70E-05 44.5405 cl17884 COG1214 superfamily C - "Inactive homolog of metal-dependent proteases, putative molecular chaperone [Posttranslational modification, protein turnover, chaperones]" Q#26308 - CGI_10000224 superfamily 245106 1 52 3.32E-22 83.0772 cl09615 UBA_e1_C superfamily N - Ubiquitin-activating enzyme e1 C-terminal domain; This presumed domain found at the C-terminus of Ubiquitin-activating enzyme e1 proteins is functionally uncharacterized. Q#26309 - CGI_10003877 superfamily 247797 9 130 9.22E-49 157.577 cl17243 PRK13975 superfamily N - thymidylate kinase; Provisional Q#26310 - CGI_10003878 superfamily 247797 4 111 4.95E-51 163.355 cl17243 PRK13975 superfamily C - thymidylate kinase; Provisional Q#26311 - CGI_10003879 superfamily 245201 80 350 8.30E-180 504.458 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26312 - CGI_10000226 superfamily 243064 23 114 1.03E-27 100.953 cl02512 NTR_like superfamily C - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#26314 - CGI_10000229 superfamily 241563 51 87 7.98E-05 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26314 - CGI_10000229 superfamily 110440 477 503 0.000237115 38.9281 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#26316 - CGI_10000235 superfamily 241659 58 129 1.38E-21 83.4667 cl00175 alpha-crystallin-Hsps_p23-like superfamily C - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#26316 - CGI_10000235 superfamily 149931 1 61 0.000236439 35.8923 cl07592 Siah-Interact_N superfamily - - "Siah interacting protein, N terminal; The N terminal domain of Siah interacting protein (SIP) adopts a helical hairpin structure with a hydrophobic core stabilised by a classic knobs-and-holes arrangement of side chains contributed by the two amphipathic helices. Little is known about this domain's function, except that it is crucial for interactions with Siah. It has also been hypothesised that SIP can dimerise through this N terminal domain." Q#26319 - CGI_10000241 superfamily 247724 32 123 0.00155986 35.51 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26320 - CGI_10000242 superfamily 247724 20 191 5.05E-46 153.963 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26323 - CGI_10006299 superfamily 248020 30 296 4.02E-35 133.744 cl17466 Sulfatase superfamily C - Sulfatase; Sulfatase. Q#26324 - CGI_10006300 superfamily 242206 364 455 9.32E-50 170.881 cl00938 Rieske superfamily - - "Rieske domain; a [2Fe-2S] cluster binding domain commonly found in Rieske non-heme iron oxygenase (RO) systems such as naphthalene and biphenyl dioxygenases, as well as in plant/cyanobacterial chloroplast b6f and mitochondrial cytochrome bc(1) complexes. The Rieske domain can be divided into two subdomains, with an incomplete six-stranded, antiparallel beta-barrel at one end, and an iron-sulfur cluster binding subdomain at the other. The Rieske iron-sulfur center contains a [2Fe-2S] cluster, which is involved in electron transfer, and is liganded to two histidine and two cysteine residues present in conserved sequences called Rieske motifs. In RO systems, the N-terminal Rieske domain of the alpha subunit acts as an electron shuttle that accepts electrons from a reductase or ferredoxin component and transfers them to the mononuclear iron in the alpha subunit C-terminal domain to be used for catalysis." Q#26324 - CGI_10006300 superfamily 215691 621 701 2.74E-11 61.0626 cl15766 Pyr_redox superfamily - - Pyridine nucleotide-disulphide oxidoreductase; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. Q#26324 - CGI_10006300 superfamily 248054 567 649 9.04E-08 51.9195 cl17500 NAD_binding_8 superfamily N - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#26325 - CGI_10006301 superfamily 247727 55 185 1.22E-08 51.2767 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#26325 - CGI_10006301 superfamily 221654 240 300 2.87E-21 85.8128 cl13964 WBS_methylT superfamily N - "Methyltransferase involved in Williams-Beuren syndrome; This domain family is found in eukaryotes, and is typically between 72 and 83 amino acids in length. The family is found in association with pfam08241. This family is made up of S-adenosylmethionine-dependent methyltransferases. The proteins are deleted in Williams-Beuren syndrome (WBS), a complex developmental disorder with multisystemic manifestations including supravalvular aortic stenosis (SVAS) and a specific cognitive phenotype." Q#26325 - CGI_10006301 superfamily 247727 19 89 0.000227619 40.7173 cl17173 AdoMet_MTases superfamily C - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#26326 - CGI_10006302 superfamily 243077 39 91 3.74E-13 61.0221 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#26327 - CGI_10006303 superfamily 220904 2 122 6.21E-26 96.2121 cl12494 DUF2781 superfamily N - Protein of unknown function (DUF2781); This is a eukaryotic family of uncharacterized proteins. Some of the proteins in this family are annotated as membrane proteins. Q#26333 - CGI_10000246 superfamily 246748 70 126 2.57E-13 65.3065 cl14876 Zinc_peptidase_like superfamily C - "Zinc peptidases M18, M20, M28, and M42; Zinc peptidases play vital roles in metabolic and signaling pathways throughout all kingdoms of life. This family corresponds to several clans in the MEROPS database, including the MH clan, which contains 4 families (M18, M20, M28, M42). The peptidase M20 family includes carboxypeptidases such as the glutamate carboxypeptidase from Pseudomonas, the thermostable carboxypeptidase Ss1 of broad specificity from archaea and yeast Gly-X carboxypeptidase. The dipeptidases include bacterial dipeptidase, peptidase V (PepV), a eukaryotic, non-specific dipeptidase, and two Xaa-His dipeptidases (carnosinases). There is also the bacterial aminopeptidase, peptidase T (PepT) that acts only on tripeptide substrates and has therefore been termed a tripeptidase. Peptidase family M28 contains aminopeptidases and carboxypeptidases, and has co-catalytic zinc ions. However, several enzymes in this family utilize other first row transition metal ions such as cobalt and manganese. Each zinc ion is tetrahedrally co-ordinated, with three amino acid ligands plus activated water; one aspartate residue binds both metal ions. The aminopeptidases in this family are also called bacterial leucyl aminopeptidases, but are able to release a variety of N-terminal amino acids. IAP aminopeptidase and aminopeptidase Y preferentially release basic amino acids while glutamate carboxypeptidase II preferentially releases C-terminal glutamates. Glutamate carbxypeptidase II and plasma glutamate carboxypeptidase hydrolyze dipeptides. Peptidase families M18 and M42 contain metalloaminopeptidases. M18 is widely distributed in bacteria and eukaryotes. However, only yeast aminopeptidase I and mammalian aspartyl aminopeptidase have been characterized in detail. Some of M42 (also known as glutamyl aminopeptidase) enzymes exhibit aminopeptidase specificity while others also have acylaminoacylpeptidase activity (i.e. hydrolysis of acylated N-terminal residues)." Q#26334 - CGI_10000248 superfamily 214507 348 399 6.29E-08 48.9656 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#26335 - CGI_10000256 superfamily 241884 1 128 7.10E-59 183.229 cl00467 Ntn_hydrolase superfamily N - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#26336 - CGI_10000257 superfamily 241884 18 61 1.49E-08 47.6387 cl00467 Ntn_hydrolase superfamily NC - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#26338 - CGI_10000249 superfamily 248264 2 87 2.17E-06 43.3798 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#26339 - CGI_10000258 superfamily 246918 115 161 2.28E-08 47.1963 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#26339 - CGI_10000258 superfamily 241610 1 25 1.53E-05 39.5893 cl00101 KU superfamily N - BPTI/Kunitz family of serine protease inhibitors; Structure is a disulfide rich alpha+beta fold. BPTI (bovine pancreatic trypsin inhibitor) is an extensively studied model structure. Q#26342 - CGI_10000265 superfamily 242173 15 155 8.33E-23 88.8482 cl00891 Cu-Zn_Superoxide_Dismutase superfamily - - "Copper/zinc superoxide dismutase (SOD). superoxide dismutases catalyse the conversion of superoxide radicals to molecular oxygen. Three evolutionarily distinct families of SODs are known, of which the copper/zinc-binding family is one. Defects in the human SOD1 gene causes familial amyotrophic lateral sclerosis (Lou Gehrig's disease). Cytoplasmic and periplasmic SODs exist as dimers, whereas chloroplastic and extracellular enzymes exist as tetramers. Structure supports independent functional evolution in prokaryotes (P-class) and eukaryotes (E-class) [PMID:.8176730]." Q#26343 - CGI_10000266 superfamily 148061 77 256 4.74E-77 234.642 cl18026 FRG1 superfamily - - "FRG1-like family; The human FRG1 gene maps to human chromosome 4q35 and has been identified as a candidate for facioscapulohumeral muscular dystrophy. Currently, the function of FRG1 is unknown." Q#26344 - CGI_10000267 superfamily 243519 1 121 2.07E-68 216.704 cl03757 phosphohexomutase superfamily N - "The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model." Q#26345 - CGI_10000268 superfamily 242885 10 68 8.49E-26 94.5866 cl02106 IF4E superfamily C - Eukaryotic initiation factor 4E; Eukaryotic initiation factor 4E. Q#26347 - CGI_10000273 superfamily 243179 76 177 8.91E-23 88.7154 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#26349 - CGI_10000271 superfamily 206009 29 61 7.20E-16 66.8078 cl16430 Clathrin_H_link superfamily N - "Clathrin-H-link; This short domain is found on clathrins, and often appears on proteins directly downstream from the Clathrin-link domain pfam09268." Q#26351 - CGI_10000283 superfamily 241670 1 93 9.46E-11 55.0567 cl00188 BPI superfamily N - "BPI/LBP/CETP domain; Bactericidal permeability-increasing protein (BPI) / Lipopolysaccharide-binding protein (LBP) / Cholesteryl ester transfer protein (CETP) domain; binds to and neutralizes lipopolysaccharides from the outer membrane of Gram-negative bacteria.; Apolar pockets on the concave surface bind a molecule of phosphatidylcholine, primarily by interacting with their acyl chains; this suggests that the pockets may also bind the acyl chains of lipopolysaccharide." Q#26355 - CGI_10000289 superfamily 241563 61 112 0.00136106 37.844 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26356 - CGI_10004197 superfamily 215839 575 642 1.38E-06 46.7762 cl15968 GHMP_kinases_N superfamily - - "GHMP kinases N terminal domain; This family includes homoserine kinases, galactokinases and mevalonate kinases." Q#26356 - CGI_10004197 superfamily 193687 107 163 0.00439801 36.4507 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#26357 - CGI_10004198 superfamily 217293 2 201 1.62E-42 150.861 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#26357 - CGI_10004198 superfamily 202474 219 329 1.59E-10 59.5897 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#26358 - CGI_10004199 superfamily 217293 34 240 8.43E-43 152.017 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#26358 - CGI_10004199 superfamily 202474 247 378 9.05E-09 54.5821 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#26359 - CGI_10004200 superfamily 152695 26 82 3.13E-05 39.3155 cl13667 PIP49_C superfamily C - Pancreatitis induced protein 49 C terminal; This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 344 to 431 amino acids in length. This protein has a single completely conserved residue C that may be functionally important. PIP49 is a putative transmembrane protein which is induced to express during pancreatitis. Q#26360 - CGI_10004201 superfamily 152695 126 318 2.01E-20 87.0803 cl13667 PIP49_C superfamily - - Pancreatitis induced protein 49 C terminal; This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 344 to 431 amino acids in length. This protein has a single completely conserved residue C that may be functionally important. PIP49 is a putative transmembrane protein which is induced to express during pancreatitis. Q#26361 - CGI_10004202 superfamily 219779 17 80 5.04E-08 46.4915 cl07044 DPM3 superfamily - - "Dolichol-phosphate mannosyltransferase subunit 3 (DPM3); This family corresponds to subunit 3 of dolichol-phosphate mannosyltransferase, an enzyme which generates mannosyl donors for glycosylphosphatidylinositols, N-glycan and protein O- and C-mannosylation. DPM3 is an integral membrane protein and plays a role in stabilising the dolichol-phosphate mannosyl transferase complex." Q#26362 - CGI_10004203 superfamily 222150 451 476 8.55E-05 41.9937 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#26362 - CGI_10004203 superfamily 222150 541 566 0.00309464 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#26362 - CGI_10004203 superfamily 246975 557 577 0.00390938 37.3265 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#26362 - CGI_10004203 superfamily 222150 569 594 0.0049237 36.9861 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#26364 - CGI_10004205 superfamily 247739 6 76 1.89E-06 42.6433 cl17185 LPLAT superfamily N - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#26365 - CGI_10000294 superfamily 247069 14 43 0.00155365 36.456 cl15787 SEC14 superfamily NC - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#26367 - CGI_10000295 superfamily 248264 85 202 7.80E-08 50.3134 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#26369 - CGI_10000296 superfamily 243092 3 98 9.27E-07 45.0184 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#26371 - CGI_10000299 superfamily 241568 17 42 0.000395032 33.5904 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#26371 - CGI_10000299 superfamily 243035 53 73 0.000854508 33.8706 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26372 - CGI_10000300 superfamily 247805 233 327 9.83E-08 51.184 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26372 - CGI_10000300 superfamily 217017 492 578 0.00833588 38.1273 cl17780 Herpes_ori_bp superfamily NC - "Origin of replication binding protein; This Pfam family represents the herpesvirus origin of replication binding protein, probably involved in DNA replication." Q#26374 - CGI_10000301 superfamily 241563 62 102 0.000446566 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26375 - CGI_10000303 superfamily 241607 57 71 0.00263411 31.5363 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#26376 - CGI_10000304 superfamily 247805 47 82 3.68E-14 64.4281 cl17251 DEXDc superfamily C - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26378 - CGI_10000307 superfamily 241786 2 191 9.27E-43 144.162 cl00325 Ribosomal_L4 superfamily - - Ribosomal protein L4/L1 family; This family includes Ribosomal L4/L1 from eukaryotes and archaebacteria and L4 from eubacteria. L4 from yeast has been shown to bind rRNA. Q#26381 - CGI_10018665 superfamily 245040 65 87 0.000116897 37.0684 cl09238 CY superfamily NC - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#26382 - CGI_10018666 superfamily 245040 26 88 0.000143548 36.5107 cl09238 CY superfamily C - "Cystatin-like domain; Cystatins are a family of cysteine protease inhibitors that occur mainly as single domain proteins. However some extracellular proteins such as kininogen, His-rich glycoprotein and fetuin also contain these domains." Q#26384 - CGI_10018668 superfamily 220239 573 713 6.96E-28 110.021 cl09673 DUF2013 superfamily - - Protein of unknown function (DUF2013); This region is found at the C terminal of a group of cytoskeletal proteins. Q#26384 - CGI_10018668 superfamily 247683 1 54 1.99E-13 66.1846 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#26385 - CGI_10018669 superfamily 198738 51 134 1.16E-46 155.889 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#26386 - CGI_10018670 superfamily 241645 714 803 1.10E-20 88.7004 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#26386 - CGI_10018670 superfamily 241566 107 139 4.60E-06 45.1379 cl00040 C1 superfamily C - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#26387 - CGI_10018671 superfamily 111929 157 244 1.18E-24 95.9306 cl03885 Str_synth superfamily - - Strictosidine synthase; Strictosidine synthase (E.C. 4.3.3.2) is a key enzyme in alkaloid biosynthesis. It catalyzes the condensation of tryptamine with secologanin to form strictosidine. Q#26388 - CGI_10018672 superfamily 222340 13 30 0.0024068 32.4429 cl18665 GNAT_acetyltr_2 superfamily C - GNAT acetyltransferase 2; This domain has N-acetyltransferase activity. It has a GCN5-related N-acetyltransferase (GNAT) fold. Q#26389 - CGI_10018673 superfamily 217293 4 163 8.30E-29 107.719 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#26390 - CGI_10018674 superfamily 217293 4 196 2.98E-37 135.068 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#26390 - CGI_10018674 superfamily 202474 204 291 1.46E-10 59.2045 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#26391 - CGI_10018675 superfamily 245213 1573 1610 1.83E-05 44.5498 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 245213 1952 1985 0.000225252 41.4682 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 245213 1357 1390 0.000303919 41.083 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 245213 1658 1694 0.000337492 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 245213 1824 1864 0.000448451 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 245213 1530 1563 0.00287516 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 245213 1487 1520 0.00322858 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 245213 162 199 0.0033161 38.0014 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 245213 359 396 0.00450282 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26391 - CGI_10018675 superfamily 246918 1999 2050 4.85E-13 66.8415 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#26391 - CGI_10018675 superfamily 243065 812 976 9.75E-09 55.9109 cl02516 VWD superfamily - - von Willebrand factor type D domain; Luciferin-2-monooxygenase from Vargula hilgendorfii contains a vwd domain. Its function is unrelated but the similarity is very strong by several methods. Q#26391 - CGI_10018675 superfamily 243124 506 663 8.06E-07 50.1181 cl02648 NIDO superfamily - - Nidogen-like; This is a nidogen-like domain (NIDO) domain and is an extracellular domain found in nidogen and hypothetical proteins of unknown function. Q#26391 - CGI_10018675 superfamily 246918 206 257 1.68E-06 47.9667 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#26391 - CGI_10018675 superfamily 241578 1909 1949 4.68E-06 48.5352 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26391 - CGI_10018675 superfamily 241578 1860 1908 7.73E-06 48.15 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26391 - CGI_10018675 superfamily 221695 1638 1661 1.83E-05 44.3682 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#26391 - CGI_10018675 superfamily 241578 1390 1437 3.21E-05 46.224 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26391 - CGI_10018675 superfamily 221695 1294 1317 0.000239616 41.2866 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#26391 - CGI_10018675 superfamily 221695 1719 1740 0.000462472 40.5162 cl18612 cEGF superfamily - - "Complement Clr-like EGF-like; cEGF, or complement Clr-like EGF, domains have six conserved cysteine residues disulfide-bonded into the characteristic pattern 'ababcc'. They are found in blood coagulation proteins such as fibrillin, Clr and Cls, thrombomodulin, and the LDL receptor. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal cysteine residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In cEGFs the C-terminal thiol resides on the C-terminal beta-sheet, resulting in long loop-lengths between the cysteine residues of disulfide 'c', typically C[10+]XC. These longer loop-lengths may have arisen by selective cysteine loss from a four-disulfide EGF template such as laminin or integrin. Tandem cEGF domains have five linking residues between terminal cysteines of adjacent domains. cEGF domains may or may not bind calcium in the linker region. cEGF domains with the consensus motif CXN4X[F,Y]XCXC are hydroxylated exclusively on the asparagine residue." Q#26391 - CGI_10018675 superfamily 245213 1783 1822 0.000513518 40.41 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26395 - CGI_10018680 superfamily 241782 47 346 1.70E-17 81.358 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#26397 - CGI_10018683 superfamily 245201 523 775 7.76E-90 289.437 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26397 - CGI_10018683 superfamily 241613 252 286 1.02E-06 46.8162 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#26397 - CGI_10018683 superfamily 247764 842 976 2.80E-27 110.261 cl17210 AtpH superfamily N - "F0F1-type ATP synthase, delta subunit (mitochondrial oligomycin sensitivity protein) [Energy production and conversion]" Q#26398 - CGI_10018684 superfamily 245814 734 798 0.000798041 39.7799 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#26398 - CGI_10018684 superfamily 243061 1506 1606 2.02E-39 144.407 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#26398 - CGI_10018684 superfamily 215647 1180 1396 8.80E-39 147.37 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#26398 - CGI_10018684 superfamily 243086 1109 1152 1.31E-13 68.1705 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#26398 - CGI_10018684 superfamily 243086 1796 1833 1.38E-09 56.6146 cl02559 GPS superfamily - - "Latrophilin/CL-1-like GPS domain; Domain present in latrophilin/CL-1, sea urchin REJ and polycystin." Q#26399 - CGI_10018685 superfamily 243092 47 201 2.96E-20 92.0128 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#26400 - CGI_10018686 superfamily 247805 102 242 1.06E-27 109.349 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26400 - CGI_10018686 superfamily 247905 267 425 8.73E-07 47.6177 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#26400 - CGI_10018686 superfamily 219532 569 670 1.41E-31 119.341 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#26400 - CGI_10018686 superfamily 243778 478 549 1.12E-16 76.4939 cl04503 HA2 superfamily C - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#26401 - CGI_10018687 superfamily 241600 102 313 2.93E-98 291.452 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#26404 - CGI_10000302 superfamily 243161 4 61 1.38E-05 40.4962 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#26409 - CGI_10000314 superfamily 243035 22 62 0.00872661 31.051 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26410 - CGI_10000317 superfamily 217473 72 295 1.66E-27 112.458 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#26413 - CGI_10000321 superfamily 215754 38 132 4.52E-24 92.3164 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#26413 - CGI_10000321 superfamily 215754 155 196 6.49E-09 50.3296 cl02813 Mito_carr superfamily NC - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#26418 - CGI_10000324 superfamily 247038 58 94 0.00607227 32.389 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#26419 - CGI_10000325 superfamily 243094 49 102 9.45E-19 79.39 cl02569 RasGAP superfamily N - "Ras GTPase Activating Domain; RasGAP functions as an enhancer of the hydrolysis of GTP that is bound to Ras-GTPases. Proteins having a RasGAP domain include p120GAP, IQGAP, Rab5-activating protein 6, and Neurofibromin, among others. Although the Rho (Ras homolog) GTPases are most closely related to members of the Ras family, RhoGAP and RasGAP exhibit no similarity at their amino acid sequence level. RasGTPases function as molecular switches in a large number of signaling pathways. They are in the on state when bound to GTP, and in the off state when bound to GDP. The RasGAP domain speeds up the hydrolysis of GTP in Ras-like proteins acting as a negative regulator." Q#26420 - CGI_10000331 superfamily 247947 3 36 0.0039478 30.8157 cl17393 HTH_Hin_like superfamily - - "Helix-turn-helix domain of Hin and related proteins, a family of DNA-binding domains unique to bacteria and represented by the Hin protein of Salmonella. The basic HTH domain is a simple fold comprised of three core helices that form a right-handed helical bundle. The principal DNA-protein interface is formed by the third helix, the recognition helix, inserting itself into the major groove of the DNA. A diverse array of HTH domains participate in a variety of functions that depend on their DNA-binding properties. HTH_Hin represents one of the simplest versions of the HTH domains; the characterization of homologous relationships between various sequence-diverse HTH domain families remains difficult. The Hin recombinase induces the site-specific inversion of a chromosomal DNA segment containing a promoter, which controls the alternate expression of two genes by reversibly switching orientation. The Hin recombinase consists of a single polypeptide chain containing a DNA-binding domain (HTH_Hin) and a catalytic domain." Q#26423 - CGI_10000318 superfamily 216981 175 204 0.00183692 35.5862 cl17087 OTU superfamily C - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#26426 - CGI_10000339 superfamily 242730 9 87 0.00794311 35.3147 cl01825 Phage_Mu_Gam superfamily C - Bacteriophage Mu Gam like protein; This family consists of bacterial and phage Gam proteins. The gam gene of bacteriophage Mu encodes a protein which protects linear double stranded DNA from exonuclease degradation in vitro and in vivo. Q#26427 - CGI_10000338 superfamily 241564 32 99 6.61E-31 111.974 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#26427 - CGI_10000338 superfamily 247792 262 301 4.24E-05 40.1216 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#26428 - CGI_10000290 superfamily 247736 12 45 0.000131603 35.5883 cl17182 NAT_SF superfamily N - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#26429 - CGI_10000291 superfamily 247736 36 119 2.60E-15 67.8077 cl17182 NAT_SF superfamily C - "N-Acyltransferase superfamily: Various enzymes that characteristically catalyze the transfer of an acyl group to a substrate; NAT (N-Acyltransferase) is a large superfamily of enzymes that mostly catalyze the transfer of an acyl group to a substrate and are implicated in a variety of functions, ranging from bacterial antibiotic resistance to circadian rhythms in mammals. Members include GCN5-related N-Acetyltransferases (GNAT) such as Aminoglycoside N-acetyltransferases, Histone N-acetyltransferase (HAT) enzymes, and Serotonin N-acetyltransferase, which catalyze the transfer of an acetyl group to a substrate. The kinetic mechanism of most GNATs involves the ordered formation of a ternary complex: the reaction begins with Acetyl Coenzyme A (AcCoA) binding, followed by binding of substrate, then direct transfer of the acetyl group from AcCoA to the substrate, followed by product and subsequent CoA release. Other family members include Arginine/ornithine N-succinyltransferase, Myristoyl-CoA: protein N-myristoyltransferase, and Acyl-homoserinelactone synthase which have a similar catalytic mechanism but differ in types of acyl groups transferred. Leucyl/phenylalanyl-tRNA-protein transferase and FemXAB nonribosomal peptidyltransferases which catalyze similar peptidyltransferase reactions are also included." Q#26434 - CGI_10000350 superfamily 110440 484 510 9.36E-05 40.0837 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#26434 - CGI_10000350 superfamily 241563 59 95 0.000455243 38.2292 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26434 - CGI_10000350 superfamily 110440 525 552 0.00441933 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#26437 - CGI_10000349 superfamily 247725 1 86 9.97E-38 125.097 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#26447 - CGI_10000374 superfamily 192604 229 282 4.62E-14 67.7094 cl11135 PACT_coil_coil superfamily C - "Pericentrin-AKAP-450 domain of centrosomal targeting protein; This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly." Q#26450 - CGI_10000375 superfamily 245847 6 126 5.52E-23 88.7677 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#26454 - CGI_10000381 superfamily 240425 6 148 8.85E-05 40.9555 cl18912 PTZ00464 superfamily C - SNF-7-like protein; Provisional Q#26455 - CGI_10000384 superfamily 241792 1 120 1.00E-77 228.978 cl00332 Ribosomal_S11 superfamily - - Ribosomal protein S11; Ribosomal protein S11. Q#26457 - CGI_10000286 superfamily 245213 192 227 1.47E-05 40.6978 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26457 - CGI_10000286 superfamily 245213 1 37 2.06E-05 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26457 - CGI_10000286 superfamily 245213 78 113 4.26E-05 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26457 - CGI_10000286 superfamily 245213 153 189 9.67E-05 38.3866 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26457 - CGI_10000286 superfamily 245213 115 151 0.000192385 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26457 - CGI_10000286 superfamily 245213 39 75 0.000222918 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26464 - CGI_10000369 superfamily 218149 16 80 1.72E-21 83.0848 cl04588 tRNA_synt_1c_R1 superfamily N - "Glutaminyl-tRNA synthetase, non-specific RNA binding region part 1; This is a region found N terminal to the catalytic domain of glutaminyl-tRNA synthetase (EC 6.1.1.18) in eukaryotes but not in Escherichia coli. This region is thought to bind RNA in a non-specific manner, enhancing interactions between the tRNA and enzyme, but is not essential for enzyme function." Q#26466 - CGI_10000401 superfamily 238191 1 332 1.09E-59 200.636 cl18907 Esterase_lipase superfamily N - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#26468 - CGI_10000404 superfamily 242274 6 165 3.61E-05 41.629 cl01053 SGNH_hydrolase superfamily - - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#26469 - CGI_10000405 superfamily 243069 44 171 3.61E-58 181.6 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#26470 - CGI_10000402 superfamily 241591 44 117 2.54E-28 102.699 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#26473 - CGI_10000416 superfamily 248009 4 118 4.07E-37 131.953 cl17455 UPF0027 superfamily NC - Uncharacterized protein family UPF0027; Uncharacterized protein family UPF0027. Q#26476 - CGI_10000414 superfamily 242406 55 190 4.54E-24 94.1953 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#26480 - CGI_10000423 superfamily 245202 39 126 1.32E-32 113.799 cl09927 S1_like superfamily - - "S1_like: Ribosomal protein S1-like RNA-binding domain. Found in a wide variety of RNA-associated proteins. Originally identified in S1 ribosomal protein. This superfamily also contains the Cold Shock Domain (CSD), which is a homolog of the S1 domain. Both domains are members of the Oligonucleotide/oligosaccharide Binding (OB) fold." Q#26486 - CGI_10000430 superfamily 219525 74 119 2.63E-06 42.021 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#26486 - CGI_10000430 superfamily 219525 126 175 0.000650476 35.4726 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#26488 - CGI_10000426 superfamily 241563 61 96 0.00500799 35.1476 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26493 - CGI_10000441 superfamily 222150 432 457 1.93E-05 41.9937 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#26493 - CGI_10000441 superfamily 222150 402 420 0.00368209 35.4453 cl16282 zf-H2C2_2 superfamily C - Zinc-finger double domain; Zinc-finger double domain. Q#26494 - CGI_10000437 superfamily 244574 11 37 0.000828521 35.058 cl06998 LEM_like superfamily - - "LEM-like domain of lamina-associated polypeptide 2 (LAP2) and similar proteins; LAP2, also termed thymopoietin (TP), or thymopoietin-related peptide (TPRP), is composed of isoform alpha and isoforms beta/gamma and may be involved in chromatin organization and postmitotic reassembly. Some of the LAP2 isoforms are inner nuclear membrane proteins that can bind to nuclear lamins and chromatin, while others are nonmembrane nuclear polypeptides. All LAP2 isoforms contain an N-terminal lamina-associated polypeptide-Emerin-MAN1 (LEM)-domain that is connected to a highly divergent LEM-like domain by an unstructured linker. Both LEM and LEM-like domains share the same structural fold, mainly composed of two large parallel alpha helices. However, their biochemical nature of the solvent-accessible residues is completely different, which indicates the two domains may target different protein surfaces. The LEM domain is responsible for the interaction with the nonspecific DNA binding protein barrier-to-autointegration factor (BAF), and the LEM-like domain is involved in chromosome binding. The family also includes the yeast helix-extension-helix domain-containing proteins, Heh1p (formerly called Src1p) and Heh2p, and their uncharacterized homologs found mainly in fungi and several in bacteria. Heh1p and Heh2p are inner nuclear membrane proteins that might interact with nuclear pore complexes (NPCs). Heh1p is involved in mitosis. It functions at the interface between subtelomeric gene expression and transcription export (TREX)-dependent messenger RNA export through NPCs. The function of Heh2p remains ill-defined. Both Heh1p and Heh2p contain a LEM-like domain (also termed HeH domain), but lack a LEM domain." Q#26499 - CGI_10000454 superfamily 216363 7 73 2.53E-13 60.5618 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#26502 - CGI_10000453 superfamily 247858 19 68 5.51E-09 49.3086 cl17304 2OG-FeII_Oxy_3 superfamily N - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#26503 - CGI_10000464 superfamily 241554 12 124 4.86E-22 85.3899 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#26506 - CGI_10000466 superfamily 245201 16 239 3.57E-83 252.457 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26507 - CGI_10000462 superfamily 247723 59 137 6.30E-47 150.848 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#26507 - CGI_10000462 superfamily 247723 166 198 5.73E-17 72.2668 cl17169 RRM_SF superfamily C - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#26507 - CGI_10000462 superfamily 247723 3 34 1.53E-06 43.3315 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#26508 - CGI_10000469 superfamily 248420 101 302 1.04E-16 76.5889 cl17866 ABC2_membrane_2 superfamily - - ABC-2 family transporter protein; This family is related to the ABC-2 membrane transporter family. Q#26509 - CGI_10000470 superfamily 247755 313 520 2.80E-78 246.54 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26509 - CGI_10000470 superfamily 247755 2 198 6.88E-63 206.095 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26510 - CGI_10000471 superfamily 222128 205 317 6.30E-13 63.5511 cl18638 HlyD_3 superfamily - - HlyD family secretion protein; This is a family of largely bacterial haemolysin translocator HlyD proteins. Q#26510 - CGI_10000471 superfamily 205711 42 90 1.36E-08 50.5649 cl18273 Biotin_lipoyl_2 superfamily - - Biotin-lipoyl like; Biotin-lipoyl like. Q#26511 - CGI_10000473 superfamily 243179 1 92 3.02E-07 44.2609 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#26512 - CGI_10000475 superfamily 246683 21 186 1.48E-77 237.404 cl14648 Aldose_epim superfamily C - "aldose 1-epimerase superfamily; Aldose 1-epimerases or mutarotases are key enzymes of carbohydrate metabolism; they catalyze the interconversion of the alpha- and beta-anomers of hexose sugars such as glucose and galactose. This interconversion is an important step that allows anomer specific metabolic conversion of sugars. Studies of the catalytic mechanism of the best known member of the family, galactose mutarotase, have shown a glutamate and a histidine residue to be critical for catalysis; the glutamate serves as the active site base to initiate the reaction by removing the proton from the C-1 hydroxyl group of the sugar substrate and the histidine as the active site acid to protonate the C-5 ring oxygen." Q#26516 - CGI_10000478 superfamily 217293 19 89 4.06E-10 57.2575 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#26516 - CGI_10000478 superfamily 202474 98 176 4.10E-07 48.8041 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#26518 - CGI_10000485 superfamily 218267 160 249 1.00E-25 104.054 cl04754 LMBR1 superfamily N - "LMBR1-like membrane protein; Members of this family are integral membrane proteins that are around 500 residues in length. LMBR1 is not involved in preaxial polydactyly, as originally thought. Vertebrate members of this family may play a role in limb development. A member of this family has been shown to be a lipocalin membrane receptor" Q#26519 - CGI_10000488 superfamily 245225 24 361 5.42E-55 189.444 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#26521 - CGI_10000434 superfamily 241600 2 133 1.82E-55 174.736 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#26522 - CGI_10000491 superfamily 245596 32 55 6.25E-07 43.347 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#26523 - CGI_10000494 superfamily 245206 4 227 4.83E-94 277.668 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#26524 - CGI_10000492 superfamily 247856 106 154 8.14E-07 42.9201 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#26527 - CGI_10000497 superfamily 247856 47 100 3.83E-07 42.9201 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#26528 - CGI_10000501 superfamily 241600 1 73 8.95E-19 76.1251 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#26529 - CGI_10000507 superfamily 218182 14 217 1.39E-66 207.302 cl18445 ERG2_Sigma1R superfamily - - "ERG2 and Sigma1 receptor like protein; This family consists of the fungal C-8 sterol isomerase and mammalian sigma1 receptor. C-8 sterol isomerase (delta-8--delta-7 sterol isomerase), catalyzes a reaction in ergosterol biosynthesis, which results in unsaturation at C-7 in the B ring of sterols. Sigma 1 receptor is a low molecular mass mammalian protein located in the endoplasmic reticulum, which interacts with endogenous steroid hormones, such as progesterone and testosterone. It also binds the sigma ligands, which are are a set of chemically unrelated drugs including haloperidol, pentazocine, and ditolylguanidine. Sigma1 effectors are not well understood, but sigma1 agonists have been observed to affect NMDA receptor function, the alpha-adrenergic system and opioid analgesia." Q#26531 - CGI_10000508 superfamily 245213 265 295 0.00475853 34.9198 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#26531 - CGI_10000508 superfamily 241583 1 94 1.74E-16 76.841 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#26531 - CGI_10000508 superfamily 241571 160 257 0.00833294 34.6955 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#26535 - CGI_10000510 superfamily 247805 52 94 2.20E-05 41.554 cl17251 DEXDc superfamily N - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26539 - CGI_10000523 superfamily 241563 75 115 1.43E-06 45.9332 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26541 - CGI_10000526 superfamily 242274 72 166 1.03E-07 48.771 cl01053 SGNH_hydrolase superfamily N - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#26543 - CGI_10000534 superfamily 243263 48 126 6.68E-13 63.581 cl02990 ASC superfamily C - Amiloride-sensitive sodium channel; Amiloride-sensitive sodium channel. Q#26544 - CGI_10000535 superfamily 245814 237 310 1.65E-05 42.0911 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#26544 - CGI_10000535 superfamily 245814 141 208 1.98E-05 41.7059 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#26549 - CGI_10000546 superfamily 241600 1 53 3.55E-21 82.6735 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#26550 - CGI_10000543 superfamily 244819 9 63 0.000508951 36.2234 cl07874 zf-AD superfamily C - "Zinc-finger associated domain (zf-AD); The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA." Q#26552 - CGI_10000547 superfamily 199156 48 63 0.00251462 31.2596 cl15298 zf-CCHC superfamily - - "Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger." Q#26553 - CGI_10000545 superfamily 245612 1 116 6.57E-43 147.459 cl11426 Amidase superfamily N - Amidase; Amidase. Q#26555 - CGI_10000552 superfamily 242889 59 103 5.16E-06 41.8198 cl02111 PCI superfamily N - "PCI domain; This domain has also been called the PINT motif (Proteasome, Int-6, Nip-1 and TRIP-15)." Q#26556 - CGI_10000555 superfamily 241754 29 118 1.15E-35 126.917 cl00286 Motor_domain superfamily NC - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#26560 - CGI_10000560 superfamily 216363 297 392 3.17E-14 68.2658 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#26561 - CGI_10000573 superfamily 201369 1 37 6.78E-12 56.485 cl02914 EF1G superfamily N - "Elongation factor 1 gamma, conserved domain; Elongation factor 1 gamma, conserved domain. " Q#26563 - CGI_10000577 superfamily 222150 327 352 0.000310078 38.5269 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#26563 - CGI_10000577 superfamily 222150 467 490 0.0023613 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#26563 - CGI_10000577 superfamily 222150 355 380 0.00794783 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#26566 - CGI_10000579 superfamily 241691 202 328 1.56E-05 43.2696 cl00213 DNA_BRE_C superfamily N - "DNA breaking-rejoining enzymes, C-terminal catalytic domain. The DNA breaking-rejoining enzyme superfamily includes type IB topoisomerases and tyrosine recombinases that share the same fold in their catalytic domain containing six conserved active site residues. The best-studied members of this diverse superfamily include human topoisomerase I, the bacteriophage lambda integrase, the bacteriophage P1 Cre recombinase, the yeast Flp recombinase and the bacterial XerD/C recombinases. Their overall reaction mechanism is essentially identical and involves cleavage of a single strand of a DNA duplex by nucleophilic attack of a conserved tyrosine to give a 3' phosphotyrosyl protein-DNA adduct. In the second rejoining step, a terminal 5' hydroxyl attacks the covalent adduct to release the enzyme and generate duplex DNA. The enzymes differ in that topoisomerases cleave and then rejoin the same 5' and 3' termini, whereas a site-specific recombinase transfers a 5' hydroxyl generated by recombinase cleavage to a new 3' phosphate partner located in a different duplex region. Many DNA breaking-rejoining enzymes also have N-terminal domains, which show little sequence or structure similarity." Q#26569 - CGI_10000585 superfamily 197729 29 47 0.00102316 34.2049 cl11732 LRRcap superfamily - - occurring C-terminal to leucine-rich repeats; A motif occurring C-terminal to leucine-rich repeats in "sds22-like" and "typical" LRR-containing proteins. Q#26570 - CGI_10000591 superfamily 213393 191 231 1.12E-14 68.4336 cl17094 talin-RS superfamily C - rod-segment of the talin C-terminal domain; The talin rod-segment characterize by this model interacts with its N-terminal FERM domain to mask its integrin-binding site and interferes with interactions between the FERM domain and the cellular membrane. Talin is a large and ubiquitous cytoskeletal protein concentrated at focal adhesion sites. It is involved in linking integrins to the actin cytoskeleton. Q#26573 - CGI_10000590 superfamily 247724 43 323 3.49E-139 410.73 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26573 - CGI_10000590 superfamily 243185 338 420 2.91E-25 100.64 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#26573 - CGI_10000590 superfamily 243187 543 623 1.03E-07 50.4715 cl02789 EFG_like_IV superfamily C - "Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm." Q#26574 - CGI_10000596 superfamily 216152 110 376 9.62E-49 170.572 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#26575 - CGI_10000595 superfamily 195671 13 143 1.14E-44 145.672 cl08257 Ribosomal_L11 superfamily - - "Ribosomal protein L11. Ribosomal protein L11, together with proteins L10 and L7/L12, and 23S rRNA, form the L7/L12 stalk on the surface of the large subunit of the ribosome. The homologous eukaryotic cytoplasmic protein is also called 60S ribosomal protein L12, which is distinct from the L12 involved in the formation of the L7/L12 stalk. The C-terminal domain (CTD) of L11 is essential for binding 23S rRNA, while the N-terminal domain (NTD) contains the binding site for the antibiotics thiostrepton and micrococcin. L11 and 23S rRNA form an essential part of the GTPase-associated region (GAR). Based on differences in the relative positions of the L11 NTD and CTD during the translational cycle, L11 is proposed to play a significant role in the binding of initiation factors, elongation factors, and release factors to the ribosome. Several factors, including the class I release factors RF1 and RF2, are known to interact directly with L11. In eukaryotes, L11 has been implicated in regulating the levels of ubiquinated p53 and MDM2 in the MDM2-p53 feedback loop, which is responsible for apoptosis in response to DNA damage. In bacteria, the "stringent response" to harsh conditions allows bacteria to survive, and ribosomes that lack L11 are deficient in stringent factor stimulation." Q#26578 - CGI_10000594 superfamily 241576 50 119 3.48E-45 145.288 cl00055 MH1 superfamily C - "N-terminal Mad Homology 1 (MH1) domain; The MH1 is a small DNA-binding domain present in SMAD (small mothers against decapentaplegic) family of proteins, which are signal transducers and transcriptional modulators that mediate multiple signaling pathways. MH1 binds to the DNA major groove in an unusual manner via a beta hairpin structure. It negatively regulates the functions of the MH2 domain, the C-terminal domain of SMAD. Receptor-regulated SMAD proteins (R-SMADs, including SMAD1, SMAD2, SMAD3, SMAD5, and SMAD9) are activated by phosphorylation by transforming growth factor (TGF)-beta type I receptors. The active R-SMAD associates with a common mediator SMAD (Co-SMAD or SMAD4) and other cofactors, which together translocate to the nucleus to regulate gene expression. The inhibitory or antagonistic SMADs (I-SMADs, including SMAD6 and SMAD7) negatively regulate TGF-beta signaling by competing with R-SMADs for type I receptor or Co-SMADs. MH1 domains of R-SMAD and SMAD4 contain a nuclear localization signal as well as DNA-binding activity. The activated R-SMAD/SMAD4 complex then binds with very low affinity to a DNA sequence CAGAC called SMAD-binding element (SBE) via the MH1 domain." Q#26584 - CGI_10000588 superfamily 218425 171 305 2.12E-39 144.763 cl04931 eIF-3_zeta superfamily C - "Eukaryotic translation initiation factor 3 subunit 7 (eIF-3); This family is made up of eukaryotic translation initiation factor 3 subunit 7 (eIF-3 zeta/eIF3 p66/eIF3d). Eukaryotic initiation factor 3 is a multi-subunit complex that is required for binding of mRNA to 40 S ribosomal subunits, stabilisation of ternary complex binding to 40 S subunits, and dissociation of 40 and 60 S subunits. These functions and the complex nature of eIF3 suggest multiple interactions with many components of the translational machinery. The gene coding for the protein has been implicated in cancer in mammals." Q#26585 - CGI_10010457 superfamily 247805 261 393 7.69E-16 73.9108 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26586 - CGI_10010458 superfamily 221155 146 259 9.30E-19 79.334 cl13152 RIG-I_C-RD superfamily - - "C-terminal domain of RIG-I; This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerisation. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity." Q#26587 - CGI_10010459 superfamily 247805 235 385 9.18E-22 93.1708 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26587 - CGI_10010459 superfamily 247905 585 696 1.18E-18 83.8264 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#26587 - CGI_10010459 superfamily 221155 772 880 1.73E-13 68.5484 cl13152 RIG-I_C-RD superfamily - - "C-terminal domain of RIG-I; This family of proteins represents the regulatory domain RD of RIG-I, a protein which initiates a signalling cascade that provides essential antiviral protection for the host. The RD domain binds viral RNA, activating the RIG-I ATPase by RNA-dependant dimerisation. The structure of RD contains a zinc-binding domain and is thought to confer ligand specificity." Q#26587 - CGI_10010459 superfamily 246680 2 75 0.00801799 35.7402 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#26590 - CGI_10010462 superfamily 193687 4 149 6.01E-62 190.614 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#26591 - CGI_10010463 superfamily 243555 21 207 4.26E-20 85.1354 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#26592 - CGI_10010464 superfamily 243134 348 459 2.08E-22 95.0235 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#26592 - CGI_10010464 superfamily 243134 641 748 9.67E-19 84.238 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#26592 - CGI_10010464 superfamily 243134 9 130 4.36E-13 67.6744 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#26592 - CGI_10010464 superfamily 243134 501 605 1.87E-11 62.6668 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#26592 - CGI_10010464 superfamily 243134 762 895 1.19E-09 57.274 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#26592 - CGI_10010464 superfamily 243134 929 1048 4.29E-09 55.7332 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#26592 - CGI_10010464 superfamily 243134 213 319 1.21E-08 54.1924 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#26598 - CGI_10010470 superfamily 220704 57 269 1.92E-122 353.507 cl11011 DUF2419 superfamily C - "Protein of unknown function (DUF2419); This is a family of conserved proteins found from plants to humans. The function is not known. A few members are annotated as being cobyrinic acid a,c-diamide synthetase but this could not be confirmed." Q#26599 - CGI_10000605 superfamily 118307 6 118 3.90E-44 142.223 cl10754 Keratin_assoc superfamily - - "Keratinocyte-associated protein 2; Members of this family comprise various keratinocyte-associated proteins. Their exact function has not, as yet, been determined." Q#26600 - CGI_10000607 superfamily 247856 44 93 2.34E-08 46.3869 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#26602 - CGI_10000611 superfamily 218216 8 150 9.20E-59 181.678 cl04685 P16-Arc superfamily - - "ARP2/3 complex 16 kDa subunit (p16-Arc); The Arp2/3 protein complex has been implicated in the control of actin polymerisation. The human complex consists of seven subunits which include the actin related proteins Arp2 and Arp3, and five others referred to as p41-Arc, p34-Arc, p21-Arc, p20-Arc, and p16-Arc. The precise function of p16-Arc is currently unknown. Its structure consists of a single domain containing a bundle of seven alpha helices." Q#26604 - CGI_10000614 superfamily 219316 1 55 4.36E-15 66.0907 cl06268 B9-C2 superfamily N - "Ciliary basal body-associated, B9 protein; The B9-C2 domain is found in proteins associated with the ciliary basal body. B9 domains were identified as a specific family of C2 domains. There are three sub-families represented by this family, notably, Mks1-Xbx7, Stumpy-Tza1 and Tza2 groups of proteins. Mutations in human Mks1 result in the developmental disorder Mechler-Gruber syndrome; mutations in mouse Stumpy lead to perinatal hydrocephalus and severe polycystic kidney disease. All the three distinct types of B9-C2 proteins cooperatively localise to the basal body or centrosome of cilia." Q#26604 - CGI_10000614 superfamily 248445 33 79 0.00420777 32.9903 cl17891 ProX superfamily C - "ABC-type proline/glycine betaine transport systems, periplasmic components [Amino acid transport and metabolism]" Q#26605 - CGI_10000615 superfamily 242996 5 89 9.07E-22 82.6432 cl02346 Tmemb_14 superfamily - - Transmembrane proteins 14C; This family of short membrane proteins are as yet uncharacterized. Q#26607 - CGI_10000613 superfamily 110440 481 508 0.00129764 37.0021 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#26607 - CGI_10000613 superfamily 110440 523 550 0.00933853 34.3057 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#26608 - CGI_10000624 superfamily 242046 59 221 1.44E-63 197.477 cl00718 TOPRIM superfamily - - "Topoisomerase-primase domain. This is a nucleotidyl transferase/hydrolase domain found in type IA, type IIA and type IIB topoisomerases, bacterial DnaG-type primases, small primase-like proteins from bacteria and archaea, OLD family nucleases from bacterial and archaea, and bacterial DNA repair proteins of the RecR/M family. This domain has two conserved motifs, one of which centers at a conserved glutamate and the other one at two conserved aspartates (DxD). This glutamate and two aspartates, cluster together to form a highly acid surface patch. The conserved glutamate may act as a general base in nucleotide polymerization by primases and in strand joining in topoisomerases and, as a general acid in strand cleavage by topisomerases and nucleases. The DXD motif may co-ordinate Mg2+, a cofactor required for full catalytic function." Q#26609 - CGI_10000619 superfamily 241563 148 190 5.29E-06 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26609 - CGI_10000619 superfamily 128778 204 306 0.00336968 35.7035 cl17972 BBC superfamily - - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#26610 - CGI_10000627 superfamily 209366 21 45 5.62E-11 53.35 cl11604 zf-A20 superfamily - - A20-like zinc finger; The A20 Zn-finger of bovine/human Rabex5/rabGEF1 is a Ubiquitin Binding Domain. The zinc finger mediates self-association in A20. These fingers also mediate IL-1-induced NF-kappa B activation. Q#26612 - CGI_10000630 superfamily 245205 10 82 7.45E-16 71.0572 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26612 - CGI_10000630 superfamily 241739 97 315 6.29E-61 198.944 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#26614 - CGI_10000629 superfamily 248264 1 107 1.87E-11 58.0174 cl17710 DDE_4 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#26618 - CGI_10000631 superfamily 241629 38 112 9.39E-22 84.569 cl00133 SCP superfamily C - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#26622 - CGI_10000661 superfamily 247724 41 201 0.000314242 38.9768 cl17170 Ras_like_GTPase superfamily C - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26624 - CGI_10000639 superfamily 245836 101 258 6.50E-75 232.913 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#26624 - CGI_10000639 superfamily 151975 25 63 3.37E-08 49.401 cl13054 Snurportin1 superfamily - - Snurportin1; Snurportin1 is a novel nuclear import receptor which contains an N-terminal importin beta binding domain which is essential for its function of a snRNP-specific nuclear import receptor. Snurportin1 interacts with m3G-cap where it enhances the m3G-cap dependent nuclear import of U snRNPs in Xenopus laevis oocytes and digitonin-permeabilized HeLa cells. Q#26629 - CGI_10000668 superfamily 245225 4 120 2.66E-15 70.0328 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily NC - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#26631 - CGI_10000677 superfamily 245814 101 169 0.00046786 37.0012 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#26632 - CGI_10000688 superfamily 241563 38 77 5.78E-05 40.7336 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26632 - CGI_10000688 superfamily 197380 290 347 0.0078335 36.8309 cl16909 SdiA-regulated superfamily C - "SdiA-regulated; This model represents a bacterial family of proteins that may be regulated by SdiA, a member of the LuxR family of transcriptional regulators. The C-terminal domain included in the alignment forms a five-bladed beta-propeller structure. The X-ray structure of Escherichia coli yjiK (C-terminal domain) exhibits binding of calcium ions (Ca++) in what appears to be an evolutionarily conserved site. Sequence analysis suggests a distant relationship to proteins that are characterized as containing NHL-repeats. The latter also form beta-propeller structures, with several examples known to form six-bladed beta-propellers. Several of the six-bladed beta-propellers containing NHL repeats have been characterized functionally, including members with enzymatic functions that are dependent on metal ions. No functional characterization is available for this family of five-bladed propellers, though." Q#26633 - CGI_10000689 superfamily 243035 38 85 9.81E-06 39.1402 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26634 - CGI_10000690 superfamily 241817 21 186 1.42E-48 161.976 cl00365 F1-ATPase_gamma superfamily C - "mitochondrial ATP synthase gamma subunit; The F-ATPase is found in bacterial plasma membranes, mitochondrial inner membranes and in chloroplast thylakoid membranes. It has also been found in the archaea Methanosarcina barkeri. It uses a proton gradient to drive ATP synthesis and hydrolyzes ATP to build the proton gradient. The extrinisic membrane domain of F-ATPases is composed of alpha, beta, gamma, delta, and epsilon (not present in bacteria) subunits with a stoichiometry of 3:3:1:1:1. Alpha and beta subunit form the globular catalytic moiety, a hexameric ring of alternating subunits. Gamma, delta and epsilon subunits form a stalk, connecting F1 to F0, the integral membrane proton translocating domain." Q#26635 - CGI_10000692 superfamily 241563 8 52 0.00998834 34.3772 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26636 - CGI_10000695 superfamily 243072 251 362 1.53E-20 86.2834 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#26636 - CGI_10000695 superfamily 115363 4 65 9.06E-05 40.0478 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#26637 - CGI_10000696 superfamily 218676 125 411 1.98E-16 79.2995 cl14911 Peptidase_M13_N superfamily C - "Peptidase family M13; M13 peptidases are well-studied proteases found in a wide range of organisms including mammals and bacteria. In mammals they participate in processes such as cardiovascular development, blood-pressure regulation, nervous control of respiration, and regulation of the function of neuropeptides in the central nervous system. In bacteria they may be used for digestion of milk." Q#26638 - CGI_10000697 superfamily 244307 71 421 3.13E-112 340.191 cl06123 DHR2_DOCK superfamily - - "Dock Homology Region 2, a GEF domain, of Dedicator of Cytokinesis proteins; DOCK proteins comprise a family of atypical guanine nucleotide exchange factors (GEFs) that lack the conventional Dbl homology (DH) domain. As GEFs, they activate the small GTPases Rac and Cdc42 by exchanging bound GDP for free GTP. They are also called the CZH (CED-5, Dock180, and MBC-zizimin homology) family, after the first family members identified. Dock180 was first isolated as a binding partner for the adaptor protein Crk. The Caenorhabditis elegans protein, Ced-5, is essential for cell migration and phagocytosis, while the Drosophila ortholog, Myoblast city (MBC), is necessary for myoblast fusion and dorsal closure. DOCKs are divided into four classes (A-D) based on sequence similarity and domain architecture: class A includes Dock1 (or Dock180), 2 and 5; class B includes Dock3 and 4; class C includes Dock6, 7, and 8; and class D includes Dock9, 10 and 11. All DOCKs contain two homology domains: the DHR-1 (Dock homology region-1), also called CZH1, and DHR-2 (also called CZH2 or Docker). This alignment model represents the DHR-2 domain of DOCK proteins, which contains the catalytic GEF activity for Rac and/or Cdc42." Q#26639 - CGI_10000698 superfamily 241547 84 186 3.97E-19 81.9455 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#26641 - CGI_10000704 superfamily 245248 1 116 1.26E-29 110.01 cl10080 RPE65 superfamily N - "Retinal pigment epithelial membrane protein; This family represents a retinal pigment epithelial membrane receptor which is abundantly expressed in retinal pigment epithelium, and binds plasma retinal binding protein. The family also includes the sequence related neoxanthin cleavage enzyme in plants and lignostilbene-alpha,beta-dioxygenase in bacteria." Q#26642 - CGI_10000705 superfamily 242274 61 174 7.73E-05 40.5725 cl01053 SGNH_hydrolase superfamily N - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#26644 - CGI_10000699 superfamily 248264 312 438 3.24E-05 42.6094 cl17710 DDE_4 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#26646 - CGI_10000709 superfamily 247724 8 208 1.37E-83 255.153 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26649 - CGI_10000718 superfamily 241584 54 96 0.000232046 36.3203 cl00065 FN3 superfamily N - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#26651 - CGI_10000720 superfamily 222428 104 416 1.71E-162 462.821 cl18675 AAA_34 superfamily - - P-loop containing NTP hydrolase pore-1; P-loop containing NTP hydrolase pore-1. Q#26654 - CGI_10000736 superfamily 241600 2 128 3.96E-36 125.431 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#26656 - CGI_10000731 superfamily 244083 22 138 4.32E-33 115.04 cl05417 PLA2_like superfamily - - "PLA2_like: Phospholipase A2, a super-family of secretory and cytosolic enzymes; the latter are either Ca dependent or Ca independent. PLA2 cleaves the sn-2 position of the glycerol backbone of phospholipids (PC or phosphatidylethanolamine), usually in a metal-dependent reaction, to generate lysophospholipid (LysoPL) and a free fatty acid (FA). The resulting products are either dietary or used in synthetic pathways for leukotrienes and prostaglandins. Often, arachidonic acid is released as a free fatty acid and acts as second messenger in signaling networks. Secreted PLA2s have also been found to specifically bind to a variety of soluble and membrane proteins in mammals, including receptors. As a toxin, PLA2 is a potent presynaptic neurotoxin which blocks nerve terminals by binding to the nerve membrane and hydrolyzing stable membrane lipids. The products of the hydrolysis (LysoPL and FA) cannot form bilayers leading to a change in membrane conformation and ultimately to a block in the release of neurotransmitters. PLA2 may form dimers or oligomers." Q#26658 - CGI_10000682 superfamily 183292 83 149 0.00313386 35.5676 cl18135 PRK11728 superfamily NC - hydroxyglutarate oxidase; Provisional Q#26660 - CGI_10000745 superfamily 183292 1 35 0.00153896 36.7232 cl18135 PRK11728 superfamily C - hydroxyglutarate oxidase; Provisional Q#26661 - CGI_10000749 superfamily 245596 86 147 2.89E-24 94.1857 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#26663 - CGI_10000757 superfamily 241578 1 124 6.54E-22 85.8062 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26664 - CGI_10000758 superfamily 248293 46 115 0.000447463 38.1038 cl17739 MADF_DNA_bdg superfamily - - Alcohol dehydrogenase transcription factor Myb/SANT-like; The myb/SANT-like domain in Adf-1 (MADF) is an approximately 80-amino-acid module that directs sequence specific DNA binding to a site consisting of multiple tri-nucleotide repeats. The MADF domain is found in one or more copies in eukaryotic and viral proteins and is often associated with the BESS domain. It is likely that the MADF domain is more closely related to the myb/SANT domain than it is to other HTH domains. Q#26667 - CGI_10000766 superfamily 243635 139 192 3.11E-10 55.0333 cl04085 uDENN superfamily N - uDENN domain; This region is always found associated with pfam02141. It is predicted to form an all beta domain. Q#26667 - CGI_10000766 superfamily 245670 236 261 0.00238243 36.7898 cl11519 DENN superfamily NC - DENN (AEX-3) domain; DENN (after differentially expressed in neoplastic vs normal cells) is a domain which occurs in several proteins involved in Rab- mediated processes or regulation of MAPK signalling pathways. Q#26669 - CGI_10000761 superfamily 220672 80 274 1.55E-37 133.526 cl10957 Frag1 superfamily - - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#26669 - CGI_10000761 superfamily 220672 2 46 4.86E-05 42.2338 cl10957 Frag1 superfamily N - "Frag1/DRAM/Sfk1 family; This family includes Frag1, DRAM and Sfk1 proteins. Frag1 (FGF receptor activating protein 1) is a protein that is conserved from fungi to humans. There are four potential iso-prenylation sites throughout the peptide, viz CILW, CIIW and CIGL. Frag1 is a membrane-spanning protein that is ubiquitously expressed in adult tissues suggesting an important cellular function. Dram is a family of proteins conserved from nematodes to humans with six hydrophobic transmembrane regions and an Endoplasmic Reticulum signal peptide. It is a lysosomal protein that induces macro-autophagy as an effector of p53-mediated death, where p53 is the tumour-suppressor gene that is frequently mutated in cancer. Expression of Dram is stress-induced. This region is also part of a family of small plasma membrane proteins, referred to as Sfk1, that may act together with or upstream of Stt4p to generate normal levels of the essential phospholipid PI4P, thus allowing proper localisation of Stt4p to the actin cytoskeleton." Q#26670 - CGI_10000779 superfamily 245225 12 378 8.58E-96 291.868 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#26677 - CGI_10000795 superfamily 247684 4 149 8.07E-23 96.6476 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#26678 - CGI_10000794 superfamily 247941 115 255 0.000111174 40.3969 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#26679 - CGI_10000806 superfamily 241619 8 42 0.00332337 31.6378 cl00112 PAN_APPLE superfamily NC - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#26687 - CGI_10000818 superfamily 241563 61 96 0.000156612 40.9256 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26689 - CGI_10000788 superfamily 207668 80 113 1.52E-13 60.283 cl02609 TFIIS_C superfamily - - Transcription factor S-II (TFIIS); Transcription factor S-II (TFIIS). Q#26690 - CGI_10000823 superfamily 248097 72 185 2.02E-20 83.0834 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#26691 - CGI_10000824 superfamily 248097 5 95 8.19E-17 70.3718 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#26692 - CGI_10000755 superfamily 241563 89 124 0.00242239 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26693 - CGI_10000829 superfamily 245205 8 103 1.30E-33 115.402 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26694 - CGI_10000825 superfamily 245205 3 47 8.94E-06 39.9137 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26695 - CGI_10000821 superfamily 245201 92 241 9.63E-37 131.525 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26695 - CGI_10000821 superfamily 247824 8 117 5.10E-05 42.2115 cl17270 APH_ChoK_like superfamily NC - "Aminoglycoside 3'-phosphotransferase (APH) and Choline Kinase (ChoK) family. The APH/ChoK family is part of a larger superfamily that includes the catalytic domains of other kinases, such as the typical serine/threonine/tyrosine protein kinases (PKs), RIO kinases, actin-fragmin kinase (AFK), and phosphoinositide 3-kinase (PI3K). The family is composed of APH, ChoK, ethanolamine kinase (ETNK), macrolide 2'-phosphotransferase (MPH2'), an unusual homoserine kinase, and uncharacterized proteins with similarity to the N-terminal domain of acyl-CoA dehydrogenase 10 (ACAD10). The members of this family catalyze the transfer of the gamma-phosphoryl group from ATP (or CTP) to small molecule substrates such as aminoglycosides, macrolides, choline, ethanolamine, and homoserine. Phosphorylation of the antibiotics, aminoglycosides and macrolides, leads to their inactivation and to bacterial antibiotic resistance. Phosphorylation of choline, ethanolamine, and homoserine serves as precursors to the synthesis of important biological compounds, such as the major phospholipids, phosphatidylcholine and phosphatidylethanolamine and the amino acids, threonine, methionine, and isoleucine." Q#26696 - CGI_10000836 superfamily 242406 83 123 0.000414434 36.4153 cl01271 DUF1768 superfamily C - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#26699 - CGI_10000840 superfamily 152473 1 47 1.69E-15 71.8264 cl12034 DUF3524 superfamily N - Domain of unknown function (DUF3524); This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is about 170 amino acids in length. This domain is found associated with pfam00534. This domain has two conserved sequence motifs: HENQ and FNS. This domain has a single completely conserved residue S that may be functionally important. Q#26699 - CGI_10000840 superfamily 245227 179 273 1.09E-08 54.7002 cl10013 Glycosyltransferase_GTB_type superfamily NC - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#26700 - CGI_10000841 superfamily 152473 12 123 1.07E-49 158.111 cl12034 DUF3524 superfamily C - Domain of unknown function (DUF3524); This presumed domain is functionally uncharacterized. This domain is found in bacteria and eukaryotes. This domain is about 170 amino acids in length. This domain is found associated with pfam00534. This domain has two conserved sequence motifs: HENQ and FNS. This domain has a single completely conserved residue S that may be functionally important. Q#26703 - CGI_10000811 superfamily 241832 60 172 7.18E-64 196.255 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#26704 - CGI_10000812 superfamily 248020 1 263 1.05E-18 83.9499 cl17466 Sulfatase superfamily N - Sulfatase; Sulfatase. Q#26705 - CGI_10000822 superfamily 245201 42 132 8.33E-09 54.5202 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26707 - CGI_10000851 superfamily 191331 148 334 3.75E-29 113.662 cl05293 RNA_pol_Rpc82 superfamily - - "RNA polymerase III subunit RPC82; This family consists of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In Saccharomyces cerevisiae, the enzyme is composed of 15 subunits, ranging from 160 to about 10 kDa." Q#26707 - CGI_10000851 superfamily 191971 7 68 7.71E-11 57.9932 cl07012 HTH_9 superfamily - - "RNA polymerase III subunit RPC82 helix-turn-helix domain; This family consists of several DNA-directed RNA polymerase III polypeptides which are related to the Saccharomyces cerevisiae RPC82 protein. RNA polymerase C (III) promotes the transcription of tRNA and 5S RNA genes. In Saccharomyces cerevisiae, the enzyme is composed of 15 subunits, ranging from 160 to about 10 kDa. This region is a probably DNA-binding helix-turn-helix." Q#26708 - CGI_10000869 superfamily 245206 3 271 5.35E-139 396.817 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#26710 - CGI_10000871 superfamily 149426 8 157 1.18E-26 104.4 cl18038 SEFIR superfamily - - "SEFIR domain; This family comprises IL17 receptors (IL17Rs) and SEF proteins. The latter are feedback inhibitors of FGF signalling and are also thought to be receptors. Due to its similarity to the TIR domain (pfam01582), the SEFIR region is thought to be involved in homotypic interactions with other SEFIR/TIR-domain-containing proteins. Thus, SEFs and IL17Rs may be involved in TOLL/IL1R-like signalling pathways." Q#26712 - CGI_10000872 superfamily 245205 19 77 0.000284618 36.0617 cl09930 RPA_2b-aaRSs_OBF_like superfamily C - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26715 - CGI_10000883 superfamily 241578 17 135 4.21E-09 51.6864 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26717 - CGI_10000887 superfamily 246680 84 151 0.00486941 33.0778 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#26723 - CGI_10000874 superfamily 245205 151 225 1.36E-06 46.0769 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26723 - CGI_10000874 superfamily 245531 404 480 0.000480543 38.499 cl11158 BEN superfamily - - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#26725 - CGI_10000893 superfamily 241832 74 177 4.58E-41 141.686 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#26725 - CGI_10000893 superfamily 241832 198 299 6.22E-37 130.516 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#26725 - CGI_10000893 superfamily 241832 321 373 3.05E-24 95.4624 cl00388 Thioredoxin_like superfamily C - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#26725 - CGI_10000893 superfamily 241832 17 61 7.52E-08 49.7791 cl00388 Thioredoxin_like superfamily N - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#26727 - CGI_10000902 superfamily 217836 2 47 4.33E-11 55.3645 cl09556 Sas10_Utp3 superfamily N - "Sas10/Utp3/C1D family; This family contains Utp3 and LCP5 which are components of the U3 ribonucleoprotein complex. It also includes the human C1D protein and Saccharomyces cerevisiae YHR081W (rrp47), an exosome-associated protein required for the 3' processing of stable RNAs, and Sas10 which has been identified as a regulator of chromatin silencing. This family also includes the human protein Neuroguidin an initiation factor 4E (eIF4E) binding protein." Q#26730 - CGI_10000897 superfamily 217648 328 646 1.44E-76 253.429 cl15557 Glyco_hydro_65m superfamily - - "Glycosyl hydrolase family 65 central catalytic domain; This family of glycosyl hydrolases contains vacuolar acid trehalase and maltose phosphorylase.Maltose phosphorylase (MP) is a dimeric enzyme that catalyzes the conversion of maltose and inorganic phosphate into beta-D-glucose-1-phosphate and glucose. The central domain is the catalytic domain, which binds a phosphate ion that is proximal the the highly conserved Glu. The arrangement of the phosphate and the glutamate is thought to cause nucleophilic attack on the anomeric carbon atom. The catalytic domain also forms the majority of the dimerisation interface." Q#26732 - CGI_10000912 superfamily 217473 163 307 3.05E-26 108.992 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#26732 - CGI_10000912 superfamily 216981 641 670 0.00763493 35.5862 cl17087 OTU superfamily C - "OTU-like cysteine protease; This family is comprised of a group of predicted cysteine proteases, homologous to the Ovarian Tumour (OTU) gene in Drosophila. Members include proteins from eukaryotes, viruses and pathogenic bacterium. The conserved cysteine and histidine, and possibly the aspartate, represent the catalytic residues in this putative group of proteases." Q#26733 - CGI_10000908 superfamily 242406 1 63 0.0029917 34.1041 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#26735 - CGI_10000913 superfamily 215647 969 1191 5.68E-10 59.9297 cl18338 7tm_2 superfamily - - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#26736 - CGI_10000910 superfamily 248097 13 106 1.19E-19 81.9278 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#26736 - CGI_10000910 superfamily 248097 129 173 8.96E-06 43.0226 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#26737 - CGI_10000916 superfamily 216363 72 177 6.20E-25 94.0741 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#26739 - CGI_10000932 superfamily 218118 31 80 1.10E-08 47.6089 cl04552 CD225 superfamily C - "Interferon-induced transmembrane protein; This family includes the human leukocyte antigen CD225, which is an interferon inducible transmembrane protein, and is associated with interferon induced cell growth suppression." Q#26740 - CGI_10000934 superfamily 247905 228 387 1.18E-23 95.3824 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#26740 - CGI_10000934 superfamily 247805 8 217 8.41E-56 185.766 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26741 - CGI_10000935 superfamily 247805 86 226 8.00E-29 112.431 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26741 - CGI_10000935 superfamily 243778 469 559 2.14E-39 139.667 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#26741 - CGI_10000935 superfamily 247905 313 372 2.45E-06 46.0509 cl17351 HELICc superfamily C - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#26741 - CGI_10000935 superfamily 219532 593 633 7.35E-06 44.6126 cl06657 OB_NTP_bind superfamily C - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#26745 - CGI_10007265 superfamily 241554 1346 1485 7.44E-38 141.244 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#26745 - CGI_10007265 superfamily 247723 386 454 8.72E-18 81.1596 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#26745 - CGI_10007265 superfamily 247723 294 365 2.26E-17 80.0388 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#26745 - CGI_10007265 superfamily 247723 473 541 0.00130316 39.2105 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#26745 - CGI_10007265 superfamily 241554 1153 1305 1.86E-45 164.366 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#26745 - CGI_10007265 superfamily 241554 1662 1773 1.75E-21 93.8643 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#26745 - CGI_10007265 superfamily 247723 548 616 6.08E-10 58.0476 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#26745 - CGI_10007265 superfamily 219837 109 176 0.000354025 41.5003 cl07160 Fork_head_N superfamily C - "Forkhead N-terminal region; The region described in this family is found towards the N-terminus of various eukaryotic fork head/HNF-3-related transcription factors (which contain the pfam00250 domain). These proteins play key roles in embryogenesis, maintenance of differentiated cell states, and tumorigenesis." Q#26747 - CGI_10007267 superfamily 192997 1042 1198 1.20E-34 133.091 cl18184 Sterol-sensing superfamily - - "Sterol-sensing domain of SREBP cleavage-activation; Sterol regulatory element-binding proteins (SREBPs) are membrane-bound transcription factors that promote lipid synthesis in animal cells. They are embedded in the membranes of the endoplasmic reticulum (ER) in a helical hairpin orientation and are released from the ER by a two-step proteolytic process. Proteolysis begins when the SREBPs are cleaved at Site-1, which is located at a leucine residue in the middle of the hydrophobic loop in the lumen of the ER. Upon proteolytic processing SREBP can activate the expression of genes involved in cholesterol biosynthesis and uptake. SCAP stimulates cleavage of SREBPs via fusion of the their two C-termini. This domain is the transmembrane region that traverses the membrane eight times and is the sterol-sensing domain of the cleavage protein. WD40 domains are found towards the C-terminus." Q#26747 - CGI_10007267 superfamily 243092 2335 2483 7.28E-06 49.2556 cl02567 WD40 superfamily N - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#26748 - CGI_10007268 superfamily 241862 268 430 6.54E-19 85.1004 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#26749 - CGI_10007269 superfamily 241596 12 55 7.71E-05 35.6527 cl00081 HLH superfamily N - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#26750 - CGI_10007270 superfamily 220692 26 326 3.39E-15 74.5481 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#26753 - CGI_10000920 superfamily 215691 237 286 2.07E-06 44.499 cl15766 Pyr_redox superfamily N - Pyridine nucleotide-disulphide oxidoreductase; This family includes both class I and class II oxidoreductases and also NADH oxidases and peroxidases. This domain is actually a small NADH binding domain within a larger FAD binding domain. Q#26753 - CGI_10000920 superfamily 248054 60 85 0.00613321 33.9848 cl17500 NAD_binding_8 superfamily C - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#26754 - CGI_10000921 superfamily 243066 41 172 4.12E-08 48.7677 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#26755 - CGI_10000906 superfamily 218713 120 299 2.89E-74 228.766 cl05332 MRG superfamily - - "MRG; This family consists of three different eukaryotic proteins (mortality factor 4 (MORF4/MRG15), male-specific lethal 3(MSL-3) and ESA1-associated factor 3(EAF3)). It is thought that the MRG family is involved in transcriptional regulation via histone acetylation. It contains 2 chromo domains and a leucine zipper motif." Q#26756 - CGI_10000947 superfamily 215754 231 322 9.87E-22 87.694 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#26756 - CGI_10000947 superfamily 215754 134 225 8.40E-20 82.3012 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#26756 - CGI_10000947 superfamily 215754 26 132 2.26E-13 64.582 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#26758 - CGI_10000936 superfamily 245206 5 61 2.37E-08 47.8269 cl09931 NADB_Rossmann superfamily NC - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#26760 - CGI_10000950 superfamily 248097 1 61 5.55E-09 47.645 cl17543 C1q superfamily N - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#26761 - CGI_10000958 superfamily 245201 173 216 2.02E-06 45.7914 cl09925 PKc_like superfamily NC - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26762 - CGI_10000953 superfamily 245201 1 157 1.21E-104 306.649 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26763 - CGI_10000960 superfamily 243040 1 57 2.40E-30 107.497 cl02447 CRD_FZ superfamily N - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#26765 - CGI_10000889 superfamily 245227 16 68 4.93E-35 122.704 cl10013 Glycosyltransferase_GTB_type superfamily C - "Glycosyltransferases catalyze the transfer of sugar moieties from activated donor molecules to specific acceptor molecules, forming glycosidic bonds. The acceptor molecule can be a lipid, a protein, a heterocyclic compound, or another carbohydrate residue. The structures of the formed glycoconjugates are extremely diverse, reflecting a wide range of biological functions. The members of this family share a common GTB topology, one of the two protein topologies observed for nucleotide-sugar-dependent glycosyltransferases. GTB proteins have distinct N- and C- terminal domains each containing a typical Rossmann fold. The two domains have high structural homology despite minimal sequence homology. The large cleft that separates the two domains includes the catalytic center and permits a high degree of flexibility." Q#26766 - CGI_10000975 superfamily 241748 1 191 1.79E-75 230.537 cl00279 APP_MetAP superfamily N - "A family including aminopeptidase P, aminopeptidase M, and prolidase. Also known as metallopeptidase family M24. This family of enzymes is able to cleave amido-, imido- and amidino-containing bonds. Members exibit relatively narrow substrate specificity compared to other metallo-aminopeptidases, suggesting they play roles in regulation of biological processes rather than general protein degradation." Q#26770 - CGI_10000993 superfamily 248469 65 154 3.02E-16 73.9435 cl17915 HAD_like superfamily C - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#26770 - CGI_10000993 superfamily 248469 259 311 1.46E-07 48.9055 cl17915 HAD_like superfamily N - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#26772 - CGI_10000954 superfamily 245206 61 307 1.06E-117 343.03 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#26773 - CGI_10000955 superfamily 241831 62 124 8.46E-16 67.5814 cl00386 BolA superfamily - - BolA-like protein; This family consist of the morphoprotein BolA from E. coli and its various homologues. In E. coli over expression of this protein causes round morphology and may be involved in switching the cell between elongation and septation systems during cell division. The expression of BolA is growth rate regulated and is induced during the transition into the the stationary phase. BolA is also induced by stress during early stages of growth and may have a general role in stress response. It has also been suggested that BolA can induce the transcription of penicillin binding proteins 6 and 5. Q#26775 - CGI_10000877 superfamily 238012 26 64 7.78E-05 40.4154 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#26775 - CGI_10000877 superfamily 238012 122 153 0.00627126 34.6374 cl11390 EGF_Lam superfamily N - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#26776 - CGI_10001003 superfamily 243146 256 302 4.89E-09 51.8934 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#26776 - CGI_10001003 superfamily 243146 218 267 6.24E-07 46.0123 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#26777 - CGI_10001007 superfamily 241568 151 206 4.46E-05 40.1388 cl00043 CCP superfamily - - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#26777 - CGI_10001007 superfamily 241619 45 116 0.00397976 34.4801 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#26782 - CGI_10001001 superfamily 247724 2 187 6.27E-75 226.005 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26783 - CGI_10001013 superfamily 202711 40 204 2.39E-56 179.856 cl04190 Mob1_phocein superfamily - - "Mob1/phocein family; Mob1 is an essential Saccharomyces cerevisiae protein, identified from a two-hybrid screen, that binds Mps1p, a protein kinase essential for spindle pole body duplication and mitotic checkpoint regulation. Mob1 contains no known structural motifs; however MOB1 is a member of a conserved gene family and shares sequence similarity with a nonessential yeast gene, MOB2. Mob1 is a phosphoprotein in vivo and a substrate for the Mps1p kinase in vitro. Conditional alleles of MOB1 cause a late nuclear division arrest at restrictive temperature. This family also includes phocein, a rat protein that by yeast two hybrid interacts with striatin." Q#26787 - CGI_10000977 superfamily 241578 1 77 1.12E-11 56.531 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#26789 - CGI_10001031 superfamily 243175 91 206 1.99E-57 179.315 cl02776 GST_C_family superfamily - - "C-terminal, alpha helical domain of the Glutathione S-transferase family; Glutathione S-transferase (GST) family, C-terminal alpha helical domain; a large, diverse group of cytosolic dimeric proteins involved in cellular detoxification by catalyzing the conjugation of glutathione (GSH) with a wide range of endogenous and xenobiotic alkylating agents, including carcinogens, therapeutic drugs, environmental toxins and products of oxidative stress. In addition, GSTs also show GSH peroxidase activity and are involved in the synthesis of prostaglandins and leukotrienes. This family, also referred to as soluble GSTs, is the largest family of GSH transferases and is only distantly related to the mitochondrial GSTs (GSTK). Soluble GSTs bear no structural similarity to microsomal GSTs (MAPEG family) and display additional activities unique to their group, such as catalyzing thiolysis, reduction and isomerization of certain compounds. The GST fold contains an N-terminal thioredoxin-fold domain and a C-terminal alpha helical domain, with an active site located in a cleft between the two domains. GSH binds to the N-terminal domain while the hydrophobic substrate occupies a pocket in the C-terminal domain. Based on sequence similarity, different classes of GSTs have been identified, which display varying tissue distribution, substrate specificities and additional specific activities. In humans, GSTs display polymorphisms which may influence individual susceptibility to diseases such as cancer, arthritis, allergy and sclerosis. Some GST family members with non-GST functions include glutaredoxin 2, the CLIC subfamily of anion channels, prion protein Ure2p, crystallins, metaxins, stringent starvation protein A, and aminoacyl-tRNA synthetases." Q#26789 - CGI_10001031 superfamily 241832 2 73 4.98E-36 123.064 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#26790 - CGI_10001033 superfamily 247794 2 334 8.29E-164 464.957 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#26791 - CGI_10001034 superfamily 241952 57 520 0 567.378 cl00566 PntB superfamily - - NAD/NADP transhydrogenase beta subunit [Energy production and conversion] Q#26791 - CGI_10001034 superfamily 204792 6 76 3.08E-06 45.7822 cl13395 DUF3464 superfamily NC - Protein of unknown function (DUF3464); This family of proteins are functionally uncharacterized. This protein is found in bacteria and eukaryotes. Proteins in this family are typically between 137 to 196 amino acids in length. Q#26794 - CGI_10001009 superfamily 245596 36 139 5.05E-45 148.114 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#26798 - CGI_10001014 superfamily 218802 117 146 0.00671562 35.415 cl05462 DUF862 superfamily N - "PPPDE putative peptidase domain; The PPPDE superfamily (after Permuted Papain fold Peptidases of DsRNA viruses and Eukaryotes), consists of predicted thiol peptidases with a circularly permuted papain-like fold. The inference of the likely DUB function of the PPPDE superfamily proteins is based on the fusions of the catalytic domain to Ub-binding PUG (PUB)/UBA domains and a novel alpha-helical Ub-associated domain (the PUL domain, after PLAP, Ufd3p and Lub1p)." Q#26799 - CGI_10001039 superfamily 247755 1 112 7.40E-41 138.068 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26800 - CGI_10001017 superfamily 191582 22 99 0.00024704 36.4385 cl05954 DUF1180 superfamily NC - Protein of unknown function (DUF1180); This family consists of several hypothetical mammalian proteins of around 190 residues in length. The function of this family is unknown. Q#26802 - CGI_10001041 superfamily 241609 29 97 2.89E-24 88.9743 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#26803 - CGI_10001044 superfamily 245205 134 208 4.91E-07 46.0769 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26803 - CGI_10001044 superfamily 245205 19 98 0.000532639 37.2173 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26805 - CGI_10001025 superfamily 241868 5 85 1.17E-10 54.0702 cl00447 Nudix_Hydrolase superfamily N - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#26806 - CGI_10000854 superfamily 128937 3 69 7.55E-15 65.3616 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#26806 - CGI_10000854 superfamily 128937 79 142 5.63E-10 51.8796 cl02743 DM9 superfamily - - Repeats found in Drosophila proteins; Repeats found in Drosophila proteins. Q#26807 - CGI_10001070 superfamily 241583 125 285 8.69E-30 113.05 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#26807 - CGI_10001070 superfamily 241571 290 352 0.000751857 37.3919 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#26809 - CGI_10001112 superfamily 220662 81 141 0.00444554 34.7386 cl10947 DUF2217 superfamily NC - Uncharacterized conserved protein (DUF2217); This is a family of conserved proteins of from 500 - 600 residues found from worms to humans. Its function is not known. Q#26813 - CGI_10001111 superfamily 217925 206 286 1.18E-05 43.2637 cl04417 Ctr superfamily N - "Ctr copper transporter family; The redox active metal copper is an essential cofactor in critical biological processes such as respiration, iron transport, oxidative stress protection, hormone production, and pigmentation. A widely conserved family of high-affinity copper transport proteins (Ctr proteins) mediates copper uptake at the plasma membrane. A series of clustered methionine residues in the hydrophilic extracellular domain, and an MXXXM motif in the second transmembrane domain, are important for copper uptake. These methionine probably coordinate copper during the process of metal transport." Q#26817 - CGI_10001094 superfamily 222150 59 84 0.00148518 33.1342 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#26820 - CGI_10001130 superfamily 247794 6 160 3.14E-23 92.5848 cl17240 FDH_GDH_like superfamily C - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#26821 - CGI_10001131 superfamily 247794 11 82 4.01E-34 119.549 cl17240 FDH_GDH_like superfamily N - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#26822 - CGI_10001132 superfamily 247792 20 70 5.61E-07 47.0552 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#26822 - CGI_10001132 superfamily 241563 157 199 0.0017981 36.6884 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26826 - CGI_10001193 superfamily 218200 1 107 4.80E-29 112.075 cl04660 Glyco_transf_54 superfamily N - "N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein." Q#26827 - CGI_10001194 superfamily 218200 30 258 8.60E-81 253.058 cl04660 Glyco_transf_54 superfamily - - "N-Acetylglucosaminyltransferase-IV (GnT-IV) conserved region; The complex-type of oligosaccharides are synthesised through elongation by glycosyltransferases after trimming of the precursor oligosaccharides transferred to proteins in the endoplasmic reticulum. N-Acetylglucosaminyltransferases (GnTs) take part in the formation of branches in the biosynthesis of complex-type sugar chains. In vertebrates, six GnTs, designated as GnT-I to -VI, which catalyze the transfer of GlcNAc to the core mannose residues of Asn-linked sugar chains, have been identified. GnT-IV (EC:2.4.1.145) catalyzes the transfer of GlcNAc from UDP-GlcNAc to the GlcNAc1-2Man1-3 arm of core oligosaccharide [Gn2(22)core oligosaccharide] and forms GlcNAc1-4(GlcNAc1-2)Man1-3 structure on the core oligosaccharide (Gn3(2,4,2)core oligosaccharide). In some members the conserved region occupies all but the very for N-terminal, where there is a signal sequence on all members. For other members the conserved region does not occupy the entire protein but is still to the N-terminus of the protein." Q#26827 - CGI_10001194 superfamily 242729 329 346 0.00881929 36.4795 cl01823 DUF2331 superfamily N - Uncharacterized protein conserved in bacteria (DUF2331); Members of this family of hypothetical bacterial proteins have no known function. Q#26831 - CGI_10001157 superfamily 220704 17 291 5.19E-142 404.353 cl11011 DUF2419 superfamily - - "Protein of unknown function (DUF2419); This is a family of conserved proteins found from plants to humans. The function is not known. A few members are annotated as being cobyrinic acid a,c-diamide synthetase but this could not be confirmed." Q#26832 - CGI_10001158 superfamily 243064 28 58 5.56E-06 39.7062 cl02512 NTR_like superfamily NC - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#26834 - CGI_10001078 superfamily 217685 10 114 7.93E-20 83.1524 cl04225 Cu2_monoox_C superfamily N - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#26835 - CGI_10001225 superfamily 245201 219 335 2.89E-17 78.0473 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26838 - CGI_10001221 superfamily 247861 104 214 1.61E-15 69.8479 cl17307 SpoU_methylase superfamily C - SpoU rRNA Methylase family; This family of proteins probably use S-AdoMet. Q#26838 - CGI_10001221 superfamily 244541 10 81 1.23E-13 62.9388 cl06870 SpoU_sub_bind superfamily - - RNA 2'-O ribose methyltransferase substrate binding; This domain is a RNA 2'-O ribose methyltransferase substrate binding domain. Q#26840 - CGI_10001223 superfamily 247792 189 230 3.93E-10 55.5296 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#26842 - CGI_10001235 superfamily 241563 61 97 0.000168337 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26843 - CGI_10001236 superfamily 192604 217 264 8.36E-11 56.5386 cl11135 PACT_coil_coil superfamily C - "Pericentrin-AKAP-450 domain of centrosomal targeting protein; This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly." Q#26846 - CGI_10001244 superfamily 247038 152 213 0.000306295 38.2046 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#26846 - CGI_10001244 superfamily 247038 32 92 0.000708734 37.0114 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#26847 - CGI_10001263 superfamily 219849 168 283 1.19E-38 140.397 cl09597 RIH_assoc superfamily - - "RyR and IP3R Homology associated; This eukaryotic domain is found in ryanodine receptors (RyR) and inositol 1,4,5-trisphosphate receptors (IP3R) which together form a superfamily of homotetrameric ligand-gated intracellular Ca2+ channels. There seems to be no known function for this domain. Also see the IP3-binding domain pfam01365 and pfam02815." Q#26849 - CGI_10001262 superfamily 241565 467 518 2.49E-11 60.0278 cl00038 BRCT superfamily C - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#26853 - CGI_10001270 superfamily 248458 59 387 1.32E-17 82.3617 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#26854 - CGI_10001281 superfamily 243250 190 609 9.16E-175 506.801 cl02959 Glyco_hydro_9 superfamily - - Glycosyl hydrolase family 9; Glycosyl hydrolase family 9. Q#26854 - CGI_10001281 superfamily 248295 77 172 0.000687539 38.543 cl17741 CBM_2 superfamily - - Cellulose binding domain; Two tryptophan residues are involved in cellulose binding. Cellulose binding domain found in bacteria. Q#26857 - CGI_10001297 superfamily 245819 299 476 4.47E-65 216.291 cl11967 Nucleotidyl_cyc_III superfamily - - "Class III nucleotidyl cyclases; Class III nucleotidyl cyclases are the largest, most diverse group of nucleotidyl cyclases (NC's) containing prokaryotic and eukaryotic proteins. They can be divided into two major groups; the mononucleotidyl cyclases (MNC's) and the diguanylate cyclases (DGC's). The MNC's, which include the adenylate cyclases (AC's) and the guanylate cyclases (GC's), have a conserved cyclase homology domain (CHD), while the DGC's have a conserved GGDEF domain, named after a conserved motif within this subgroup. Their products, cyclic guanylyl and adenylyl nucleotides, are second messengers that play important roles in eukaryotic signal transduction and prokaryotic sensory pathways." Q#26857 - CGI_10001297 superfamily 245201 3 228 3.64E-30 120.719 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26857 - CGI_10001297 superfamily 219526 245 286 4.68E-06 47.2287 cl06648 HNOBA superfamily N - "Heme NO binding associated; The HNOBA domain is found associated with the HNOB domain and pfam00211 in soluble cyclases and signalling proteins. The HNOB domain is predicted to function as a heme-dependent sensor for gaseous ligands, and transduce diverse downstream signals, in both bacteria and animals." Q#26858 - CGI_10001287 superfamily 245823 1 441 1.07E-154 454.047 cl11976 SNF superfamily - - Sodium:neurotransmitter symporter family; Sodium:neurotransmitter symporter family. Q#26859 - CGI_10001190 superfamily 238191 25 499 1.45E-121 368.968 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#26860 - CGI_10001191 superfamily 220695 102 239 9.89E-09 54.5071 cl18571 7TM_GPCR_Srx superfamily NC - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#26861 - CGI_10001300 superfamily 192604 72 99 0.00139536 33.0414 cl11135 PACT_coil_coil superfamily C - "Pericentrin-AKAP-450 domain of centrosomal targeting protein; This domain is a coiled-coil region close to the C-terminus of centrosomal proteins that is directly responsible for recruiting AKAP-450 and pericentrin to the centrosome. Hence the suggested name for this region is a PACT domain (pericentrin-AKAP-450 centrosomal targeting). This domain is also present at the C-terminus of coiled-coil proteins from Drosophila and S. pombe, and that from the Drosophila protein is sufficient for targeting to the centrosome in mammalian cells. The function of these proteins is unknown but they seem good candidates for having a centrosomal or spindle pole body location. The final 22 residues of this domain in AKAP-450 appear specifically to be a calmodulin-binding domain indicating that this member at least is likely to contribute to centrosome assembly." Q#26865 - CGI_10001310 superfamily 245206 133 371 1.63E-113 334.957 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#26865 - CGI_10001310 superfamily 243072 1 119 1.71E-25 99.7654 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#26866 - CGI_10001311 superfamily 243072 305 431 1.75E-26 105.543 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#26866 - CGI_10001311 superfamily 243072 502 600 1.58E-17 79.735 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#26866 - CGI_10001311 superfamily 243072 180 330 1.32E-15 74.3422 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#26874 - CGI_10001329 superfamily 241563 72 108 5.08E-05 41.1188 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26875 - CGI_10001305 superfamily 243035 8 45 4.31E-05 36.809 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26878 - CGI_10001346 superfamily 248012 2 125 6.24E-11 55.3573 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#26879 - CGI_10001347 superfamily 245201 11 110 0.00146832 40.5432 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26880 - CGI_10001348 superfamily 216599 11 325 3.89E-157 450.574 cl18372 B56 superfamily - - "Protein phosphatase 2A regulatory B subunit (B56 family); Protein phosphatase 2A (PP2A) is a major intracellular protein phosphatase that regulates multiple aspects of cell growth and metabolism. The ability of this widely distributed heterotrimeric enzyme to act on a diverse array of substrates is largely controlled by the nature of its regulatory B subunit. There are multiple families of B subunits (See also pfam01240), this family is called the B56 family." Q#26884 - CGI_10001375 superfamily 183292 1 32 0.00633887 36.338 cl18135 PRK11728 superfamily C - hydroxyglutarate oxidase; Provisional Q#26886 - CGI_10001397 superfamily 245814 17 74 3.17E-05 42.0911 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#26888 - CGI_10001428 superfamily 248097 179 305 3.44E-23 92.3282 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#26890 - CGI_10001293 superfamily 217380 131 408 4.60E-103 313.492 cl18406 TTL superfamily - - "Tubulin-tyrosine ligase family; Tubulins and microtubules are subjected to several post-translational modifications of which the reversible detyrosination/tyrosination of the carboxy-terminal end of most alpha-tubulins has been extensively analysed. This modification cycle involves a specific carboxypeptidase and the activity of the tubulin-tyrosine ligase (TTL). The true physiological function of TTL has so far not been established. Tubulin-tyrosine ligase (TTL) catalyzes the ATP-dependent post-translational addition of a tyrosine to the carboxy terminal end of detyrosinated alpha-tubulin. In normally cycling cells, the tyrosinated form of tubulin predominates. However, in breast cancer cells, the detyrosinated form frequently predominates, with a correlation to tumour aggressiveness. On the other hand, 3-nitrotyrosine has been shown to be incorporated, by TTL, into the carboxy terminal end of detyrosinated alpha-tubulin. This reaction is not reversible by the carboxypeptidase enzyme. Cells cultured in 3-nitrotyrosine rich medium showed evidence of altered microtubule structure and function, including altered cell morphology, epithelial barrier dysfunction, and apoptosis. Bacterial homologs of TTL are predicted to form peptide tags. Some of these are fused to a 2-oxoglutarate Fe(II)-dependent dioxygenase domain." Q#26891 - CGI_10001331 superfamily 245622 176 338 6.35E-30 112.703 cl11446 Rhomboid superfamily - - "Rhomboid family; This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite." Q#26891 - CGI_10001331 superfamily 247856 63 90 0.000666924 37.5273 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#26893 - CGI_10001431 superfamily 247057 102 138 5.17E-07 47.0888 cl15755 SAM_superfamily superfamily N - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#26894 - CGI_10001432 superfamily 243146 291 336 3.16E-06 44.5746 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#26894 - CGI_10001432 superfamily 243146 340 385 0.000145861 39.739 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#26894 - CGI_10001432 superfamily 243146 259 302 0.000335913 38.6935 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#26900 - CGI_10001456 superfamily 245603 307 374 0.00107496 37.5716 cl11403 pepsin_retropepsin_like superfamily C - "Cellular and retroviral pepsin-like aspartate proteases; This family includes both cellular and retroviral pepsin-like aspartate proteases. The cellular pepsin and pepsin-like enzymes are twice as long as their retroviral counterparts. The cellular pepsin-like aspartic proteases are found in mammals, plants, fungi and bacteria. These well known and extensively characterized enzymes include pepsins, chymosin, rennin, cathepsins, and fungal aspartic proteases. Several have long been known to be medically (rennin, cathepsin D and E, pepsin) or commercially (chymosin) important. The eukaryotic pepsin-like proteases contain two domains possessing similar topological features. The N- and C-terminal domains, although structurally related by a 2-fold axis, have only limited sequence homology except in the vicinity of the active site. This suggests that the enzymes evolved by an ancient duplication event. The eukaryotic pepsin-like proteases have two active site ASP residues with each N- and C-terminal lobe contributing one residue. While the fungal and mammalian pepsins are bilobal proteins, retropepsins function as dimers and the monomer resembles structure of the N- or C-terminal domains of eukaryotic enzyme. The active site motif (Asp-Thr/Ser-Gly-Ser) is conserved between the retroviral and eukaryotic proteases and between the N-and C-terminal of eukaryotic pepsin-like proteases. The retropepsin-like family includes pepsin-like aspartate proteases from retroviruses, retrotransposons and retroelements; as well as eukaryotic DNA-damage-inducible proteins (DDIs), and bacterial aspartate peptidases. Retropepsin is synthesized as part of the POL polyprotein that contains an aspartyl-protease, a reverse transcriptase, RNase H, and an integrase. The POL polyprotein undergoes specific enzymatic cleavage to yield the mature proteins. This family of aspartate proteases is classified by MEROPS as the peptidase family A1 (pepsin A) and A2 (retropepsin family)." Q#26901 - CGI_10001457 superfamily 243179 90 147 1.37E-13 63.1542 cl02781 tetraspanin_LEL superfamily N - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#26903 - CGI_10001459 superfamily 220695 31 147 0.00211045 37.9435 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#26904 - CGI_10001382 superfamily 247755 839 1060 1.45E-104 330.236 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26904 - CGI_10001382 superfamily 247755 2 209 1.59E-102 324.843 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26908 - CGI_10001505 superfamily 218284 110 171 3.53E-07 46.4787 cl04786 SOUL superfamily NC - SOUL heme-binding protein; This family represents a group of putative heme-binding proteins. Our family includes archaeal and bacterial homologues. Q#26909 - CGI_10001506 superfamily 246679 42 169 2.35E-57 178.1 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#26914 - CGI_10001325 superfamily 193257 61 283 6.81E-65 219.857 cl15086 AAA_9 superfamily - - "ATP-binding dynein motor region D5; The 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This particular family is the D5 ATP-binding region of the motor, but has lost its P-loop." Q#26914 - CGI_10001325 superfamily 193253 4 35 0.00669367 38.4793 cl15084 MT superfamily N - "Microtubule-binding stalk of dynein motor; the 380 kDa motor unit of dynein belongs to the AAA class of chaperone-like ATPases. The core of the 380 kDa motor unit contains a concatenated chain of six AAA modules, of which four correspond to the ATP binding sites with P-loop signatures described previously, and two are modules in which the P loop has been lost in evolution. This family is the region between D4 and D5 and is the two predicted alpha-helical coiled coil segments that form the stalk supporting the ATP-sensitive microtubule binding component." Q#26915 - CGI_10001550 superfamily 243037 282 335 1.27E-20 86.6203 cl02440 DAGK_acc superfamily C - Diacylglycerol kinase accessory domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown. Q#26915 - CGI_10001550 superfamily 248019 1 18 0.00177641 36.4933 cl17465 DAGK_cat superfamily NC - "Diacylglycerol kinase catalytic domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. The catalytic domain is assumed from the finding of bacterial homologues. YegS is the Escherichia coli protein in this family whose crystal structure reveals an active site in the inter-domain cleft formed by four conserved sequence motifs, revealing a novel metal-binding site. The residues of this site are conserved across the family." Q#26916 - CGI_10001551 superfamily 243037 1 74 2.07E-27 104.339 cl02440 DAGK_acc superfamily N - Diacylglycerol kinase accessory domain; Diacylglycerol (DAG) is a second messenger that acts as a protein kinase C activator. This domain is assumed to be an accessory domain: its function is unknown. Q#26917 - CGI_10001552 superfamily 247057 17 81 4.14E-32 109.039 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#26918 - CGI_10001553 superfamily 247805 82 279 8.30E-88 268.969 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26918 - CGI_10001553 superfamily 247905 290 415 7.03E-32 118.109 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#26919 - CGI_10001554 superfamily 247805 1 168 3.94E-82 250.48 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#26919 - CGI_10001554 superfamily 247905 179 308 1.31E-32 118.88 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#26920 - CGI_10001578 superfamily 241580 2 71 6.60E-10 54.6267 cl00061 FH superfamily - - "Forkhead (FH), also known as a "winged helix". FH is named for the Drosophila fork head protein, a transcription factor which promotes terminal rather than segmental development. This family of transcription factor domains, which bind to B-DNA as monomers, are also found in the Hepatocyte nuclear factor (HNF) proteins, which provide tissue-specific gene regulation. The structure contains 2 flexible loops or "wings" in the C-terminal region, hence the term winged helix." Q#26923 - CGI_10001581 superfamily 241563 35 77 3.69E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26924 - CGI_10001562 superfamily 241574 292 346 1.54E-17 81.0929 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#26924 - CGI_10001562 superfamily 241574 416 500 9.01E-10 57.5957 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#26925 - CGI_10001630 superfamily 245840 35 139 0.00262846 35.7904 cl12022 Ribosomal_L18e superfamily C - Ribosomal protein L18e/L15; This family includes eukaryotic L18 as well as prokaryotic L15. Q#26926 - CGI_10001722 superfamily 245211 301 886 0 621.485 cl09939 RNR_PFL superfamily - - "Ribonucleotide reductase and Pyruvate formate lyase; Ribonucleotide reductase (RNR) and pyruvate formate lyase (PFL) are believed to have diverged from a common ancestor. They have a structurally similar ten-stranded alpha-beta barrel domain that hosts the active site, and are radical enzymes. RNRs are found in all organisms and provide the only mechanism by which nucleotides are converted to deoxynucleotides. RNRs are separated into three classes based on their metallocofactor usage. Class I RNRs use a diiron-tyrosyl radical while Class II RNRs use coenzyme B12 (adenosylcobalamin, AdoCbl). Class III RNRs use an FeS cluster and S-adenosylmethionine to generate a glycyl radical. PFL, an essential enzyme in anaerobic bacteria, catalyzes the conversion of pyruvate and CoA to acteylCoA and formate in a mechanism that uses a glycyl radical." Q#26926 - CGI_10001722 superfamily 217585 23 114 8.06E-22 91.9244 cl12279 ATP-cone superfamily - - ATP cone domain; ATP cone domain. Q#26928 - CGI_10001726 superfamily 218479 149 284 5.04E-46 153.424 cl04965 DapB_C superfamily - - "Dihydrodipicolinate reductase, C-terminus; Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The C-terminal domain of DapB has been proposed to be the substrate- binding domain." Q#26928 - CGI_10001726 superfamily 216304 4 145 2.08E-12 61.8772 cl18363 DapB_N superfamily - - "Dihydrodipicolinate reductase, N-terminus; Dihydrodipicolinate reductase (DapB) reduces the alpha,beta-unsaturated cyclic imine, dihydro-dipicolinate. This reaction is the second committed step in the biosynthesis of L-lysine and its precursor meso-diaminopimelate, which are critical for both protein and cell wall biosynthesis. The N-terminal domain of DapB binds the dinucleotide NADPH." Q#26929 - CGI_10001729 superfamily 247861 40 183 5.76E-38 129.554 cl17307 SpoU_methylase superfamily - - SpoU rRNA Methylase family; This family of proteins probably use S-AdoMet. Q#26930 - CGI_10001730 superfamily 241550 211 312 1.07E-55 186.631 cl00015 nt_trans superfamily N - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#26930 - CGI_10001730 superfamily 241550 5 116 2.40E-45 158.896 cl00015 nt_trans superfamily C - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#26930 - CGI_10001730 superfamily 245839 371 482 9.75E-15 71.439 cl12020 Anticodon_Ia_like superfamily N - "Anticodon-binding domain of class Ia aminoacyl tRNA synthetases and similar domains; This domain is found in a variety of class Ia aminoacyl tRNA synthetases, C-terminal to the catalytic core domain. It recognizes and specifically binds to the anticodon of the tRNA. Aminoacyl tRNA synthetases catalyze the transfer of cognate amino acids to the 3'-end of their tRNAs by specifically recognizing cognate from non-cognate amino acids. Members include valyl-, leucyl-, isoleucyl-, cysteinyl-, arginyl-, and methionyl-tRNA synthethases. This superfamily also includes a domain from MshC, an enzyme in the mycothiol biosynthetic pathway." Q#26931 - CGI_10001731 superfamily 247740 75 291 1.69E-57 185.415 cl17186 TIM_phosphate_binding superfamily - - "TIM barrel proteins share a structurally conserved phosphate binding motif and in general share an eight beta/alpha closed barrel structure. Specific for this family is the conserved phosphate binding site at the edges of strands 7 and 8. The phosphate comes either from the substrate, as in the case of inosine monophosphate dehydrogenase (IMPDH), or from ribulose-5-phosphate 3-epimerase (RPE) or from cofactors, like FMN." Q#26932 - CGI_10001732 superfamily 244842 149 572 0 813.611 cl08031 ThiC superfamily - - "ThiC family; ThiC is found within the thiamine biosynthesis operon. ThiC is involved in pyrimidine biosynthesis. The precise catalytic function of ThiC is still not known. ThiC participates in the formation of 4-Amino-5-hydroxymethyl-2-methylpyrimidine from AIR, an intermediate in the de novo pyrimidine biosynthesis." Q#26932 - CGI_10001732 superfamily 222303 18 108 2.43E-20 86.3897 cl16343 ThiC-associated superfamily - - "ThiC-associated domain; This domain is most frequently found at the N-terminus of the ThiC family of proteins, pfam01964. The function is not known." Q#26934 - CGI_10001549 superfamily 218280 33 63 1.66E-06 42.2065 cl04781 Rad21_Rec8_N superfamily NC - "N terminus of Rad21 / Rec8 like protein; This family represents a conserved N-terminal region found in eukaryotic cohesins of the Rad21, Rec8 and Scc1 families. Members of this family mediate sister chromatid cohesion during mitosis and meiosis, as part of the cohesin complex. Cohesion is necessary for homologous recombination (including double-strand break repair) and correct chromatid segregation. These proteins may also be involved in chromosome condensation. Dissociation at the metaphase to anaphase transition causes loss of cohesion and chromatid segregation." Q#26935 - CGI_10001743 superfamily 243035 31 145 9.44E-14 66.4893 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26935 - CGI_10001743 superfamily 243035 169 279 4.74E-07 47.2294 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26936 - CGI_10001744 superfamily 242274 421 534 1.11E-06 47.7922 cl01053 SGNH_hydrolase superfamily N - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#26937 - CGI_10001603 superfamily 248469 13 123 4.40E-09 52.7575 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#26937 - CGI_10001603 superfamily 248469 166 231 7.24E-08 49.2907 cl17915 HAD_like superfamily N - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#26938 - CGI_10001604 superfamily 245201 147 430 0 526.318 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#26938 - CGI_10001604 superfamily 243090 14 145 4.14E-38 136.561 cl02565 RGS superfamily - - "Regulator of G protein signaling (RGS) domain superfamily; The RGS domain is an essential part of the Regulator of G-protein Signaling (RGS) protein family, a diverse group of multifunctional proteins that regulate cellular signaling events downstream of G-protein coupled receptors (GPCRs). RGS proteins play critical regulatory roles as GTPase activating proteins (GAPs) of the heterotrimeric G-protein G-alpha-subunits. While inactive, G-alpha-subunits bind GDP, which is released and replaced by GTP upon agonist activation. GTP binding leads to dissociation of the alpha-subunit and the beta-gamma-dimer, allowing them to interact with effectors molecules and propagate signaling cascades associated with cellular growth, survival, migration, and invasion. Deactivation of the G-protein signaling controlled by the RGS domain accelerates GTPase activity of the alpha subunit by hydrolysis of GTP to GDP, which results in the reassociation of the alpha-subunit with the beta-gamma-dimer and thereby inhibition of downstream activity. As a major G-protein regulator, RGS domain containing proteins are involved in many crucial cellular processes such as regulation of intracellular trafficking, glial differentiation, embryonic axis formation, skeletal and muscle development, and cell migration during early embryogenesis. RGS proteins are also involved in apoptosis and cell proliferation, as well as modulation of cardiac development. Several RGS proteins can fine-tune immune responses, while others play important roles in neuronal signals modulation. Some RGS proteins are principal elements needed for proper vision." Q#26939 - CGI_10001989 superfamily 215763 207 244 2.92E-07 45.6739 cl02815 HTH_AraC superfamily - - "Bacterial regulatory helix-turn-helix proteins, AraC family; In the absence of arabinose, the N-terminal arm of AraC binds to the DNA binding domain (pfam00165) and helps to hold the two DNA binding domains in a relative orientation that favours DNA looping. In the presence of arabinose, the arms bind over the arabinose on the dimerisation domain, thus freeing the DNA-binding domains. The freed DNA-binding domains are then able to assume a conformation suitable for binding to the adjacent DNA sites that are utilised when AraC activates transcription, and hence AraC ceases looping the DNA when arabinose is added." Q#26940 - CGI_10001990 superfamily 182155 1 296 1.66E-144 411.511 cl08072 PRK09936 superfamily - - hypothetical protein; Provisional Q#26941 - CGI_10001991 superfamily 241593 126 228 4.47E-28 103.495 cl00075 HATPase_c superfamily - - "Histidine kinase-like ATPases; This family includes several ATP-binding proteins for example: histidine kinase, DNA gyrase B, topoisomerases, heat shock protein HSP90, phytochrome-like ATPases and DNA mismatch repair proteins" Q#26941 - CGI_10001991 superfamily 241595 11 73 5.16E-12 58.378 cl00080 HisKA superfamily - - "Histidine Kinase A (dimerization/phosphoacceptor) domain; Histidine Kinase A dimers are formed through parallel association of 2 domains creating 4-helix bundles; usually these domains contain a conserved His residue and are activated via trans-autophosphorylation by the catalytic domain of the histidine kinase. They subsequently transfer the phosphoryl group to the Asp acceptor residue of a response regulator protein. Two-component signalling systems, consisting of a histidine protein kinase that senses a signal input and a response regulator that mediates the output, are ancient and evolutionarily conserved signaling mechanisms in prokaryotes and eukaryotes." Q#26942 - CGI_10001993 superfamily 248290 4 116 3.10E-35 122.684 cl17736 REC superfamily - - "Signal receiver domain; originally thought to be unique to bacteria (CheY, OmpR, NtrC, and PhoB), now recently identified in eukaroytes ETR1 Arabidopsis thaliana; this domain receives the signal from the sensor partner in a two-component systems; contains a phosphoacceptor site that is phosphorylated by histidine kinase homologs; usually found N-terminal to a DNA binding effector domain; forms homodimers" Q#26942 - CGI_10001993 superfamily 247909 128 221 3.43E-33 116.78 cl17355 trans_reg_C superfamily - - "Effector domain of response regulator. Bacteria and certain eukaryotes like protozoa and higher plants use two-component signal transduction systems to detect and respond to changes in the environment. The system consists of a sensor histidine kinase and a response regulator. The former autophosphorylates in a histidine residue on detecting an external stimulus. The phosphate is then transferred to an invariant aspartate residue in a highly conserved receiver domain of the response regulator. Phosphorylation activates a variable effector domain of the response regulator, which triggers the cellular response. The C-terminal effector domain contains DNA and RNA polymerase binding sites. Several dimers or monomers bind head to tail to small tandem repeats upstream of the genes. The RNA polymerase binding sites interact with the alpha or sigma subunite of RNA polymerase." Q#26943 - CGI_10001995 superfamily 243005 1 102 4.29E-55 169.277 cl02363 CusF_Ec superfamily - - "Copper binding periplasmic protein CusF; CusF is a periplasmic protein involved in copper and silver resistance in Escherichia coil. CusF forms a five-stranded beta-barrel OB fold. Cu(I) binds to H36, M47 and M49 which are conserved residues in the protein." Q#26944 - CGI_10001996 superfamily 222128 199 300 6.97E-12 61.2399 cl18638 HlyD_3 superfamily - - HlyD family secretion protein; This is a family of largely bacterial haemolysin translocator HlyD proteins. Q#26947 - CGI_10002000 superfamily 241917 2 211 8.39E-59 184.708 cl00514 Nitro_FMN_reductase superfamily - - "Proteins of this family catalyze the reduction of flavin or nitrocompounds using NAD(P)H as electron donor in a obligatory two-electron transfer, utilizing FMN or FAD as cofactor. They are often found to be homodimers. Enzymes of this family are described as NAD(P)H:FMN oxidoreductases, oxygen-insensitive nitroreductase, flavin reductase P, dihydropteridine reductase, NADH oxidase or NADH dehydrogenase." Q#26948 - CGI_10002001 superfamily 242218 1 372 0 709.385 cl00954 GCS2 superfamily - - "Glutamate-cysteine ligase family 2(GCS2); Also known as gamma-glutamylcysteine synthetase and gamma-ECS (EC:6.3.2.2). This enzyme catalyzes the first and rate limiting step in de novo glutathione biosynthesis. Members of this family are found in archaea, bacteria and plants. May and Leaver discuss the possible evolutionary origins of glutamate-cysteine ligase enzymes in different organisms and suggest that it evolved independently in different eukaryotes, from an ancestral bacterial enzyme. They also state that Arabidopsis thaliana gamma-glutamylcysteine synthetase is structurally unrelated to mammalian, yeast and Escherichia coli homologues. In plants, there are separate cytosolic and chloroplast forms of the enzyme." Q#26949 - CGI_10002002 superfamily 247798 67 764 7.69E-117 369.088 cl17244 OM_channels superfamily - - "Porin superfamily. These outer membrane channels share a beta-barrel structure that differ in strand and shear number. Classical (gram-negative ) porins are non-specific channels for small hydrophillic molecules and form 16 beta-stranded barrels (16,20), which associate as trimers. Maltoporin-like channels have specificities for various sugars and form 18 beta-stranded barrels (18,22), which associate as trimers. Ligand-gated protein channels cooperate with a TonB associated inner membrane complex to actively transport ligands via the proton motive force and they form monomeric, (22,24) barrels. The 150-200 N-terminal residues form a plug that blocks the channel from the periplasmic end." Q#26950 - CGI_10002003 superfamily 221235 3 130 2.91E-39 137.12 cl13275 DUF3327 superfamily - - Domain of unknown function (DUF3327); Domain of unknown function (DUF3327). Q#26950 - CGI_10002003 superfamily 225375 133 284 2.77E-10 58.9569 cl18715 COG2819 superfamily C - Predicted hydrolase of the alpha/beta superfamily [General function prediction only] Q#26951 - CGI_10002004 superfamily 247692 511 1000 0 567.843 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#26951 - CGI_10002004 superfamily 245209 1022 1084 1.05E-08 53.7126 cl09936 PP-binding superfamily - - Phosphopantetheine attachment site; A 4'-phosphopantetheine prosthetic group is attached through a serine. This prosthetic group acts as a a 'swinging arm' for the attachment of activated fatty acid and amino-acid groups. This domain forms a four helix bundle. This family includes members not included in Prosite. The inclusion of these members is supported by sequence analysis and functional evidence. The related domain of Vibrio anguillarum angR has the attachment serine replaced by an alanine. Q#26952 - CGI_10002006 superfamily 247755 1 170 8.30E-56 177.628 cl17201 ABC_ATPase superfamily N - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26954 - CGI_10001608 superfamily 241862 200 349 1.13E-21 91.6488 cl00437 COG0428 superfamily N - Predicted divalent heavy-metal cations transporter [Inorganic ion transport and metabolism] Q#26955 - CGI_10001609 superfamily 247725 2 17 0.000819563 33.8994 cl17171 PH-like superfamily N - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#26956 - CGI_10001115 superfamily 245205 139 217 8.29E-10 54.1661 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26956 - CGI_10001115 superfamily 245205 19 98 0.000822439 36.8321 cl09930 RPA_2b-aaRSs_OBF_like superfamily - - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26958 - CGI_10001443 superfamily 247684 51 470 4.60E-87 279.933 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#26959 - CGI_10001444 superfamily 243035 1 64 0.00759993 31.051 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#26960 - CGI_10002078 superfamily 244888 1 274 1.20E-78 242.963 cl08282 Acyl_transf_1 superfamily - - Acyl transferase domain; Acyl transferase domain. Q#26961 - CGI_10002080 superfamily 247755 206 435 1.53E-111 331.068 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26961 - CGI_10002080 superfamily 241857 64 192 3.62E-16 76.1632 cl00427 TM_PBP2 superfamily C - "Transmembrane subunit (TM) found in Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which generally bind type 2 PBPs. These types of transporters consist of a PBP, two TMs, and two cytoplasmic ABC ATPase subunits, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. For these transporters the ABCs and TMs are on independent polypeptide chains. These systems transport a diverse range of substrates. Most are specific for a single substrate or a group of related substrates; however some transporters are more promiscuous, transporting structurally diverse substrates such as the histidine/lysine and arginine transporter in Enterobacteriaceae. In the latter case, this is achieved through binding different PBPs with different specificities to the TMs. For other promiscuous transporters such as the multiple-sugar transporter Msm of Streptococcus mutans, the PBP has a wide substrate specificity. These transporters include the maltose-maltodextrin, phosphate and sulfate transporters, among others." Q#26962 - CGI_10002081 superfamily 241857 406 613 2.55E-21 92.3416 cl00427 TM_PBP2 superfamily - - "Transmembrane subunit (TM) found in Periplasmic Binding Protein (PBP)-dependent ATP-Binding Cassette (ABC) transporters which generally bind type 2 PBPs. These types of transporters consist of a PBP, two TMs, and two cytoplasmic ABC ATPase subunits, and are mainly involved in importing solutes from the environment. The solute is captured by the PBP which delivers it to a gated translocation pathway formed by the two TMs. The two ABCs bind and hydrolyze ATP and drive the transport reaction. For these transporters the ABCs and TMs are on independent polypeptide chains. These systems transport a diverse range of substrates. Most are specific for a single substrate or a group of related substrates; however some transporters are more promiscuous, transporting structurally diverse substrates such as the histidine/lysine and arginine transporter in Enterobacteriaceae. In the latter case, this is achieved through binding different PBPs with different specificities to the TMs. For other promiscuous transporters such as the multiple-sugar transporter Msm of Streptococcus mutans, the PBP has a wide substrate specificity. These transporters include the maltose-maltodextrin, phosphate and sulfate transporters, among others." Q#26962 - CGI_10002081 superfamily 247850 35 347 6.99E-100 309.758 cl17296 PBP_like_2 superfamily - - PBP superfamily domain; This domain belongs to the periplasmic binding protein superfamily. Q#26963 - CGI_10002082 superfamily 247724 201 378 4.10E-65 208.852 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#26963 - CGI_10002082 superfamily 205348 33 125 0.00234039 35.9229 cl16145 GTP-bdg_N superfamily - - "GTP-binding GTPase N-terminal; This is the N-terminal region of GTP-binding HflX-like proteins. The full-length members bind and interact with the 50S ribosome and are GTPases, hydrolysing GTP/GDP/ATP/ADP. This N-terminal region is necessary for stability of the whole protein." Q#26964 - CGI_10002089 superfamily 247780 209 435 3.54E-102 306.385 cl17226 NAD_bind_amino_acid_DH superfamily - - "NAD(P) binding domain of amino acid dehydrogenase-like proteins; Amino acid dehydrogenase(DH)-like NAD(P)-binding domains are members of the Rossmann fold superfamily and are found in glutamate, leucine, and phenylalanine DHs (DHs), methylene tetrahydrofolate DH, methylene-tetrahydromethanopterin DH, methylene-tetrahydropholate DH/cyclohydrolase, Shikimate DH-like proteins, malate oxidoreductases, and glutamyl tRNA reductase. Amino acid DHs catalyze the deamination of amino acids to keto acids with NAD(P)+ as a cofactor. The NAD(P)-binding Rossmann fold superfamily includes a wide variety of protein families including NAD(P)- binding domains of alcohol DHs, tyrosine-dependent oxidoreductases, glyceraldehyde-3-phosphate DH, lactate/malate DHs, formate/glycerate DHs, siroheme synthases, 6-phosphogluconate DH, amino acid DHs, repressor rex, NAD-binding potassium channel domain, CoA-binding, and ornithine cyclodeaminase-like domains. These domains have an alpha-beta-alpha configuration. NAD binding involves numerous hydrogen and van der Waals contacts." Q#26964 - CGI_10002089 superfamily 202408 63 190 1.77E-60 195.412 cl08368 ELFV_dehydrog_N superfamily - - "Glu/Leu/Phe/Val dehydrogenase, dimerisation domain; Glu/Leu/Phe/Val dehydrogenase, dimerisation domain. " Q#26965 - CGI_10002090 superfamily 246713 3 120 1.63E-30 108.868 cl14786 ENDO3c superfamily N - "endonuclease III; includes endonuclease III (DNA-(apurinic or apyrimidinic site) lyase), alkylbase DNA glycosidases (Alka-family) and other DNA glycosidases" Q#26966 - CGI_10002091 superfamily 241739 104 249 8.59E-81 258.659 cl00268 class_II_aaRS-like_core superfamily C - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#26966 - CGI_10002091 superfamily 241739 391 525 8.52E-56 191.635 cl00268 class_II_aaRS-like_core superfamily N - "Class II tRNA amino-acyl synthetase-like catalytic core domain. Class II amino acyl-tRNA synthetases (aaRS) share a common fold and generally attach an amino acid to the 3' OH of ribose of the appropriate tRNA. PheRS is an exception in that it attaches the amino acid at the 2'-OH group, like class I aaRSs. These enzymes are usually homodimers. This domain is primarily responsible for ATP-dependent formation of the enzyme bound aminoacyl-adenylate. The substrate specificity of this reaction is further determined by additional domains. Intererestingly, this domain is also found is asparagine synthase A (AsnA), in the accessory subunit of mitochondrial polymerase gamma and in the bacterial ATP phosphoribosyltransferase regulatory subunit HisZ." Q#26966 - CGI_10002091 superfamily 245205 2 100 2.50E-31 118.778 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#26966 - CGI_10002091 superfamily 218125 525 614 2.41E-24 98.2955 cl18443 Bactofilin superfamily - - "Polymer-forming cytoskeletal; This is a family of bactofilins, a functionally diverse class of cytoskeletal, polymer-forming, proteins that is widely conserved among bacteria. In the example species C. crescentus, two bactofilins assemble into a membrane-associated laminar structure that shows cell-cycle-dependent polar localisation and acts as a platform for the recruitment of a cell wall biosynthetic enzyme involved in polar morphogenesis. Bactofilins display distinct subcellular distributions and dynamics in different bacterial species, suggesting that they are versatile structural elements that have adopted a range of different cellular functions." Q#26966 - CGI_10002091 superfamily 145868 282 371 3.56E-05 42.2402 cl08382 GAD superfamily - - GAD domain; This domain is found in some members of the GatB and aspartyl tRNA synthetases. Q#26967 - CGI_10002094 superfamily 247755 359 586 4.55E-95 293.363 cl17201 ABC_ATPase superfamily - - "ATP-binding cassette transporter nucleotide-binding domain; ABC transporters are a large family of proteins involved in the transport of a wide variety of different compounds, like sugars, ions, peptides, and more complex organic molecules. The nucleotide-binding domain shows the highest similarity between all members of the family. ABC transporters are a subset of nucleotide hydrolases that contain a signature motif, Q-loop, and H-loop/switch region, in addition to, the Walker A motif/P-loop and Walker B motif commonly found in a number of ATP- and GTP-binding and hydrolyzing proteins." Q#26967 - CGI_10002094 superfamily 216049 37 307 1.45E-33 128.943 cl18356 ABC_membrane superfamily - - ABC transporter transmembrane region; This family represents a unit of six transmembrane helices. Many members of the ABC transporter family (pfam00005) have two such regions. Q#26968 - CGI_10001813 superfamily 243034 276 372 2.02E-05 43.1376 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#26968 - CGI_10001813 superfamily 243034 161 270 0.000150165 40.4412 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#26968 - CGI_10001813 superfamily 248374 410 565 1.02E-61 202.999 cl17820 Asp_Arg_Hydrox superfamily - - "Aspartyl/Asparaginyl beta-hydroxylase; Iron (II)/2-oxoglutarate (2-OG)-dependent oxygenases catalyze oxidative reactions in a range of metabolic processes. Proline 3-hydroxylase hydroxylates proline at position 3, the first of a 2-OG oxygenase catalyzing oxidation of a free alpha-amino acid. The structure of proline 3-hydroxylase contains the conserved motifs present in other 2-OG oxygenases including a jelly roll strand core and residues binding iron and 2-oxoglutarate, consistent with divergent evolution within the extended family. This family represent the arginine, asparagine and proline hydroxylases. The aspartyl/asparaginyl beta-hydroxylase (EC:1.14.11.16) specifically hydroxylates one aspartic or asparagine residue in certain epidermal growth factor-like domains of a number of proteins." Q#26970 - CGI_10001815 superfamily 248097 153 256 3.67E-09 52.6526 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#26970 - CGI_10001815 superfamily 221533 46 84 0.00354595 34.5948 cl13726 TMF_DNA_bd superfamily N - "TATA element modulatory factor 1 DNA binding; This is the middle region of a family of TATA element modulatory factor 1 proteins conserved in eukaryotes that contains at its N-terminal section a number of leucine zippers that could potentially form coiled coil structures. The whole proteins bind to the TATA element of some RNA polymerase II promoters and repress their activity. by competing with the binding of TATA binding protein. TMFs are evolutionarily conserved golgins that bind Rab6, a ubiquitous ras-like GTP-binding Golgi protein, and contribute to Golgi organisation in animal and plant cells." Q#26971 - CGI_10002228 superfamily 245818 57 212 4.33E-26 104.176 cl11966 Rel-Spo_like superfamily - - "RelA- and SpoT-like ppGpp Synthetases and Hydrolases, catalytic domain; The Rel-Spo superfamily includes the catalytic domains of Escherichia coli ppGpp synthetase (RelA), ppGpp synthetase/hydrolase (SpoT), and related proteins. RelA synthesizes (p)ppGpp in response to amino-acid starvation and in association with ribosomes. (p)ppGpp triggers the bacterial stringent response. SpoT catalyzes (p)ppGpp synthesis under carbon limitation in a ribosome-independent manner. It also catalyzes (p)ppGpp degradation. Gram-negative bacteria have two enzymes involved in (p)ppGpp metabolism while most Gram-positive organisms have a single Rel-Spo enzyme (Rel), which both synthesizes and degrades (p)ppGpp. The Arabidopsis thaliana Rel-Spo proteins, At-RSH1,-2, and-3 appear to regulate a rapid (p)ppGpp-mediated response to pathogens and other stresses. This catalytic domain is found in association with an N-terminal HD domain and a C-terminal metal dependent phosphohydrolase domain (TGS). Some Rel-Spo proteins also have a C-terminal regulatory ACT domain." Q#26971 - CGI_10002228 superfamily 218331 367 505 5.71E-42 149.727 cl08427 PAP_RNA-bind superfamily - - Poly(A) polymerase predicted RNA binding domain; Based on its similarity structurally to the RNA recognition motif this domain is thought to be RNA binding. Q#26973 - CGI_10002230 superfamily 191179 3 437 0 538.418 cl04912 Menin superfamily C - "Menin; MEN1, the gene responsible for multiple endocrine neoplasia type 1, is a tumour suppressor gene that encodes a protein called Menin which may be an atypical GTPase stimulated by nm23." Q#26973 - CGI_10002230 superfamily 191179 614 712 5.79E-16 80.4153 cl04912 Menin superfamily N - "Menin; MEN1, the gene responsible for multiple endocrine neoplasia type 1, is a tumour suppressor gene that encodes a protein called Menin which may be an atypical GTPase stimulated by nm23." Q#26974 - CGI_10002231 superfamily 218652 39 86 0.000717403 36.5026 cl12311 CLPTM1 superfamily C - "Cleft lip and palate transmembrane protein 1 (CLPTM1); This family consists of several eukaryotic cleft lip and palate transmembrane protein 1 sequences. Cleft lip with or without cleft palate is a common birth defect that is genetically complex. The nonsyndromic forms have been studied genetically using linkage and candidate-gene association studies with only partial success in defining the loci responsible for orofacial clefting. CLPTM1 encodes a transmembrane protein and has strong homology to two Caenorhabditis elegans genes, suggesting that CLPTM1 may belong to a new gene family. This family also contains the human cisplatin resistance related protein CRR9p which is associated with CDDP-induced apoptosis." Q#26976 - CGI_10002026 superfamily 246612 44 281 2.93E-37 135.976 cl14057 BPL_LplA_LipB superfamily - - "Biotin/lipoate A/B protein ligase family; This family includes biotin protein ligase, lipoate-protein ligase A and B. Biotin is covalently attached at the active site of certain enzymes that transfer carbon dioxide from bicarbonate to organic acids to form cellular metabolites. Biotin protein ligase (BPL) is the enzyme responsible for attaching biotin to a specific lysine at the active site of biotin enzymes. Each organism probably has only one BPL. Biotin attachment is a two step reaction that results in the formation of an amide linkage between the carboxyl group of biotin and the epsilon-amino group of the modified lysine. Lipoate-protein ligase A (LPLA) catalyzes the formation of an amide linkage between lipoic acid and a specific lysine residue in lipoate dependent enzymes. The unusual biosynthesis pathway of lipoic acid is mechanistically intertwined with attachment of the cofactor." Q#26978 - CGI_10002028 superfamily 241688 84 376 4.96E-89 271.732 cl00210 Isoprenoid_Biosyn_C1 superfamily - - "Isoprenoid Biosynthesis enzymes, Class 1; Superfamily of trans-isoprenyl diphosphate synthases (IPPS) and class I terpene cyclases which either synthesis geranyl/farnesyl diphosphates (GPP/FPP) or longer chained products from isoprene precursors, isopentenyl diphosphate (IPP) and dimethylallyl diphosphate (DMAPP), or use geranyl (C10)-, farnesyl (C15)-, or geranylgeranyl (C20)-diphosphate as substrate. These enzymes produce a myriad of precursors for such end products as steroids, cholesterol, sesquiterpenes, heme, carotenoids, retinoids, and diterpenes; and are widely distributed among archaea, bacteria, and eukaryota.The enzymes in this superfamily share the same 'isoprenoid synthase fold' and include several subgroups. The head-to-tail (HT) IPPS catalyze the successive 1'-4 condensation of the 5-carbon IPP to the growing isoprene chain to form linear, all-trans, C10-, C15-, C20- C25-, C30-, C35-, C40-, C45-, or C50-isoprenoid diphosphates. Cyclic monoterpenes, diterpenes, and sesquiterpenes, are formed from their respective linear isoprenoid diphosphates by class I terpene cyclases. The head-to-head (HH) IPPS catalyze the successive 1'-1 condensation of 2 farnesyl or 2 geranylgeranyl isoprenoid diphosphates. Cyclization of these 30- and 40-carbon linear forms are catalyzed by class II cyclases. Both the isoprenoid chain elongation reactions and the class I terpene cyclization reactions proceed via electrophilic alkylations in which a new carbon-carbon single bond is generated through interaction between a highly reactive electron-deficient allylic carbocation and an electron-rich carbon-carbon double bond. The catalytic site consists of a large central cavity formed by mostly antiparallel alpha helices with two aspartate-rich regions located on opposite walls. These residues mediate binding of prenyl phosphates via bridging Mg2+ ions, inducing proposed conformational changes that close the active site to solvent, stabilizing reactive carbocation intermediates. Generally, the enzymes in this family exhibit an all-trans reaction pathway, an exception, is the cis-trans terpene cyclase, trichodiene synthase. Mechanistically and structurally distinct, class II terpene cyclases and cis-IPPS are not included in this CD." Q#26979 - CGI_10002029 superfamily 248388 3 167 2.50E-09 52.3309 cl17834 Ser_hydrolase superfamily - - "Serine hydrolase; Members of this family have serine hydrolase activity. They contain a conserved serine hydrolase motif, GXSXG/A, where the serine is a putative nucleophile. This family has an alpha-beta hydrolase fold. Eukaryotic members of this family have a conserved LXCXE motif, which binds to retinoblastomas. This motif is absent from prokaryotic members of this family." Q#26980 - CGI_10002030 superfamily 220792 6 104 1.22E-20 81.2983 cl11150 EPL1 superfamily C - Enhancer of polycomb-like; This is a family of EPL1 (Enhancer of polycomb-like) proteins. The EPL1 protein is a member of a histone acetyltransferase complex which is involved in transcriptional activation of selected genes. Q#26981 - CGI_10001734 superfamily 245862 54 266 5.10E-39 138.485 cl12076 THUMP superfamily - - "THUMP domain, predicted to bind RNA; The THUMP domain is named after THioUridine synthases, RNA Methyltransferases and Pseudo-uridine synthases. It is predicted to be an RNA-binding domain and probably functions by delivering a variety of RNA modification enzymes to their targets." Q#26982 - CGI_10001735 superfamily 192535 58 292 6.79E-06 46.0498 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#26984 - CGI_10001737 superfamily 215754 113 214 4.19E-17 74.5972 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#26984 - CGI_10001737 superfamily 215754 21 112 6.54E-15 68.434 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#26984 - CGI_10001737 superfamily 215754 216 303 2.09E-14 67.2784 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#26986 - CGI_10001749 superfamily 241563 68 109 2.84E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#26987 - CGI_10001716 superfamily 243092 37 318 4.28E-26 105.495 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#26989 - CGI_10001718 superfamily 207794 15 288 8.51E-123 363.841 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#26989 - CGI_10001718 superfamily 207794 291 422 2.06E-54 186.264 cl02948 GH20_hexosaminidase superfamily N - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#26996 - CGI_10001856 superfamily 248458 124 220 1.37E-07 50.0049 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#26997 - CGI_10001857 superfamily 248458 34 99 2.55E-06 45.3825 cl17904 MFS superfamily NC - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27002 - CGI_10002072 superfamily 243310 60 161 1.42E-27 104.628 cl03120 ELO superfamily C - "GNS1/SUR4 family; Members of this family are involved in long chain fatty acid elongation systems that produce the 26-carbon precursors for ceramide and sphingolipid synthesis. Predicted to be integral membrane proteins, in eukaryotes they are probably located on the endoplasmic reticulum. Yeast ELO3 affects plasma membrane H+-ATPase activity, and may act on a glucose-signaling pathway that controls the expression of several genes that are transcriptionally regulated by glucose such as PMA1." Q#27004 - CGI_10001058 superfamily 245226 16 115 1.03E-12 61.9329 cl10012 DnaQ_like_exo superfamily N - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#27006 - CGI_10009783 superfamily 245847 2 92 3.94E-09 49.4774 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#27007 - CGI_10009784 superfamily 247742 2 97 1.40E-08 51.3289 cl17188 enolase_like superfamily N - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#27008 - CGI_10009785 superfamily 247742 19 50 5.03E-09 51.3289 cl17188 enolase_like superfamily N - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#27009 - CGI_10009786 superfamily 245201 32 311 1.74E-162 489.254 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27009 - CGI_10009786 superfamily 149105 1000 1103 0.00129413 41.2665 cl12353 TMPIT superfamily C - "TMPIT-like protein; A number of members of this family are annotated as being transmembrane proteins induced by tumour necrosis factor alpha, but no literature was found to support this." Q#27010 - CGI_10009787 superfamily 245879 96 170 0.000104555 40.4194 cl12116 DUSP superfamily - - DUSP domain; The DUSP (domain present in ubiquitin-specific protease) domain is found at the N-terminus of Ubiquitin-specific proteases. The structure of this domain has been solved. Its tripod-like structure consists of a 3-fold alpha-helical bundle supporting a triple-stranded anti-parallel beta-sheet. Q#27011 - CGI_10009788 superfamily 243066 12 104 2.05E-21 82.9861 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27012 - CGI_10009789 superfamily 248097 30 152 8.21E-18 74.9942 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#27013 - CGI_10009790 superfamily 246918 80 124 9.22E-08 44.8851 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27024 - CGI_10013283 superfamily 247941 6 146 1.54E-08 50.0269 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#27025 - CGI_10013284 superfamily 247941 52 192 1.37E-05 42.3229 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#27026 - CGI_10013285 superfamily 247856 50 104 7.52E-07 46.3869 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27026 - CGI_10013285 superfamily 247856 177 231 0.00180305 36.3717 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27027 - CGI_10013286 superfamily 247856 403 457 7.18E-07 46.7721 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27027 - CGI_10013286 superfamily 243056 121 298 8.95E-38 138.208 cl02495 RabGAP-TBC superfamily N - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#27029 - CGI_10013288 superfamily 217293 16 224 1.11E-58 194.774 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#27029 - CGI_10013288 superfamily 202474 231 340 2.50E-17 80.3905 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#27032 - CGI_10000915 superfamily 241563 65 103 4.78E-07 46.8967 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27032 - CGI_10000915 superfamily 128778 121 200 0.00414063 36.0887 cl17972 BBC superfamily C - B-Box C-terminal domain; Coiled coil region C-terminal to (some) B-Box domains Q#27036 - CGI_10007289 superfamily 247684 65 230 9.95E-39 139.72 cl17037 NBD_sugar-kinase_HSP70_actin superfamily C - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#27037 - CGI_10007290 superfamily 247684 43 306 4.37E-40 145.883 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#27039 - CGI_10007292 superfamily 221825 168 294 5.89E-07 50.7022 cl15135 DUF3827 superfamily NC - "Domain of unknown function (DUF3827); This family contains the human KIAA1549 protein which has been found to be fused fused to BRAF gene in many cases of pilocytic astrocytomas. The fusion is due mainly to a tandem duplication of 2 Mb at 7q34. Although nothing is known about the function of KIAA1549 protein, the BRAF protein is a well characterized oncoprotein. It is a serine/threonine protein kinase which is implicated in MAP/ERK signalling, a critical pathway for the regulation of cell division, differentiation and secretion." Q#27041 - CGI_10007294 superfamily 243072 222 330 1.88E-30 115.559 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27041 - CGI_10007294 superfamily 241565 354 427 5.75E-07 47.3163 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#27041 - CGI_10007294 superfamily 241565 463 558 0.000350899 38.8958 cl00038 BRCT superfamily - - "Breast Cancer Suppressor Protein (BRCA1), carboxy-terminal domain. The BRCT domain is found within many DNA damage repair and cell cycle checkpoint proteins. The unique diversity of this domain superfamily allows BRCT modules to interact forming homo/hetero BRCT multimers, BRCT-non-BRCT interactions, and interactions within DNA strand breaks." Q#27042 - CGI_10007295 superfamily 241594 1344 1700 4.35E-142 444.7 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#27042 - CGI_10007295 superfamily 241647 947 972 1.11E-06 47.5226 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#27042 - CGI_10007295 superfamily 241647 1127 1157 0.000617718 39.4334 cl00157 WW superfamily - - Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs. Q#27042 - CGI_10007295 superfamily 246669 191 317 4.25E-44 158.721 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27043 - CGI_10007296 superfamily 247684 5 118 9.67E-18 78.3995 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#27044 - CGI_10025282 superfamily 219502 312 556 2.02E-68 221.549 cl06625 Nucleos_tra2_C superfamily - - Na+ dependent nucleoside transporter C-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the C-terminus of this family of transporters. Q#27044 - CGI_10025282 superfamily 201962 130 200 7.20E-18 78.5716 cl03347 Nucleos_tra2_N superfamily - - Na+ dependent nucleoside transporter N-terminus; This family consists of nucleoside transport proteins. Rat CNT 2 is a purine-specific Na+-nucleoside cotransporter localised to the bile canalicular membrane. Rat CNT 1 is a a Na+-dependent nucleoside transporter selective for pyrimidine nucleosides and adenosine it also transports the anti-viral nucleoside analogues AZT and ddC. This alignment covers the N terminus of this family Q#27044 - CGI_10025282 superfamily 219507 209 309 1.50E-08 52.6267 cl18514 Gate superfamily - - "Nucleoside recognition; This region in the nucleoside transporter proteins are responsible for determining nucleoside specificity in the human CNT1 and CNT2 proteins. In the FeoB proteins, which are believed to be Fe2+ transporters, it includes the membrane pore region, so the function of this region is likely to be more general than just nucleoside specificity. This family may represent the pore and gate, with a wide potential range of specificity. Hence its name 'Gate'." Q#27045 - CGI_10025283 superfamily 245596 748 934 1.30E-70 236.821 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#27045 - CGI_10025283 superfamily 245596 569 626 6.83E-08 53.4656 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#27045 - CGI_10025283 superfamily 245596 664 719 1.16E-06 49.9988 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#27049 - CGI_10025287 superfamily 216363 58 166 4.37E-12 58.6358 cl08312 UPF0029 superfamily - - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#27050 - CGI_10025288 superfamily 243152 132 221 1.70E-11 58.4542 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#27050 - CGI_10025288 superfamily 248086 29 85 3.86E-07 45.2394 cl17532 SH3_3 superfamily - - Bacterial SH3 domain; Bacterial SH3 domain. Q#27051 - CGI_10025289 superfamily 222150 512 538 0.000415004 39.2973 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27051 - CGI_10025289 superfamily 219837 321 430 0.000549895 39.9595 cl07160 Fork_head_N superfamily - - "Forkhead N-terminal region; The region described in this family is found towards the N-terminus of various eukaryotic fork head/HNF-3-related transcription factors (which contain the pfam00250 domain). These proteins play key roles in embryogenesis, maintenance of differentiated cell states, and tumorigenesis." Q#27051 - CGI_10025289 superfamily 222150 625 647 0.000644599 38.9121 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27051 - CGI_10025289 superfamily 222150 541 564 0.00216807 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27051 - CGI_10025289 superfamily 222150 652 677 0.002423 37.3713 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27051 - CGI_10025289 superfamily 222150 596 620 0.00486098 36.2157 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27052 - CGI_10025290 superfamily 245814 433 497 4.74E-10 56.7287 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27052 - CGI_10025290 superfamily 245814 250 320 5.22E-09 53.6471 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27052 - CGI_10025290 superfamily 245814 332 408 1.11E-11 61.3672 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27052 - CGI_10025290 superfamily 245814 195 229 0.00152348 37.0997 cl11960 Ig superfamily N - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27053 - CGI_10025291 superfamily 222150 131 156 5.84E-05 40.8381 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27053 - CGI_10025291 superfamily 222150 160 184 0.000149526 39.6825 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27053 - CGI_10025291 superfamily 222150 43 65 0.00064615 38.1417 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27056 - CGI_10025294 superfamily 243066 13 117 7.09E-24 96.5325 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27056 - CGI_10025294 superfamily 198867 125 223 1.79E-16 75.6579 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#27056 - CGI_10025294 superfamily 243146 357 405 8.40E-06 43.419 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#27056 - CGI_10025294 superfamily 243146 325 368 6.91E-05 41.0047 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#27056 - CGI_10025294 superfamily 243146 505 554 0.00200877 36.6574 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#27062 - CGI_10025300 superfamily 241754 209 482 1.01E-88 278.3 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#27062 - CGI_10025300 superfamily 243859 30 94 2.32E-11 60.8066 cl04722 PLAC8 superfamily C - PLAC8 family; This family includes the Placenta-specific gene 8 protein. Q#27063 - CGI_10025301 superfamily 242406 24 104 0.00284028 33.7189 cl01271 DUF1768 superfamily C - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#27064 - CGI_10025302 superfamily 214507 24 77 1.50E-05 37.7948 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#27065 - CGI_10025303 superfamily 217617 580 755 3.05E-21 93.6348 cl15988 Sulfotransfer_2 superfamily N - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#27066 - CGI_10025304 superfamily 248458 1 126 2.84E-11 63.8721 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27068 - CGI_10025306 superfamily 241584 141 243 3.60E-09 54.0395 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#27068 - CGI_10025306 superfamily 245201 347 509 8.52E-36 134.201 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27070 - CGI_10025308 superfamily 220695 30 200 0.000531205 39.4843 cl18571 7TM_GPCR_Srx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#27071 - CGI_10025309 superfamily 177822 2 290 1.01E-20 89.2089 cl18088 PLN02164 superfamily - - sulfotransferase Q#27072 - CGI_10025310 superfamily 241566 271 321 3.79E-07 47.4868 cl00040 C1 superfamily - - "Protein kinase C conserved region 1 (C1) . Cysteine-rich zinc binding domain. Some members of this domain family bind phorbol esters and diacylglycerol, some are reported to bind RasGTP. May occur in tandem arrangement. Diacylglycerol (DAG) is a second messenger, released by activation of Phospholipase D. Phorbol Esters (PE) can act as analogues of DAG and mimic its downstream effects in, for example, tumor promotion. Protein Kinases C are activated by DAG/PE, this activation is mediated by their N-terminal conserved region (C1). DAG/PE binding may be phospholipid dependent. C1 domains may also mediate DAG/PE signals in chimaerins (a family of Rac GTPase activating proteins), RasGRPs (exchange factors for Ras/Rap1), and Munc13 isoforms (scaffolding proteins involved in exocytosis)." Q#27077 - CGI_10025315 superfamily 241763 128 347 2.52E-95 284.902 cl00298 Peptidase_C1 superfamily - - "C1 Peptidase family (MEROPS database nomenclature), also referred to as the papain family; composed of two subfamilies of cysteine peptidases (CPs), C1A (papain) and C1B (bleomycin hydrolase). Papain-like enzymes are mostly endopeptidases with some exceptions like cathepsins B, C, H and X, which are exopeptidases. Papain-like CPs have different functions in various organisms. Plant CPs are used to mobilize storage proteins in seeds while mammalian CPs are primarily lysosomal enzymes responsible for protein degradation in the lysosome. Papain-like CPs are synthesized as inactive proenzymes with N-terminal propeptide regions, which are removed upon activation. Bleomycin hydrolase (BH) is a CP that detoxifies bleomycin by hydrolysis of an amide group. It acts as a carboxypeptidase on its C-terminus to convert itself into an aminopeptidase and peptide ligase. BH is found in all tissues in mammals as well as in many other eukaryotes. It forms a hexameric ring barrel structure with the active sites imbedded in the central channel. Some members of the C1 family are proteins classified as non-peptidase homologs which lack peptidase activity or have missing active site residues." Q#27077 - CGI_10025315 superfamily 244586 43 98 2.04E-12 61.4906 cl07031 Inhibitor_I29 superfamily - - Cathepsin propeptide inhibitor domain (I29); This domain is found at the N-terminus of some C1 peptidases such as Cathepsin L where it acts as a propeptide. There are also a number of proteins that are composed solely of multiple copies of this domain such as the peptidase inhibitor salarin. This family is classified as I29 by MEROPS. Q#27078 - CGI_10025316 superfamily 247769 221 403 3.16E-09 55.0381 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#27083 - CGI_10025321 superfamily 219080 15 105 1.05E-07 45.7946 cl05851 DUF1115 superfamily - - Protein of unknown function (DUF1115); This family represents the C-terminus of hypothetical eukaryotic proteins of unknown function. Q#27084 - CGI_10025322 superfamily 217473 187 373 1.30E-25 107.066 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#27085 - CGI_10025323 superfamily 246680 1714 1783 1.28E-11 63.76 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#27086 - CGI_10025324 superfamily 247059 42 212 4.77E-101 293.194 cl15763 Clp_protease_like superfamily - - "Caseinolytic protease (ClpP) is an ATP-dependent protease; Clp protease (caseinolytic protease; ClpP; endopeptidase Clp; Peptidase S14; ATP-dependent protease, ClpAP)-like enzymes are highly conserved serine proteases and belong to the ClpP/Crotonase superfamily. Included in this family are Clp proteases that are involved in a number of cellular processes such as degradation of misfolded proteins, regulation of short-lived proteins and housekeeping removal of dysfunctional proteins. They are also implicated in the control of cell growth, targeting DNA-binding protein from starved cells. The functional Clp protease is comprised of two components: a proteolytic component and one of several regulatory ATPase components, both of which are required for effective levels of protease activity in the presence of ATP. Active site consists of the triad Ser, His and Asp, preferring hydrophobic or non-polar residues at P1 or P1' positions. The protease exists as a tetradecamer made up of two heptameric rings stacked back-to-back such that the catalytic triad of each subunit is located at the interface between three monomers, thus making oligomerization essential for function. Another family included in this class of enzymes is the signal peptide peptidase A (SppA; S49) which is involved in the cleavage of signal peptides after their removal from the precursor proteins by signal peptidases. Mutagenesis studies suggest that the catalytic center of SppA comprises a Ser-Lys dyad and not the usual Ser-His-Asp catalytic triad found in the majority of serine proteases. In addition to the carboxyl-terminal protease domain that is conserved in all the S49 family members, the E. coli SppA contains an amino-terminal domain. Others, including sohB peptidase, protein C, protein 1510-N and archaeal signal peptide peptidase, do not contain the amino-terminal domain. The third family included in this hierarchy is nodulation formation efficiency D (NfeD) which is a membrane-bound Clp-class protease and only found in bacteria and archaea. Majority of the NfeD genomes have been shown to possess operons containing a homologous NfeD/stomatin gene pair, causing NfeD to be previously named stomatin operon partner protein (STOPP). NfeD homologs can be divided into two groups: long and short forms. Long-form homologs have a putative ClpP-class serine protease domain while the short form homologs do not. Downstream from the ClpP-class domain is the so-called NfeD or DUF107 domain. N-terminal region of the NfeD homolog PH1510 from Pyrococcus horikoshii has been shown to possess serine protease activity having a Ser-Lys catalytic dyad." Q#27087 - CGI_10025325 superfamily 247724 57 208 2.53E-93 274.397 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27089 - CGI_10025327 superfamily 245201 26 280 1.43E-52 172.806 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27090 - CGI_10025328 superfamily 241622 335 425 7.42E-15 71.829 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#27090 - CGI_10025328 superfamily 246669 490 611 7.83E-22 93.0811 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27090 - CGI_10025328 superfamily 246669 842 984 2.14E-20 89.3693 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27092 - CGI_10025330 superfamily 241642 172 226 5.59E-13 62.513 cl00152 t_SNARE superfamily - - "Soluble NSF (N-ethylmaleimide-sensitive fusion protein)-Attachment protein (SNAP) REceptor domain; these alpha-helical motifs form twisted and parallel heterotetrameric helix bundles; the core complex contains one helix from a protein that is anchored in the vesicle membrane (synaptobrevin), one helix from a protein of the target membrane (syntaxin), and two helices from another protein anchored in the target membrane (SNAP-25); their interaction forms a core which is composed of a polar zero layer, a flanking leucine-zipper layer acts as a water tight shield to isolate ionic interactions in the zero layer from the surrounding solvent" Q#27092 - CGI_10025330 superfamily 241634 32 119 5.29E-09 52.3481 cl00143 SynN superfamily - - "Syntaxin N-terminus domain; syntaxins are nervous system-specific proteins implicated in the docking of synaptic vesicles with the presynaptic plasma membrane; they are a family of receptors for intracellular transport vesicles; each target membrane may be identified by a specific member of the syntaxin family; syntaxins contain a moderately well conserved amino-terminal domain, called Habc, whose structure is an antiparallel three-helix bundle; a linker of about 30 amino acids connects this to the carboxy-terminal region, designated H3 (t_SNARE), of the syntaxin cytoplasmic domain; the highly conserved H3 region forms a single, long alpha-helix when it is part of the core SNARE complex and anchors the protein on the cytoplasmic surface of cellular membranes; H3 is not included in defining this domain" Q#27094 - CGI_10025332 superfamily 247769 515 688 2.25E-08 53.4973 cl17215 HDc superfamily - - Metal dependent phosphohydrolases with conserved 'HD' motif Q#27094 - CGI_10025332 superfamily 248010 277 421 5.46E-26 105.541 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#27094 - CGI_10025332 superfamily 248010 121 229 6.10E-09 55.08 cl17456 GAF superfamily - - "GAF domain; This domain is present in cGMP-specific phosphodiesterases, adenylyl and guanylyl cyclases, phytochromes, FhlA and NifA. Adenylyl and guanylyl cyclases catalyze ATP and GTP to the second messengers cAMP and cGMP, respectively, these products up-regulating catalytic activity by binding to the regulatory GAF domain(s). The opposite hydrolysis reaction is catalyzed by phosphodiesterase. cGMP-dependent 3',5'-cyclic phosphodiesterase catalyzes the conversion of guanosine 3',5'-cyclic phosphate to guanosine 5'-phosphate. Here too, cGMP regulates catalytic activity by GAF-domain binding. Phytochromes are regulatory photoreceptors in plants and bacteria which exist in two thermally-stable states that are reversibly inter-convertible by light: the Pr state absorbs maximally in the red region of the spectrum, while the Pfr state absorbs maximally in the far-red region. This domain is also found in FhlA (formate hydrogen lyase transcriptional activator) and NifA, a transcriptional activator which is required for activation of most Nif operons which are directly involved in nitrogen fixation. NifA interacts with sigma-54." Q#27097 - CGI_10025335 superfamily 243061 8 107 1.38E-41 135.162 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27099 - CGI_10025337 superfamily 243061 1 59 6.88E-22 88.553 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27100 - CGI_10025338 superfamily 222263 62 112 0.00887463 33.0601 cl16321 DDE_4_2 superfamily C - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#27101 - CGI_10025339 superfamily 243061 373 473 3.52E-43 149.029 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27101 - CGI_10025339 superfamily 243061 36 136 1.64E-40 141.711 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27101 - CGI_10025339 superfamily 243061 139 235 1.21E-36 130.925 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27105 - CGI_10021709 superfamily 216363 7 73 6.34E-09 48.2354 cl08312 UPF0029 superfamily N - Uncharacterized protein family UPF0029; Uncharacterized protein family UPF0029. Q#27107 - CGI_10021711 superfamily 245864 97 456 2.96E-81 261.059 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#27108 - CGI_10021712 superfamily 247799 397 459 1.76E-15 72.5927 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27108 - CGI_10021712 superfamily 247799 194 257 2.37E-14 69.1259 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27108 - CGI_10021712 superfamily 247799 301 366 1.54E-12 64.1183 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27108 - CGI_10021712 superfamily 247799 512 575 1.66E-12 63.7331 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27108 - CGI_10021712 superfamily 149918 725 753 3.55E-09 53.6633 cl07567 DUF1897 superfamily C - "Domain of unknown function (DUF1897); This domain is found in Psi proteins produced by Drosophila, and in various eukaryotic hypothetical proteins. It has no known function." Q#27109 - CGI_10021713 superfamily 217617 91 346 1.08E-28 111.739 cl15988 Sulfotransfer_2 superfamily - - "Sulfotransferase family; This family includes a variety of sulfotransferase enzymes. Chondroitin 6-sulfotransferase catalyzes the transfer of sulfate to position 6 of the N-acetylgalactosamine residue of chondroitin. This family also includes Heparan sulfate 2-O-sulfotransferase (HS2ST) and Heparan sulfate 6-sulfotransferase (HS6ST). Heparan sulfate (HS) is a co-receptor for a number of growth factors, morphogens, and adhesion proteins. HS biosynthetic modifications may determine the strength and outcome of HS-ligand interactions. Mice that lack HS2ST undergo developmental failure only after midgestation,the most dramatic effect being the complete failure of kidney development. Heparan sulphate 6- O -sulfotransferase (HS6ST) catalyzes the transfer of sulphate from adenosine 3'-phosphate, 5'-phosphosulphate to the 6th position of the N -sulphoglucosamine residue in heparan sulphate." Q#27110 - CGI_10021714 superfamily 248097 88 181 1.02E-12 66.1346 cl17543 C1q superfamily C - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#27110 - CGI_10021714 superfamily 245213 321 361 0.00252437 36.9432 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27110 - CGI_10021714 superfamily 238012 469 531 0.00771948 35.4078 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#27114 - CGI_10021718 superfamily 241794 218 339 8.29E-37 129.847 cl00334 Ribosomal_S9 superfamily - - Ribosomal protein S9/S16; This family includes small ribosomal subunit S9 from prokaryotes and S16 from eukaryotes. Q#27116 - CGI_10021720 superfamily 241607 910 945 1.66E-06 46.4942 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#27116 - CGI_10021720 superfamily 241607 1048 1068 7.02E-06 44.5682 cl00097 KAZAL_FS superfamily C - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#27116 - CGI_10021720 superfamily 241607 979 1014 5.43E-05 42.3219 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#27118 - CGI_10021722 superfamily 243072 47 116 0.00267844 35.4371 cl02529 ANK superfamily N - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27120 - CGI_10021724 superfamily 248458 311 491 1.67E-15 76.5837 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27121 - CGI_10021725 superfamily 241768 146 322 2.01E-44 151.493 cl00305 Sua5_yciO_yrdC superfamily - - Telomere recombination; This domain has been shown to bind preferentially to dsRNA. The domain is found in SUA5 as well as HypF and YrdC. It has also been shown to be required for telomere recombniation in yeast. Q#27121 - CGI_10021725 superfamily 243161 4 69 4.86E-07 46.6186 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#27122 - CGI_10021726 superfamily 246940 256 476 1.15E-05 45.4022 cl15377 Radical_SAM superfamily - - "Radical SAM superfamily. Enzymes of this family generate radicals by combining a 4Fe-4S cluster and S-adenosylmethionine (SAM) in close proximity. They are characterized by a conserved CxxxCxxC motif, which coordinates the conserved iron-sulfur cluster. Mechanistically, they share the transfer of a single electron from the iron-sulfur cluster to SAM, which leads to its reductive cleavage to methionine and a 5'-deoxyadenosyl radical, which, in turn, abstracts a hydrogen from the appropriately positioned carbon atom. Depending on the enzyme, SAM is consumed during this process or it is restored and reused. Radical SAM enzymes catalyze steps in metabolism, DNA repair, the biosynthesis of vitamins and coenzymes, and the biosynthesis of many antibiotics. Examples are biotin synthase (BioB), lipoyl synthase (LipA), pyruvate formate-lyase (PFL), coproporphyrinogen oxidase (HemN), lysine 2,3-aminomutase (LAM), anaerobic ribonucleotide reductase (ARR), and MoaA, an enzyme of the biosynthesis of molybdopterin." Q#27122 - CGI_10021726 superfamily 216191 105 207 8.74E-30 113.364 cl03017 UPF0004 superfamily - - Uncharacterized protein family UPF0004; This family is the N terminal half of the Prosite family. The C-terminal half has been shown to be related to MiaB proteins. This domain is a nearly always found in conjunction with pfam04055 and pfam01938 although its function is uncertain. Q#27122 - CGI_10021726 superfamily 242412 518 592 6.39E-06 44.1437 cl01282 TRAM superfamily - - TRAM domain; This small domain has no known function. However it may perform a nucleic acid binding role (Bateman A. unpublished observation). Q#27125 - CGI_10021729 superfamily 201778 17 129 9.65E-24 93.8126 cl18219 GFO_IDH_MocA superfamily - - "Oxidoreductase family, NAD-binding Rossmann fold; This family of enzymes utilise NADP or NAD. This family is called the GFO/IDH/MOCA family in swiss-prot." Q#27126 - CGI_10021730 superfamily 247792 478 522 0.00546242 35.114 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27126 - CGI_10021730 superfamily 154937 93 168 9.45E-07 46.8275 cl02489 SWIB superfamily - - SWIB/MDM2 domain; This family includes the SWIB domain and the MDM2 domain. The p53-associated protein (MDM2) is an inhibitor of the p53 tumour suppressor gene binding the transactivation domain and down regulating the ability of p53 to activate transcription. This family contains the p53 binding domain of MDM2. Q#27129 - CGI_10021733 superfamily 246664 356 739 2.90E-173 508.653 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#27129 - CGI_10021733 superfamily 246664 206 302 2.10E-05 46.5088 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#27129 - CGI_10021733 superfamily 218628 699 836 0.00123979 41.0237 cl05221 Spheroidin superfamily N - "Entomopoxvirus spheroidin protein; Entomopoxviruses (EPVs) are large (300-400 nm) oval-shaped viruses replicating in the cytoplasm of their insect host cells. At the end of their replicative cycle EPVs virions are occluded in a highly expressed protein called spheroidin. This protein forms large (5-20 mm long) oval-shaped occlusion bodies (OBs) called spherules. The infectious cycle of EPVs begins with the ingestion by the insect host of the spherules, their dissolution by the alkaline reducing conditions of the midgut fluid and the release of virions in the midgut lumen. The infective particles first replicate in midgut epithelial cells, then pass the gut barrier to colonise the internal tissues, mainly the fat body cells. Whilst spheroidin has been demonstrated to be non-essential for viral replication, it plays an essential role in the natural biological cycle of the virus in protecting virions from adverse environmental conditions (e.g. UV degradation) and thus improving transmission efficacy. In this respect, spheroidins are functionally similar to polyhedrins of baculoviruses or cypoviruses." Q#27130 - CGI_10021734 superfamily 246664 127 478 3.35E-159 463.585 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#27130 - CGI_10021734 superfamily 245814 27 125 0.0043132 36.135 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27130 - CGI_10021734 superfamily 246918 521 564 0.00765362 34.8699 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27131 - CGI_10021735 superfamily 241900 1671 1920 5.26E-43 162.472 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#27132 - CGI_10021736 superfamily 248012 173 240 1.39E-05 42.6457 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#27133 - CGI_10021737 superfamily 248012 3 99 1.11E-07 45.7273 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#27135 - CGI_10021739 superfamily 248012 890 1011 1.58E-11 63.0613 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#27135 - CGI_10021739 superfamily 246925 521 765 0.000333676 42.7278 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#27137 - CGI_10021741 superfamily 248012 473 565 1.21E-09 56.5129 cl17458 TIR_2 superfamily C - TIR domain; This is a family of bacterial Toll-like receptors. Q#27140 - CGI_10025139 superfamily 238191 23 524 3.34E-156 460.646 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#27142 - CGI_10025141 superfamily 248097 93 220 8.57E-16 70.3718 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#27144 - CGI_10025143 superfamily 242443 1 126 2.15E-42 146.333 cl01342 Peptidase_A22B superfamily N - "Signal peptide peptidase; The members of this family are membrane proteins. In some proteins this region is found associated with pfam02225. This family corresponds with Merops subfamily A22B, the type example of which is signal peptide peptidase. There is a sequence-similarity relationship with pfam01080." Q#27149 - CGI_10025148 superfamily 247724 2 70 5.73E-16 70.6419 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27151 - CGI_10025150 superfamily 242902 42 104 4.07E-12 59.9531 cl02144 TLD superfamily C - TLD; This domain is predicted to be an enzyme and is often found associated with pfam01476. Q#27152 - CGI_10025151 superfamily 222150 408 433 9.60E-06 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27153 - CGI_10025152 superfamily 222150 77 102 0.00142044 32.749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27154 - CGI_10025153 superfamily 246597 28 303 1.77E-130 385.743 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#27154 - CGI_10025153 superfamily 217260 339 499 1.02E-38 138.926 cl03752 5_nucleotid_C superfamily - - "5'-nucleotidase, C-terminal domain; 5'-nucleotidase, C-terminal domain. " Q#27156 - CGI_10025155 superfamily 243035 64 187 1.00E-21 86.1345 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27157 - CGI_10025156 superfamily 247683 327 381 1.32E-26 103.158 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#27157 - CGI_10025156 superfamily 247725 398 503 9.17E-42 148.178 cl17171 PH-like superfamily C - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#27157 - CGI_10025156 superfamily 147559 549 642 1.02E-12 66.0674 cl05152 Tmemb_9 superfamily - - "TMEM9; This family contains several eukaryotic transmembrane proteins which are homologous to human transmembrane protein 9. The TMEM9 gene encodes a 183 amino-acid protein that contains an N-terminal signal peptide, a single transmembrane region, three potential N-glycosylation sites and three conserved cys-rich domains in the N-terminus, but no known functional domains. The protein is highly conserved between species from Caenorhabditis elegans to man and belongs to a novel family of transmembrane proteins. The exact function of TMEM9 is unknown although it has been found to be widely expressed and localised to the late endosomes and lysosomes. Members of this family contain pfam03128 repeats in their N-terminal region." Q#27158 - CGI_10025157 superfamily 191268 204 332 7.35E-05 41.1031 cl15960 Mt_ATP-synt_B superfamily C - "Mitochondrial ATP synthase B chain precursor (ATP-synt_B); The Fo sector of the ATP synthase is a membrane bound complex which mediates proton transport. It is composed of nine different polypeptide subunits (a, b, c, d, e, f, g F6, A6L)." Q#27160 - CGI_10025159 superfamily 241619 31 93 6.12E-06 40.2581 cl00112 PAN_APPLE superfamily - - "PAN/APPLE-like domain; present in N-terminal (N) domains of plasminogen/ hepatocyte growth factor proteins, plasma prekallikrein/coagulation factor XI and microneme antigen proteins, plant receptor-like protein kinases, and various nematode and leech anti-platelet proteins. Common structural features include two disulfide bonds that link the alpha-helix to the central region of the protein. PAN domains have significant functional versatility, fulfilling diverse biological functions by mediating protein-protein or protein-carbohydrate interactions." Q#27161 - CGI_10025160 superfamily 245213 60 90 2.33E-05 39.5422 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27162 - CGI_10025161 superfamily 243123 46 78 0.000415462 35.997 cl02638 Hairy_orange superfamily - - "Hairy Orange; The Orange domain is found in the Drosophila proteins Hesr-1, Hairy, and Enhancer of Split. The Orange domain is proposed to mediate specific protein-protein interaction between Hairy and Scute." Q#27164 - CGI_10025163 superfamily 247856 705 763 7.69E-08 50.6241 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27164 - CGI_10025163 superfamily 247856 292 347 0.000307237 39.8385 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27164 - CGI_10025163 superfamily 247856 796 853 0.000781083 38.6829 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27166 - CGI_10025165 superfamily 247723 65 146 4.49E-54 180.375 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27166 - CGI_10025165 superfamily 247723 153 233 9.07E-51 171.149 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27166 - CGI_10025165 superfamily 247723 555 646 2.79E-51 172.94 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27168 - CGI_10025169 superfamily 247905 777 900 3.87E-24 100.005 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#27168 - CGI_10025169 superfamily 247805 434 577 1.60E-23 98.5635 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#27168 - CGI_10025169 superfamily 243130 175 210 0.00032755 39.7559 cl02655 CUE superfamily - - "CUE domain; CUE domains have been shown to bind ubiquitin. It has been suggested that CUE domains are related to pfam00627 and this has been confirmed by the structure of the domain. CUE domains also occur in two protein of the IL-1 signal transduction pathway, tollip and TAB2." Q#27169 - CGI_10025170 superfamily 247637 8 348 0 512.039 cl16912 MDR superfamily - - "Medium chain reductase/dehydrogenase (MDR)/zinc-dependent alcohol dehydrogenase-like family; The medium chain reductase/dehydrogenases (MDR)/zinc-dependent alcohol dehydrogenase-like family, which contains the zinc-dependent alcohol dehydrogenase (ADH-Zn) and related proteins, is a diverse group of proteins related to the first identified member, class I mammalian ADH. MDRs display a broad range of activities and are distinguished from the smaller short chain dehydrogenases (~ 250 amino acids vs. the ~ 350 amino acids of the MDR). The MDR proteins have 2 domains: a C-terminal NAD(P) binding-Rossmann fold domain of a beta-alpha form and an N-terminal catalytic domain with distant homology to GroES. The MDR group contains a host of activities, including the founding alcohol dehydrogenase (ADH) , quinone reductase, sorbitol dehydrogenase, formaldehyde dehydrogenase, butanediol DH, ketose reductase, cinnamyl reductase, and numerous others. The zinc-dependent alcohol dehydrogenases (ADHs) catalyze the NAD(P)(H)-dependent interconversion of alcohols to aldehydes or ketones. ADH-like proteins typically form dimers (typically higher plants, mammals) or tetramers (yeast, bacteria), and generally have 2 tightly bound zinc atoms per subunit, a catalytic zinc at the active site and a structural zinc in a lobe of the catalytic domain. The active site zinc is coordinated by a histidine, two cysteines, and a water molecule. The second zinc seems to play a structural role, affects subunit interactions, and is typically coordinated by 4 cysteines. Other MDR members have only a catalytic zinc, and some contain no coordinated zinc." Q#27173 - CGI_10025174 superfamily 245235 1 220 2.37E-136 395.43 cl10023 POLBc superfamily N - "DNA polymerase type-B family catalytic domain. DNA-directed DNA polymerases elongate DNA by adding nucleotide triphosphate (dNTP) residues to the 5'-end of the growing chain of DNA. DNA-directed DNA polymerases are multifunctional with both synthetic (polymerase) and degradative modes (exonucleases) and play roles in the processes of DNA replication, repair, and recombination. DNA-dependent DNA polymerases can be classified in six main groups based upon their phylogenetic relationships with E. coli polymerase I (class A), E. coli polymerase II (class B), E. coli polymerase III (class C), euryarchaeota polymerase II (class D), human polymerase beta (class x), E. coli UmuC/DinB, and eukaryotic RAP 30/Xeroderma pigmentosum variant (class Y). Family B DNA polymerases include E. coli DNA polymerase II, some eubacterial phage DNA polymerases, nuclear replicative DNA polymerases (alpha, delta, epsilon, and zeta), and eukaryotic viral and plasmid-borne enzymes. DNA polymerase is made up of distinct domains and sub-domains. The polymerase domain of DNA polymerase type B (Pol domain) is responsible for the template-directed polymerization of dNTPs onto the growing primer strand of duplex DNA that is usually magnesium dependent. In general, the architecture of the Pol domain has been likened to a right hand with fingers, thumb, and palm sub-domains with a deep groove to accommodate the nucleic acid substrate. There are a few conserved motifs in the Pol domain of family B DNA polymerases. The conserved aspartic acid residues in the DTDS motifs of the palm sub-domain is crucial for binding to divalent metal ion and is suggested to be important for polymerase catalysis." Q#27173 - CGI_10025174 superfamily 222632 222 248 0.000931897 36.2293 cl16754 zf-C4pol superfamily N - "C4-type zinc-finger of DNA polymerase delta; In fission yeast this zinc-finger domain appears is the region of Pol3 that binds directly to the B-subunit, Cdc1. Pol delta is a hetero-tetrameric enzyme comprising four evolutionarily well-conserved proteins: the catalytic subunit Pol3 and three smaller subunits Cdc1, Cdc27 and Cdm1." Q#27175 - CGI_10025176 superfamily 247804 96 140 2.44E-14 68.371 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#27175 - CGI_10025176 superfamily 247804 148 188 3.09E-11 59.5114 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#27175 - CGI_10025176 superfamily 247804 44 88 7.23E-11 58.3558 cl17250 SANT superfamily - - "'SWI3, ADA2, N-CoR and TFIIIB' DNA-binding domains. Tandem copies of the domain bind telomeric DNA tandem repeatsas part of the capping complex. Binding is sequence dependent for repeats which contain the G/C rich motif [C2-3 A (CA)1-6]. The domain is also found in regulatory transcriptional repressor complexes where it also binds DNA." Q#27175 - CGI_10025176 superfamily 220175 432 561 1.54E-11 62.9235 cl07814 Cmyb_C superfamily - - "C-myb, C-terminal; Members of this family are predominantly found in the proto-oncogene c-myb and the viral transforming protein myb. Truncation of the domain results in 'activation' of c-myb and subsequent tumourigenesis." Q#27178 - CGI_10025179 superfamily 247723 13 88 5.36E-50 159.703 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27179 - CGI_10025180 superfamily 241677 4 162 3.90E-114 323.44 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#27183 - CGI_10010961 superfamily 247038 554 595 3.24E-10 57.3255 cl15674 IPT superfamily C - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#27183 - CGI_10010961 superfamily 247042 235 320 0.00100029 40.6853 cl15693 Sema superfamily N - "The Sema domain, a protein interacting module, of semaphorins and plexins; Both semaphorins and plexins have a Sema domain on their N-termini. Plexins function as receptors for the semaphorins. Evolutionarily, plexins may be the ancestor of semaphorins. Semaphorins are regulatory molecules in the development of the nervous system and in axonal guidance. They also play important roles in other biological processes, such as angiogenesis, immune regulation, respiration systems, and cancer. Semaphorins can be divided into 7 classes. Vertebrates have members in classes 3-7, whereas classes 1 and 2 are known only in invertebrates. Class 2 and 3 semaphorins are secreted; classes 1 and 4 through 6 are transmembrane proteins; and class 7 is membrane associated via glycosylphosphatidylinositol (GPI) linkage. Plexins are a large family of transmembrane proteins, which are divided into four types (A-D) according to sequence similarity. In vertebrates, type A plexins serve as co-receptors for neuropilins to mediate the signalling of class 3 semaphorins. Plexins serve as direct receptors for several other members of the semaphorin family: class 6 semaphorins signal through type A plexins and class 4 semaphorins through type B plexins. This family also includes the MET and RON receptor tyrosine kinases. The Sema domain is located at the N-terminus and contains four disulfide bonds formed by eight conserved cysteine residues. It serves to recognize and bind receptors." Q#27184 - CGI_10010963 superfamily 243060 163 251 5.46E-05 41.9808 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#27185 - CGI_10010964 superfamily 247984 21 203 7.38E-65 205.537 cl17430 FtsJ superfamily - - "FtsJ-like methyltransferase; This family consists of FtsJ from various bacterial and archaeal sources FtsJ is a methyltransferase, but actually has no effect on cell division. FtsJ's substrate is the 23S rRNA. The 1.5 A crystal structure of FtsJ in complex with its cofactor S-adenosylmethionine revealed that FtsJ has a methyltransferase fold. This family also includes the N terminus of flaviviral NS5 protein. It has been hypothesised that the N-terminal domain of NS5 is a methyltransferase involved in viral RNA capping." Q#27186 - CGI_10010965 superfamily 247724 255 527 2.52E-152 453.487 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27186 - CGI_10010965 superfamily 243185 561 641 3.41E-46 160.909 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#27186 - CGI_10010965 superfamily 243187 733 841 3.45E-44 156.443 cl02789 EFG_like_IV superfamily - - "Elongation Factor G-like domain IV. This family includes the translational elongation factor termed EF-2 (for Archaea and Eukarya) and EF-G (for Bacteria), ribosomal protection proteins that mediate tetracycline resistance and, an evolutionarily conserved U5 snRNP-specific protein (U5-116kD). In complex with GTP, EF-G/EF-2 promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site of the small subunit of ribosome and the mRNA is shifted one codon relative to the ribosome. It has been shown that EF-G/EF-2_IV domain mimics the shape of anticodon arm of the tRNA in the structurally homologous ternary complex of Petra, EF-Tu (another transcriptional elongation factor) and GTP analog. The tip portion of this domain is found in a position that overlaps the anticodon arm of the A-site tRNA, implying that EF-G/EF-2 displaces the A-site tRNA to the P-site by physical interaction with the anticodon arm." Q#27186 - CGI_10010965 superfamily 243183 854 931 5.80E-36 132.059 cl02785 Elongation_Factor_C superfamily - - "Elongation factor G C-terminus. This domain includes the carboxyl terminal regions of elongation factors (EFs) bacterial EF-G, eukaryotic and archeal EF-2 and eukaryotic mitochondrial mtEFG1s and mtEFG2s. This group also includes proteins similar to the ribosomal protection proteins Tet(M) and Tet(O), BipA, LepA and, spliceosomal proteins: human 116kD U5 small nuclear ribonucleoprotein (snRNP) protein (U5-116 kD) and yeast counterpart Snu114p. This domain adopts a ferredoxin-like fold consisting of an alpha-beta sandwich with anti-parallel beta-sheets, resembling the topology of domain III found in the elongation factors EF-G and eukaryotic EF-2, with which it forms the C-terminal block. The two domains however are not superimposable and domain III lacks some of the characteristics of this domain. EF-2/EF-G in complex with GTP, promotes the translocation step of translation. During translocation the peptidyl-tRNA is moved from the A site to the P site, the uncharged tRNA from the P site to the E-site and, the mRNA is shifted one codon relative to the ribosome. Tet(M) and Tet(O) mediate Tc resistance. Typical Tcs bind to the ribosome and inhibit the elongation phase of protein synthesis, by inhibiting the occupation of site A by aminoacyl-tRNA. Tet(M) and Tet(O) catalyze the release of tetracycline (Tc) from the ribosome in a GTP-dependent manner. BipA is a highly conserved protein with global regulatory properties in Escherichia coli. Yeast Snu114p is essential for cell viability and for splicing in vivo. Experiments suggest that GTP binding and probably GTP hydrolysis is important for the function of the U5-116 kD/Snu114p. The function of LepA proteins is unknown." Q#27187 - CGI_10010966 superfamily 243072 15 140 1.36E-31 115.173 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27189 - CGI_10005933 superfamily 111397 6 61 0.00196511 35.0095 cl03620 HYR superfamily N - "HYR domain; This domain is known as the HYR (Hyalin Repeat) domain, after the protein hyalin that is composed exclusively of this repeat. This domain probably corresponds to a new superfamily in the immunoglobulin fold. The function of this domain is uncertain it may be involved in cell adhesion." Q#27190 - CGI_10011897 superfamily 246669 366 485 2.95E-51 175.836 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27190 - CGI_10011897 superfamily 246669 500 613 7.87E-38 138.082 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27190 - CGI_10011897 superfamily 241578 635 861 1.28E-83 270.782 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27191 - CGI_10011898 superfamily 243066 18 121 2.95E-26 102.696 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27191 - CGI_10011898 superfamily 198867 130 230 8.38E-23 93.1748 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#27191 - CGI_10011898 superfamily 243146 369 415 1.91E-07 48.3235 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#27191 - CGI_10011898 superfamily 243146 304 355 1.31E-06 45.7302 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#27191 - CGI_10011898 superfamily 243146 406 448 1.87E-05 42.2634 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#27192 - CGI_10011899 superfamily 246669 17 136 4.10E-40 141.938 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27192 - CGI_10011899 superfamily 246669 151 230 1.05E-26 104.57 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27192 - CGI_10011899 superfamily 241578 315 534 2.77E-81 256.915 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27193 - CGI_10011900 superfamily 246597 1 227 2.74E-69 214.452 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#27194 - CGI_10011901 superfamily 222269 174 331 0.00818049 36.1474 cl18657 Cupin_8 superfamily C - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#27195 - CGI_10011902 superfamily 222269 49 249 3.29E-08 51.5554 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#27197 - CGI_10011904 superfamily 241570 581 697 3.44E-24 100.093 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#27199 - CGI_10011906 superfamily 241899 146 336 2.04E-32 121.496 cl00489 60KD_IMP superfamily - - 60Kd inner membrane protein; 60Kd inner membrane protein. Q#27200 - CGI_10011907 superfamily 150812 1 60 1.11E-16 68.1838 cl10884 Tmemb_170 superfamily N - Putative transmembrane protein 170; Tmem170 is a family of putative transmembrane proteins conserved from nematodes to humans. The protein is only of approximately 130 amino acids in length. The function is unknown. Q#27202 - CGI_10011909 superfamily 248097 117 242 1.83E-16 72.683 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#27203 - CGI_10011910 superfamily 243507 46 227 3.34E-28 108.597 cl03728 Alpha_kinase superfamily - - "Alpha-kinase family; This family is a novel family of eukaryotic protein kinase catalytic domains, which have no detectable similarity to conventional kinases. The family contains myosin heavy chain kinases and Elongation Factor-2 kinase and a bifunctional ion channel. This family is known as the alpha-kinase family. The structure of the kinase domain revealed unexpected similarity to eukaryotic protein kinases in the catalytic core as well as to metabolic enzymes with ATP-grasp domains." Q#27204 - CGI_10011911 superfamily 215827 161 327 9.70E-23 96.3835 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#27205 - CGI_10011912 superfamily 215827 146 324 7.29E-37 136.829 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#27206 - CGI_10011913 superfamily 215827 148 325 2.35E-34 130.281 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#27209 - CGI_10011916 superfamily 215827 38 215 6.09E-38 139.141 cl02830 Tyrosinase superfamily - - Common central domain of tyrosinase; This family also contains polyphenol oxidases and some hemocyanins. Binds two copper ions via two sets of three histidines. This family is related to pfam00372. Q#27210 - CGI_10004096 superfamily 246723 149 693 0 603.141 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27212 - CGI_10004098 superfamily 243091 231 275 3.86E-09 53.65 cl02566 SET superfamily N - "SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure." Q#27213 - CGI_10004099 superfamily 243092 37 334 7.46E-23 95.4796 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#27214 - CGI_10005452 superfamily 247743 2709 2854 1.89E-06 50.2223 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#27217 - CGI_10022109 superfamily 241574 11 55 0.0002775 34.2567 cl00053 PTPc superfamily N - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#27218 - CGI_10022110 superfamily 241584 109 221 2.31E-08 51.3431 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#27218 - CGI_10022110 superfamily 241574 348 451 4.76E-18 82.2485 cl00053 PTPc superfamily C - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#27219 - CGI_10022111 superfamily 245201 82 392 0 643.003 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27220 - CGI_10022112 superfamily 243058 879 982 0.00223277 38.8348 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#27220 - CGI_10022112 superfamily 243267 1 86 2.03E-14 75.728 cl03000 Innexin superfamily N - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#27221 - CGI_10022113 superfamily 243267 582 670 0.00816303 37.5932 cl03000 Innexin superfamily N - "Innexin; This family includes the drosophila proteins Ogre and shaking-B, and the C. elegans proteins Unc-7 and Unc-9. Members of this family are integral membrane proteins which are involved in the formation of gap junctions. This family has been named the Innexins." Q#27223 - CGI_10022115 superfamily 241563 68 109 9.87E-06 44.0072 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27223 - CGI_10022115 superfamily 241563 24 59 0.00127207 37.844 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27223 - CGI_10022115 superfamily 241672 347 420 0.00297131 39.2596 cl00192 ribokinase_pfkB_like superfamily C - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#27226 - CGI_10022118 superfamily 241874 9 548 0 714.066 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#27227 - CGI_10022119 superfamily 241874 9 528 0 692.11 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#27230 - CGI_10022122 superfamily 245206 269 497 1.85E-79 249.905 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#27231 - CGI_10022123 superfamily 241599 1209 1264 5.62E-13 66.498 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#27231 - CGI_10022123 superfamily 202226 1102 1180 5.37E-29 113.542 cl08348 CUT superfamily - - "CUT domain; The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain, often found downstream of the CUT domain. Multiple copies of the CUT domain can exist in one protein ." Q#27231 - CGI_10022123 superfamily 202226 926 1002 7.68E-29 112.772 cl08348 CUT superfamily - - "CUT domain; The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain, often found downstream of the CUT domain. Multiple copies of the CUT domain can exist in one protein ." Q#27231 - CGI_10022123 superfamily 202226 538 622 7.65E-26 104.297 cl08348 CUT superfamily - - "CUT domain; The CUT domain is a DNA-binding motif which can bind independently or in cooperation with the homeodomain, often found downstream of the CUT domain. Multiple copies of the CUT domain can exist in one protein ." Q#27232 - CGI_10022124 superfamily 222430 565 653 0.000940792 38.7664 cl16445 Nup54 superfamily N - "Nucleoporin complex subunit 54; This is the human Nup54 subunit of the nucleoporin complex, equivalent to Nup57 of yeast. Nup54, Nup58 and Nup62 all have similar affinities for importin-beta. It seems likely that they are the only FG-repeat nucleoporins of the central channel, and as such they would form a zone of equal affinity spanning the central channel. The diffusion of importin-beta import complexes through the central channel may be a stochastic process as the affinities are similar, whereas movement from cytoplasmic fibrils to the central channel and from the central channel to the nuclear basket would be facilitated by the subtle differences in affinity between them." Q#27233 - CGI_10022125 superfamily 241868 60 148 1.23E-11 58.6599 cl00447 Nudix_Hydrolase superfamily C - "Nudix hydrolase is a superfamily of enzymes found in all three kingdoms of life, and it catalyzes the hydrolysis of NUcleoside DIphosphates linked to other moieties, X. Enzymes belonging to this superfamily require a divalent cation, such as Mg2+ or Mn2+ for their activity. Members of this family are recognized by a highly conserved 23-residue nudix motif (GX5EX7REUXEEXGU, where U = I, L or V), which forms a structural motif that functions as a metal binding and catalytic site. Substrates of nudix hydrolase include intact and oxidatively damaged nucleoside triphosphates, dinucleoside polyphosphates, nucleotide-sugars and dinucleotide enzymes. These substrates are metabolites or cell signaling molecules that require regulation during different stages of the cell cycle or during periods of stress. In general, the role of the nudix hydrolase is to sanitize the nucleotide pools and to maintain cell viability, thereby serving as surveillance and "house-cleaning" enzymes. Substrate specificity is used to define child families within the superfamily. Differences in substrate specificity are determined by the N-terminal extension or by residues in variable loop regions. Mechanistically, substrate hydrolysis occurs by a nucleophilic substitution reaction, with variation in the numbers and roles of divalent cations required. This superfamily consists of at least nine families: IPP (isopentenyl diphosphate) isomerase, ADP ribose pyrophosphatase, mutT pyrophosphohydrolase, coenzyme-A pyrophosphatase, MTH1-7,8-dihydro-8-oxoguanine-triphosphatase, diadenosine tetraphosphate hydrolase, NADH pyrophosphatase, GDP-mannose hydrolase and the c-terminal portion of the mutY adenine glycosylase." Q#27234 - CGI_10022126 superfamily 244824 8 378 4.55E-124 371.896 cl07893 AmyAc_family superfamily - - "Alpha amylase catalytic domain family; The Alpha-amylase family comprises the largest family of glycoside hydrolases (GH), with the majority of enzymes acting on starch, glycogen, and related oligo- and polysaccharides. These proteins catalyze the transformation of alpha-1,4 and alpha-1,6 glucosidic linkages with retention of the anomeric center. The protein is described as having 3 domains: A, B, C. A is a (beta/alpha) 8-barrel; B is a loop between the beta 3 strand and alpha 3 helix of A; and C is the C-terminal extension characterized by a Greek key. The majority of the enzymes have an active site cleft found between domains A and B where a triad of catalytic residues (Asp, Glu and Asp) performs catalysis. Other members of this family have lost this catalytic activity as in the case of the human 4F2hc, or only have 2 residues that serve as the catalytic nucleophile and the acid/base, such as Thermus A4 beta-galactosidase with 2 Glu residues (GH42) and human alpha-galactosidase with 2 Asp residues (GH31). The family members are quite extensive and include: alpha amylase, maltosyltransferase, cyclodextrin glycotransferase, maltogenic amylase, neopullulanase, isoamylase, 1,4-alpha-D-glucan maltotetrahydrolase, 4-alpha-glucotransferase, oligo-1,6-glucosidase, amylosucrase, sucrose phosphorylase, and amylomaltase." Q#27234 - CGI_10022126 superfamily 241832 407 577 5.28E-46 159.988 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#27235 - CGI_10022127 superfamily 241832 32 202 1.72E-47 155.366 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#27236 - CGI_10022128 superfamily 245864 27 481 2.96E-136 403.968 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#27237 - CGI_10022129 superfamily 215859 175 308 3.46E-07 49.5223 cl18347 Peptidase_S9 superfamily C - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#27238 - CGI_10022130 superfamily 247941 259 406 1.51E-17 78.9168 cl17387 Methyltransf_21 superfamily - - "Methyltransferase FkbM domain; This family has members from bacteria to human, and appears to be a methyltransferase." Q#27241 - CGI_10022133 superfamily 219125 441 632 1.29E-85 268.427 cl05941 C5-epim_C superfamily - - D-glucuronyl C5-epimerase C-terminus; This family represents the C-terminus of D-glucuronyl C5-epimerase (EC:5.1.3.-). Glucuronyl C5-epimerases catalyze the conversion of D-glucuronic acid (GlcUA) to L-iduronic acid (IdceA) units during the biosynthesis of glycosaminoglycans. Q#27242 - CGI_10011551 superfamily 241546 12 124 8.96E-26 100.106 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#27242 - CGI_10011551 superfamily 215847 224 362 3.80E-27 112.155 cl09510 Lipoxygenase superfamily NC - Lipoxygenase; Lipoxygenase. Q#27243 - CGI_10011552 superfamily 241546 3 108 6.52E-22 85.8536 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#27244 - CGI_10011553 superfamily 241546 6 110 1.04E-28 104.174 cl00011 PLAT superfamily - - "PLAT (Polycystin-1, Lipoxygenase, Alpha-Toxin) domain or LH2 (Lipoxygenase homology 2) domain. It consists of an eight stranded beta-barrel. The domain can be found in various domain architectures, in case of lipoxygenases, alpha toxin, lipases and polycystin, but also as a single domain or as repeats.The putative function of this domain is to facilitate access to sequestered membrane or micelle bound substrates." Q#27245 - CGI_10011554 superfamily 241659 70 163 1.45E-27 101.956 cl00175 alpha-crystallin-Hsps_p23-like superfamily - - "alpha-crystallin domain (ACD) found in alpha-crystallin-type small heat shock proteins, and a similar domain found in p23 (a cochaperone for Hsp90) and in other p23-like proteins.; The alpha-crystallin-Hsps_p23-like superfamily includes the alpha-crystallin domain (ACD) of alpha-crystallin-type small heat shock proteins (sHsps) and a similar domain found in p23-like proteins. sHsps are small stress induced proteins with monomeric masses between 12-43 kDa, whose common feature is this ACD. sHsps are generally active as large oligomers consisting of multiple subunits, and are believed to be ATP-independent chaperones that prevent aggregation and are important in refolding in combination with other Hsps. p23 is a cochaperone of the Hsp90 chaperoning pathway. It binds Hsp90 and participates in the folding of a number of Hsp90 clients including the progesterone receptor. p23 also has a passive chaperoning activity. p23 in addition may act as the cytosolic prostaglandin E2 synthase. Included in this superfamily is the p23-like C-terminal CHORD-SGT1 (CS) domain of suppressor of G2 allele of Skp1 (Sgt1) and the p23-like domains of human butyrate-induced transcript 1 (hB-ind1), NUD (nuclear distribution) C, Melusin, and NAD(P)H cytochrome b5 (NCB5) oxidoreductase (OR)." Q#27245 - CGI_10011554 superfamily 218373 190 217 2.26E-07 46.2445 cl04882 SGS superfamily NC - "SGS domain; This domain was thought to be unique to the SGT1-like proteins, but is also found in calcyclin binding proteins." Q#27245 - CGI_10011554 superfamily 149931 3 73 3.13E-07 45.9075 cl07592 Siah-Interact_N superfamily - - "Siah interacting protein, N terminal; The N terminal domain of Siah interacting protein (SIP) adopts a helical hairpin structure with a hydrophobic core stabilised by a classic knobs-and-holes arrangement of side chains contributed by the two amphipathic helices. Little is known about this domain's function, except that it is crucial for interactions with Siah. It has also been hypothesised that SIP can dimerise through this N terminal domain." Q#27246 - CGI_10011555 superfamily 215859 294 342 0.00794524 36.4255 cl18347 Peptidase_S9 superfamily NC - Prolyl oligopeptidase family; Prolyl oligopeptidase family. Q#27249 - CGI_10011558 superfamily 241884 40 228 1.75E-144 406.198 cl00467 Ntn_hydrolase superfamily - - "The Ntn hydrolases (N-terminal nucleophile) are a diverse superfamily of of enzymes that are activated autocatalytically via an N-terminally lcated nucleophilic amino acid. N-terminal nucleophile (NTN-) hydrolase superfamily, which contains a four-layered alpha, beta, beta, alpha core structure. This family of hydrolases includes penicillin acylase, the 20S proteasome alpha and beta subunits, and glutamate synthase. The mechanism of activation of these proteins is conserved, although they differ in their substrate specificities. All known members catalyze the hydrolysis of amide bonds in either proteins or small molecules, and each one of them is synthesized as a preprotein. For each, an autocatalytic endoproteolytic process generates a new N-terminal residue. This mature N-terminal residue is central to catalysis and acts as both a polarizing base and a nucleophile during the reaction. The N-terminal amino group acts as the proton acceptor and activates either the nucleophilic hydroxyl in a Ser or Thr residue or the nucleophilic thiol in a Cys residue. The position of the N-terminal nucleophile in the active site and the mechanism of catalysis are conserved in this family, despite considerable variation in the protein sequences." Q#27249 - CGI_10011558 superfamily 204931 231 269 4.96E-10 53.6172 cl13849 Pr_beta_C superfamily - - "Proteasome beta subunits C terminal; This domain family is found in eukaryotes, and is approximately 40 amino acids in length. The family is found in association with pfam00227. There is a conserved GTT sequence motif. There is a single completely conserved residue Y that may be functionally important. This family includes the C terminal of the beta-type subunits of the proteasome, a multimeric complex that degrades proteins into peptides as part of the MHC class I-mediated Ag-presenting pathway." Q#27251 - CGI_10011560 superfamily 241889 21 178 9.22E-36 124.738 cl00474 PAP2_like superfamily - - "PAP2_like proteins, a super-family of histidine phosphatases and vanadium haloperoxidases, includes type 2 phosphatidic acid phosphatase or lipid phosphate phosphatase (LPP), Glucose-6-phosphatase, Phosphatidylglycerophosphatase B and bacterial acid phosphatase, vanadium chloroperoxidases, vanadium bromoperoxidases, and several other mostly uncharacterized subfamilies. Several members of this superfamily have been predicted to be transmembrane proteins." Q#27252 - CGI_10011561 superfamily 245201 29 257 8.06E-161 451.504 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27254 - CGI_10011563 superfamily 241572 117 202 5.68E-14 66.108 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#27254 - CGI_10011563 superfamily 241572 216 296 2.41E-05 41.4553 cl00050 CYCLIN superfamily - - "Cyclin box fold. Protein binding domain functioning in cell-cycle and transcription control. Present in cyclins, TFIIB and Retinoblastoma (RB).The cyclins consist of 8 classes of cell cycle regulators that regulate cyclin dependent kinases (CDKs). TFIIB is a transcription factor that binds the TATA box. Cyclins, TFIIB and RB contain 2 copies of the domain." Q#27254 - CGI_10011563 superfamily 203895 16 53 0.000924894 36.4926 cl07036 TF_Zn_Ribbon superfamily - - TFIIB zinc-binding; The transcription factor TFIIB contains a zinc-binding motif near the N-terminus. This domain is involved in the interaction with RNA pol II and TFIIF and plays a crucial role in selecting the transcription initiation site. The domain adopts a zinc ribbon like structure. Q#27255 - CGI_10011564 superfamily 241603 93 325 1.45E-74 231.875 cl00089 NUC superfamily - - DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases. They exists as monomers and homodimers. Q#27256 - CGI_10011565 superfamily 247948 406 457 2.56E-12 62.4086 cl17394 RINGv superfamily - - RING-variant domain; RING-variant domain. Q#27257 - CGI_10011566 superfamily 241570 515 632 2.07E-21 90.8481 cl00047 CAP_ED superfamily - - "effector domain of the CAP family of transcription factors; members include CAP (or cAMP receptor protein (CRP)), which binds cAMP, FNR (fumarate and nitrate reduction), which uses an iron-sulfur cluster to sense oxygen) and CooA, a heme containing CO sensor. In all cases binding of the effector leads to conformational changes and the ability to activate transcription. Cyclic nucleotide-binding domain similar to CAP are also present in cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) and vertebrate cyclic nucleotide-gated ion-channels. Cyclic nucleotide-monophosphate binding domain; proteins that bind cyclic nucleotides (cAMP or cGMP) share a structural domain of about 120 residues; the best studied is the prokaryotic catabolite gene activator, CAP, where such a domain is known to be composed of three alpha-helices and a distinctive eight-stranded, antiparallel beta-barrel structure; three conserved glycine residues are thought to be essential for maintenance of the structural integrity of the beta-barrel; CooA is a homodimeric transcription factor that belongs to CAP family; cAMP- and cGMP-dependent protein kinases (cAPK and cGPK) contain two tandem copies of the cyclic nucleotide-binding domain; cAPK's are composed of two different subunits, a catalytic chain and a regulatory chain, which contains both copies of the domain; cGPK's are single chain enzymes that include the two copies of the domain in their N-terminal section; also found in vertebrate cyclic nucleotide-gated ion-channels" Q#27258 - CGI_10011567 superfamily 247069 81 210 2.24E-22 93.2186 cl15787 SEC14 superfamily - - "Sec14p-like lipid-binding domain. Found in secretory proteins, such as S. cerevisiae phosphatidylinositol transfer protein (Sec14p), and in lipid regulated proteins such as RhoGAPs, RhoGEFs and neurofibromin (NF1). SEC14 domain of Dbl is known to associate with G protein beta/gamma subunits." Q#27258 - CGI_10011567 superfamily 201362 265 368 1.42E-15 72.7831 cl08277 Motile_Sperm superfamily - - MSP (Major sperm protein) domain; Major sperm proteins are involved in sperm motility. These proteins oligomerise to form filaments. This family contains many other proteins. Q#27259 - CGI_10011568 superfamily 152683 246 327 2.12E-13 67.3129 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#27263 - CGI_10011572 superfamily 247805 563 695 3.87E-29 113.972 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#27264 - CGI_10011573 superfamily 247905 10 122 6.76E-08 49.5437 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#27264 - CGI_10011573 superfamily 243778 176 266 3.94E-39 134.274 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#27266 - CGI_10011575 superfamily 248458 16 408 2.46E-11 63.4869 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27267 - CGI_10011576 superfamily 221187 1 123 2.13E-40 137.038 cl13212 Malectin superfamily N - "Di-glucose binding within endoplasmic reticulum; Malectin is a membrane-anchored protein of the endoplasmic reticulum that recognises and binds Glc2-N-glycan. It carries a signal peptide from residues 1-26, a C-terminal transmembrane helix from residues 255-274, and a highly conserved central part of approximately 190 residues followed by an acidic, glutamate-rich region. Carbohydrate-binding is mediated by the four aromatic residues, Y67, Y89, Y116, and F117 and the aspartate at D186. NMR-based ligand-screening studies has shown binding of the protein to maltose and related oligosaccharides, on the basis of which the protein has been designated "malectin", and its endogenous ligand is found to be Glc2-high-mannose N-glycan." Q#27268 - CGI_10011577 superfamily 248472 1 112 8.08E-29 101.956 cl17918 Ribosomal_P1_P2_L12p superfamily - - "Ribosomal protein P1, P2, and L12p. Ribosomal proteins P1 and P2 are the eukaryotic proteins that are functionally equivalent to bacterial L7/L12. L12p is the archaeal homolog. Unlike other ribosomal proteins, the archaeal L12p and eukaryotic P1 and P2 do not share sequence similarity with their bacterial counterparts. They are part of the ribosomal stalk (called the L7/L12 stalk in bacteria), along with 28S rRNA and the proteins L11 and P0 in eukaryotes (23S rRNA, L11, and L10e in archaea). In bacterial ribosomes, L7/L12 homodimers bind the extended C-terminal helix of L10 to anchor the L7/L12 molecules to the ribosome. Eukaryotic P1/P2 heterodimers and archaeal L12p homodimers are believed to bind the L10 equivalent proteins, eukaryotic P0 and archaeal L10e, in a similar fashion. P1 and P2 (L12p, L7/L12) are the only proteins in the ribosome to occur as multimers, always appearing as sets of dimers. Recent data indicate that most archaeal species contain six copies of L12p (three homodimers), while eukaryotes have two copies each of P1 and P2 (two heterodimers). Bacteria may have four or six copies (two or three homodimers), depending on the species. As in bacteria, the stalk is crucial for binding of initiation, elongation, and release factors in eukaryotes and archaea." Q#27269 - CGI_10011578 superfamily 241578 69 241 1.70E-43 150.616 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27270 - CGI_10010574 superfamily 247724 91 285 1.50E-76 240.13 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27270 - CGI_10010574 superfamily 207690 17 43 0.00193577 36.1737 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#27271 - CGI_10023791 superfamily 203134 388 451 0.00917297 34.9649 cl04866 CHORD superfamily - - "CHORD; CHORD represents a Zn binding domain. Silencing of the C. elegans CHORD-containing gene results in semisterility and embryo lethality, suggesting an essential function of the wild-type gene in nematode development." Q#27273 - CGI_10023793 superfamily 217865 2047 2410 0 578.041 cl12285 Not1 superfamily - - "CCR4-Not complex component, Not1; The Ccr4-Not complex is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID." Q#27273 - CGI_10023793 superfamily 221802 1414 1567 2.05E-56 195.103 cl15117 DUF3819 superfamily - - Domain of unknown function (DUF3819); This is an uncharacterized domain that is found on the CCR4-Not complex component Not1. Not1 is a global regulator of transcription that affects genes positively and negatively and is thought to regulate transcription factor TFIID. Q#27274 - CGI_10023794 superfamily 241666 3 158 4.53E-16 71.594 cl00184 CAS_like superfamily N - "Clavaminic acid synthetase (CAS) -like; CAS is a trifunctional Fe(II)/ 2-oxoglutarate (2OG) oxygenase carrying out three reactions in the biosynthesis of clavulanic acid, an inhibitor of class A serine beta-lactamases. In general, Fe(II)-2OG oxygenases catalyze a hydroxylation reaction, which leads to the incorporation of an oxygen atom from dioxygen into a hydroxyl group and conversion of 2OG to succinate and CO2" Q#27275 - CGI_10023795 superfamily 248360 13 205 1.88E-57 183.631 cl17806 DER1 superfamily - - "Der1-like family; The endoplasmic reticulum (ER) of the yeast Saccharomyces cerevisiae contains of proteolytic system able to selectively degrade misfolded lumenal secretory proteins. For examination of the components involved in this degradation process, mutants were isolated. They could be divided into four complementation groups. The mutations led to stabilisation of two different substrates for this process. The mutant classes were called 'der' for 'degradation in the ER'. DER1 was cloned by complementation of the der1-2 mutation. The DER1 gene codes for a novel, hydrophobic protein, that is localised to the ER. Deletion of DER1 abolished degradation of the substrate proteins. The function of the Der1 protein seems to be specifically required for the degradation process associated with the ER. Interestingly this family seems distantly related to the Rhomboid family of membrane peptidases. Suggesting that this family may also mediate degradation of misfolded proteins (Bateman A pers. obs.)." Q#27276 - CGI_10023796 superfamily 243092 41 336 4.13E-34 134 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#27276 - CGI_10023796 superfamily 243056 420 601 9.28E-12 65.0729 cl02495 RabGAP-TBC superfamily - - "Rab-GTPase-TBC domain; Identification of a TBC domain in GYP6_YEAST and GYP7_YEAST, which are GTPase activator proteins of yeast Ypt6 and Ypt7, implies that these domains are GTPase activator proteins of Rab-like small GTPases." Q#27277 - CGI_10023797 superfamily 241770 210 272 4.44E-11 58.9464 cl00309 PRTases_typeI superfamily N - "Phosphoribosyl transferase (PRT)-type I domain; Phosphoribosyl transferase (PRT) domain. The type I PRTases are identified by a conserved PRPP binding motif which features two adjacent acidic residues surrounded by one or more hydrophobic residue. PRTases catalyze the displacement of the alpha-1'-pyrophosphate of 5-phosphoribosyl-alpha1-pyrpphosphate (PRPP) by a nitrogen-containing nucleophile. The reaction products are an alpha-1 substituted ribose-5'-phosphate and a free pyrophosphate (PP). PRPP, an activated form of ribose-5-phosphate, is a key metabolite connecting nucleotide synthesis and salvage pathways. The type I PRTase family includes a range of diverse phosphoribosyl transferase enzymes and regulatory proteins of the nucleotide synthesis and salvage pathways, including adenine phosphoribosyltransferase EC:2.4.2.7., hypoxanthine-guanine-xanthine phosphoribosyltransferase, hypoxanthine phosphoribosyltransferase EC:2.4.2.8., ribose-phosphate pyrophosphokinase EC:2.7.6.1., amidophosphoribosyltransferase EC:2.4.2.14., orotate phosphoribosyltransferase EC:2.4.2.10., uracil phosphoribosyltransferase EC:2.4.2.9., and xanthine-guanine phosphoribosyltransferase EC:2.4.2.22." Q#27277 - CGI_10023797 superfamily 222383 37 121 1.17E-06 45.8654 cl16402 Pribosyltran_N superfamily N - "N-terminal domain of ribose phosphate pyrophosphokinase; This family is frequently found N-terminal to the Pribosyltran, pfam00156." Q#27278 - CGI_10023798 superfamily 245835 209 421 5.00E-69 223.731 cl12013 BAR superfamily - - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#27282 - CGI_10023802 superfamily 243035 566 683 1.73E-33 125.04 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27283 - CGI_10023803 superfamily 243555 23 212 3.19E-11 58.5566 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#27284 - CGI_10023804 superfamily 243555 23 212 1.94E-14 67.8014 cl03871 Chitin_bind_3 superfamily - - "Chitin binding domain; This domain is found associated with a wide variety of cellulose binding domain. This domain however is a chitin binding domain. This domain is found in isolation in baculoviral spheroidins and spindolins, protein of unknown function." Q#27285 - CGI_10023805 superfamily 243033 79 198 1.24E-08 50.7797 cl02428 Ependymin superfamily - - Ependymin; Ependymin. Q#27287 - CGI_10023807 superfamily 247639 1 100 8.38E-24 92.5243 cl16914 O-FucT_like superfamily N - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#27288 - CGI_10023808 superfamily 243039 1645 1783 8.12E-88 284.439 cl02446 MATH superfamily - - "MATH (meprin and TRAF-C homology) domain; an independent folding unit with an eight-stranded beta-sandwich structure found in meprins, TRAFs and other proteins. Meprins comprise a class of extracellular metalloproteases which are anchored to the membrane and are capable of cleaving growth factors, extracellular matrix proteins, and biologically active peptides. TRAF molecules serve as adapter proteins that link cell surface receptors of the Tumor Necrosis Factor and 1nterleukin-1/Toll-like families to downstream kinase cascades, which results in the activation of transcription factors and the regulation of cell survival, proliferation and stress responses in the immune and inflammatory systems. Other members include the ubiquitin ligases, TRIM37 and SPOP, and the ubiquitin-specific proteases, HAUSP and Ubp21p. A large number of uncharacterized members mostly from lineage-specific expansions in C. elegans and rice contain MATH and BTB domains, similar to SPOP. The MATH domain has been shown to bind peptide/protein substrates in TRAFs and HAUSP. It is possible that the MATH domain in other members of this superfamily also interacts with various protein substrates. The TRAF domain may also be involved in the trimerization of TRAFs. Based on homology, it is postulated that the MATH domain in meprins may be involved in its tetramer assembly and that the MATH domain, in general, may take part in diverse modular arrangements defined by adjacent multimerization domains." Q#27288 - CGI_10023808 superfamily 243035 344 472 4.82E-09 56.4741 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27288 - CGI_10023808 superfamily 241584 1121 1215 8.96E-07 49.0319 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#27288 - CGI_10023808 superfamily 241584 1335 1431 4.36E-06 47.1059 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#27288 - CGI_10023808 superfamily 241584 1440 1531 8.35E-06 46.3355 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#27288 - CGI_10023808 superfamily 222608 4 124 2.42E-25 104.259 cl18680 DIOX_N superfamily - - non-haem dioxygenase in morphine synthesis N-terminal; This is the highly conserved N-terminal region of proteins with 2-oxoglutarate/Fe(II)-dependent dioxygenase activity. Q#27288 - CGI_10023808 superfamily 217403 175 278 4.31E-22 94.4117 cl18408 2OG-FeII_Oxy superfamily - - "2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. This family includes the C-terminal of prolyl 4-hydroxylase alpha subunit. The holoenzyme has the activity EC:1.14.11.2 catalyzing the reaction: Procollagen L-proline + 2-oxoglutarate + O2 <=> procollagen trans- 4-hydroxy-L-proline + succinate + CO2. The full enzyme consists of a alpha2 beta2 complex with the alpha subunit contributing most of the parts of the active site. The family also includes lysyl hydrolases, isopenicillin synthases and AlkB." Q#27288 - CGI_10023808 superfamily 243066 1807 1911 1.35E-21 93.0657 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27288 - CGI_10023808 superfamily 245814 933 1006 6.71E-17 78.6402 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27288 - CGI_10023808 superfamily 245814 830 911 1.31E-15 74.944 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27288 - CGI_10023808 superfamily 245814 1029 1101 2.73E-13 68.2896 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27288 - CGI_10023808 superfamily 245814 502 598 3.29E-13 68.241 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27288 - CGI_10023808 superfamily 245814 733 802 4.54E-08 52.8646 cl11960 Ig superfamily C - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27288 - CGI_10023808 superfamily 245814 607 695 6.67E-08 52.5576 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27289 - CGI_10023809 superfamily 222438 220 283 1.28E-11 60.189 cl16459 zf-C3Hc3H superfamily - - Potential DNA-binding domain; This domain is likely to be the DNA-binding domain of chromatin re-modelling proteins and helicases. Q#27291 - CGI_10023811 superfamily 245225 39 493 0 613.096 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily - - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#27291 - CGI_10023811 superfamily 215648 588 800 5.91E-71 234.412 cl02802 7tm_3 superfamily - - "7 transmembrane sweet-taste receptor of 3 GCPR; This is a domain of seven transmembrane regions that forms the C-terminus of some subclass 3 G-coupled-protein receptors. It is often associated with a downstream cysteine-rich linker domain, NCD3G pfam07562, which is the human sweet-taste receptor, and the N-terminal domain, ANF_receptor pfam01094. The seven TM regions assemble in such a way as to produce a docking pocket into which such molecules as cyclamate and lactisole have been found to bind and consequently confer the taste of sweetness." Q#27291 - CGI_10023811 superfamily 219467 506 557 1.19E-12 64.2767 cl08456 NCD3G superfamily - - "Nine Cysteines Domain of family 3 GPCR; This conserved sequence contains several highly-conserved Cys residues that are predicted to form disulphide bridges. It is predicted to lie outside the cell membrane, tethered to the pfam00003 in several receptor proteins." Q#27293 - CGI_10023813 superfamily 248009 32 472 1.09E-172 505.211 cl17455 UPF0027 superfamily - - Uncharacterized protein family UPF0027; Uncharacterized protein family UPF0027. Q#27293 - CGI_10023813 superfamily 248009 468 740 3.44E-107 335.724 cl17455 UPF0027 superfamily N - Uncharacterized protein family UPF0027; Uncharacterized protein family UPF0027. Q#27294 - CGI_10023814 superfamily 248009 392 639 3.97E-77 254.061 cl17455 UPF0027 superfamily N - Uncharacterized protein family UPF0027; Uncharacterized protein family UPF0027. Q#27294 - CGI_10023814 superfamily 220964 125 211 0.000580219 39.1213 cl12630 DUF2869 superfamily C - Protein of unknown function (DUF2869); This bacterial family of proteins has no known function. Q#27296 - CGI_10023816 superfamily 207794 321 769 0 654.361 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#27296 - CGI_10023816 superfamily 243574 11 160 1.12E-32 124.75 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#27297 - CGI_10023817 superfamily 247639 23 267 1.63E-37 134.896 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#27298 - CGI_10023818 superfamily 247639 77 327 5.36E-36 132.585 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#27299 - CGI_10023819 superfamily 247639 83 347 3.43E-28 110.629 cl16914 O-FucT_like superfamily - - "GDP-fucose protein O-fucosyltransferase and related proteins; O-fucosyltransferase-like proteins are GDP-fucose dependent enzymes with similarities to the family 1 glycosyltransferases (GT1). They are soluble ER proteins that may be proteolytically cleaved from a membrane-associated preprotein, and are involved in the O-fucosylation of protein substrates, the core fucosylation of growth factor receptors, and other processes." Q#27302 - CGI_10023822 superfamily 247683 324 375 4.07E-20 82.6926 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#27302 - CGI_10023822 superfamily 245835 2 111 5.83E-44 154.333 cl12013 BAR superfamily N - "The Bin/Amphiphysin/Rvs (BAR) domain, a dimerization module that binds membranes and detects membrane curvature; BAR domains are dimerization, lipid binding and curvature sensing modules found in many different proteins with diverse functions including organelle biogenesis, membrane trafficking or remodeling, and cell division and migration. Mutations in BAR containing proteins have been linked to diseases and their inactivation in cells leads to altered membrane dynamics. A BAR domain with an additional N-terminal amphipathic helix (an N-BAR) can drive membrane curvature. These N-BAR domains are found in amphiphysins and endophilins, among others. BAR domains are also frequently found alongside domains that determine lipid specificity, such as the Pleckstrin Homology (PH) and Phox Homology (PX) domains which are present in beta centaurins (ACAPs and ASAPs) and sorting nexins, respectively. A FES-CIP4 Homology (FCH) domain together with a coiled coil region is called the F-BAR domain and is present in Pombe/Cdc15 homology (PCH) family proteins, which include Fes/Fes tyrosine kinases, PACSIN or syndapin, CIP4-like proteins, and srGAPs, among others. The Inverse (I)-BAR or IRSp53/MIM homology Domain (IMD) is found in multi-domain proteins, such as IRSp53 and MIM, that act as scaffolding proteins and transducers of a variety of signaling pathways that link membrane dynamics and the underlying actin cytoskeleton. BAR domains form dimers that bind to membranes, induce membrane bending and curvature, and may also be involved in protein-protein interactions. The I-BAR domain induces membrane protrusions in the opposite direction compared to classical BAR and F-BAR domains, which produce membrane invaginations. BAR domains that also serve as protein interaction domains include those of arfaptin and OPHN1-like proteins, among others, which bind to Rac and Rho GAP domains, respectively." Q#27302 - CGI_10023822 superfamily 241602 191 230 3.46E-08 50.3033 cl00087 HR1 superfamily N - "Protein kinase C-related kinase homology region 1 (HR1) domain that binds Rho family small GTPases; The HR1 domain, also called the ACC (anti-parallel coiled-coil) finger domain or Rho-binding domain binds small GTPases from the Rho family. It is found in Rho effector proteins including PKC-related kinases such as vertebrate PRK1 (or PKN) and yeast PKC1 protein kinases C, as well as in rhophilins and Rho-associated kinase (ROCK). Rho family members function as molecular switches, cycling between inactive and active forms, controlling a variety of cellular processes. HR1 domains may occur in repeat arrangements (PKN contains three HR1 domains), separated by a short linker region." Q#27303 - CGI_10023823 superfamily 241571 12 105 1.46E-15 70.519 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#27304 - CGI_10023824 superfamily 241571 42 137 4.94E-08 51.2591 cl00049 CUB superfamily N - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#27304 - CGI_10023824 superfamily 242323 385 491 8.37E-19 82.5563 cl01132 FA_hydroxylase superfamily - - "Fatty acid hydroxylase superfamily; This superfamily includes fatty acid and carotene hydroxylases and sterol desaturases. Beta-carotene hydroxylase is involved in zeaxanthin synthesis by hydroxylating beta-carotene, but the enzyme may be involved in other pathways. This family includes C-5 sterol desaturase and C-4 sterol methyl oxidase. Members of this family are involved in cholesterol biosynthesis and biosynthesis a plant cuticular wax. These enzymes contain two copies of a HXHH motif. Members of this family are integral membrane proteins." Q#27306 - CGI_10023826 superfamily 247856 489 549 3.55E-11 59.4837 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27306 - CGI_10023826 superfamily 246925 189 344 3.53E-17 81.2477 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#27310 - CGI_10023830 superfamily 243179 133 241 1.01E-18 79.8843 cl02781 tetraspanin_LEL superfamily - - "Tetraspanin, extracellular domain or large extracellular loop (LEL). Tetraspanins are trans-membrane proteins with 4 trans-membrane segments. Both the N- and C-termini lie on the intracellular side of the membrane. This alignment model spans the extracellular domain between the 3rd and 4th trans-membrane segment. The tetraspanin family contains CD9, CD63, CD37, CD53, CD82, CD151, and CD81, amongst others. Tetraspanins are involved in diverse processes such as cell activation and proliferation, adhesion and motility, differentiation, cancer, and others. Their various functions may relate to their ability to act as molecular facilitators, grouping specific cell-surface proteins and affecting formation and stability of signaling complexes. Tetraspanins associate laterally with one another and cluster dynamically with numerous parnter domains in membrane microdomains, forming a network of multimolecular complexes, the "tetraspanin web", which may also include integrins." Q#27311 - CGI_10023831 superfamily 246679 27 170 3.69E-67 203.143 cl14632 Glo_EDI_BRP_like superfamily - - "This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins; This domain superfamily is found in a variety of structurally related metalloproteins, including the type I extradiol dioxygenases, glyoxalase I and a group of antibiotic resistance proteins. A bound metal ion is required for protein activities for the members of this superfamily. A variety of metal ions have been found in the catalytic centers of these proteins including Fe(II), Mn(II), Zn(II), Ni(II) and Mg(II). Type I extradiol dioxygenases catalyze the incorporation of both atoms of molecular oxygen into aromatic substrates, which results in the cleavage of aromatic rings. They are key enzymes in the degradation of aromatic compounds. Type I extradiol dioxygenases include class I and class II enzymes. Class I and II enzymes show sequence similarity; the two-domain class II enzymes evolved from a class I enzyme through gene duplication. Glyoxylase I catalyzes the glutathione-dependent inactivation of toxic methylglyoxal, requiring zinc or nickel ions for activity. The antibiotic resistance proteins in this family use a variety of mechanisms to block the function of antibiotics. Bleomycin resistance protein (BLMA) sequesters bleomycin's activity by directly binding to it. Whereas, three types of fosfomycin resistance proteins employ different mechanisms to render fosfomycin inactive by modifying the fosfomycin molecule. Although the proteins in this superfamily are functionally distinct, their structures are similar. The difference among the three dimensional structures of the three types of proteins in this superfamily is interesting from an evolutionary perspective. Both glyoxalase I and BLMA show domain swapping between subunits. However, there is no domain swapping for type 1 extradiol dioxygenases." Q#27312 - CGI_10023832 superfamily 247724 25 164 3.04E-60 189.204 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27313 - CGI_10023833 superfamily 215754 272 368 5.97E-10 56.1076 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27314 - CGI_10023834 superfamily 199528 4 115 0.000146301 40.8496 cl15392 PRK10429 superfamily C - melibiose:sodium symporter; Provisional Q#27318 - CGI_10024101 superfamily 241645 237 302 0.00179972 35.579 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#27319 - CGI_10024102 superfamily 247986 216 296 0.00189382 37.7378 cl17432 PBPb superfamily N - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#27321 - CGI_10024104 superfamily 247792 28 73 4.30E-13 66.3152 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27322 - CGI_10024105 superfamily 243072 473 589 9.13E-33 124.803 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27322 - CGI_10024105 superfamily 243072 602 717 1.85E-25 104.003 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27322 - CGI_10024105 superfamily 243072 669 773 2.50E-12 65.4826 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27322 - CGI_10024105 superfamily 247792 802 844 0.00086172 38.5808 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27322 - CGI_10024105 superfamily 115363 15 83 2.86E-29 112.85 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#27322 - CGI_10024105 superfamily 115363 166 233 1.41E-21 90.8941 cl05972 MIB_HERC2 superfamily - - Mib_herc2; Named "mib/herc2 domain" in. Usually the protein also contains an E3 ligase domain (either Ring or Hect). Q#27322 - CGI_10024105 superfamily 241760 92 136 4.13E-19 83.2767 cl00295 ZZ superfamily - - "Zinc finger, ZZ type. Zinc finger present in dystrophin, CBP/p300 and many other proteins. The ZZ motif coordinates one or two zinc ions and most likely participates in ligand binding or molecular scaffolding. Many proteins containing ZZ motifs have other zinc-binding motifs as well, and the majority serve as scaffolds in pathways involving acetyltransferase, protein kinase, or ubiqitin-related activity. ZZ proteins can be grouped into the following functional classes: chromatin modifying, cytoskeletal scaffolding, ubiquitin binding or conjugating, and membrane receptor or ion-channel modifying proteins." Q#27322 - CGI_10024105 superfamily 247792 866 895 0.000139667 41.2113 cl17238 RING superfamily C - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27323 - CGI_10024106 superfamily 243038 421 504 2.94E-52 177.15 cl02442 DEP superfamily - - "DEP domain, named after Dishevelled, Egl-10, and Pleckstrin, where this domain was first discovered. The function of this domain is still not clear, but it is believed to be important for the membrane association of the signaling proteins in which it is present. New studies show that the DEP domain of Sst2, a yeast RGS protein is necessary and sufficient for receptor interaction." Q#27323 - CGI_10024106 superfamily 241622 252 339 1.30E-21 90.7038 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#27323 - CGI_10024106 superfamily 198670 1 82 7.36E-41 145.632 cl02426 DIX superfamily - - DIX domain; The DIX domain is present in Dishevelled and axin. This domain is involved in homo- and hetero-oligomerisation. It is involved in the homo- oligomerisation of mouse axin. The axin DIX domain also interacts with the dishevelled DIX domain. The DIX domain has also been called the DAX domain. Q#27323 - CGI_10024106 superfamily 221526 508 572 0.000259091 41.8735 cl13715 Dsh_C superfamily C - "Segment polarity protein dishevelled (Dsh) C terminal; This domain family is found in eukaryotes, and is typically between 177 and 207 amino acids in length. The family is found in association with pfam00778, pfam02377, pfam00610, pfam00595. The segment polarity gene dishevelled (dsh) is required for pattern formation of the embryonic segments. It is involved in the determination of body organisation through the Wingless pathway (analogous to the Wnt-1 pathway)." Q#27323 - CGI_10024106 superfamily 221526 833 857 0.00864878 37.2511 cl13715 Dsh_C superfamily N - "Segment polarity protein dishevelled (Dsh) C terminal; This domain family is found in eukaryotes, and is typically between 177 and 207 amino acids in length. The family is found in association with pfam00778, pfam02377, pfam00610, pfam00595. The segment polarity gene dishevelled (dsh) is required for pattern formation of the embryonic segments. It is involved in the determination of body organisation through the Wingless pathway (analogous to the Wnt-1 pathway)." Q#27324 - CGI_10024107 superfamily 247792 352 390 0.00897187 33.9584 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27324 - CGI_10024107 superfamily 248318 42 88 0.00214041 35.8338 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#27325 - CGI_10024108 superfamily 241548 51 443 0 609.547 cl00013 Lyase_I_like superfamily - - "Lyase class I_like superfamily: contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase, which catalyze similar beta-elimination reactions; Lyase class I_like superfamily of enzymes that catalyze beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. This superfamily contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase. The lyase class I family comprises proteins similar to class II fumarase, aspartase, adenylosuccinate lyase, argininosuccinate lyase, and 3-carboxy-cis, cis-muconate lactonizing enzyme which, for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. Histidine or phenylalanine ammonia-lyase catalyze a beta-elimination of ammonia from histidine and phenylalanine, respectively." Q#27328 - CGI_10024111 superfamily 241548 17 452 0 839.283 cl00013 Lyase_I_like superfamily - - "Lyase class I_like superfamily: contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase, which catalyze similar beta-elimination reactions; Lyase class I_like superfamily of enzymes that catalyze beta-elimination reactions and are active as homotetramers. The four active sites of the homotetrameric enzyme are each formed by residues from three different subunits. This superfamily contains the lyase class I family, histidine ammonia-lyase and phenylalanine ammonia-lyase. The lyase class I family comprises proteins similar to class II fumarase, aspartase, adenylosuccinate lyase, argininosuccinate lyase, and 3-carboxy-cis, cis-muconate lactonizing enzyme which, for the most part catalyze similar beta-elimination reactions in which a C-N or C-O bond is cleaved with the release of fumarate as one of the products. Histidine or phenylalanine ammonia-lyase catalyze a beta-elimination of ammonia from histidine and phenylalanine, respectively." Q#27328 - CGI_10024111 superfamily 241983 497 754 1.27E-38 145.193 cl00614 ADP_ribosyl_GH superfamily - - "ADP-ribosylglycohydrolase; This family includes enzymes that ADP-ribosylations, for example ADP-ribosylarginine hydrolase EC:3.2.2.19 cleaves ADP-ribose-L-arginine. The family also includes dinitrogenase reductase activating glycohydrolase. Most surprisingly the family also includes jellyfish crystallins, these proteins appear to have lost the presumed active site residues." Q#27329 - CGI_10024112 superfamily 245230 3 455 0 868.497 cl10017 Tubulin_FtsZ superfamily - - "Tubulin/FtsZ: Family includes tubulin alpha-, beta-, gamma-, delta-, and epsilon-tubulins as well as FtsZ, all of which are involved in polymer formation. Tubulin is the major component of microtubules, but also exists as a heterodimer and as a curved oligomer. Microtubules exist in all eukaryotic cells and are responsible for many functions, including cellular transport, cell motility, and mitosis. FtsZ forms a ring-shaped septum at the site of bacterial cell division, which is required for constriction of cell membrane and cell envelope to yield two daughter cells. FtsZ can polymerize into tubes, sheets, and rings in vitro and is ubiquitous in eubacteria, archaea, and chloroplasts." Q#27330 - CGI_10024113 superfamily 245201 48 299 1.71E-67 217.773 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27331 - CGI_10024114 superfamily 247744 338 517 6.06E-54 181.949 cl17190 NK superfamily - - "Nucleoside/nucleotide kinase (NK) is a protein superfamily consisting of multiple families of enzymes that share structural similarity and are functionally related to the catalysis of the reversible phosphate group transfer from nucleoside triphosphates to nucleosides/nucleotides, nucleoside monophosphates, or sugars. Members of this family play a wide variety of essential roles in nucleotide metabolism, the biosynthesis of coenzymes and aromatic compounds, as well as the metabolism of sugar and sulfate." Q#27331 - CGI_10024114 superfamily 241550 171 317 1.41E-49 168.998 cl00015 nt_trans superfamily - - "nucleotidyl transferase superfamily; nt_trans (nucleotidyl transferase) This superfamily includes the class I amino-acyl tRNA synthetases, pantothenate synthetase (PanC), ATP sulfurylase, and the cytidylyltransferases, all of which have a conserved dinucleotide-binding domain." Q#27333 - CGI_10024116 superfamily 203212 351 494 2.59E-32 125.204 cl04998 NDT80_PhoG superfamily - - "NDT80 / PhoG like DNA-binding family; This family includes the DNA-binding region of NDT80 as well as PhoG and its homologues. The family contains VIB-1. VIB-1 is thought to be a regulator of conidiation in Neurospora crassa and shares a region of similarity to PHOG, a possible phosphate nonrepressible acid phosphatase in Aspergillus nidulans. It has been found that vib-1 is not the structural gene for nonrepressible acid phosphatase, but rather may regulate nonrepressible acid phosphatase activity." Q#27333 - CGI_10024116 superfamily 206058 663 698 7.69E-15 70.9042 cl16455 MRF_C1 superfamily - - "Myelin gene regulatory factor -C-terminal domain 1; This domain is found just downstream of Peptidase_S74, pfam13884. The function is not known." Q#27333 - CGI_10024116 superfamily 222434 599 643 1.27E-05 44.1062 cl16452 Peptidase_S74 superfamily N - "Chaperone of endosialidase; This is the very C-terminal, chaperone, domain of the bacteriophage protein endosialidase. It releases itself, via the serine-lysine dyad at the N-terminus, from the remainder of the end-tail-spike. Cleavage occurs after the threonine which is the final residue of the End-tail-spike family, pfam12219. The endosialidase protein forms homotrimeric molecules in bacteriophages. The catalytic dyad allows this portion of the molecule to be cleaved from the more N-terminal region such that the latter can fold and presumably bind to DNA." Q#27335 - CGI_10024118 superfamily 245206 4 102 2.92E-13 70.188 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#27335 - CGI_10024118 superfamily 248053 257 336 0.000127232 41.8837 cl17499 Peptidase_M14NE-CP-C_like superfamily - - "Peptidase associated domain: C-terminal domain of M14 N/E carboxypeptidase; putative folding, regulation, or interaction domain; This domain is found C-terminal to the M14 carboxypeptidase (CP) N/E subfamily containing zinc-binding enzymes that hydrolyze single C-terminal amino acids from polypeptide chains, and have a recognition site for the free C-terminal carboxyl group, which is a key determinant of specificity. The N/E subfamily includes enzymatically active members (carboxypeptidase N, E, M, D, and Z), as well as non-active members (carboxypeptidase-like protein 1, -2, aortic CP-like protein, and adipocyte enhancer binding protein-1) which lack the critical active site and substrate-binding residues considered necessary for activity. The active N/E enzymes fulfill a variety of cellular functions, including prohormone processing, regulation of peptide hormone activity, alteration of protein-protein or protein-cell interactions and transcriptional regulation. For M14 CPs, it has been suggested that this domain may assist in folding of the CP domain, regulate enzyme activity, or be involved in interactions with other proteins or with membranes; for carboxypeptidase M, it may interact with the bradykinin 1 receptor at the cell surface. This domain may also be found in other peptidase families." Q#27336 - CGI_10024119 superfamily 245206 29 118 3.05E-23 92.6726 cl09931 NADB_Rossmann superfamily C - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#27337 - CGI_10024120 superfamily 246925 136 410 6.36E-10 60.8322 cl15309 LRR_RI superfamily - - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#27337 - CGI_10024120 superfamily 220695 662 857 0.000181181 43.3363 cl18571 7TM_GPCR_Srx superfamily N - Serpentine type 7TM GPCR chemoreceptor Srx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srx is part of the Srg superfamily of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#27337 - CGI_10024120 superfamily 243030 25 49 0.00110261 38.4527 cl02423 LRRNT superfamily C - Leucine rich repeat N-terminal domain; Leucine Rich Repeats pfam00560 are short sequence motifs present in a number of proteins with diverse functions and cellular locations. Leucine Rich Repeats are often flanked by cysteine rich domains. This domain is often found at the N-terminus of tandem leucine rich repeats. Q#27338 - CGI_10024121 superfamily 243084 746 853 3.52E-37 136.267 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#27338 - CGI_10024121 superfamily 247999 675 719 9.03E-12 61.7376 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#27338 - CGI_10024121 superfamily 241563 120 154 0.00200419 37.3167 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27339 - CGI_10024122 superfamily 243035 231 336 5.34E-19 81.5121 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27339 - CGI_10024122 superfamily 152683 128 198 4.73E-09 53.0605 cl13656 Methyltransf_FA superfamily C - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#27339 - CGI_10024122 superfamily 243035 30 89 3.71E-07 47.5946 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27340 - CGI_10024123 superfamily 243035 252 326 6.98E-17 74.9637 cl02432 CLECT superfamily N - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27340 - CGI_10024123 superfamily 152683 144 247 1.39E-14 68.4685 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#27340 - CGI_10024123 superfamily 243035 47 107 1.84E-10 57.2246 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27342 - CGI_10024125 superfamily 243072 819 924 5.48E-29 114.403 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27343 - CGI_10024126 superfamily 241613 499 533 3.98E-06 44.505 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#27343 - CGI_10024126 superfamily 246918 435 488 6.50E-15 69.9231 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27343 - CGI_10024126 superfamily 246918 369 423 8.28E-11 58.3671 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27344 - CGI_10024127 superfamily 241613 692 726 7.33E-05 41.0382 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#27344 - CGI_10024127 superfamily 246918 629 681 1.03E-11 61.4487 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27344 - CGI_10024127 superfamily 246918 203 247 7.91E-06 44.1147 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27344 - CGI_10024127 superfamily 246918 573 623 0.00148212 37.5663 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27345 - CGI_10024128 superfamily 221752 37 64 1.04E-08 47.7189 cl15069 SUZ superfamily N - SUZ domain; The SUZ domain is a conserved RNA-binding domain found in eukaryotes and enriched in positively charged amino acids. It was first characterized in the C.elegans protein Szy-20 where it has been shown to bind RNA and allow their localization to the centrosome. Warning- the domain has a compositionally biased character. Q#27348 - CGI_10024131 superfamily 188340 4 294 1.21E-48 175.459 cl18158 selen_PSTK_euk superfamily - - "L-seryl-tRNA(Sec) kinase, eukaryotic; Members of this protein are L-seryl-tRNA(Sec) kinase. This enzyme is part of a two-step pathway in Eukaryota and Archaea for performing selenocysteine biosynthesis by changing serine misacylated on selenocysteine-tRNA to selenocysteine. This enzyme performs the first step, phosphorylation of the OH group of the serine side chain. This family represents eukaryotic proteins with this activity." Q#27348 - CGI_10024131 superfamily 245201 283 335 0.000487292 41.368 cl09925 PKc_like superfamily N - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27349 - CGI_10024132 superfamily 217575 149 281 2.91E-46 157.821 cl04090 eRF1_2 superfamily - - "eRF1 domain 2; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#27349 - CGI_10024132 superfamily 217574 13 145 6.17E-43 148.91 cl04089 eRF1_1 superfamily - - "eRF1 domain 1; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#27349 - CGI_10024132 superfamily 146221 284 421 1.06E-34 124.972 cl04091 eRF1_3 superfamily - - "eRF1 domain 3; The release factor eRF1 terminates protein biosynthesis by recognising stop codons at the A site of the ribosome and stimulating peptidyl-tRNA bond hydrolysis at the peptidyl transferase centre. The crystal structure of human eRF1 is known. The overall shape and dimensions of eRF1 resemble a tRNA molecule with domains 1, 2, and 3 of eRF1 corresponding to the anticodon loop, aminoacyl acceptor stem, and T stem of a tRNA molecule, respectively. The position of the essential GGQ motif at an exposed tip of domain 2 suggests that the Gln residue coordinates a water molecule to mediate the hydrolytic activity at the peptidyl transferase centre. A conserved groove on domain 1, 80 A from the GGQ motif, is proposed to form the codon recognition site. This family also includes other proteins for which the precise molecular function is unknown. Many of them are from Archaebacteria. These proteins may also be involved in translation termination but this awaits experimental verification." Q#27350 - CGI_10027125 superfamily 242571 1 310 5.97E-77 250.906 cl01544 Bestrophin superfamily - - "Bestrophin, RFP-TM, chloride channel; Bestrophin is a 68-kDa basolateral plasma membrane protein expressed in retinal pigment epithelial cells (RPE). It is encoded by the VMD2 gene, which is mutated in Best macular dystrophy, a disease characterized by a depressed light peak in the electrooculogram. VMD2 encodes a 585-amino acid protein with an approximate mass of 68 kDa which has been designated bestrophin. Bestrophin shares homology with the Caenorhabditis elegans RFP gene family, named for the presence of a conserved arginine (R), phenylalanine (F), proline (P), amino acid sequence motif. Bestrophin is a plasma membrane protein, localised to the basolateral surface of RPE cells consistent with a role for bestrophin in the generation or regulation of the EOG light peak. Bestrophin and other RFP family members represent a new class of chloride channels, indicating a direct role for bestrophin in generating the light peak. The VMD2 gene underlying Best disease was shown to represent the first human member of the RFP-TM protein family. More than 97% of the disease-causing mutations are located in the N-terminal RFP-TM domain implying important functional properties. The bestrophins are four-pass transmembrane chloride-channel proteins, and the RFP-TM or bestrophin domain extends from the N-terminus through approximately 350 amino acids and contains all of the TM domains as well as nearly all reported disease causing mutations. Interestingly, the RFP motif is not conserved evolutionarily back beyond Metazoa, neither is it in plant members." Q#27351 - CGI_10027126 superfamily 247757 8 85 6.61E-23 87.9843 cl17203 Fer4_NifH superfamily N - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#27352 - CGI_10027127 superfamily 217884 26 258 8.39E-55 187.652 cl04392 SRP-alpha_N superfamily - - "Signal recognition particle, alpha subunit, N-terminal; SRP is a complex of six distinct polypeptides and a 7S RNA that is essential for transferring nascent polypeptide chains that are destined for export from the cell to the translocation apparatus of the endoplasmic reticulum (ER) membrane. SRP binds hydrophobic signal sequences as they emerge from the ribosome, and arrests translation." Q#27352 - CGI_10027127 superfamily 247757 425 543 2.12E-43 153.083 cl17203 Fer4_NifH superfamily C - "The Fer4_NifH superfamily contains a variety of proteins which share a common ATP-binding domain. Functionally, proteins in this superfamily use the energy from hydrolysis of NTP to transfer electron or ion." Q#27352 - CGI_10027127 superfamily 243520 330 403 1.08E-07 49.472 cl03758 SRP54_N superfamily - - "SRP54-type protein, helical bundle domain; SRP54-type protein, helical bundle domain. " Q#27353 - CGI_10027128 superfamily 241596 195 256 7.77E-14 64.9279 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#27354 - CGI_10027129 superfamily 247684 40 429 0 546.393 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#27356 - CGI_10027131 superfamily 243040 32 146 2.87E-75 236.924 cl02447 CRD_FZ superfamily - - "CRD_domain cysteine-rich domain, also known as Fz (frizzled) domain; CRD_FZ is an essential component of a number of cell surface receptors, which are involved in multiple signal transduction pathways, particularly in modulating the activity of the Wnt proteins, which play a fundamental role in the early development of metazoans. CRD is also found in secreted frizzled related proteins (SFRPs), which lack the transmembrane segment found in the frizzled protein. The CRD domain is also present in the alpha-1 chain of mouse type XVIII collagen, in carboxypeptidase Z, several receptor tyrosine kinases, and the mosaic transmembrane serine protease corin. The CRD domain is well conserved in metazoans - 10 frizzled proteins have been identified in mammals, 4 in Drosophila and 3 in Caenorhabditis elegans. CRD domains have also been identified in multiple tandem copies in a Dictyostelium discoideum protein. Very little is known about the mechanism by which CRD domains interact with their ligands. The domain contains 10 conserved cysteines." Q#27357 - CGI_10027132 superfamily 247724 585 790 2.97E-89 280.977 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27357 - CGI_10027132 superfamily 245820 1 148 5.40E-37 143.556 cl11970 PriL superfamily N - "Archaeal/eukaryotic core primase: Large subunit, PriL; Primases synthesize the RNA primers required for DNA replication. Primases are grouped into two classes, bacteria/bacteriophage and archaeal/eukaryotic. The proteins in the two classes differ in structure and the replication apparatus components. The DNA replication machinery of archaeal organisms contains only the core primase, a simpler arrangement compared to eukaryotes. Archaeal/eukaryotic core primase is a heterodimeric enzyme consisting of a small catalytic subunit (PriS) and a large subunit (PriL). Although the catalytic activity resides within PriS, the PriL subunit is essential for primase function as disruption of the PriL gene in yeast is lethal. PriL is composed of two structural domains. Several functions have been proposed for PriL, such as the stabilization of PriS, involvement in the initiation of synthesis, the improvement of primase processivity, and the determination of product size." Q#27357 - CGI_10027132 superfamily 191640 347 432 9.06E-28 108.512 cl06121 DUF1279 superfamily - - Protein of unknown function (DUF1279); This family represents the C-terminus (approx. 120 residues) of a number of eukaryotic proteins of unknown function. Q#27358 - CGI_10027133 superfamily 241583 249 362 3.07E-23 96.1535 cl00064 ZnMc superfamily N - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#27359 - CGI_10027134 superfamily 248054 54 272 2.66E-14 70.7943 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#27360 - CGI_10027135 superfamily 248458 219 603 3.44E-22 97.3844 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27361 - CGI_10027136 superfamily 247723 333 403 6.90E-46 154.303 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27363 - CGI_10027138 superfamily 247723 17 30 0.000293904 33.7353 cl17169 RRM_SF superfamily N - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27365 - CGI_10027140 superfamily 247684 4 383 8.84E-80 257.591 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#27366 - CGI_10027141 superfamily 214781 109 145 2.70E-12 63.9004 cl02747 NRF superfamily C - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#27367 - CGI_10027142 superfamily 214781 110 146 9.03E-13 65.4412 cl02747 NRF superfamily C - N-terminal domain in C. elegans NRF-6 (Nose Resistant to Fluoxetine-4) and NDG-4 (resistant to nordihydroguaiaretic acid-4); Also present in several other worm and fly proteins. Q#27367 - CGI_10027142 superfamily 225813 416 582 0.00209966 39.2813 cl18725 COG3274 superfamily N - Predicted O-acyltransferase [General function prediction only] Q#27369 - CGI_10027144 superfamily 241584 209 292 5.00E-07 48.2615 cl00065 FN3 superfamily - - "Fibronectin type 3 domain; One of three types of internal repeats found in the plasma protein fibronectin. Its tenth fibronectin type III repeat contains an RGD cell recognition sequence in a flexible loop between 2 strands. Approximately 2% of all animal proteins contain the FN3 repeat; including extracellular and intracellular proteins, membrane spanning cytokine receptors, growth hormone receptors, tyrosine phosphatase receptors, and adhesion molecules. FN3-like domains are also found in bacterial glycosyl hydrolases." Q#27369 - CGI_10027144 superfamily 245814 102 210 8.86E-14 68.4181 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27369 - CGI_10027144 superfamily 245814 17 102 3.71E-07 48.6557 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27370 - CGI_10027145 superfamily 245814 1 68 8.80E-09 50.7951 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27370 - CGI_10027145 superfamily 245814 175 249 2.35E-06 44.0333 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27370 - CGI_10027145 superfamily 245814 94 161 3.71E-06 43.5687 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27372 - CGI_10027147 superfamily 192535 140 165 0.00452151 35.6494 cl18179 7TM_GPCR_Srsx superfamily C - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#27373 - CGI_10027148 superfamily 192535 141 219 0.00324442 36.4198 cl18179 7TM_GPCR_Srsx superfamily N - Serpentine type 7TM GPCR chemoreceptor Srsx; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srsx is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. Q#27376 - CGI_10027151 superfamily 245814 197 278 6.10E-07 48.0987 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27376 - CGI_10027151 superfamily 245814 584 669 1.09E-06 47.3283 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27376 - CGI_10027151 superfamily 245814 698 765 3.17E-05 42.7792 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27376 - CGI_10027151 superfamily 245814 313 380 0.00038079 39.3124 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27377 - CGI_10027152 superfamily 245814 447 510 1.14E-05 45.1727 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27377 - CGI_10027152 superfamily 242406 1234 1371 8.55E-22 94.1953 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#27377 - CGI_10027152 superfamily 245814 782 869 3.55E-07 49.6395 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27377 - CGI_10027152 superfamily 245814 331 409 2.84E-06 46.9431 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27377 - CGI_10027152 superfamily 245814 897 963 0.000275331 40.972 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27378 - CGI_10027153 superfamily 242406 3 122 6.00E-18 76.4761 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#27380 - CGI_10027155 superfamily 222450 75 98 0.00762069 32.2167 cl16469 zf-H2C2_5 superfamily - - C2H2-type zinc-finger domain; C2H2-type zinc-finger domain. Q#27382 - CGI_10027157 superfamily 247799 569 628 9.61E-20 85.6412 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 639 701 1.14E-17 79.478 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 713 775 7.08E-16 74.4705 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 1030 1092 8.30E-16 74.0853 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 423 482 1.19E-15 73.7001 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 493 555 2.18E-15 72.9297 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 1103 1165 2.51E-13 67.1517 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 950 1007 7.27E-13 65.6109 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 786 848 3.81E-12 63.6849 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 352 409 5.30E-12 63.2997 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 210 270 1.42E-09 55.9809 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 282 341 1.50E-09 55.9809 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 859 944 4.29E-09 54.8253 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27382 - CGI_10027157 superfamily 247799 137 198 8.95E-08 50.9733 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#27384 - CGI_10027159 superfamily 248458 131 311 3.37E-09 56.1681 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27385 - CGI_10027160 superfamily 248458 14 166 3.23E-08 51.5457 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27386 - CGI_10027161 superfamily 216897 30 108 3.66E-27 100.063 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#27386 - CGI_10027161 superfamily 216897 128 206 1.50E-25 95.826 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#27387 - CGI_10027162 superfamily 245531 94 175 7.47E-05 38.9115 cl11158 BEN superfamily - - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#27388 - CGI_10027163 superfamily 241616 27 68 0.000119531 37.7516 cl00109 MADS superfamily - - "MADS: MCM1, Agamous, Deficiens, and SRF (serum response factor) box family of eukaryotic transcriptonal regulators. Binds DNA and exists as hetero and homo-dimers. Composed of 2 main subgroups: SRF-like/Type I and MEF2-like (myocyte enhancer factor 2)/ Type II. These subgroups differ mainly in position of the alpha 2 helix responsible for the dimerization interface; Important in homeotic regulation in plants and in immediate-early development in animals. Also found in fungi." Q#27388 - CGI_10027163 superfamily 247999 131 182 0.000912949 35.2654 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#27389 - CGI_10027165 superfamily 243064 530 643 1.67E-41 146.237 cl02512 NTR_like superfamily - - "NTR_like domain; a beta barrel with an oligosaccharide/oligonucleotide-binding fold found in netrins, complement proteins, tissue inhibitors of metalloproteases (TIMP), and procollagen C-proteinase enhancers (PCOLCE), amongst others. In netrins, the domain plays a role in controlling axon branching in neural development, while the common function of these modules in TIMPs appears to be binding to metzincins. A subset of this family is also known as the C345C domain because it occurs as a C-terminal domain in complement C3, C4 and C5. In C5, the domain interacts with various partners during the formation of the membrane attack complex." Q#27389 - CGI_10027165 superfamily 238012 436 484 4.02E-09 53.5122 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#27389 - CGI_10027165 superfamily 238012 254 299 8.35E-08 49.6602 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#27389 - CGI_10027165 superfamily 238012 373 426 0.000145069 40.0302 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#27389 - CGI_10027165 superfamily 243198 17 253 2.33E-101 311.602 cl02806 Laminin_N superfamily - - Laminin N-terminal (Domain VI); Laminin N-terminal (Domain VI). Q#27392 - CGI_10027168 superfamily 152088 122 201 5.81E-11 55.6719 cl13155 DUF3259 superfamily - - Protein of unknown function (DUF3259); This eukaryotic family of proteins has no known function. Q#27393 - CGI_10027169 superfamily 243072 157 274 3.44E-23 96.6838 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27393 - CGI_10027169 superfamily 243072 332 453 4.22E-19 85.1278 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27393 - CGI_10027169 superfamily 243072 58 209 2.31E-11 62.401 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27393 - CGI_10027169 superfamily 217473 572 891 1.06E-12 68.1605 cl03978 Mab-21 superfamily - - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#27394 - CGI_10027170 superfamily 245201 414 708 4.27E-177 512.345 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27394 - CGI_10027170 superfamily 245814 217 302 1.11E-28 110.793 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27394 - CGI_10027170 superfamily 245814 114 194 1.05E-17 79.3508 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27394 - CGI_10027170 superfamily 245814 3 83 1.36E-14 70.612 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27395 - CGI_10027171 superfamily 248241 32 482 0 526.055 cl17687 5_nucleotid superfamily - - "5' nucleotidase family; This family of eukaryotic proteins includes 5' nucleotidase enzymes, such as purine 5'-nucleotidase EC:3.1.3.5." Q#27396 - CGI_10027172 superfamily 243092 84 373 3.92E-90 284.227 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#27400 - CGI_10027176 superfamily 246680 42 115 4.96E-06 44.8852 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#27400 - CGI_10027176 superfamily 246680 389 448 9.47E-06 44.1148 cl14633 DD_superfamily superfamily C - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#27401 - CGI_10027177 superfamily 220374 40 140 1.14E-40 143.349 cl16011 MCM_bind superfamily - - "Mini-chromosome maintenance replisome factor; This entry is of proteins of approximately 600 residues in length containing alternating regions of conservation and low complexity. The Arabidopsis protein is a replisome factor found to bind with the mini-chromosome maintenance, MCM-binding, complex and is crucial for efficient DNA replication." Q#27401 - CGI_10027177 superfamily 222265 276 375 2.73E-37 133.115 cl16322 Racemase_4 superfamily - - Putative alanine racemase; This is a family of eukaryotic proteins which are putatively alanine racemase. Q#27402 - CGI_10027178 superfamily 241643 299 334 6.49E-10 54.0023 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#27403 - CGI_10027179 superfamily 216167 80 243 1.40E-37 135.792 cl02999 DNA_photolyase superfamily - - DNA photolyase; This domain binds a light harvesting cofactor. Q#27404 - CGI_10027180 superfamily 205718 214 242 1.07E-05 42.4774 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#27404 - CGI_10027180 superfamily 205718 320 348 0.000118369 39.3958 cl16296 RCC1_2 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#27404 - CGI_10027180 superfamily 201217 177 226 0.000414799 37.8904 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#27406 - CGI_10027182 superfamily 191444 81 161 2.10E-10 53.4821 cl05558 IL17 superfamily - - Interleukin-17; IL-17 is a potent proinflammatory cytokine produced by activated memory T cells. The IL-17 family is thought to represent a distinct signaling system that appears to have been highly conserved across vertebrate evolution. Q#27407 - CGI_10027183 superfamily 243060 143 235 4.56E-10 60.0852 cl02507 SEA superfamily - - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#27408 - CGI_10027184 superfamily 241599 158 206 1.69E-18 77.6688 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#27409 - CGI_10027185 superfamily 242181 4 316 4.13E-78 254.864 cl00900 Ldh_2 superfamily - - "Malate/L-lactate dehydrogenase; This family consists of bacterial and archaeal Malate/L-lactate dehydrogenase. L-lactate dehydrogenase, EC:1.1.1.27, catalyzes the reaction (S)-lactate + NAD(+) <=> pyruvate + NADH. Malate dehydrogenase, EC:1.1.1.37 and EC:1.1.1.82, catalyzes the reactions: (S)-malate + NAD(+) <=> oxaloacetate + NADH, and (S)-malate + NADP(+) <=> oxaloacetate + NADPH respectively." Q#27409 - CGI_10027185 superfamily 242181 338 659 2.42E-68 228.671 cl00900 Ldh_2 superfamily - - "Malate/L-lactate dehydrogenase; This family consists of bacterial and archaeal Malate/L-lactate dehydrogenase. L-lactate dehydrogenase, EC:1.1.1.27, catalyzes the reaction (S)-lactate + NAD(+) <=> pyruvate + NADH. Malate dehydrogenase, EC:1.1.1.37 and EC:1.1.1.82, catalyzes the reactions: (S)-malate + NAD(+) <=> oxaloacetate + NADH, and (S)-malate + NADP(+) <=> oxaloacetate + NADPH respectively." Q#27410 - CGI_10027186 superfamily 243152 91 219 1.49E-37 129.331 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#27411 - CGI_10027187 superfamily 243152 91 218 9.15E-41 137.805 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#27412 - CGI_10027188 superfamily 243152 679 807 1.31E-39 143.198 cl02712 PGRP superfamily - - "Peptidoglycan recognition proteins (PGRPs) are pattern recognition receptors that bind, and in certain cases, hydrolyze peptidoglycans (PGNs) of bacterial cell walls. PGRPs have been divided into three classes: short PGRPs (PGRP-S), that are small (20 kDa) extracellular proteins; intermediate PGRPs (PGRP-I) that are 40-45 kDa and are predicted to be transmembrane proteins; and long PGRPs (PGRP-L), up to 90 kDa, which may be either intracellular or transmembrane. Several structures of PGRPs are known in insects and mammals, some bound with substrates like Muramyl Tripeptide (MTP) or Tracheal Cytotoxin (TCT). The substrate binding site is conserved in PGRP-LCx, PGRP-LE, and PGRP-Ialpha proteins. This family includes Zn-dependent N-Acetylmuramoyl-L-alanine Amidase, EC:3.5.1.28. This enzyme cleaves the amide bond between N-acetylmuramoyl and L-amino acids, preferentially D-lactyl-L-Ala, in bacterial cell walls. The structure for the bacteriophage T7 lysozyme shows that two of the conserved histidines and a cysteine are zinc binding residues. Site-directed mutagenesis of T7 lysozyme indicates that two conserved residues, a Tyr and a Lys, are important for amidase activity." Q#27412 - CGI_10027188 superfamily 246680 66 138 1.22E-06 47.1964 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#27414 - CGI_10011269 superfamily 245213 347 383 2.65E-06 45.3202 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27414 - CGI_10011269 superfamily 245213 310 345 7.54E-06 44.1646 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27414 - CGI_10011269 superfamily 246918 391 443 1.08E-15 73.0047 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27414 - CGI_10011269 superfamily 219525 685 722 5.06E-06 44.7174 cl06646 GCC2_GCC3 superfamily N - GCC2 and GCC3; GCC2 and GCC3. Q#27414 - CGI_10011269 superfamily 219525 210 252 1.16E-05 43.947 cl06646 GCC2_GCC3 superfamily - - GCC2 and GCC3; GCC2 and GCC3. Q#27415 - CGI_10011270 superfamily 247866 48 249 7.31E-24 96.7528 cl17312 PhyH superfamily - - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#27416 - CGI_10011271 superfamily 247866 3 142 2.55E-21 87.508 cl17312 PhyH superfamily N - "Phytanoyl-CoA dioxygenase (PhyH); This family is made up of several eukaryotic phytanoyl-CoA dioxygenase (PhyH) proteins, ectoine hydroxylases and a number of bacterial deoxygenases. PhyH is a peroxisomal enzyme catalyzing the first step of phytanic acid alpha-oxidation. PhyH deficiency causes Refsum's disease (RD) which is an inherited neurological syndrome biochemically characterized by the accumulation of phytanic acid in plasma and tissues." Q#27417 - CGI_10011272 superfamily 247684 1 263 1.96E-130 384.354 cl17037 NBD_sugar-kinase_HSP70_actin superfamily N - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#27418 - CGI_10011273 superfamily 220692 29 332 9.68E-27 107.675 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#27419 - CGI_10011274 superfamily 220692 29 330 5.33E-15 73.7777 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#27420 - CGI_10011275 superfamily 220692 39 337 1.02E-20 90.3413 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#27421 - CGI_10011276 superfamily 220692 23 329 8.98E-22 93.0377 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#27422 - CGI_10011277 superfamily 241563 68 109 1.56E-06 46.3184 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27422 - CGI_10011277 superfamily 241563 28 59 0.000874048 38.2292 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27424 - CGI_10011279 superfamily 245596 35 209 7.15E-94 276 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#27428 - CGI_10027965 superfamily 241574 87 306 1.65E-60 205.127 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#27428 - CGI_10027965 superfamily 241574 367 605 3.58E-42 154.281 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#27429 - CGI_10027966 superfamily 241597 410 451 3.43E-06 45.3024 cl00082 HMG-box superfamily C - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#27430 - CGI_10027967 superfamily 243092 6 294 4.27E-67 225.292 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#27430 - CGI_10027967 superfamily 219795 530 777 1.68E-49 174.871 cl08476 PUL superfamily - - "PUL domain; The PUL (PLAP, Ufd3p and Lub1p) domain is a novel alpha-helical Ub-associated domain. It directly binds to Cdc48, a chaperone-like AAA ATPase that collects ubiquitylated substrates." Q#27430 - CGI_10027967 superfamily 204121 339 455 1.37E-48 167.403 cl08519 PFU superfamily - - PFU (PLAA family ubiquitin binding); This domain is found N terminal to pfam08324 and binds to ubiquitin. Q#27432 - CGI_10027969 superfamily 218123 575 767 2.56E-74 245.301 cl04559 CP2 superfamily - - CP2 transcription factor; This family represents a conserved region in the CP2 transcription factor family. Q#27433 - CGI_10027970 superfamily 243092 545 859 1.44E-38 147.096 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#27433 - CGI_10027970 superfamily 247683 1065 1115 1.40E-24 99.1223 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#27435 - CGI_10027972 superfamily 242203 122 290 5.31E-51 169.067 cl00935 Brix superfamily - - Brix domain; Brix domain. Q#27436 - CGI_10027973 superfamily 248145 12 231 2.98E-37 134.299 cl17591 CAF1 superfamily C - CAF1 family ribonuclease; The major pathways of mRNA turnover in eukaryotes initiate with shortening of the polyA tail. CAF1 encodes a critical component of the major cytoplasmic deadenylase in yeast. Both Caf1p is required for normal mRNA deadenylation in vivo and localises to the cytoplasm. Caf1p copurifies with a Ccr4p-dependent polyA-specific exonuclease activity. Some members of this family include and inserted RNA binding domain pfam01424. This family of proteins is related to other exonucleases pfam00929 (Bateman A pers. obs.). The crystal structure of Saccharomyces cerevisiae Pop2 has been resolved at 2.3 Angstrom#resolution. Q#27438 - CGI_10027975 superfamily 241645 13 84 4.48E-16 72.1293 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#27439 - CGI_10027976 superfamily 241782 2 283 9.03E-54 180.23 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27440 - CGI_10027977 superfamily 243095 785 980 3.70E-81 263.558 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#27440 - CGI_10027977 superfamily 241622 82 165 1.27E-19 85.311 cl00117 PDZ superfamily - - "PDZ domain, also called DHR (Dlg homologous region) or GLGF (after a conserved sequence motif). Many PDZ domains bind C-terminal polypeptides, though binding to internal (non-C-terminal) polypeptides and even to lipids has been demonstrated. Heterodimerization through PDZ-PDZ domain interactions adds to the domain's versatility, and PDZ domain-mediated interactions may be modulated dynamically through target phosphorylation. Some PDZ domains play a role in scaffolding supramolecular complexes. PDZ domains are found in diverse signaling proteins in bacteria, archebacteria, and eurkayotes. This CD contains two distinct structural subgroups with either a N- or C-terminal beta-strand forming the peptide-binding groove base. The circular permutation placing the strand on the N-terminus appears to be found in Eumetazoa only, while the C-terminal variant is found in all three kingdoms of life, and seems to co-occur with protease domains. PDZ domains have been named after PSD95(post synaptic density protein), DlgA (Drosophila disc large tumor suppressor), and ZO1, a mammalian tight junction protein." Q#27440 - CGI_10027977 superfamily 246669 658 730 7.17E-08 51.6839 cl14603 C2 superfamily C - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27441 - CGI_10027978 superfamily 248458 96 404 7.88E-11 61.9461 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27442 - CGI_10027979 superfamily 248012 11 118 2.51E-07 48.7281 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#27442 - CGI_10027979 superfamily 190233 262 320 5.16E-05 41.2858 cl08341 zf-TRAF superfamily - - TRAF-type zinc finger; TRAF-type zinc finger. Q#27443 - CGI_10027981 superfamily 243035 24 135 2.19E-05 39.5254 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27444 - CGI_10027982 superfamily 247792 16 62 0.00126749 36.6548 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27445 - CGI_10027983 superfamily 214545 457 596 8.66E-59 196.772 cl10551 CULLIN superfamily - - Cullin; Cullin. Q#27445 - CGI_10027983 superfamily 245539 704 771 1.01E-30 116.114 cl11186 Cullin_Nedd8 superfamily - - "Cullin protein neddylation domain; This is the neddylation site of cullin proteins which are a family of structurally related proteins containing an evolutionarily conserved cullin domain. With the exception of APC2, each member of the cullin family is modified by Nedd8 and several cullins function in Ubiquitin-dependent proteolysis, a process in which the 26S proteasome recognises and subsequently degrades a target protein tagged with K48-linked poly-ubiquitin chains. Cullins are molecular scaffolds responsible for assembling the ROC1/Rbx1 RING-based E3 ubiquitin ligases, of which several play a direct role in tumorigenesis. Nedd8/Rub1 is a small ubiquitin-like protein, which was originally found to be conjugated to Cdc53, a cullin component of the SCF (Skp1-Cdc53/CUL1-F-box protein) E3 Ub ligase complex in Saccharomyces cerevisiae, and Nedd8 modification has now emerged as a regulatory pathway of fundamental importance for cell cycle control and for embryogenesis in metazoans. The only identified Nedd8 substrates are cullins. Neddylation results in covalent conjugation of a Nedd8 moiety onto a conserved cullin lysine residue." Q#27446 - CGI_10027984 superfamily 248097 295 417 3.98E-25 98.8766 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#27446 - CGI_10027984 superfamily 225368 68 166 0.00214554 36.2402 cl01058 NtpF superfamily - - Archaeal/vacuolar-type H+-ATPase subunit H [Energy production and conversion] Q#27447 - CGI_10027985 superfamily 248097 63 172 3.14E-19 79.2314 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#27449 - CGI_10027987 superfamily 145577 58 83 4.25E-05 36.8786 cl03632 Spin-Ssty superfamily N - Spin/Ssty Family; Spindlin (Spin) is a novel maternal transcript present in the unfertilised egg and early embryo. The Y-linked spermiogenesis -specific transcript (Ssty) is also expressed during gametogenesis and forms part of this Pfam family. Members of this family contain three copies of this 50 residue repeat. The repeat is predicted to contain four beta strands. Q#27450 - CGI_10027988 superfamily 243166 18 155 8.34E-11 58.0738 cl02759 TRAM_LAG1_CLN8 superfamily - - TLC domain; TLC domain. Q#27452 - CGI_10027990 superfamily 245226 197 365 2.02E-29 112.394 cl10012 DnaQ_like_exo superfamily - - "DnaQ-like (or DEDD) 3'-5' exonuclease domain superfamily; The DnaQ-like exonuclease superfamily is a structurally conserved group of 3'-5' exonucleases, which catalyze the excision of nucleoside monophosphates at the DNA or RNA termini in the 3'-5' direction. It is also called the DEDD superfamily, after the four invariant acidic residues present in the catalytic site of its members. The superfamily consists of DNA- and RNA-processing enzymes such as the proofreading domains of DNA polymerases, other DNA exonucleases, RNase D, RNase T, Oligoribonuclease and RNA exonucleases (REX). The DnaQ-like exonuclease domain contains three conserved sequence motifs termed ExoI, ExoII and ExoIII, which are clustered around the active site and contain four conserved acidic residues that serve as ligands for the two metal ions required for catalysis. The conservation patterns of the three motifs may vary among different subfamilies. DnaQ-like exonucleases are classified as DEDDy or DEDDh exonucleases depending on the variation of motif III as YX(3)D or HX(4)D, respectively. The significance of the motif differences is still unclear. Almost all RNase families in this superfamily are present only in eukaryotes and bacteria, but not in archaea, suggesting a later origin, which in some cases are accompanied by horizontal gene transfer." Q#27453 - CGI_10027991 superfamily 241818 107 150 0.000871686 36.4333 cl00366 PMSR superfamily NC - Peptide methionine sulfoxide reductase; This enzyme repairs damaged proteins. Methionine sulfoxide in proteins is reduced to methionine. Q#27455 - CGI_10027993 superfamily 241626 163 283 4.30E-59 188.197 cl00125 RHOD superfamily - - "Rhodanese Homology Domain (RHOD); an alpha beta fold domain found duplicated in the rhodanese protein. The cysteine containing enzymatically active version of the domain is also found in the Cdc25 class of protein phosphatases and a variety of proteins such as sulfide dehydrogenases and certain stress proteins such as senesence specific protein 1 in plants, PspE and GlpE in bacteria and cyanide and arsenate resistance proteins. Inactive versions (no active site cysteine) are also seen in dual specificity phosphatases, ubiquitin hydrolases from yeast and in sulfuryltransferases, where they are believed to play a regulatory role in multidomain proteins." Q#27457 - CGI_10027995 superfamily 212564 11 32 0.000185028 38.2143 cl17035 lambda-1 superfamily C - "inner capsid protein lambda-1 or VP3; The reovirus inner capsid protein lambda-1 displays nucleoside triphosphate phosphohydrolase (NTPase), RNA-5'-triphosphatase (RTPase), and RNA helicase activity and may play a role in the transcription of the virus genome, the unwinding or reannealing of double-stranded RNA during RNA synthesis. The RTPase activity constitutes the first step in the capping of RNA, resulting in a 5'-diphosphorylated RNA plus-strand. lambda1 is an Orthoreovirus core protein, VP3 is the homologous core protein in Aquareoviruses." Q#27459 - CGI_10027997 superfamily 243066 22 121 6.23E-09 54.1605 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27459 - CGI_10027997 superfamily 222150 752 777 2.22E-05 42.7641 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27459 - CGI_10027997 superfamily 246975 739 760 0.00464601 35.7857 cl15478 zf-C2H2 superfamily - - "Zinc finger, C2H2 type; The C2H2 zinc finger is the classical zinc finger domain. The two conserved cysteines and histidines co-ordinate a zinc ion. The following pattern describes the zinc finger. #-X-C-X(1-5)-C-X3-#-X5-#-X2-H-X(3-6)-[H/C] Where X can be any amino acid, and numbers in brackets indicate the number of residues. The positions marked # are those that are important for the stable fold of the zinc finger. The final position can be either his or cys. The C2H2 zinc finger is composed of two short beta strands followed by an alpha helix. The amino terminal part of the helix binds the major groove in DNA binding zinc fingers. The accepted consensus binding sequence for Sp1 is usually defined by the asymmetric hexanucleotide core GGGCGG but this sequence does not include, among others, the GAG (=CTC) repeat that constitutes a high-affinity site for Sp1 binding to the wt1 promoter." Q#27460 - CGI_10027998 superfamily 243066 38 137 2.37E-10 58.3977 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27460 - CGI_10027998 superfamily 245531 697 771 5.01E-05 42.3783 cl11158 BEN superfamily - - "BEN domain; The BEN domain is found in diverse animal proteins such as BANP/SMAR1, NAC1 and the Drosophila mod(mdg4) isoform C, in the chordopoxvirus virosomal protein E5R and in several proteins of polydnaviruses. Computational analysis suggests that the BEN domain mediates protein-DNA and protein-protein interactions during chromatin organisation and transcription." Q#27465 - CGI_10028003 superfamily 217293 38 103 7.69E-07 46.4719 cl03788 Neur_chan_LBD superfamily C - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#27466 - CGI_10028004 superfamily 217293 33 238 2.76E-45 157.409 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#27466 - CGI_10028004 superfamily 202474 262 329 5.29E-08 51.8857 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#27467 - CGI_10028005 superfamily 245864 1 213 9.21E-71 225.235 cl12078 p450 superfamily N - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#27468 - CGI_10028006 superfamily 216686 1 124 6.33E-26 98.9345 cl18377 Galactosyl_T superfamily N - "Galactosyltransferase; This family includes the galactosyltransferases UDP-galactose:2-acetamido-2-deoxy-D-glucose3beta-galactosyltransferase and UDP-Gal:beta-GlcNAc beta 1,3-galactosyltranferase. Specific galactosyltransferases transfer galactose to GlcNAc terminal chains in the synthesis of the lacto-series oligosaccharides types 1 and 2." Q#27469 - CGI_10028007 superfamily 241629 76 212 6.11E-43 147.275 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#27469 - CGI_10028007 superfamily 241609 286 358 2.09E-14 67.8006 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#27470 - CGI_10028008 superfamily 247068 774 871 2.52E-12 65.0273 cl15786 CA_like superfamily - - "Cadherin repeat-like domain; Cadherins are glycoproteins involved in Ca2+-mediated cell-cell adhesion. The cadherin repeat domains occur as tandem repeats in the extracellular regions, which are thought to mediate cell-cell contact when bound to calcium. They play numerous roles in cell fate, signalling, proliferation, differentiation, and migration; members include E-, N-, P-, T-, VE-, CNR-, proto-, and FAT-family cadherin, desmocollin, and desmoglein, a large variety of domain architectures with varying repeat copy numbers. Cadherin-repeat containing proteins exist as monomers, homodimers, or heterodimers. This family also includes the cadherin-like repeats of extracellular alpha-dystroglycan." Q#27470 - CGI_10028008 superfamily 241629 74 212 2.81E-41 149.799 cl00133 SCP superfamily - - "SCP: SCP-like extracellular protein domain, found in eukaryotes and prokaryotes. This family includes plant pathogenesis-related protein 1 (PR-1), which accumulates after infections with pathogens, and may act as an anti-fungal agent or be involved in cell wall loosening. This family also includes CRISPs, mammalian cysteine-rich secretory proteins, which combine SCP with a C-terminal cysteine rich domain, and allergen 5 from vespid venom. Roles for CRISP, in response to pathogens, fertilization, and sperm maturation have been proposed. One member, Tex31 from the venom duct of Conus textile, has been shown to possess proteolytic activity sensitive to serine protease inhibitors. The human GAPR-1 protein has been reported to dimerize, and such a dimer may form an active site containing a catalytic triad. SCP has also been proposed to be a Ca++ chelating serine protease. The Ca++-chelating function would fit with various signaling processes that members of this family, such as the CRISPs, are involved in, and is supported by sequence and structural evidence of a conserved pocket containing two histidines and a glutamate. It also may explain how helothermine, a toxic peptide secreted by the beaded lizard, blocks Ca++ transporting ryanodine receptors. Little is known about the biological roles of the bacterial and archaeal SCP domains." Q#27470 - CGI_10028008 superfamily 241609 264 334 2.39E-15 73.1934 cl00100 KR superfamily - - "Kringle domain; Kringle domains are believed to play a role in binding mediators, such as peptides, other proteins, membranes, or phospholipids. They are autonomous structural domains, found in a varying number of copies, in blood clotting and fibrinolytic proteins, some serine proteases and plasma proteins. Plasminogen-like kringles possess affinity for free lysine and lysine-containing peptides." Q#27471 - CGI_10028009 superfamily 247724 81 260 7.16E-98 286.835 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27474 - CGI_10028012 superfamily 247856 147 193 1.41E-08 50.6241 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27474 - CGI_10028012 superfamily 247856 60 112 5.22E-08 49.0833 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27474 - CGI_10028012 superfamily 247856 190 233 0.000213913 38.6829 cl17302 EFh superfamily C - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27476 - CGI_10028014 superfamily 241578 648 703 1.85E-07 51.1382 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27476 - CGI_10028014 superfamily 243119 781 825 1.43E-05 43.9717 cl02629 CBM_14 superfamily - - Chitin binding Peritrophin-A domain; This domain is called the Peritrophin-A domain and is found in chitin binding proteins particularly peritrophic matrix proteins of insects and animal chitinases. Copies of the domain are also found in some baculoviruses. Relevant references that describe proteins with this domain include. It is an extracellular domain that contains six conserved cysteines that probably form three disulphide bridges. Chitin binding has been demonstrated for a protein containing only two of these domains. Q#27479 - CGI_10028017 superfamily 241900 179 223 0.00320375 37.6565 cl00490 EEP superfamily N - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#27480 - CGI_10028018 superfamily 243112 49 179 6.00E-52 167.428 cl02620 YDG_SRA superfamily - - "YDG/SRA domain; The function of this domain is unknown, it contains a conserved motif YDG after which it has been named." Q#27481 - CGI_10028019 superfamily 243032 668 989 5.24E-172 508.285 cl02427 Pumilio superfamily - - "Pumilio-family RNA binding domain; Puf repeats (also labelled PUM-HD or Pumilio homology domain) mediate sequence specific RNA binding in fly Pumilio, worm FBF-1 and FBF-2, and many other proteins such as vertebrate Pumilio. These proteins function as translational repressors in early embryonic development by binding to sequences in the 3' UTR of target mRNAs, such as the nanos response element (NRE) in fly Hunchback mRNA, or the point mutation element (PME) in worm fem-3 mRNA. Other proteins that contain Puf domains are also plausible RNA binding proteins. Yeast PUF1 (JSN1), for instance, appears to contain a single RNA-recognition motif (RRM) domain. Puf repeat proteins have been observed to function asymmetrically and may be responsible for creating protein gradients involved in the specification of cell fate and differentiation. Puf domains usually occur as a tandem repeat of 8 domains. This model encompasses all 8 tandem repeats. Some proteins may have fewer (canonical) repeats." Q#27481 - CGI_10028019 superfamily 247856 24 85 1.17E-14 70.6545 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27481 - CGI_10028019 superfamily 247856 99 149 7.50E-10 56.7873 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27485 - CGI_10028023 superfamily 218688 14 47 0.00139123 33.22 cl05315 ATP-synt_E superfamily C - ATP synthase E chain; This family consists of several ATP synthase E chain sequences which are components of the CF(0) subunit. Q#27486 - CGI_10028024 superfamily 247739 156 258 1.45E-35 128.108 cl17185 LPLAT superfamily C - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#27487 - CGI_10028025 superfamily 247739 1 95 7.10E-37 138.893 cl17185 LPLAT superfamily N - "Lysophospholipid acyltransferases (LPLATs) of glycerophospholipid biosynthesis; Lysophospholipid acyltransferase (LPLAT) superfamily members are acyltransferases of de novo and remodeling pathways of glycerophospholipid biosynthesis. These proteins catalyze the incorporation of an acyl group from either acylCoAs or acyl-acyl carrier proteins (acylACPs) into acceptors such as glycerol 3-phosphate, dihydroxyacetone phosphate or lyso-phosphatidic acid. Included in this superfamily are LPLATs such as glycerol-3-phosphate 1-acyltransferase (GPAT, PlsB), 1-acyl-sn-glycerol-3-phosphate acyltransferase (AGPAT, PlsC), lysophosphatidylcholine acyltransferase 1 (LPCAT-1), lysophosphatidylethanolamine acyltransferase (LPEAT, also known as, MBOAT2, membrane-bound O-acyltransferase domain-containing protein 2), lipid A biosynthesis lauroyl/myristoyl acyltransferase, 2-acylglycerol O-acyltransferase (MGAT), dihydroxyacetone phosphate acyltransferase (DHAPAT, also known as 1 glycerol-3-phosphate O-acyltransferase 1) and Tafazzin (the protein product of the Barth syndrome (TAZ) gene)." Q#27488 - CGI_10028026 superfamily 193607 265 395 9.36E-76 233.618 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#27488 - CGI_10028026 superfamily 241554 72 197 1.13E-07 50.0367 cl00019 Macro superfamily C - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#27489 - CGI_10028027 superfamily 248020 28 359 4.86E-50 176.501 cl17466 Sulfatase superfamily - - Sulfatase; Sulfatase. Q#27490 - CGI_10028028 superfamily 241578 54 223 2.99E-13 68.7466 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27490 - CGI_10028028 superfamily 241578 447 510 1.43E-05 44.8642 cl00057 vWFA superfamily C - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27490 - CGI_10028028 superfamily 241578 509 556 0.000413262 40.7938 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27491 - CGI_10028029 superfamily 241578 16 185 3.95E-13 66.8206 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27492 - CGI_10028030 superfamily 243095 1319 1502 4.98E-53 186.122 cl02570 RhoGAP superfamily - - "RhoGAP: GTPase-activator protein (GAP) for Rho-like GTPases; GAPs towards Rho/Rac/Cdc42-like small GTPases. Small GTPases (G proteins) cluster into distinct families, and all act as molecular switches, active in their GTP-bound form but inactive when bound to GDP. The Rho family of small G proteins, which includes Cdc42Hs, activates effectors involved in a wide variety of developmental processes, including regulation of cytoskeleton formation, cell proliferation and the JNK signaling pathway. G proteins generally have a low intrinsic GTPase hydrolytic activity but there are family-specific groups of GAPs that enhance the rate of GTP hydrolysis by several orders of magnitude. The RhoGAPs are one of the major classes of regulators of Rho G proteins." Q#27492 - CGI_10028030 superfamily 247724 153 245 5.14E-06 46.7315 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27493 - CGI_10028031 superfamily 245205 66 129 1.22E-05 40.6841 cl09930 RPA_2b-aaRSs_OBF_like superfamily N - "Replication protein A, class 2b aminoacyl-tRNA synthetases, and related proteins with oligonucleotide/oligosaccharide (OB) fold.; This superfamily includes two oligonucleotide/oligosaccharide binding fold (OBF) domain families. One of these contains the OBF domains of the large (RPA1, 70kDa), middle (RPA2, RPA4, 32kDa) and small (RPA3, 14 kDa) subunits of human heterotrimeric Replication protein A (RPA), and similar domains. RPA is a nuclear single-strand (ss) DNA-binding protein involved in most aspects of DNA metabolism. This family includes the four OBF domains of RPA1 [DNA-binding domain (DBD)-A, DBD-B, DBD-C, and RPA1N], the OBF domain of RPA2 (RPA2 DBD-D), RPA3, and the OBF domain of RPA4. The major DNA binding activity of human RPA and Saccharomyces cerevisiae RPA appears to be associated with DBD-A and -B, of RPA1. RPA1 DBD-C shows only weak ssDNA-binding activity and is involved in trimerization. The other OBF domain family in this superfamily is the N-terminal, anticodon recognition domain of class 2b aminoacyl-tRNA synthetases (aaRSs). aaRSs catalyze the specific attachment of amino acids to their cognate tRNAs during protein biosynthesis. Class 2b aaRSs include the homodimeric aspartyl-, asparaginyl-, and lysyl-tRNA synthetases." Q#27494 - CGI_10028032 superfamily 201810 84 161 7.94E-46 146.563 cl03226 Skp1 superfamily - - "Skp1 family, dimerisation domain; Skp1 family, dimerisation domain. " Q#27494 - CGI_10028032 superfamily 248291 1 111 2.34E-27 99.6752 cl17737 Skp1_POZ superfamily - - "Skp1 family, tetramerisation domain; Skp1 family, tetramerisation domain. " Q#27495 - CGI_10028033 superfamily 247724 225 442 5.42E-50 169.527 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27495 - CGI_10028033 superfamily 110047 141 223 1.72E-13 67.734 cl03072 GTP1_OBG superfamily N - "GTP1/OBG; The N-terminal domain of B. subtilis GTPase obgE has the OBG fold, which is formed by three glycine-rich regions inserted into a small 8-stranded beta-sandwich these regions form six left-handed collagen-like helices packed and H-bonded together." Q#27496 - CGI_10028034 superfamily 242611 370 552 1.44E-70 232.038 cl01629 TPP_enzymes superfamily - - "Thiamine pyrophosphate (TPP) enzyme family, TPP-binding module; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. These enzymes include, among others, the E1 components of the pyruvate, the acetoin and the branched chain alpha-keto acid dehydrogenase complexes." Q#27496 - CGI_10028034 superfamily 245606 10 161 5.00E-57 193.517 cl11410 TPP_enzyme_PYR superfamily - - "Pyrimidine (PYR) binding domain of thiamine pyrophosphate (TPP)-dependent enzymes; Thiamine pyrophosphate (TPP) family, pyrimidine (PYR) binding domain; found in many key metabolic enzymes which use TPP (also known as thiamine diphosphate) as a cofactor. TPP binds in the cleft formed by a PYR domain and a PP domain. The PYR domain, binds the aminopyrimidine ring of TPP, the PP domain binds the diphosphate residue. A polar interaction between the conserved glutamate of the PYR domain and the N1' of the TPP aminopyrimidine ring is shared by most TPP-dependent enzymes, and participates in the activation of TPP. The PYR and PP domains have a common fold, but do not share strong sequence conservation. The PP domain is not included in this group. Most TPP-dependent enzymes have the PYR and PP domains on the same subunit although these domains can be alternatively arranged in the primary structure. In the case of 2-oxoisovalerate dehydrogenase (2OXO), sulfopyruvate decarboxylase (ComDE), and the E1 component of human pyruvate dehydrogenase complex (E1- PDHc) the PYR and PP domains appear on different subunits. TPP-dependent enzymes are multisubunit proteins, the smallest catalytic unit being a dimer-of-active sites. For many of these enzymes the active sites lie between PP and PYR domains on different subunits. However, for the homodimeric enzymes 1-deoxy-D-xylulose 5-phosphate synthase (DXS) and Desulfovibrio africanus pyruvate:ferredoxin oxidoreductase (PFOR), each active site lies at the interface of the PYR and PP domains from the same subunit." Q#27496 - CGI_10028034 superfamily 247727 759 866 4.23E-11 61.2918 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#27496 - CGI_10028034 superfamily 215786 194 323 5.11E-41 148.083 cl18345 TPP_enzyme_M superfamily - - "Thiamine pyrophosphate enzyme, central domain; The central domain of TPP enzymes contains a 2-fold Rossman fold." Q#27496 - CGI_10028034 superfamily 216379 814 920 0.00251146 39.3011 cl18366 NNMT_PNMT_TEMT superfamily N - NNMT/PNMT/TEMT family; NNMT/PNMT/TEMT family. Q#27497 - CGI_10028035 superfamily 219242 48 462 8.90E-74 245.678 cl06147 FPN1 superfamily - - "Ferroportin1 (FPN1); This family represents a conserved region approximately 100 residues long within eukaryotic Ferroportin1 (FPN1), a protein that may play a role in iron export from the cell. This family may represent a number of transmembrane regions in Ferroportin1." Q#27497 - CGI_10028035 superfamily 219242 529 609 3.22E-24 104.695 cl06147 FPN1 superfamily N - "Ferroportin1 (FPN1); This family represents a conserved region approximately 100 residues long within eukaryotic Ferroportin1 (FPN1), a protein that may play a role in iron export from the cell. This family may represent a number of transmembrane regions in Ferroportin1." Q#27500 - CGI_10028038 superfamily 241567 1 205 7.17E-23 92.6635 cl00042 CASc superfamily - - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#27501 - CGI_10028039 superfamily 248012 13 94 4.45E-15 68.7585 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#27503 - CGI_10028041 superfamily 247792 16 65 3.24E-06 43.9736 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27506 - CGI_10028044 superfamily 243034 498 597 8.16E-23 94.3691 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#27506 - CGI_10028044 superfamily 243034 566 665 5.02E-14 68.9459 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#27506 - CGI_10028044 superfamily 243034 449 529 4.72E-13 66.2496 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#27506 - CGI_10028044 superfamily 243034 636 709 5.17E-09 54.3084 cl02429 TPR superfamily C - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#27506 - CGI_10028044 superfamily 149463 284 360 6.85E-20 85.3441 cl07144 DUF1736 superfamily - - Domain of unknown function (DUF1736); This domain of unknown function is found in various hypothetical metazoan proteins. Q#27508 - CGI_10028046 superfamily 247986 373 477 2.55E-09 56.9978 cl17432 PBPb superfamily C - "Bacterial periplasmic transport systems use membrane-bound complexes and substrate-bound, membrane-associated, periplasmic binding proteins (PBPs) to transport a wide variety of substrates, such as, amino acids, peptides, sugars, vitamins and inorganic ions. PBPs have two cell-membrane translocation functions: bind substrate, and interact with the membrane bound complex. A diverse group of periplasmic transport receptors for lysine/arginine/ornithine (LAO), glutamine, histidine, sulfate, phosphate, molybdate, and methanol are included in the PBPb CD." Q#27508 - CGI_10028046 superfamily 197504 599 758 4.27E-24 99.6712 cl18192 PBPe superfamily - - Eukaryotic homologues of bacterial periplasmic substrate binding proteins; Prokaryotic homologues are represented by a separate alignment: PBPb Q#27508 - CGI_10028046 superfamily 245225 31 205 8.79E-16 78.6924 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#27509 - CGI_10028047 superfamily 241984 74 358 2.34E-135 390.852 cl00615 Membrane-FADS-like superfamily - - "The membrane fatty acid desaturase (Membrane_FADS)-like CD includes membrane FADSs, alkane hydroxylases, beta carotene ketolases (CrtW-like), hydroxylases (CrtR-like), and other related proteins. They are present in all groups of organisms with the exception of archaea. Membrane FADSs are non-heme, iron-containing, oxygen-dependent enzymes involved in regioselective introduction of double bonds in fatty acyl aliphatic chains. They play an important role in the maintenance of the proper structure and functioning of biological membranes. Alkane hydroxylases are bacterial, integral-membrane di-iron enzymes that share a requirement for iron and oxygen for activity similar to that of membrane FADSs, and are involved in the initial oxidation of inactivated alkanes. Beta-carotene ketolase and beta-carotene hydroxylase are carotenoid biosynthetic enzymes for astaxanthin and zeaxanthin, respectively. This superfamily domain has extensive hydrophobic regions that would be capable of spanning the membrane bilayer at least twice. Comparison of these sequences also reveals three regions of conserved histidine cluster motifs that contain eight histidine residues: HXXX(X)H, HXX(X)HH, and HXXHH (an additional conserved histidine residue is seen between clusters 2 and 3). Spectroscopic and genetic evidence point to a nitrogen-rich coordination environment located in the cytoplasm with as many as eight histidines coordinating the two iron ions and a carboxylate residue bridging the two metals in the Pseudomonas oleovorans alkane hydroxylase (AlkB). In addition, the eight histidine residues are reported to be catalytically essential and proposed to be the ligands for the iron atoms contained within the rat stearoyl CoA delta-9 desaturase." Q#27510 - CGI_10028048 superfamily 247803 323 513 1.43E-45 162.049 cl17249 YlqF_related_GTPase superfamily - - "Circularly permuted YlqF-related GTPases; These proteins are found in bacteria, eukaryotes, and archaea. They all exhibit a circular permutation of the GTPase signature motifs so that the order of the conserved G box motifs is G4-G5-G1-G2-G3, with G4 and G5 being permuted from the C-terminal region of proteins in the Ras superfamily to the N-terminus of YlqF-related GTPases." Q#27510 - CGI_10028048 superfamily 242415 4 172 1.26E-41 152.383 cl01287 AE_Prim_S_like superfamily N - "AE_Prim_S_like: primase domain similar to that found in the small subunit of archaeal and eukaryotic (A/E) DNA primases. The replication machineries of A/Es are distinct from that of bacteria. Primases are DNA-dependent RNA polymerases which synthesis the short RNA primers required for DNA replication. In eukaryotes, this small catalytically active primase subunit (p50) and a larger primase subunit (p60), referred to jointly as the core primase, associate with the B subunit and the DNA polymerase alpha subunit in a complex, called Pol alpha-pri. In addition to its catalytic role in replication, eukaryotic DNA primase may play a role in coupling replication to DNA damage repair and in checkpoint control during S phase. Pfu41 and Pfu46 comprise the primase complex of the archaea Pyrococcus furiosus; these proteins have sequence identity to the eukaryotic p50 and p60 primase proteins respectively. Pfu41 preferentially uses dNTPs as substrate. Pfu46 regulates the primase activity of Pfu41. Also found in this group is the primase-polymerase (primpol) domain of replicases from archaeal plasmids including the ORF904 protein of pRN1 from Sulfolobus islandicus (pRN1 primpol). The pRN1 primpol domain exhibits DNA polymerase and primase activities; a cluster of active site residues (three acidic residues, and a histidine) is required for both these activities. The pRN1 primpol primase activity prefers dNTPs to rNTPs; however incorporation of dNTPs requires rNTP as cofactor. This group also includes the Pol domain of bacterial LigD proteins such Mycobacterium tuberculosis (Mt)LigD. MtLigD contains an N-terminal Pol domain, a central phosphoesterase module, and a C-terminal ligase domain. LigD Pol plays a role in non-homologous end joining (NHEJ)-mediated repair of DNA double-strand breaks (DSB) in vivo, perhaps by filling in short 5'-overhangs with ribonucleotides; the filled in termini would be sealed by the associated LigD ligase domain. The MtLigD Pol domain is stimulated by manganese, is error-prone, and prefers adding rNTPs to dNTPs in vitro." Q#27511 - CGI_10007854 superfamily 247999 192 243 1.15E-09 54.804 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#27512 - CGI_10007855 superfamily 241675 103 333 1.39E-115 350.778 cl00195 SIR2 superfamily - - "SIR2 superfamily of proteins includes silent information regulator 2 (Sir2) enzymes which catalyze NAD+-dependent protein/histone deacetylation, where the acetyl group from the lysine epsilon-amino group is transferred to the ADP-ribose moiety of NAD+, producing nicotinamide and the novel metabolite O-acetyl-ADP-ribose. Sir2 proteins, also known as sirtuins, are found in all eukaryotes and many archaea and prokaryotes and have been shown to regulate gene silencing, DNA repair, metabolic enzymes, and life span. The most-studied function, gene silencing, involves the inactivation of chromosome domains containing key regulatory genes by packaging them into a specialized chromatin structure that is inaccessible to DNA-binding proteins. The oligomerization state of Sir2 appears to be organism-dependent, sometimes occurring as a monomer and sometimes as a multimer. Also included in this superfamily is a group of uncharacterized Sir2-like proteins which lack certain key catalytic residues and conserved zinc binding cysteines." Q#27512 - CGI_10007855 superfamily 241672 493 630 0.000647248 41.5696 cl00192 ribokinase_pfkB_like superfamily C - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#27513 - CGI_10007856 superfamily 207794 228 683 6.15E-132 402.826 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#27513 - CGI_10007856 superfamily 243574 2 91 0.000777845 39.6204 cl03918 CHB_HEX superfamily N - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#27513 - CGI_10007856 superfamily 111707 166 230 0.00223049 37.7796 cl03741 Glyco_hydro_20b superfamily N - "Glycosyl hydrolase family 20, domain 2; This domain has a zincin-like fold." Q#27514 - CGI_10007857 superfamily 207794 323 774 2.90E-60 212.922 cl02948 GH20_hexosaminidase superfamily - - "Beta-N-acetylhexosaminidases of glycosyl hydrolase family 20 (GH20) catalyze the removal of beta-1,4-linked N-acetyl-D-hexosamine residues from the non-reducing ends of N-acetyl-beta-D-hexosaminides including N-acetylglucosides and N-acetylgalactosides. These enzymes are broadly distributed in microorganisms, plants and animals, and play roles in various key physiological and pathological processes. These processes include cell structural integrity, energy storage, cellular signaling, fertilization, pathogen defense, viral penetration, the development of carcinomas, inflammatory events and lysosomal storage disorders. The GH20 enzymes include the eukaryotic beta-N-acetylhexosaminidases A and B, the bacterial chitobiases, dispersin B, and lacto-N-biosidase. The GH20 hexosaminidases are thought to act via a catalytic mechanism in which the catalytic nucleophile is not provided by the solvent or the enzyme, but by the substrate itself." Q#27514 - CGI_10007857 superfamily 243574 16 172 2.20E-07 50.406 cl03918 CHB_HEX superfamily - - Putative carbohydrate binding domain; This domain represents the N terminal domain in chitobiases and beta-hexosaminidases EC:3.2.1.52. It is composed of a beta sandwich structure that is similar in structure to the cellulose binding domain of cellulase from Cellulomonas fimi. This suggests that this may be a carbohydrate binding domain. Q#27517 - CGI_10007860 superfamily 198675 1109 1308 2.02E-46 167.437 cl02436 COLFI superfamily - - "Fibrillar collagen C-terminal domain; Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1 alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc." Q#27517 - CGI_10007860 superfamily 248289 35 102 5.36E-07 48.5743 cl17735 VWC superfamily - - von Willebrand factor type C domain; The high cutoff was used to prevent overlap with pfam00094. Q#27518 - CGI_10007861 superfamily 241782 62 290 6.78E-52 178.344 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27519 - CGI_10007862 superfamily 198675 1107 1312 4.77E-78 257.189 cl02436 COLFI superfamily - - "Fibrillar collagen C-terminal domain; Found at C-termini of fibrillar collagens: Ephydatia muelleri procollagen EMF1 alpha, vertebrate collagens alpha(1)III, alpha(1)II, alpha(2)V etc." Q#27520 - CGI_10007863 superfamily 241641 82 144 2.13E-13 65.1777 cl00150 TY superfamily - - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#27521 - CGI_10028317 superfamily 241600 12 223 6.45E-98 286.829 cl00085 FReD superfamily - - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#27522 - CGI_10028318 superfamily 247065 49 141 8.28E-18 75.459 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#27524 - CGI_10028320 superfamily 207627 1095 1183 4.08E-10 59.5707 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#27524 - CGI_10028320 superfamily 207627 1335 1427 4.20E-10 59.5707 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#27524 - CGI_10028320 superfamily 207627 1576 1668 1.23E-09 58.0347 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#27524 - CGI_10028320 superfamily 207627 1217 1308 2.20E-09 57.2595 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#27524 - CGI_10028320 superfamily 207627 1452 1546 8.06E-09 55.7187 cl02522 Calx-beta superfamily - - Calx-beta domain; Calx-beta domain. Q#27526 - CGI_10028323 superfamily 245213 42 68 0.00196048 33.7642 cl09941 EGF_CA superfamily N - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27529 - CGI_10028326 superfamily 241574 23 160 1.26E-53 168.941 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#27530 - CGI_10028327 superfamily 241574 3 140 3.98E-47 151.607 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#27531 - CGI_10028328 superfamily 241574 61 188 1.32E-55 175.49 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#27532 - CGI_10028329 superfamily 241574 11 146 6.03E-47 150.837 cl00053 PTPc superfamily - - "Protein tyrosine phosphatases (PTP) catalyze the dephosphorylation of phosphotyrosine peptides; they regulate phosphotyrosine levels in signal transduction pathways. The depth of the active site cleft renders the enzyme specific for phosphorylated Tyr (pTyr) residues, instead of pSer or pThr. This family has a distinctive active site signature motif, HCSAGxGRxG. Characterized as either transmembrane, receptor-like or non-transmembrane (soluble) PTPs. Receptor-like PTP domains tend to occur in two copies in the cytoplasmic region of the transmembrane proteins, only one copy may be active." Q#27533 - CGI_10028330 superfamily 221887 16 115 4.89E-21 85.2586 cl15229 ING superfamily - - "Inhibitor of growth proteins N-terminal histone-binding; Histones undergo numerous post-translational modifications, including acetylation and methylation, at residues which are then probable docking sites for various chromatin remodelling complexes. Inhibitor of growth proteins (INGs) specifically bind to residues that have been thus modified. INGs carry a well-characterized C-terminal PHD-type zinc-finger domain, binding with lysine 4-tri-methylated histone H3 (H3K4me3), as well as this N-terminal domain that binds unmodified H3 tails. Although these two regions can bind histones independently, together they increase the apparent association of the ING for the H3 tail." Q#27534 - CGI_10028331 superfamily 206050 241 331 0.000566768 38.024 cl16449 KIAA1430 superfamily - - KIAA1430 homologue; This is a family of KIAA1430 homologues. The function is not known. Q#27535 - CGI_10028332 superfamily 241597 13 57 0.000231838 39.9742 cl00082 HMG-box superfamily C - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#27536 - CGI_10028333 superfamily 147490 470 526 1.49E-09 57.2172 cl09572 DUF719 superfamily NC - Protein of unknown function (DUF719); This family consists of several eukaryotic proteins of unknown function. Q#27537 - CGI_10028334 superfamily 197732 81 112 2.23E-08 47.6323 cl18195 ZnF_U1 superfamily - - "U1-like zinc finger; Family of C2H2-type zinc fingers, present in matrin, U1 small nuclear ribonucleoprotein C and other RNA-binding proteins." Q#27538 - CGI_10028335 superfamily 241874 190 622 2.04E-179 524.436 cl00456 SLC5-6-like_sbd superfamily N - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#27538 - CGI_10028335 superfamily 241874 25 60 7.05E-18 86.079 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#27538 - CGI_10028335 superfamily 241874 132 176 1.25E-15 78.7602 cl00456 SLC5-6-like_sbd superfamily C - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#27539 - CGI_10028336 superfamily 241563 17 51 1.75E-07 48.2444 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27540 - CGI_10028337 superfamily 241874 4 571 0 852.241 cl00456 SLC5-6-like_sbd superfamily - - "Solute carrier families 5 and 6-like; solute binding domain; This superfamily includes the solute-binding domain of SLC5 proteins (also called the sodium/glucose cotransporters or solute sodium symporters), SLC6 proteins (also called the sodium- and chloride-dependent neurotransmitter transporters or Na+/Cl--dependent transporters), and nucleobase-cation-symport-1 (NCS1) transporters. SLC5s co-transport Na+ with sugars, amino acids, inorganic ions or vitamins. SLC6s include Na+/Cl--dependent plasma membrane transporters for the monoamine neurotransmitters serotonin, dopamine, and norepinephrine, and the amino acid neurotransmitters GABA and glycine. NCS1s are essential components of salvage pathways for nucleobases and related metabolites; their known substrates include allantoin, uracil, thiamine, and nicotinamide riboside. Members of this superfamily are important in human physiology and disease. They contain a functional core of 10 transmembrane helices (TMs): an inverted structural repeat, TMs1-5 and TMs6-10; TMs numbered to conform to the SLC6 Aquifex aeolicus LeuT." Q#27540 - CGI_10028337 superfamily 241822 663 713 5.09E-14 67.8848 cl00373 Ribosomal_S18 superfamily - - Ribosomal protein S18; Ribosomal protein S18. Q#27542 - CGI_10028339 superfamily 247724 220 466 1.40E-68 225.429 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27542 - CGI_10028339 superfamily 243185 477 585 4.47E-52 177.344 cl02787 Translation_Factor_II_like superfamily - - "Translation_Factor_II_like: Elongation factor Tu (EF-Tu) domain II-like proteins. Elongation factor Tu consists of three structural domains, this family represents the second domain. Domain II adopts a beta barrel structure and is involved in binding to charged tRNA. Domain II is found in other proteins such as elongation factor G and translation initiation factor IF-2. This group also includes the C2 subdomain of domain IV of IF-2 that has the same fold as domain II of (EF-Tu). Like IF-2 from certain prokaryotes such as Thermus thermophilus, mitochondrial IF-2 lacks domain II, which is thought to be involved in binding of E.coli IF-2 to 30S subunits." Q#27542 - CGI_10028339 superfamily 221359 582 688 1.70E-17 79.7975 cl13429 IF-2 superfamily - - "Translation-initiation factor 2; IF-2 is a translation initiator in each of the three main phylogenetic domains (Eukaryotes, Bacteria and Archaea). IF2 interacts with formylmethionine-tRNA, GTP, IF1, IF3 and both ribosomal subunits. Through these interactions, IF2 promotes the binding of the initiator tRNA to the A site in the smaller ribosomal subunit and catalyzes the hydrolysis of GTP following initiation-complex formation." Q#27543 - CGI_10028340 superfamily 147416 108 216 1.36E-12 61.6408 cl04988 Sprouty superfamily - - "Sprouty protein (Spry); This family consists of eukaryotic Sprouty protein homologues. Sprouty proteins have been revealed as inhibitors of the Ras/mitogen-activated protein kinase (MAPK) cascade, a pathway crucial for developmental processes initiated by activation of various receptor tyrosine kinases. The sprouty gene has found to be expressed in the the brain, cochlea, nasal organs, teeth, salivary gland, lungs, digestive tract, kidneys and limb buds in mice." Q#27546 - CGI_10028343 superfamily 241782 6 363 0 585.027 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27547 - CGI_10028344 superfamily 247856 84 143 5.87E-09 48.6981 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27547 - CGI_10028344 superfamily 247856 26 73 0.0070834 32.1345 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27548 - CGI_10028345 superfamily 215754 146 233 1.24E-26 101.561 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27548 - CGI_10028345 superfamily 215754 50 143 6.76E-23 91.1608 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27548 - CGI_10028345 superfamily 215754 252 335 5.32E-21 85.768 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27550 - CGI_10028347 superfamily 243035 25 66 3.36E-06 42.7302 cl02432 CLECT superfamily NC - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27551 - CGI_10028348 superfamily 247916 533 583 7.20E-07 48.1479 cl17362 Transglut_core superfamily N - "Transglutaminase-like superfamily; This family includes animal transglutaminases and other bacterial proteins of unknown function. Sequence conservation in this superfamily primarily involves three motifs that centre around conserved cysteine, histidine, and aspartate residues that form the catalytic triad in the structurally characterized transglutaminase, the human blood clotting factor XIIIa'. On the basis of the experimentally demonstrated activity of the Methanobacterium phage pseudomurein endoisopeptidase, it is proposed that many, if not all, microbial homologues of the transglutaminases are proteases and that the eukaryotic transglutaminases have evolved from an ancestral protease." Q#27552 - CGI_10028349 superfamily 215647 39 136 0.000494966 38.7437 cl18338 7tm_2 superfamily C - "7 transmembrane receptor (Secretin family); This family is known as Family B, the secretin-receptor family or family 2 of the G-protein-coupled receptors (GCPRs).They have been described in many animal species, but not in plants, fungi or prokaryotes. Three distinct sub-families are recognised. Subfamily B1 contains classical hormone receptors, such as receptors for secretin and glucagon, that are all involved in cAMP-mediated signalling pathways. Subfamily B2 contains receptors with long extracellular N-termini, such as the leukocyte cell-surface antigen CD97 ; calcium-independent receptors for latrotoxin, and brain-specific angiogenesis inhibitors amongst others. Subfamily B3 includes Methuselah and other Drosophila proteins. Other than the typical seven-transmembrane region, characteristic structural features include an amino-terminal extracellular domain involved in ligand binding, and an intracellular loop (IC3) required for specific G-protein coupling." Q#27558 - CGI_10028355 superfamily 243050 21 75 2.53E-34 122.068 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#27558 - CGI_10028355 superfamily 243050 83 137 3.19E-31 113.683 cl02475 LIM superfamily - - "LIM is a small protein-protein interaction domain, containing two zinc fingers; LIM domains are identified in a diverse group of proteins with wide variety of biological functions, including gene expression regulation, cell fate determination, cytoskeleton organization, tumor formation and development. LIM domains function as adaptors or scaffolds to support the assembly of multimeric protein complexes. They perform their functions through interactions with other protein partners. LIM domains are 50-60 amino acids in size and share two characteristic highly conserved zinc finger motifs. The two zinc fingers contain eight conserved residues, mostly cysteines and histidines, which coordinately bond to two zinc atoms. The consensus sequence of LIM domain has been defined as C-x(2)-C-x(16,23)-H-x(2)-[CH]-x(2)-C-x(2)-C-x(16,21)-C-x(2,3)-[CHD] (where X denotes any amino acid)." Q#27558 - CGI_10028355 superfamily 241599 216 273 7.43E-16 71.5056 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#27563 - CGI_10028361 superfamily 217293 29 237 1.64E-60 198.626 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#27563 - CGI_10028361 superfamily 202474 245 448 3.71E-07 49.5745 cl08379 Neur_chan_memb superfamily - - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#27566 - CGI_10028365 superfamily 241573 149 379 1.56E-11 66.581 cl00051 CysPc superfamily C - "Calpains, domains IIa, IIb; calcium-dependent cytoplasmic cysteine proteinases, papain-like. Functions in cytoskeletal remodeling processes, cell differentiation, apoptosis and signal transduction." Q#27566 - CGI_10028365 superfamily 241749 1133 1218 5.48E-08 53.5437 cl00280 globin_like superfamily N - superfamily containing globins and truncated hemoglobins Q#27566 - CGI_10028365 superfamily 241749 1337 1378 0.00148389 39.232 cl00280 globin_like superfamily C - superfamily containing globins and truncated hemoglobins Q#27569 - CGI_10028368 superfamily 247723 237 319 1.35E-40 142.415 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27569 - CGI_10028368 superfamily 247723 158 234 1.23E-38 136.921 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27569 - CGI_10028368 superfamily 247723 333 404 3.16E-35 127.741 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27570 - CGI_10028369 superfamily 222269 52 271 1.01E-48 166.73 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#27571 - CGI_10028370 superfamily 217533 30 177 6.15E-71 221.258 cl04048 Ist1 superfamily - - "Regulator of Vps4 activity in the MVB pathway; ESCRT-I, -II, and -III are endosomal sorting complexes required for transporting proteins and carry out cargo sorting and vesicle formation in the multivesicular bodies, MVBs, pathway. These complexes are transiently recruited from the cytoplasm to the endosomal membrane where they bind transmembrane proteins previously marked for degradation by mono-ubiquitination. Assembly of ESCRT-III, a complex composed of at least four subunits (Vps2, Vps24, Vps20, Snf7), is intimately linked with MVB vesicle formation, its disassembly being an essential step in the MVB vesicle formation, a reaction that is carried out by Vps4, an AAA-type ATPase. The family Ist1 is a regulator of Vps4 activity; by interacting with Did2 and Vps4, Ist1 appears to regulate the recruitment and oligomerisation of Vps4. Together Ist1, Did2, and Vta1 form a network of interconnected regulatory proteins that modulate Vps4 activity, thereby regulating the flow of cargo through the MVB pathway." Q#27576 - CGI_10028375 superfamily 217293 1 150 1.44E-29 113.497 cl03788 Neur_chan_LBD superfamily N - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#27576 - CGI_10028375 superfamily 202474 175 263 0.000222105 40.7149 cl08379 Neur_chan_memb superfamily C - Neurotransmitter-gated ion-channel transmembrane region; This family includes the four transmembrane helices that form the ion channel. Q#27578 - CGI_10028377 superfamily 217293 72 246 1.74E-36 133.912 cl03788 Neur_chan_LBD superfamily - - Neurotransmitter-gated ion-channel ligand binding domain; This family is the extracellular ligand binding domain of these ion channels. This domain forms a pentameric arrangement in the known structure. Q#27580 - CGI_10028379 superfamily 247101 20 171 2.31E-15 71.4425 cl15849 Palm_thioest superfamily N - Palmitoyl protein thioesterase; Palmitoyl protein thioesterase. Q#27585 - CGI_10028384 superfamily 243519 27 502 0 641.479 cl03757 phosphohexomutase superfamily - - "The alpha-D-phosphohexomutase superfamily includes several related enzymes that catalyze a reversible intramolecular phosphoryl transfer on their sugar substrates. Members of this family include the phosphoglucomutases (PGM1 and PGM2), phosphoglucosamine mutase (PNGM), phosphoacetylglucosamine mutase (PAGM), the bacterial phosphomannomutase ManB, the bacterial phosphoglucosamine mutase GlmM, and the bifunctional phosphomannomutase/phosphoglucomutase (PMM/PGM). These enzymes play important and diverse roles in carbohydrate metabolism in organisms from bacteria to humans. Each of these enzymes has four domains with a centrally located active site formed by four loops, one from each domain. All four domains are included in this alignment model." Q#27586 - CGI_10028385 superfamily 248458 52 252 5.28E-07 49.6197 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27588 - CGI_10028387 superfamily 243061 396 496 2.23E-37 134.007 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27588 - CGI_10028387 superfamily 243061 260 361 3.58E-37 133.236 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27588 - CGI_10028387 superfamily 243061 122 222 4.54E-36 130.155 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27588 - CGI_10028387 superfamily 243061 12 113 2.57E-34 125.532 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27590 - CGI_10028389 superfamily 245208 10 636 0 659.791 cl09933 ACAD superfamily - - "Acyl-CoA dehydrogenase; Both mitochondrial acyl-CoA dehydrogenases (ACAD) and peroxisomal acyl-CoA oxidases (AXO) catalyze the alpha,beta dehydrogenation of the corresponding trans-enoyl-CoA by FAD, which becomes reduced. The reduced form of ACAD is reoxidized in the oxidative half-reaction by electron-transferring flavoprotein (ETF), from which the electrons are transferred to the mitochondrial respiratory chain coupled with ATP synthesis. In contrast, AXO catalyzes a different oxidative half-reaction, in which the reduced FAD is reoxidized by molecular oxygen. The ACAD family includes the eukaryotic beta-oxidation enzymes, short (SCAD), medium (MCAD), long (LCAD) and very-long (VLCAD) chain acyl-CoA dehydrogenases. These enzymes all share high sequence similarity, but differ in their substrate specificities. The ACAD family also includes amino acid catabolism enzymes such as Isovaleryl-CoA dehydrogenase (IVD), short/branched chain acyl-CoA dehydrogenases(SBCAD), Isobutyryl-CoA dehydrogenase (IBDH), glutaryl-CoA deydrogenase (GCD) and Crotonobetainyl-CoA dehydrogenase. The mitochondrial ACAD's are generally homotetramers, except for VLCAD, which is a homodimer. Related enzymes include the SOS adaptive reponse proten aidB, Naphthocyclinone hydroxylase (NcnH), and and Dibenzothiophene (DBT) desulfurization enzyme C (DszC)" Q#27591 - CGI_10028390 superfamily 241867 19 176 2.93E-23 93.387 cl00446 Lactamase_B superfamily - - Metallo-beta-lactamase superfamily; Metallo-beta-lactamase superfamily. Q#27595 - CGI_10028394 superfamily 243066 24 122 1.53E-11 61.8645 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27595 - CGI_10028394 superfamily 222150 623 648 0.00917018 34.6749 cl16282 zf-H2C2_2 superfamily - - Zinc-finger double domain; Zinc-finger double domain. Q#27596 - CGI_10028395 superfamily 241782 20 388 1.37E-81 267.285 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27596 - CGI_10028395 superfamily 241782 386 766 4.93E-66 224.528 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27597 - CGI_10028396 superfamily 247723 13 85 1.20E-42 140.833 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27598 - CGI_10028397 superfamily 245206 18 230 4.20E-81 245.282 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#27599 - CGI_10028398 superfamily 248054 6 220 1.79E-18 80.4243 cl17500 NAD_binding_8 superfamily - - NAD(P)-binding Rossmann-like domain; NAD(P)-binding Rossmann-like domain. Q#27601 - CGI_10028401 superfamily 191182 290 391 2.81E-29 110.854 cl04917 Nsp1_C superfamily - - Nsp1-like C-terminal region; This family probably forms a coiled-coil. This important region of Nsp1 is involved in binding Nup82. Q#27601 - CGI_10028401 superfamily 222274 36 111 4.64E-07 47.866 cl18658 Nucleoporin_FG superfamily N - "Nucleoporin FG repeat region; This family includes a number of FG repeats that are found in nucleoporin proteins. This family includes the yeast nucleoporins Nup116, Nup100, Nup49, Nup57 and Nup 145." Q#27603 - CGI_10028403 superfamily 247792 8 54 3.01E-07 46.67 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27605 - CGI_10028405 superfamily 191243 1 24 2.68E-05 38.1923 cl05016 zf-U11-48K superfamily - - U11-48K-like CHHC zinc finger; This zinc binding domain has four conserved zinc chelating residues in a CHHC pattern. This domain is predicted to have an RNA-binding function. Q#27609 - CGI_10028409 superfamily 221442 473 543 3.18E-07 48.6994 cl18607 Hydrolase_4 superfamily - - "Putative lysophospholipase; This domain is found in bacteria and eukaryotes and is approximately 110 amino acids in length. It is found in association with pfam00561. Many members are annotated as being lysophospholipases, and others as alpha-beta hydrolase fold-containing proteins." Q#27611 - CGI_10028411 superfamily 244539 1549 1730 2.77E-48 173.645 cl06868 FNR_like superfamily - - "Ferredoxin reductase (FNR), an FAD and NAD(P) binding protein, was intially identified as a chloroplast reductase activity, catalyzing the electron transfer from reduced iron-sulfur protein ferredoxin to NADP+ as the final step in the electron transport mechanism of photosystem I. FNR transfers electrons from reduced ferredoxin to FAD (forming FADH2 via a semiquinone intermediate) and then transfers a hydride ion to convert NADP+ to NADPH. FNR has since been shown to utilize a variety of electron acceptors and donors and has a variety of physiological functions including nitrogen assimilation, dinitrogen fixation, steroid hydroxylation, fatty acid metabolism, oxygenase activity, and methane assimilation in many organisms. FNR has an NAD(P)-binding sub-domain of the alpha/beta class and a discrete (usually N-terminal) flavin sub-domain which vary in orientation with respect to the NAD(P) binding domain. The N-terminal moeity may contain a flavin prosthetic group (as in flavoenzymes) or use flavin as a substrate. Because flavins such as FAD can exist in oxidized, semiquinone (one- electron reduced), or fully reduced hydroquinone forms, FNR can interact with one and 2 electron carriers. FNR has a strong preference for NADP(H) vs NAD(H)." Q#27611 - CGI_10028411 superfamily 247856 1120 1180 1.93E-07 50.6241 cl17302 EFh superfamily - - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27611 - CGI_10028411 superfamily 246664 583 923 2.71E-110 363.928 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#27611 - CGI_10028411 superfamily 109874 63 189 1.89E-17 81.957 cl02980 Stathmin superfamily - - Stathmin family; The Stathmin family of proteins play an important role in the regulation of the microtubule cytoskeleton. They regulate microtubule dynamics by promoting depolymerization of microtubules and/or preventing polymerisation of tubulin heterodimers. Q#27611 - CGI_10028411 superfamily 109874 196 269 4.15E-08 53.4522 cl02980 Stathmin superfamily N - Stathmin family; The Stathmin family of proteins play an important role in the regulation of the microtubule cytoskeleton. They regulate microtubule dynamics by promoting depolymerization of microtubules and/or preventing polymerisation of tubulin heterodimers. Q#27611 - CGI_10028411 superfamily 242267 1375 1508 0.000130861 42.2772 cl01043 Ferric_reduct superfamily - - "Ferric reductase like transmembrane component; This family includes a common region in the transmembrane proteins mammalian cytochrome B-245 heavy chain (gp91-phox), ferric reductase transmembrane component in yeast and respiratory burst oxidase from mouse-ear cress. This may be a family of flavocytochromes capable of moving electrons across the plasma membrane. The Frp1 protein from S. pombe is a ferric reductase component and is required for cell surface ferric reductase activity, mutants in frp1 are deficient in ferric iron uptake. Cytochrome B-245 heavy chain is a FAD-dependent dehydrogenase it is also has electron transferase activity which reduces molecular oxygen to superoxide anion, a precursor in the production of microbicidal oxidants. Mutations in the sequence of cytochrome B-245 heavy chain (gp91-phox) lead to the X-linked chronic granulomatous disease. The bacteriocidal ability of phagocytic cells is reduced and is characterized by the absence of a functional plasma membrane associated NADPH oxidase. The chronic granulomatous disease gene codes for the beta chain of cytochrome B-245 and cytochrome B-245 is missing from patients with the disease." Q#27612 - CGI_10028412 superfamily 219153 117 282 9.72E-59 199.504 cl15854 DEAD_2 superfamily - - "DEAD_2; This represents a conserved region within a number of RAD3-like DNA-binding helicases that are seemingly ubiquitous - members include proteins of eukaryotic, bacterial and archaeal origin. RAD3 is involved in nucleotide excision repair, and forms part of the transcription factor TFIIH in yeast." Q#27612 - CGI_10028412 superfamily 248014 558 716 1.29E-49 173.907 cl17460 Csf4_U superfamily - - CRISPR/Cas system-associated DinG family helicase Csf4; CRISPR (Clustered Regularly Interspaced Short Palindromic Repeats) and associated Cas proteins comprise a system for heritable host defense by prokaryotic cells against phage and other foreign DNA; DinG family DNA helicase Q#27613 - CGI_10028413 superfamily 246723 108 611 0 648.855 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27616 - CGI_10028416 superfamily 199166 308 420 1.02E-07 51.9444 cl15308 AMN1 superfamily N - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#27616 - CGI_10028416 superfamily 199166 432 500 0.00570795 37.3068 cl15308 AMN1 superfamily C - "Antagonist of mitotic exit network protein 1; Amn1 has been functionally characterized in Saccharomyces cerevisiae as a component of the Antagonist of MEN pathway (AMEN). The AMEN network is activated by MEN (mitotic exit network) via an active Cdc14, and in turn switches off MEN. Amn1 constitutes one of the alternative mechanisms by which MEN may be disrupted. Specifically, Amn1 binds Tem1 (Termination of M-phase, a GTPase that belongs to the RAS superfamily), and disrupts its association with Cdc15, the primary downstream target. Amn1 is a leucine-rich repeat (LRR) protein, with 12 repeats in the S. cerevisiae ortholog. As a negative regulator of the signal transduction pathway MEN, overexpression of AMN1 slows the growth of wild type cells. The function of the vertebrate members of this family has not been determined experimentally, they have fewer LRRs that determine the extent of this model." Q#27620 - CGI_10012549 superfamily 248305 63 207 4.46E-13 69.6888 cl17751 Glyco_transf_22 superfamily C - Alg9-like mannosyltransferase family; Members of this family are mannosyltransferase enzymes. At least some members are localised in endoplasmic reticulum and involved in GPI anchor biosynthesis. Q#27621 - CGI_10012550 superfamily 241832 89 274 8.55E-74 229.546 cl00388 Thioredoxin_like superfamily - - "Protein Disulfide Oxidoreductases and Other Proteins with a Thioredoxin fold; The thioredoxin (TRX)-like superfamily is a large, diverse group of proteins containing a TRX fold. Many members contain a classic TRX domain with a redox active CXXC motif. They function as protein disulfide oxidoreductases (PDOs), altering the redox state of target proteins via the reversible oxidation of their active site dithiol. The PDO members of this superfamily include the families of TRX, protein disulfide isomerase (PDI), tlpA, glutaredoxin, NrdH redoxin, and bacterial Dsb proteins (DsbA, DsbC, DsbG, DsbE, DsbDgamma). Members of the superfamily that do not function as PDOs but contain a TRX-fold domain include phosducins, peroxiredoxins, glutathione (GSH) peroxidases, SCO proteins, GSH transferases (GST, N-terminal domain), arsenic reductases, TRX-like ferredoxins and calsequestrin, among others." Q#27623 - CGI_10012552 superfamily 241567 36 120 3.09E-22 90.7606 cl00042 CASc superfamily C - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#27623 - CGI_10012552 superfamily 241567 128 197 2.27E-21 88.4263 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#27624 - CGI_10012553 superfamily 241567 286 417 2.50E-34 128.487 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#27624 - CGI_10012553 superfamily 246680 1 89 3.38E-12 62.4436 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#27626 - CGI_10012555 superfamily 242880 5 232 1.98E-101 297.18 cl02098 14-3-3 superfamily - - "14-3-3 domain; 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 14-3-3 proteins play important roles in many biological processes that are regulated by phosphorylation, including cell cycle regulation, cell proliferation, protein trafficking, metabolic regulation and apoptosis. More than 300 binding partners of the 14-3-3 domain have been identified in all subcellular compartments and include transcription factors, signaling molecules, tumor suppressors, biosynthetic enzymes, cytoskeletal proteins and apoptosis factors. 14-3-3 binding can alter the conformation, localization, stability, phosphorylation state, activity as well as molecular interactions of a target protein. They function only as dimers, some preferring strictly homodimeric interaction, while others form heterodimers. Binding of the 14-3-3 domain to its target occurs in a phosphospecific manner where it binds to one of two consensus sequences of their target proteins; RSXpSXP (mode-1) and RXXXpSXP (mode-2). In some instances, 14-3-3 domain containing proteins are involved in regulation and signaling of a number of cellular processes in phosphorylation-independent manner. Many organisms express multiple isoforms: there are seven mammalian 14-3-3 family members (beta, gamma, eta, theta, epsilon, sigma, zeta), each encoded by a distinct gene, while plants contain up to 13 isoforms. The flexible C-terminal segment of 14-3-3 isoforms shows the highest sequence variability and may significantly contribute to individual isoform uniqueness by playing an important regulatory role by occupying the ligand binding groove and blocking the binding of inappropriate ligands in a distinct manner. Elevated amounts of 14-3-3 proteins are found in the cerebrospinal fluid of patients with Creutzfeldt-Jakob disease. In protozoa, like Plasmodium or Cryptosporidium parvum 14-3-3 proteins play an important role in key steps of parasite development." Q#27627 - CGI_10012556 superfamily 242880 5 232 7.55E-103 301.032 cl02098 14-3-3 superfamily - - "14-3-3 domain; 14-3-3 domain is an essential part of 14-3-3 proteins, a ubiquitous class of regulatory, phosphoserine/threonine-binding proteins found in all eukaryotic cells, including yeast, protozoa and mammalian cells. 14-3-3 proteins play important roles in many biological processes that are regulated by phosphorylation, including cell cycle regulation, cell proliferation, protein trafficking, metabolic regulation and apoptosis. More than 300 binding partners of the 14-3-3 domain have been identified in all subcellular compartments and include transcription factors, signaling molecules, tumor suppressors, biosynthetic enzymes, cytoskeletal proteins and apoptosis factors. 14-3-3 binding can alter the conformation, localization, stability, phosphorylation state, activity as well as molecular interactions of a target protein. They function only as dimers, some preferring strictly homodimeric interaction, while others form heterodimers. Binding of the 14-3-3 domain to its target occurs in a phosphospecific manner where it binds to one of two consensus sequences of their target proteins; RSXpSXP (mode-1) and RXXXpSXP (mode-2). In some instances, 14-3-3 domain containing proteins are involved in regulation and signaling of a number of cellular processes in phosphorylation-independent manner. Many organisms express multiple isoforms: there are seven mammalian 14-3-3 family members (beta, gamma, eta, theta, epsilon, sigma, zeta), each encoded by a distinct gene, while plants contain up to 13 isoforms. The flexible C-terminal segment of 14-3-3 isoforms shows the highest sequence variability and may significantly contribute to individual isoform uniqueness by playing an important regulatory role by occupying the ligand binding groove and blocking the binding of inappropriate ligands in a distinct manner. Elevated amounts of 14-3-3 proteins are found in the cerebrospinal fluid of patients with Creutzfeldt-Jakob disease. In protozoa, like Plasmodium or Cryptosporidium parvum 14-3-3 proteins play an important role in key steps of parasite development." Q#27628 - CGI_10012557 superfamily 247724 319 593 2.81E-83 271.811 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27629 - CGI_10012558 superfamily 220393 4 197 1.89E-48 161.004 cl10751 Tmem26 superfamily C - "Transmembrane protein 26; The function of this family of transmembrane proteins has not, as yet, been determined." Q#27630 - CGI_10012559 superfamily 241613 39 73 1.75E-06 41.0382 cl00104 LDLa superfamily - - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#27630 - CGI_10012559 superfamily 246918 1 34 0.000117844 36.0255 cl15278 TSP_1 superfamily N - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27631 - CGI_10012560 superfamily 215754 203 291 3.49E-18 77.2936 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27631 - CGI_10012560 superfamily 215754 15 87 6.94E-13 63.0412 cl02813 Mito_carr superfamily N - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27631 - CGI_10012560 superfamily 215754 98 197 5.36E-10 54.952 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27633 - CGI_10012562 superfamily 243061 173 274 7.47E-46 154.807 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27633 - CGI_10012562 superfamily 243061 65 165 2.84E-38 134.777 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27633 - CGI_10012562 superfamily 243061 279 379 1.26E-37 132.851 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27634 - CGI_10012563 superfamily 241554 2 123 1.03E-22 92.7187 cl00019 Macro superfamily N - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#27634 - CGI_10012563 superfamily 241554 203 318 1.42E-08 51.456 cl00019 Macro superfamily - - "Macro domain, a high-affinity ADP-ribose binding module found in a variety of proteins as a stand-alone domain or in combination with other domains like in histone macroH2A and some PARPs (poly ADP-ribose polymerases). Some macro domains recognize poly ADP-ribose as a ligand. Previously identified as displaying an Appr-1"-p (ADP-ribose-1"-monophosphate) processing activity, the macro domain may play roles in distinct ADP-ribose pathways, such as the ADP-ribosylation of proteins, an important post-translational modification which occurs in DNA repair, transcription, chromatin biology, and long-term memory formation, among other processes." Q#27635 - CGI_10010099 superfamily 242406 185 287 1.12E-10 57.9865 cl01271 DUF1768 superfamily N - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#27637 - CGI_10010101 superfamily 248012 2 91 4.97E-06 40.3844 cl17458 TIR_2 superfamily N - TIR domain; This is a family of bacterial Toll-like receptors. Q#27638 - CGI_10010102 superfamily 150859 52 92 2.96E-12 56.2993 cl10937 PRCC_Cterm superfamily - - Mitotic checkpoint protein PRCC_Cterm; This is the highly conserved C-terminal domain of the renal papillary carcinoma protein PRCC. The function of this domain is not known. Q#27639 - CGI_10010103 superfamily 244535 27 300 1.17E-08 54.2792 cl06858 DUF1704 superfamily N - Domain of unknown function (DUF1704); This family contains many hypothetical proteins. Q#27640 - CGI_10010104 superfamily 246597 69 174 3.18E-51 170.593 cl13995 MPP_superfamily superfamily C - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#27640 - CGI_10010104 superfamily 246597 249 333 7.37E-16 73.9079 cl13995 MPP_superfamily superfamily N - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#27641 - CGI_10010105 superfamily 247905 295 403 2.71E-10 59.5589 cl17351 HELICc superfamily N - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#27641 - CGI_10010105 superfamily 243778 455 547 3.21E-17 79.1903 cl04503 HA2 superfamily - - "Helicase associated domain (HA2); This presumed domain is about 90 amino acid residues in length. It is found is a diverse set of RNA helicases. Its function is unknown, however it seems likely to be involved in nucleic acid binding." Q#27641 - CGI_10010105 superfamily 219532 583 681 0.000183027 41.531 cl06657 OB_NTP_bind superfamily - - "Oligonucleotide/oligosaccharide-binding (OB)-fold; This family is found towards the C-terminus of the DEAD-box helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. There do seem to be a couple of instances where it occurs by itself - . The structure PDB:3i4u adopts an OB-fold. helicases (pfam00270). In these helicases it is apparently always found in association with pfam04408. This C-terminal domain of the yeast helicase contains an oligonucleotide/oligosaccharide-binding (OB)-fold which seems to be placed at the entrance of the putative nucleic acid cavity. It also constitutes the binding site for the G-patch-containing domain of Pfa1p. When found on DEAH/RHA helicases, this domain is central to the regulation of the helicase activity through its binding of both RNA and G-patch domain proteins." Q#27642 - CGI_10010106 superfamily 243129 227 338 6.81E-26 100.021 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#27642 - CGI_10010106 superfamily 243129 67 172 9.79E-20 83.0629 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#27644 - CGI_10010108 superfamily 118272 634 740 6.49E-31 118.308 cl10724 DUF2043 superfamily - - Uncharacterized conserved protein (DUF2043); This is a 100 residue conserved region of a family of proteins found from fungi to humans. This region contains three conserved Cysteines and a motif of {CP}{y/l}{HG}. Q#27647 - CGI_10010111 superfamily 110440 322 348 0.0025886 35.4613 cl03211 NHL superfamily - - "NHL repeat; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies. It is about 40 residues long and resembles the WD repeat pfam00400. The repeats have a catalytic activity in bovine PAM (peptidyl-glycine alpha-amidating monooxygenase), proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localised to the repeats. Human E3 ubiquitin-protein ligase TRIM32 interacts with the activation domain of Tat. This interaction is me diated by the NHL repeats." Q#27649 - CGI_10010113 superfamily 217410 15 81 0.000194322 38.104 cl18409 DDE_1 superfamily N - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction. Interestingly this family also includes the CENP-B protein. This domain in that protein appears to have lost the metal binding residues and is unlikely to have endonuclease activity. Centromere Protein B (CENP-B) is a DNA-binding protein localised to the centromere." Q#27651 - CGI_10010115 superfamily 245201 26 277 1.62E-130 386.799 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27651 - CGI_10010115 superfamily 201217 472 521 1.04E-14 69.4768 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#27651 - CGI_10010115 superfamily 201217 419 469 3.61E-12 62.158 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#27651 - CGI_10010115 superfamily 201217 366 416 9.25E-09 52.528 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#27651 - CGI_10010115 superfamily 201217 524 587 0.000958453 37.5052 cl08266 RCC1 superfamily - - Regulator of chromosome condensation (RCC1) repeat; Regulator of chromosome condensation (RCC1) repeat. Q#27653 - CGI_10017787 superfamily 241600 4 85 5.27E-29 104.245 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#27655 - CGI_10017789 superfamily 241958 43 448 4.59E-68 225.087 cl00573 SDF superfamily - - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#27657 - CGI_10017791 superfamily 241958 3 93 3.37E-08 50.977 cl00573 SDF superfamily C - Sodium:dicarboxylate symporter family; Sodium:dicarboxylate symporter family. Q#27658 - CGI_10017792 superfamily 243142 84 204 2.43E-10 60.3327 cl02689 RUN superfamily - - "RUN domain; This domain is present in several proteins that are linked to the functions of GTPases in the Rap and Rab families. They could hence play important roles in multiple Ras-like GTPase signalling pathways. The domain is comprises six conserved regions, which in some proteins have considerable insertions between them. The domain core is thought to take up a predominantly alpha fold, with basic amino acids in regions A and D possibly playing a functional role in interactions with Ras GTPases." Q#27659 - CGI_10017793 superfamily 243066 26 127 2.41E-18 80.0424 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27659 - CGI_10017793 superfamily 198867 138 245 5.91E-10 56.1956 cl06652 BACK superfamily - - "BTB And C-terminal Kelch; This domain is found associated with pfam00651 and pfam01344. The BACK domain is found juxtaposed to the BTB domain; they are separated by as little as two residues. This family appears to be closely related to the BTB domain (Finn RD, personal observation)." Q#27660 - CGI_10017794 superfamily 246597 129 424 1.21E-102 313.441 cl13995 MPP_superfamily superfamily - - "metallophosphatase superfamily, metallophosphatase domain; Metallophosphatases (MPPs), also known as metallophosphoesterases, phosphodiesterases (PDEs), binuclear metallophosphoesterases, and dimetal-containing phosphoesterases (DMPs), represent a diverse superfamily of enzymes with a conserved domain containing an active site consisting of two metal ions (usually manganese, iron, or zinc) coordinated with octahedral geometry by a cage of histidine, aspartate, and asparagine residues. This superfamily includes: the phosphoprotein phosphatases (PPPs), Mre11/SbcD-like exonucleases, Dbr1-like RNA lariat debranching enzymes, YfcE-like phosphodiesterases, purple acid phosphatases (PAPs), YbbF-like UDP-2,3-diacylglucosamine hydrolases, and acid sphingomyelinases (ASMases). The conserved domain is a double beta-sheet sandwich with a di-metal active site made up of residues located at the C-terminal side of the sheets. This domain is thought to allow for productive metal coordination." Q#27661 - CGI_10017795 superfamily 238012 427 466 0.0058094 35.4078 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#27662 - CGI_10017796 superfamily 221258 131 280 1.61E-33 121.329 cl13308 DUF3361 superfamily - - Domain of unknown function (DUF3361); This domain is functionally uncharacterized. This domain is found in eukaryotes. This presumed domain is typically between 154 to 168 amino acids in length. Q#27663 - CGI_10017797 superfamily 247725 277 406 2.33E-38 136.282 cl17171 PH-like superfamily - - "Pleckstrin homology-like domain; The PH-like family includes the PH domain, both the Shc-like and IRS-like PTB domains, the ran-binding domain, the EVH1 domain, a domain in neurobeachin and the third domain of FERM. All of these domains have a PH fold, but lack significant sequence similarity. They are generally involved in targeting to protein to the appropriate cellular location or interacting with a binding partner. This domain family possesses multiple functions including the ability to bind inositol phosphates and to other proteins." Q#27663 - CGI_10017797 superfamily 218231 35 210 1.20E-37 135.869 cl04708 ELMO_CED12 superfamily - - "ELMO/CED-12 family; This family represents a conserved domain which is found in a number of eukaryotic proteins including CED-12, ELMO I and ELMO II. ELMO1 is a component of signalling pathways that regulate phagocytosis and cell migration and is the mammalian orthologue of the C. elegans gene, ced-12. CED-12 is required for the engulfment of dying cells and cell migration. In mammalian cells, ELMO1 interacts with Dock180 as part of the CrkII/Dock180/Rac pathway responsible for phagocytosis and cell migration. ELMO1 is ubiquitously expressed, although its expression is highest in the spleen, an organ rich in immune cells. ELMO1 has a PH domain and a polyproline sequence motif at its C terminus which are not present in this alignment." Q#27666 - CGI_10017800 superfamily 241600 2 170 6.97E-76 228.279 cl00085 FReD superfamily N - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#27669 - CGI_10017803 superfamily 245814 380 448 0.000354523 39.0095 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#27670 - CGI_10017804 superfamily 247065 4 96 8.31E-16 69.2958 cl15777 GGCT_like superfamily - - "GGCT-like domains, also called AIG2-like family. Gamma-glutamyl cyclotransferase (GGCT) catalyzes the formation of pyroglutamic acid (5-oxoproline) from dipeptides containing gamma-glutamyl, and is a dimeric protein. In Homo sapiens, the protein is encoded by the gene C7orf24, and the enzyme participates in the gamma-glutamyl cycle. Hereditary defects in the gamma-glutamyl cycle have been described for some of the genes involved, but not for C7orf24. The synthesis and metabolism of glutathione (L-gamma-glutamyl-L-cysteinylglycine) ties the gamma-glutamyl cycle to numerous cellular processes; glutathione acts as a ubiquitous reducing agent in reductive mechanisms involved in protein and DNA synthesis, transport processes, enzyme activity, and metabolism. AIG2 (avrRpt2-induced gene) is an Arabidopsis protein that exhibits RPS2- and avrRpt2-dependent induction early after infection with Pseudomonas syringae pv maculicola strain ES4326 carrying avrRpt2. avrRpt2 is an avirulence gene that can convert virulent strains of P. syringae to avirulence on Arabidopsis thaliana, soybean, and bean. The family also includes bacterial tellurite-resistance proteins (trgB); tellurium (Te) compounds are used in industrial processes and had been used as antimicrobial agents in the past. Some members have been described proteins involved in cation transport (chaC)." Q#27671 - CGI_10017805 superfamily 247724 13 174 1.12E-100 290.867 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27672 - CGI_10017806 superfamily 220867 111 412 6.11E-37 145.7 cl11338 Med1 superfamily - - "Mediator of RNA polymerase II transcription subunit 1; Mediator complexes are basic necessities for linking transcriptional regulators to RNA polymerase II. This domain, Med1, is conserved from plants to fungi to humans and forms part of the Med9 submodule of the Srb/Med complex. it is one of three subunits essential for viability of the whole organism via its role in environmentally-directed cell-fate decisions. Med1 is part of the tail region of the Mediator complex." Q#27673 - CGI_10017807 superfamily 247724 9 172 1.31E-95 277.752 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27674 - CGI_10017809 superfamily 241596 61 118 1.07E-17 74.1727 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#27675 - CGI_10017810 superfamily 241596 50 110 3.02E-17 72.6319 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#27676 - CGI_10017811 superfamily 247792 22 69 1.07E-06 46.67 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27676 - CGI_10017811 superfamily 241563 97 132 0.00287416 36.5463 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27677 - CGI_10017812 superfamily 217473 249 397 1.23E-27 113.614 cl03978 Mab-21 superfamily N - Mab-21 protein; This family contains Mab-21 and Mab-21 like proteins. In C. elegans these proteins are required for several aspects of embryonic development. Q#27677 - CGI_10017812 superfamily 243034 709 742 0.00358486 36.2416 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#27678 - CGI_10017813 superfamily 241563 78 119 1.63E-05 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27678 - CGI_10017813 superfamily 241563 38 69 0.00752424 34.7624 cl00034 BBOX superfamily N - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27679 - CGI_10017814 superfamily 242274 13 126 5.79E-05 40.1873 cl01053 SGNH_hydrolase superfamily N - "SGNH_hydrolase, or GDSL_hydrolase, is a diverse family of lipases and esterases. The tertiary fold of the enzyme is substantially different from that of the alpha/beta hydrolase family and unique among all known hydrolases; its active site closely resembles the typical Ser-His-Asp(Glu) triad from other serine hydrolases, but may lack the carboxlic acid." Q#27680 - CGI_10017815 superfamily 216101 27 480 1.02E-134 404.753 cl08288 Carn_acyltransf superfamily - - Choline/Carnitine o-acyltransferase; Choline/Carnitine o-acyltransferase. Q#27682 - CGI_10017817 superfamily 216167 9 115 1.21E-32 121.925 cl02999 DNA_photolyase superfamily C - DNA photolyase; This domain binds a light harvesting cofactor. Q#27687 - CGI_10015570 superfamily 247683 296 353 3.20E-30 111.568 cl17036 SH3 superfamily - - "Src Homology 3 domain superfamily; Src Homology 3 (SH3) domains are protein interaction domains that bind proline-rich ligands with moderate affinity and selectivity, preferentially to PxxP motifs. Thus, they are referred to as proline-recognition domains (PRDs). SH3 domains are less selective and show more diverse specificity compared to other PRDs. They have been shown to bind peptide sequences that lack the PxxP motif; examples include the PxxDY motif of Eps8 and the RKxxYxxY sequence in SKAP55. SH3 domain containing proteins play versatile and diverse roles in the cell, including the regulation of enzymes, changing the subcellular localization of signaling pathway components, and mediating the formation of multiprotein complex assemblies, among others. Many members of this superfamily are adaptor proteins that associate with a number of protein partners, facilitating complex formation and signal transduction." Q#27687 - CGI_10015570 superfamily 217886 128 277 2.49E-40 142.754 cl04393 Peroxin-13_N superfamily - - "Peroxin 13, N-terminal region; Both termini of the Peroxin-13 are oriented to the cytosol. Peroxin-13 is required for peroxisomal association of peroxin-14." Q#27688 - CGI_10015571 superfamily 243066 126 225 1.24E-42 145.388 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27689 - CGI_10015572 superfamily 219619 179 209 2.38E-08 51.4395 cl18518 Ion_trans_2 superfamily NC - Ion channel; This family includes the two membrane helix type ion channels found in bacteria. Q#27690 - CGI_10015573 superfamily 247742 5 413 0 741.081 cl17188 enolase_like superfamily - - "Enolase-superfamily, characterized by the presence of an enolate anion intermediate which is generated by abstraction of the alpha-proton of the carboxylate substrate by an active site residue and is stabilized by coordination to the essential Mg2+ ion. Enolase superfamily contains different enzymes, like enolases, glutarate-, fucanate- and galactonate dehydratases, o-succinylbenzoate synthase, N-acylamino acid racemase, L-alanine-DL-glutamate epimerase, mandelate racemase, muconate lactonizing enzyme and 3-methylaspartase." Q#27693 - CGI_10015576 superfamily 241645 108 172 0.00887516 34.3797 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#27694 - CGI_10015577 superfamily 245622 477 612 2.44E-22 94.2134 cl11446 Rhomboid superfamily - - "Rhomboid family; This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite." Q#27695 - CGI_10015578 superfamily 112128 253 458 7.27E-108 321.742 cl03992 TF_AP-2 superfamily - - Transcription factor AP-2; Transcription factor AP-2. Q#27697 - CGI_10015580 superfamily 248458 56 459 4.83E-32 124.734 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27700 - CGI_10015583 superfamily 241603 242 459 4.81E-21 94.3501 cl00089 NUC superfamily - - DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases. They exists as monomers and homodimers. Q#27700 - CGI_10015583 superfamily 241603 1739 1944 1.12E-18 87.0313 cl00089 NUC superfamily - - DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases. They exists as monomers and homodimers. Q#27700 - CGI_10015583 superfamily 241603 944 1104 5.74E-12 66.2305 cl00089 NUC superfamily C - DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases. They exists as monomers and homodimers. Q#27700 - CGI_10015583 superfamily 241603 1120 1226 1.24E-10 61.9933 cl00089 NUC superfamily N - DNA/RNA non-specific endonuclease; prokaryotic and eukaryotic double- and single-stranded DNA and RNA endonucleases also present in phosphodiesterases. They exists as monomers and homodimers. Q#27700 - CGI_10015583 superfamily 131388 504 749 6.86E-10 62.2245 cl17985 hydr_PhnA superfamily C - "phosphonoacetate hydrolase; This family consists of examples of phosphonoacetate hydrolase, an enzyme specific for the cleavage of the C-P bond in phosphonoacetate. Phosphonates are organic compounds with a direct C-P bond that is far less labile that the C-O-P bonds of phosphate attachment sites. Phosphonates may be degraded for phosphorus and energy by broad spectrum C-P lyase encoded by large operon or by specific enzymes for some of the more common phosphonates in nature. This family represents an enzyme from the latter category. It may be found encoded near genes for phosphonate transport and for pther specific phosphonatases." Q#27700 - CGI_10015583 superfamily 131388 1270 1345 1.61E-06 51.4389 cl17985 hydr_PhnA superfamily C - "phosphonoacetate hydrolase; This family consists of examples of phosphonoacetate hydrolase, an enzyme specific for the cleavage of the C-P bond in phosphonoacetate. Phosphonates are organic compounds with a direct C-P bond that is far less labile that the C-O-P bonds of phosphate attachment sites. Phosphonates may be degraded for phosphorus and energy by broad spectrum C-P lyase encoded by large operon or by specific enzymes for some of the more common phosphonates in nature. This family represents an enzyme from the latter category. It may be found encoded near genes for phosphonate transport and for pther specific phosphonatases." Q#27700 - CGI_10015583 superfamily 248017 1407 1481 0.0056047 40.0848 cl17463 iPGM_N superfamily N - "BPG-independent PGAM N-terminus (iPGM_N); This family represents the N-terminal region of the 2,3-bisphosphoglycerate-independent phosphoglycerate mutase (or phosphoglyceromutase or BPG-independent PGAM) protein (EC:5.4.2.1). The family is found in conjunction with pfam01676 (located in the C-terminal region of the protein)." Q#27701 - CGI_10015584 superfamily 220692 53 364 3.41E-11 62.6069 cl18570 7TM_GPCR_Srw superfamily - - Serpentine type 7TM GPCR chemoreceptor Srw; Chemoreception is mediated in Caenorhabditis elegans by members of the seven-transmembrane G-protein-coupled receptor class (7TM GPCRs) of proteins which are of the serpentine type. Srw is a solo family amongst the superfamilies of chemoreceptors. Chemoperception is one of the central senses of soil nematodes like C. elegans which are otherwise 'blind' and 'deaf'. The genes encoding Srw do not appear to be under as strong an adaptive evolutionary pressure as those of Srz. Q#27702 - CGI_10015585 superfamily 243092 3 121 4.14E-20 82.768 cl02567 WD40 superfamily NC - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#27703 - CGI_10015586 superfamily 216152 128 403 9.88E-69 226.811 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#27704 - CGI_10015587 superfamily 216152 13 369 1.71E-78 249.923 cl02988 Glyco_transf_10 superfamily - - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#27705 - CGI_10015588 superfamily 216152 3 148 1.09E-36 132.823 cl02988 Glyco_transf_10 superfamily N - "Glycosyltransferase family 10 (fucosyltransferase); This family of Fucosyltransferases are the enzymes transferring fucose from GDP-Fucose to GlcNAc in an alpha1,3 linkage. This family is know as glycosyltransferase family 10." Q#27710 - CGI_10015593 superfamily 245622 568 692 3.81E-26 105.384 cl11446 Rhomboid superfamily - - "Rhomboid family; This family contains integral membrane proteins that are related to Drosophila rhomboid protein. Members of this family are found in bacteria and eukaryotes. Rhomboid promotes the cleavage of the membrane-anchored TGF-alpha-like growth factor Spitz, allowing it to activate the Drosophila EGF receptor. Analysis has shown that Rhomboid-1 is an intramembrane serine protease (EC:3.4.21.105). Parasite-encoded rhomboid enzymes are also important for invasion of host cells by Toxoplasma and the malaria parasite." Q#27712 - CGI_10009312 superfamily 245213 862 895 0.000287525 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27712 - CGI_10009312 superfamily 245213 945 979 0.0019243 37.6162 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27712 - CGI_10009312 superfamily 245213 693 726 0.00369225 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27712 - CGI_10009312 superfamily 245213 482 522 0.00437357 36.8458 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27712 - CGI_10009312 superfamily 245213 523 564 0.00928812 35.6902 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27712 - CGI_10009312 superfamily 241578 646 690 8.22E-06 46.9944 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27712 - CGI_10009312 superfamily 241578 556 597 3.00E-05 45.4536 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27712 - CGI_10009312 superfamily 241578 730 772 0.00032146 42.372 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27712 - CGI_10009312 superfamily 241578 388 436 0.000763781 41.2164 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27712 - CGI_10009312 superfamily 241578 311 354 0.00141931 40.446 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27712 - CGI_10009312 superfamily 243060 1112 1170 0.00371301 37.3584 cl02507 SEA superfamily N - "SEA domain; Domain found in Sea urchin sperm protein, Enterokinase, Agrin (SEA). Proposed function of regulating or binding carbohydrate side chains. Recently a proteolytic activity has been shown for a SEA domain." Q#27712 - CGI_10009312 superfamily 241578 899 940 0.00900837 37.7496 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27714 - CGI_10009314 superfamily 241900 96 337 1.71E-83 256.116 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#27717 - CGI_10020075 superfamily 241782 1 141 2.47E-21 88.1672 cl00321 AAT_I superfamily NC - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27718 - CGI_10020076 superfamily 241782 37 294 2.53E-40 144.791 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27719 - CGI_10020077 superfamily 241782 4 96 8.87E-13 65.8256 cl00321 AAT_I superfamily N - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27720 - CGI_10020078 superfamily 245206 25 293 3.10E-95 285.272 cl09931 NADB_Rossmann superfamily - - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#27724 - CGI_10020082 superfamily 243110 131 321 1.02E-19 87.0997 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#27725 - CGI_10020083 superfamily 246925 2 55 7.09E-05 38.1054 cl15309 LRR_RI superfamily NC - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#27727 - CGI_10020085 superfamily 243110 89 273 2.19E-19 87.0997 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#27728 - CGI_10020086 superfamily 243110 127 342 1.18E-19 87.8701 cl02616 MACPF superfamily - - "MAC/Perforin domain; The membrane-attack complex (MAC) of the complement system forms transmembrane channels. These channels disrupt the phospholipid bilayer of target cells, leading to cell lysis and death. A number of proteins participate in the assembly of the MAC. Freshly activated C5b binds to C6 to form a C5b-6 complex, then to C7 forming the C5b-7 complex. The C5b-7 complex binds to C8, which is composed of three chains (alpha, beta, and gamma), thus forming the C5b-8 complex. C5b-8 subsequently binds to C9 and acts as a catalyst in the polymerisation of C9. Active MAC has a subunit composition of C5b-C6-C7-C8-C9{n}. Perforin is a protein found in cytolytic T-cell and killer cells. In the presence of calcium, perforin polymerises into transmembrane tubules and is capable of lysing, non-specifically, a variety of target cells. There are a number of regions of similarity in the sequences of complement components C6, C7, C8-alpha, C8-beta, C9 and perforin. The X-ray crystal structure of a MACPF domain reveals that it shares a common fold with bacterial cholesterol dependent cytolysins (pfam01289) such as perfringolysin O. Three key pieces of evidence suggests that MACPF domains and CDCs are homologous: Functional similarity (pore formation), conservation of three glycine residues at a hinge in both families and conservation of a complex core fold." Q#27731 - CGI_10020089 superfamily 241571 29 138 8.29E-19 78.223 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#27734 - CGI_10020092 superfamily 241900 285 558 5.33E-55 188.376 cl00490 EEP superfamily - - "Exonuclease-Endonuclease-Phosphatase (EEP) domain superfamily; This large superfamily includes the catalytic domain (exonuclease/endonuclease/phosphatase or EEP domain) of a diverse set of proteins including the ExoIII family of apurinic/apyrimidinic (AP) endonucleases, inositol polyphosphate 5-phosphatases (INPP5), neutral sphingomyelinases (nSMases), deadenylases (such as the vertebrate circadian-clock regulated nocturnin), bacterial cytolethal distending toxin B (CdtB), deoxyribonuclease 1 (DNase1), the endonuclease domain of the non-LTR retrotransposon LINE-1, and related domains. These diverse enzymes share a common catalytic mechanism of cleaving phosphodiester bonds; their substrates range from nucleic acids to phospholipids and perhaps proteins." Q#27737 - CGI_10020095 superfamily 202894 87 150 1.72E-13 61.853 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#27738 - CGI_10020096 superfamily 202894 93 156 1.30E-14 65.705 cl04406 Mpv17_PMP22 superfamily - - "Mpv17 / PMP22 family; The 22-kDa peroxisomal membrane protein (PMP22) is a major component of peroxisomal membranes. PMP22 seems to be involved in pore forming activity and may contribute to the unspecific permeability of the organelle membrane. PMP22 is synthesised on free cytosolic ribosomes and then directed to the peroxisome membrane by specific targeting information. Mpv17 is a closely related peroxisomal protein. In mouse, the Mpv17 protein is involved in the development of early-onset glomerulosclerosis. More recently a homolog of Mpv17 in S. cerevisiae has been been found to be an integral membrane protein of the inner mitochondrial membrane where it has been proposed to have a role in ethanol metabolism and tolerance during heat-shock. Defects in MPV17 is associated with mitochondrial DNA depletion syndrome (MDDS) and Navajo neurohepatopathy (NNH). MDDS is a clinically heterogeneous group of disorders characterized by a reduction in mitochondrial DNA (mtDNA) copy number. Primary mtDNA depletion is inherited as an autosomal recessive trait and may affect single organs, typically muscle or liver, or multiple tissues. Individuals with the hepatocerebral form of mitochondrial DNA depletion syndrome have early progressive liver failure and neurologic abnormalities, hypoglycemia, and increased lactate in body fluids. NNH is an autosomal recessive disease that is prevalent among Navajo children in the South Western states of America. The major clinical features are hepatopathy, peripheral neuropathy, corneal anesthesia and scarring, acral mutilation, cerebral leukoencephalopathy, failure to thrive, and recurrent metabolic acidosis with intercurrent infections. Infantile, childhood, and classic forms of NNH have been described. Mitochondrial DNA depletion was detected in the livers of patients, suggesting a primary defect in mtDNA maintenance." Q#27739 - CGI_10020097 superfamily 217007 56 349 3.24E-116 350.747 cl11995 Syja_N superfamily - - SacI homology domain; This Pfam family represents a protein domain which shows homology to the yeast protein SacI. The SacI homology domain is most notably found at the amino terminal of the inositol 5'-phosphatase synaptojanin. Q#27740 - CGI_10020098 superfamily 241599 179 238 8.78E-14 66.498 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#27742 - CGI_10018163 superfamily 246664 163 537 2.60E-140 413.509 cl14561 An_peroxidase_like superfamily - - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#27742 - CGI_10018163 superfamily 246664 59 101 1.80E-06 48.8458 cl14561 An_peroxidase_like superfamily C - "Animal heme peroxidases and related proteins; A diverse family of enzymes, which includes prostaglandin G/H synthase, thyroid peroxidase, myeloperoxidase, linoleate diol synthase, lactoperoxidase, peroxinectin, peroxidasin, and others. Despite its name, this family is not restricted to metazoans: members are found in fungi, plants, and bacteria as well." Q#27744 - CGI_10018165 superfamily 243066 23 123 1.95E-13 66.8721 cl02518 BTB superfamily - - "BTB/POZ domain; The BTB (for BR-C, ttk and bab) or POZ (for Pox virus and Zinc finger) domain is present near the N-terminus of a fraction of zinc finger (pfam00096) proteins and in proteins that contain the pfam01344 motif such as Kelch and a family of pox virus proteins. The BTB/POZ domain mediates homomeric dimerisation and in some instances heteromeric dimerisation. The structure of the dimerised PLZF BTB/POZ domain has been solved and consists of a tightly intertwined homodimer. The central scaffolding of the protein is made up of a cluster of alpha-helices flanked by short beta-sheets at both the top and bottom of the molecule. POZ domains from several zinc finger proteins have been shown to mediate transcriptional repression and to interact with components of histone deacetylase co-repressor complexes including N-CoR and SMRT. The POZ or BTB domain is also known as BR-C/Ttk or ZiN." Q#27745 - CGI_10018166 superfamily 241644 6 148 1.42E-56 177.009 cl00154 UBCc superfamily - - "Ubiquitin-conjugating enzyme E2, catalytic (UBCc) domain. This is part of the ubiquitin-mediated protein degradation pathway in which a thiol-ester linkage forms between a conserved cysteine and the C-terminus of ubiquitin and complexes with ubiquitin protein ligase enzymes, E3. This pathway regulates many fundamental cellular processes. There are also other E2s which form thiol-ester linkages without the use of E3s as well as several UBC homologs (TSG101, Mms2, Croc-1 and similar proteins) which lack the active site cysteine essential for ubiquitination and appear to function in DNA repair pathways which were omitted from the scope of this CD." Q#27748 - CGI_10018169 superfamily 245201 4 264 0 533.431 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27749 - CGI_10018170 superfamily 245847 105 232 1.86E-12 61.3643 cl12042 FA58C superfamily - - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#27751 - CGI_10018172 superfamily 246723 1145 1709 0 993.225 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27751 - CGI_10018172 superfamily 246723 1734 2299 0 983.21 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27751 - CGI_10018172 superfamily 246723 2325 2889 0 944.305 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27751 - CGI_10018172 superfamily 246723 556 1119 0 921.578 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27751 - CGI_10018172 superfamily 246723 2912 3494 0 834.908 cl14813 GluZincin superfamily - - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27751 - CGI_10018172 superfamily 246723 35 485 0 679.288 cl14813 GluZincin superfamily C - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27751 - CGI_10018172 superfamily 246723 480 525 4.38E-16 83.3839 cl14813 GluZincin superfamily N - "Peptidase Gluzincin family (thermolysin-like proteinases, TLPs) includes peptidases M1, M2, M3, M4, M13, M32 and M36 (fungalysins); Gluzincin family (thermolysin-like peptidases or TLPs) includes several zinc-dependent metallopeptidases such as the M1, M2, M3, M4, M13, M32, M36 peptidases (MEROPS classification), and contain HEXXH and EXXXD motifs as part of their active site. All peptidases in this family bind a single catalytic zinc ion which is tetrahedrally co-ordinated by three amino acid ligands and a water molecule that forms the nucleophile on activation during catalysis. M1 family includes aminopeptidase N (APN) and leukotriene A4 hydrolase (LTA4H). APN preferentially cleaves neutral amino acids from the N-terminus of oligopeptides and is present in a variety of human tissues and cell types. LTA4H is a bifunctional enzyme, possessing an aminopeptidase as well as an epoxide hydrolase activity such that the two activities occupy different, but overlapping sites. The peptidase M3 or neurolysin-like family, includes M3, M2 and M32 metallopeptidases. The M3 peptidases have two subfamilies: M3A, includes thimet oligopeptidase (TOP; endopeptidase 3.4.24.15), neurolysin (3.4.24.16), and the mitochondrial intermediate peptidase; M3B contains oligopeptidase F. M2 peptidase angiotensin converting enzyme (ACE, EC 3.4.15.1) catalyzes the conversion of decapeptide angiotensin I to the potent vasopressor octapeptide angiotensin II. ACE is a key part of the renin-angiotensin system that regulates blood pressure, thus ACE inhibitors are important for the treatment of hypertension. M32 family includes two eukaryotic enzymes from protozoa Trypanosoma cruzi, a causative agent of Chagas' disease, and Leishmania major, a parasite that causes leishmaniasis, making them attractive targets for drug development. The M4 family includes secreted protease thermolysin (EC 3.4.24.27), pseudolysin, aureolysin, neutral protease as well as fungalysin and bacillolysin (EC 3.4.24.28) that degrade extracellular proteins and peptides for bacterial nutrition, especially prior to sporulation. Thermolysin is widely used as a nonspecific protease to obtain fragments for peptide sequencing as well as in production of the artificial sweetener aspartame. M13 family includes neprilysin (EC 3.4.24.11) and endothelin-converting enzyme I (ECE-1, EC 3.4.24.71), which fulfill a broad range of physiological roles due to the greater variation in the S2' subsite allowing substrate specificity and are prime therapeutic targets for selective inhibition. Peptidase M36 (fungamysin) family includes endopeptidases from pathogenic fungi. Fungalysin hydrolyzes extracellular matrix proteins such as elastin and keratin. Aspergillus fumigatus causes the pulmonary disease aspergillosis by invading the lungs of immuno-compromised animals and secreting fungalysin that possibly breaks down proteinaceous structural barriers." Q#27758 - CGI_10018180 superfamily 243134 36 128 1.79E-20 83.4676 cl02663 Fasciclin superfamily C - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#27758 - CGI_10018180 superfamily 243134 125 197 9.15E-15 67.2892 cl02663 Fasciclin superfamily C - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#27759 - CGI_10018181 superfamily 243134 37 136 1.25E-31 111.202 cl02663 Fasciclin superfamily - - "Fasciclin domain; This extracellular domain is found repeated four times in grasshopper fasciclin I as well as in proteins from mammals, sea urchins, plants, yeast and bacteria." Q#27760 - CGI_10018182 superfamily 238191 28 536 1.02E-119 365.502 cl18907 Esterase_lipase superfamily - - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#27762 - CGI_10018184 superfamily 243072 1 118 2.93E-18 79.3498 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27762 - CGI_10018184 superfamily 243072 66 185 1.25E-17 77.809 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27762 - CGI_10018184 superfamily 243072 131 252 1.01E-16 75.1126 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27762 - CGI_10018184 superfamily 243072 196 325 1.74E-12 63.1714 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27766 - CGI_10014441 superfamily 248264 45 188 2.60E-05 42.6094 cl17710 DDE_4 superfamily - - "DDE superfamily endonuclease; This family of proteins are related to pfam00665 and are probably endonucleases of the DDE superfamily. Transposase proteins are necessary for efficient DNA transposition. This domain is a member of the DDE superfamily, which contain three carboxylate residues that are believed to be responsible for coordinating metal ions needed for catalysis. The catalytic activity of this enzyme involves DNA cleavage at a specific site followed by a strand transfer reaction." Q#27767 - CGI_10014442 superfamily 243061 1 101 5.53E-44 140.17 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#27768 - CGI_10014443 superfamily 241594 950 1238 5.01E-76 256.338 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#27768 - CGI_10014443 superfamily 207685 839 900 4.11E-24 98.1242 cl02642 PABP superfamily - - "Poly-adenylate binding protein, unique domain; The region featured in this family is found towards the C-terminus of poly(A)-binding proteins (PABPs). These are eukaryotic proteins that, through their binding of the 3' poly(A) tail on mRNA, have very important roles in the pathways of gene expression. They seem to provide a scaffold on which other proteins can bind and mediate processes such as export, translation and turnover of the transcripts. Moreover, they may act as antagonists to the binding of factors that allow mRNA degradation, regulating mRNA longevity. PABPs are also involved in nuclear transport. PABPs interact with poly(A) tails via RNA-recognition motifs (pfam00076). Note that the PABP C-terminal region is also found in members of the hyperplastic discs protein (HYD) family of ubiquitin ligases that contain HECT domains - these are also included in this family." Q#27769 - CGI_10014444 superfamily 151983 178 228 1.31E-22 94.6757 cl13062 E3_UbLigase_EDD superfamily - - "E3 ubiquitin ligase EDD; EDD, the ER ubiquitin ligase from the HECT ligases, contains an N-terminal ubiquitin-associated domain which binds ubiquitin. Ubiquitin is recognised by helices alpha-1 and -3 in in the UBA domain. EDD is involved in DNA damage repair pathways and binds to mono-ubiquitinated proteins." Q#27770 - CGI_10014445 superfamily 241607 586 638 2.82E-19 83.1225 cl00097 KAZAL_FS superfamily - - "Kazal type serine protease inhibitors and follistatin-like domains. Kazal inhibitors inhibit serine proteases, such as, trypsin, chyomotrypsin, avian ovomucoids, and elastases. The inhibitory domain has one reactive site peptide bond, which serves the cognate enzyme as substrate. The reactive site peptide bond is a combining loop which has an identical conformation in all Kazal inhibitors and in all enzyme/inhibitor complexes. These Kazal domains (small hydrophobic core of alpha/beta structure with 3 to 4 disulfide bonds) often occur in tandem arrays. Similar domains are also present in follistatin (FS) and follistatin-like family members, which play an important role in tissue specific regulation. The FS domain consists of an N-terminal beta hairpin (FOLN/EGF-like domain) and a Kazal-like domain and has five disulfide bonds. Although the Kazal-like FS substructure is similar to Kazal proteinase inhibitors, no FS domain has yet been shown to be a proteinase inhibitor. Follistatin-like family members include SPARC, also known as, BM-40 or osteonectin, the Gallus gallus Flik protein, as well as, agrin which has a long array of FS domains. The kazal-type inhibitor domain has also been detected in an extracellular loop region of solute carrier 21 (SLC21) family members (organic anion transporters) , which may regulate the specificity of anion uptake. The distant homolog, Ascidian trypsin inhibitor, is included in this CD." Q#27770 - CGI_10014445 superfamily 248458 199 406 9.37E-13 69.2649 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27771 - CGI_10014446 superfamily 243084 256 354 9.64E-47 164.864 cl02556 Bromodomain superfamily - - Bromodomain. Bromodomains are found in many chromatin-associated proteins and in nuclear histone acetyltransferases. They interact specifically with acetylated lysine. Q#27771 - CGI_10014446 superfamily 243083 375 455 3.12E-33 125.594 cl02554 PWWP superfamily - - "The PWWP domain, named for a conserved Pro-Trp-Trp-Pro motif, is a small domain consisting of 100-150 amino acids. The PWWP domain is found in numerous proteins that are involved in cell division, growth and differentiation. Most PWWP-domain proteins seem to be nuclear, often DNA-binding, proteins that function as transcription factors regulating a variety of developmental processes. The function of the PWWP domain is still not known precisely; however, based on the fact that other regions of PWWP-domain proteins are responsible for nuclear localization and DNA-binding, is likely that the PWWP domain acts as a site for protein-protein binding interactions, influencing chromatin remodeling and thereby regulating transcriptional processes. Some PWWP-domain proteins have been linked to cancer or other diseases; some are known to function as growth factors." Q#27771 - CGI_10014446 superfamily 247999 195 235 8.61E-08 50.952 cl17445 PHD superfamily - - PHD-finger; PHD folds into an interleaved type of Zn-finger chelating 2 Zn ions in a similar manner to that of the RING and FYVE domains. Several PHD fingers have been identified as binding modules of methylated histone H3. Q#27771 - CGI_10014446 superfamily 203213 1412 1452 4.23E-06 45.957 cl04999 HTH_psq superfamily - - "helix-turn-helix, Psq domain; This DNA-binding motif is found in four copies in the pipsqueak protein of Drosophila melanogaster. In pipsqueak this domain binds to GAGA sequence." Q#27772 - CGI_10014447 superfamily 243166 85 284 3.29E-31 117.377 cl02759 TRAM_LAG1_CLN8 superfamily - - TLC domain; TLC domain. Q#27772 - CGI_10014447 superfamily 203928 15 83 2.44E-16 72.2143 cl07130 TRAM1 superfamily - - "TRAM1-like protein; This family comprises sequences that are similar to human TRAM1. This is a transmembrane protein of the endoplasmic reticulum, thought to be involved in the membrane transfer of secretory proteins. The region featured in this family is found N-terminal to the longevity-assurance protein region (pfam03798)." Q#27773 - CGI_10014448 superfamily 198738 226 311 2.19E-39 137.785 cl02599 Ets superfamily - - Ets-domain; Ets-domain. Q#27773 - CGI_10014448 superfamily 247057 9 74 9.75E-26 99.3849 cl15755 SAM_superfamily superfamily - - "SAM (Sterile alpha motif ); SAM (Sterile Alpha Motif) domain is a module consisting of approximately 70 amino acids. This domain is found in the Fungi/Metazoa group and in a restricted number of bacteria. Proteins with SAM domains are represented by a wide variety of domain architectures and have different intracellular localization, including nucleus, cytoplasm and membranes. SAM domains have diverse functions. They can interact with proteins, RNAs and membrane lipids, contain site of phosphorylation and/or kinase docking site, and play a role in protein homo and hetero dimerization/oligomerization in processes ranging from signal transduction to regulation of transcription. Mutations in SAM domains have been linked to several diseases." Q#27775 - CGI_10014450 superfamily 217414 345 764 3.73E-82 270.74 cl03927 Otopetrin superfamily - - "Protein of unknown function, DUF270; Protein of unknown function, DUF270. " Q#27778 - CGI_10028709 superfamily 189857 4 123 2.69E-41 135.841 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27779 - CGI_10028710 superfamily 189857 8 126 1.88E-40 133.915 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27780 - CGI_10028711 superfamily 189857 6 128 6.37E-39 130.063 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27781 - CGI_10028712 superfamily 189857 83 211 4.73E-53 169.353 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27782 - CGI_10028713 superfamily 189857 15 124 4.09E-30 106.951 cl07832 Caveolin superfamily N - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27783 - CGI_10028714 superfamily 189857 2 132 2.82E-39 130.833 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27784 - CGI_10028715 superfamily 189857 1 127 3.90E-46 148.552 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27785 - CGI_10028716 superfamily 189857 1 123 4.75E-40 132.374 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27786 - CGI_10028717 superfamily 245864 24 427 1.42E-52 189.026 cl12078 p450 superfamily - - "Cytochrome P450; Cytochrome P450s are haem-thiolate proteins involved in the oxidative degradation of various compounds. They are particularly well known for their role in the degradation of environmental toxins and mutagens. They can be divided into 4 classes, according to the method by which electrons from NAD(P)H are delivered to the catalytic site. Sequence conservation is relatively low within the family - there are only 3 absolutely conserved residues - but their general topography and structural fold are highly conserved. The conserved core is composed of a coil termed the 'meander', a four-helix bundle, helices J and K, and two sets of beta-sheets. These constitute the haem-binding loop (with an absolutely conserved cysteine that serves as the 5th ligand for the haem iron), the proton-transfer groove and the absolutely conserved EXXR motif in helix K. While prokaryotic P450s are soluble proteins, most eukaryotic P450s are associated with microsomal membranes. their general enzymatic function is to catalyze regiospecific and stereospecific oxidation of non-activated hydrocarbons at physiological temperatures." Q#27786 - CGI_10028717 superfamily 189857 609 736 2.25E-46 162.034 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27786 - CGI_10028717 superfamily 189857 490 597 1.13E-36 135.07 cl07832 Caveolin superfamily - - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27786 - CGI_10028717 superfamily 189857 3 23 0.000563231 39.5406 cl07832 Caveolin superfamily C - "Caveolin; All three known Caveolin forms have the FEDVIAEP caveolin 'signature motif' within their hydrophilic N-terminal domain. Caveolin 2 (Cav-2) is co-localised and co-expressed with Cav-1/VIP21, forms heterodimers with it and needs Cav-1 for proper membrane localisation. Cav-3 has greater protein sequence similarity to Cav-1 than to Cav-2. Cellular processes caveolins are involved in include vesicular transport, cholesterol homeostasis, signal transduction, and tumour suppression." Q#27787 - CGI_10028718 superfamily 248458 37 176 0.000425494 41.1453 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27787 - CGI_10028718 superfamily 248458 294 469 0.000714694 40.3749 cl17904 MFS superfamily N - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27788 - CGI_10028719 superfamily 246908 6 75 3.32E-18 74.0555 cl15255 SH2 superfamily N - "Src homology 2 (SH2) domain; In general, SH2 domains are involved in signal transduction; they bind pTyr-containing polypeptide ligands via two surface pockets, a pTyr and hydrophobic binding pocket, allowing proteins with SH2 domains to localize to tyrosine phosphorylated sites. They are present in a wide array of proteins including: adaptor proteins (Nck1, Crk, Grb2), scaffolds (Slp76, Shc, Dapp1), kinases (Src, Syk, Fps, Tec), phosphatases (Shp-1, Shp-2), transcription factors (STAT1), Ras signaling molecules (Ras-Gap), ubiquitination factors (c-Cbl), cytoskeleton regulators (Tensin), signal regulators (SAP), and phospholipid second messengers (PLCgamma), amongst others." Q#27789 - CGI_10028720 superfamily 247905 188 304 3.22E-15 70.7296 cl17351 HELICc superfamily - - "Helicase superfamily c-terminal domain; associated with DEXDc-, DEAD-, and DEAH-box proteins, yeast initiation factor 4A, Ski2p, and Hepatitis C virus NS3 helicases; this domain is found in a wide variety of helicases and helicase related proteins; may not be an autonomously folding unit, but an integral part of the helicase; 4 helicase superfamilies at present according to the organization of their signature motifs; all helicases share the ability to unwind nucleic acid duplexes with a distinct directional polarity; they utilize the free energy from nucleoside triphosphate hydrolysis to fuel their translocation along DNA, unwinding the duplex in the process" Q#27789 - CGI_10028720 superfamily 247805 1 146 1.02E-08 52.3396 cl17251 DEXDc superfamily - - DEAD-like helicases superfamily. A diverse family of proteins involved in ATP-dependent RNA or DNA unwinding. This domain contains the ATP-binding region. Q#27791 - CGI_10028722 superfamily 244819 9 64 0.000242416 37.7642 cl07874 zf-AD superfamily C - "Zinc-finger associated domain (zf-AD); The zf-AD domain, also known as ZAD, forms an atypical treble-cleft-like zinc co-ordinating fold. The zf-AD domain is thought to be involved in mediating dimer formation, but does not bind to DNA." Q#27792 - CGI_10028723 superfamily 247743 89 180 4.18E-07 48.6815 cl17189 AAA superfamily - - "The AAA+ (ATPases Associated with a wide variety of cellular Activities) superfamily represents an ancient group of ATPases belonging to the ASCE (for additional strand, catalytic E) division of the P-loop NTPase fold. The ASCE division also includes ABC, RecA-like, VirD4-like, PilT-like, and SF1/2 helicases. Members of the AAA+ ATPases function as molecular chaperons, ATPase subunits of proteases, helicases, or nucleic-acid stimulated ATPases. The AAA+ proteins contain several distinct features in addition to the conserved alpha-beta-alpha core domain structure and the Walker A and B motifs of the P-loop NTPases." Q#27792 - CGI_10028723 superfamily 247912 358 565 2.31E-18 84.8604 cl17358 Beta-lactamase superfamily N - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#27792 - CGI_10028723 superfamily 204202 254 313 9.82E-16 72.288 cl07827 Vps4_C superfamily - - Vps4 C terminal oligomerisation domain; This domain is found at the C terminal of ATPase proteins involved in vacuolar sorting. It forms an alpha helix structure and is required for oligomerisation. Q#27793 - CGI_10028724 superfamily 247912 37 333 4.06E-23 99.1128 cl17358 Beta-lactamase superfamily - - Beta-lactamase; This family appears to be distantly related to pfam00905 and PF00768 D-alanyl-D-alanine carboxypeptidase. Q#27794 - CGI_10028725 superfamily 247724 10 167 1.05E-41 139.583 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27798 - CGI_10028729 superfamily 243058 209 303 4.69E-10 57.3243 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#27798 - CGI_10028729 superfamily 248012 337 436 7.66E-20 85.322 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#27800 - CGI_10028731 superfamily 216897 177 241 5.34E-12 59.2321 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#27801 - CGI_10028732 superfamily 243212 242 380 8.91E-19 82.7769 cl02844 Arrestin_C superfamily - - "Arrestin (or S-antigen), C-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with N-terminal domain." Q#27801 - CGI_10028732 superfamily 215866 110 219 5.95E-18 80.4471 cl18349 Arrestin_N superfamily N - "Arrestin (or S-antigen), N-terminal domain; Ig-like beta-sandwich fold. Scop reports duplication with C-terminal domain." Q#27802 - CGI_10028733 superfamily 243161 5 63 1.51E-14 66.6489 cl02739 THAP superfamily C - "THAP domain; The THAP domain is a putative DNA-binding domain (DBD) and probably also binds a zinc ion. It features the conserved C2CH architecture (consensus sequence: Cys - 2-4 residues - Cys - 35-50 residues - Cys - 2 residues - His). Other universal features include the location of the domain at the N-termini of proteins, its size of about 90 residues, a C-terminal AVPTIF box and several other conserved residues. Orthologues of the human THAP domain have been identified in other vertebrates and probably worms and flies, but not in other eukaryotes or any prokaryotes." Q#27805 - CGI_10028737 superfamily 241596 44 100 1.21E-12 63.0019 cl00081 HLH superfamily - - "Helix-loop-helix domain, found in specific DNA- binding proteins that act as transcription factors; 60-100 amino acids long. A DNA-binding basic region is followed by two alpha-helices separated by a variable loop region; HLH forms homo- and heterodimers, dimerization creates a parallel, left-handed, four helix bundle; the basic region N-terminal to the first amphipathic helix mediates high-affinity DNA-binding; there are several groups of HLH proteins: those (E12/E47) which bind specific hexanucleotide sequences such as E-box (5-CANNTG-3) or StRE 5-ATCACCCCAC-3), those lacking the basic domain (Emc, Id) function as negative regulators since they fail to bind DNA, those (hairy, E(spl), deadpan) which repress transcription although they can bind specific hexanucleotide sequences such as N-box (5-CACGc/aG-3), those which have a COE domain (Collier/Olf-1/EBF) which is involved in both in dimerization and in DNA binding, and those which bind pentanucleotides ACGTG or GCGTG and have a PAS domain which allows the dimerization between PAS proteins, the binding of small molecules (e.g., dioxin), and interactions with non-PAS proteins." Q#27808 - CGI_10028740 superfamily 245847 68 145 1.66E-12 60.6481 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#27808 - CGI_10028740 superfamily 245847 4 63 0.00995383 33.6842 cl12042 FA58C superfamily N - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#27810 - CGI_10028742 superfamily 218390 1 310 1.11E-152 438.673 cl04895 PARG_cat superfamily - - "Poly (ADP-ribose) glycohydrolase (PARG); Poly(ADP-ribose) glycohydrolase (PARG), is a ubiquitously expressed exo- and endoglycohydrolase which mediates oxidative and excitotoxic neuronal death." Q#27811 - CGI_10028743 superfamily 245213 529 565 1.00E-05 43.3942 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27811 - CGI_10028743 superfamily 245213 418 452 9.63E-05 40.3126 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27811 - CGI_10028743 superfamily 245213 305 341 0.000136776 39.9274 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27811 - CGI_10028743 superfamily 245213 491 527 0.00105706 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27811 - CGI_10028743 superfamily 245213 46 77 0.00112919 37.231 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27811 - CGI_10028743 superfamily 245213 118 153 0.00224924 36.4606 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27811 - CGI_10028743 superfamily 245213 568 604 0.00624363 35.305 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27811 - CGI_10028743 superfamily 245213 380 415 0.00838311 34.9198 cl09941 EGF_CA superfamily - - "Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements." Q#27812 - CGI_10028744 superfamily 159801 18 357 3.41E-42 154.426 cl12141 Tweety_N superfamily - - "N-terminal domain of the protein encoded by the Drosophila tweety gene and related proteins, a family of chloride ion channels; The protein product of the Drosophila tweety (tty) gene is thought to form a trans-membrane protein with five membrane-spanning regions and a cytoplasmic C-terminus. This N-terminal domain contains the putative transmembrane spanning regions. Tweety has been suggested as a candidate for a large conductance chloride channel, both in vertebrate and insect cells. Three human homologs have been identified and designated TTYH1-3. TTYH2 has been associated with the progression of cancer, and Drosophila melanogaster tweety has been assumed to play a role in development. TTYH2, and TTYH3 bind to and are ubiquinated by Nedd4-2, a HECT type E3 ubiquitin ligase, which most likely plays a role in controlling the cellular levels of tweety family proteins." Q#27813 - CGI_10028745 superfamily 245201 74 351 0 509.273 cl09925 PKc_like superfamily - - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27815 - CGI_10028747 superfamily 248318 40 93 1.63E-19 82.0985 cl17764 FYVE superfamily - - "FYVE domain; Zinc-binding domain; targets proteins to membrane lipids via interaction with phosphatidylinositol-3-phosphate, PI3P; present in Fab1, YOTB, Vac1, and EEA1;" Q#27815 - CGI_10028747 superfamily 243058 130 263 0.000861293 37.6792 cl02500 ARM superfamily - - "Armadillo/beta-catenin-like repeats. An approximately 40 amino acid long tandemly repeated sequence motif first identified in the Drosophila segment polarity gene armadillo; these repeats were also found in the mammalian armadillo homolog beta-catenin, the junctional plaque protein plakoglobin, the adenomatous polyposis coli (APC) tumor suppressor protein, and a number of other proteins. ARM has been implicated in mediating protein-protein interactions, but no common features among the target proteins recognized by the ARM repeats have been identified; related to the HEAT domain; three consecutive copies of the repeat are represented by this alignment model." Q#27817 - CGI_10028749 superfamily 248030 100 467 2.17E-48 174.869 cl17476 Glyco_transf_7C superfamily N - "N-terminal domain of galactosyltransferase; This is the N-terminal domain of a family of galactosyltransferases from a wide range of Metazoa with three related galactosyltransferases activities, all three of which are possessed by one sequence in some cases. EC:2.4.1.90, N-acetyllactosamine synthase; EC:2.4.1.38, Beta-N-acetylglucosaminyl-glycopeptide beta-1,4- galactosyltransferase; and EC:2.4.1.22 Lactose synthase. Note that N-acetyllactosamine synthase is a component of Lactose synthase along with alpha-lactalbumin, in the absence of alpha-lactalbumin EC:2.4.1.90 is the catalyzed reaction." Q#27818 - CGI_10028750 superfamily 247038 295 397 4.72E-53 182.904 cl15674 IPT superfamily - - "Immunoglobulin-like fold, Plexins, Transcription factors (IPT). IPTs are also known as Transcription factor ImmunoGlobin (TIG) domains. They are present in intracellular transcription factors, cell surface receptors (such as plexins and scatter factor receptors), as well as, cyclodextrin glycosyltransferase and similar enzymes. Although they are involved in DNA binding in transcription factors, their function in other proteins is unknown. In these transcription factors, IPTs form homo- or heterodimers with the exception of the nuclear factor of activated Tcells (NFAT) transcription factors which are mainly monomers." Q#27818 - CGI_10028750 superfamily 243072 779 910 8.10E-29 114.018 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27818 - CGI_10028750 superfamily 243072 1064 1187 1.05E-27 110.936 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27818 - CGI_10028750 superfamily 243072 855 991 1.07E-23 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27818 - CGI_10028750 superfamily 243072 1132 1268 1.07E-23 99.3802 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27818 - CGI_10028750 superfamily 243072 717 838 2.05E-19 87.0538 cl02529 ANK superfamily - - "ankyrin repeats; ankyrin repeats mediate protein-protein interactions in very diverse families of proteins. The number of ANK repeats in a protein can range from 2 to over 20 (ankyrins, for example). ANK repeats may occur in combinations with other types of domains. The structural repeat unit contains two antiparallel helices and a beta-hairpin, repeats are stacked in a superhelical arrangement; this alignment contains 4 consecutive repeats." Q#27818 - CGI_10028750 superfamily 246680 1341 1415 1.12E-16 77.2904 cl14633 DD_superfamily superfamily - - "The Death Domain Superfamily of protein-protein interaction domains; The Death Domain (DD) superfamily includes the DD, Pyrin, CARD (Caspase activation and recruitment domain) and DED (Death Effector Domain) families. DDs are protein-protein interaction domains found in a variety of domain architectures. Their common feature is that they form homodimers by self-association or heterodimers by associating with other members of the DD superfamily. They serve as adaptors in signaling pathways and can recruit other proteins into signaling complexes. They are prominent components of the programmed cell death (apoptosis) pathway and are found in a number of other signaling pathways including those that impact innate immunity, inflammation, differentiation, and cancer." Q#27818 - CGI_10028750 superfamily 208843 97 288 1.26E-73 245.08 cl08275 RHD-n superfamily - - "N-terminal sub-domain of the Rel homology domain (RHD); Proteins containing the Rel homology domain (RHD) are metazoan transcription factors. The RHD is composed of two structural sub-domains; this model characterizes the N-terminal sub-domain, which may be distantly related to the DNA-binding domain found in P53. The C-terminal sub-domain has an immunoglobulin-like fold and serves as a dimerization module that also binds DNA (see cd00102). The RHD is found in NF-kappa B, nuclear factor of activated T-cells (NFAT), the tonicity-responsive enhancer binding protein (TonEBP), and the arthropod proteins Dorsal and Relish (Rel)." Q#27820 - CGI_10028752 superfamily 241971 4 264 1.33E-92 276.715 cl00599 Extradiol_Dioxygenase_3B_like superfamily - - "Subunit B of Class III Extradiol ring-cleavage dioxygenases; Dioxygenases catalyze the incorporation of both atoms of molecular oxygen into substrates using a variety of reaction mechanisms, resulting in the cleavage of aromatic rings. Two major groups of dioxygenases have been identified according to the cleavage site of the aromatic ring. Intradiol enzymes cleave the aromatic ring between two hydroxyl groups, whereas extradiol enzymes cleave the aromatic ring between a hydroxylated carbon and an adjacent non-hydroxylated carbon. Extradiol dioxygenases can be further divided into three classes. Class I and II enzymes are evolutionary related and show sequence similarity, with the two-domain class II enzymes evolving from the class I enzyme through gene duplication. Class III enzymes are different in sequence and structure and usually have two subunits, designated A and B. This model represents the catalytic subunit B of extradiol dioxygenase class III enzymes. Enzymes belonging to this family include Protocatechuate 4,5-dioxygenase (LigAB), 2'-aminobiphenyl-2,3-diol 1,2-dioxygenase (CarB), 4,5-DOPA Dioxygenase, 2,3-dihydroxyphenylpropionate 1,2-dioxygenase, and 3,4-dihydroxyphenylacetate (homoprotocatechuate) 2,3-dioxygenase (HPCD). There are also some family members that do not show the typical dioxygenase activity." Q#27822 - CGI_10028754 superfamily 207690 570 593 5.73E-07 46.9273 cl02656 zf-RanBP superfamily - - Zn-finger in Ran binding protein and others; Zn-finger in Ran binding protein and others. Q#27829 - CGI_10028761 superfamily 150162 43 112 7.37E-20 79.4949 cl09646 FOP_dimer superfamily - - FOP N terminal dimerisation domain; Fibroblast growth factor receptor 1 (FGFR1) oncogene partner (FOP) is a centrosomal protein that is involved in anchoring microtubules to subcellular structures. This domain includes a Lis-homology motif. It forms an alpha helical bundle and is involved in dimerisation. Q#27830 - CGI_10028762 superfamily 215754 9 102 5.77E-19 79.2196 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27830 - CGI_10028762 superfamily 215754 102 192 1.05E-14 67.2784 cl02813 Mito_carr superfamily - - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27830 - CGI_10028762 superfamily 215754 196 255 6.04E-09 51.4852 cl02813 Mito_carr superfamily C - Mitochondrial carrier protein; Mitochondrial carrier protein. Q#27832 - CGI_10028764 superfamily 241797 37 284 2.56E-80 248.313 cl00337 UbiA superfamily - - 4-hydroxybenzoate polyprenyltransferase and related prenyltransferases [Coenzyme metabolism] Q#27833 - CGI_10028766 superfamily 247724 26 225 6.06E-68 209.695 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27834 - CGI_10028767 superfamily 247724 27 226 2.42E-70 215.858 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27835 - CGI_10028768 superfamily 247724 27 229 2.50E-70 215.858 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27837 - CGI_10028770 superfamily 241641 276 335 6.46E-15 68.6445 cl00150 TY superfamily - - Thyroglobulin type I repeats.; The N-terminal region of human thyroglobulin contains 11 type-1 repeats TY repeats are proposed to be inhibitors of cysteine proteases Q#27837 - CGI_10028770 superfamily 238155 159 276 2.49E-18 78.9558 cl08547 SPARC_EC superfamily - - "SPARC_EC; extracellular Ca2+ binding domain (containing 2 EF-hand motifs) of SPARC and related proteins (QR1, SC1/hevin, testican and tsc-36/FRP). SPARC (BM-40) is a multifunctional glycoprotein, a matricellular protein, that functions to regulate cell-matrix interactions; binds to such proteins as collagen and vitronectin and binds to endothelial cells thus inhibiting cellular proliferation. The EC domain interacts with a follistatin-like (FS) domain which appears to stabilize Ca2+ binding. The two EF-hands interact canonically but their conserved disulfide bonds confer a tight association between the EF-hand pair and an acid/amphiphilic N-terminal helix. Proposed active form involves a Ca2+ dependent symmetric homodimerization of EC-FS modules." Q#27839 - CGI_10028772 superfamily 241591 300 378 2.08E-07 48.3863 cl00073 H15 superfamily - - "linker histone 1 and histone 5 domains; the basic subunit of chromatin is the nucleosome, consisting of an octamer of core histones, two full turns of DNA, a linker histone (H1 or H5) and a variable length of linker DNA; H1/H5 are chromatin-associated proteins that bind to the exterior of nucleosomes and dramatically stabilize the highly condensed states of chromatin fibers; stabilization of higher order folding occurs through electrostatic neutralization of the linker DNA segments, through a highly positively charged carboxy- terminal domain known as the AKP helix (Ala, Lys, Pro); thought to be involved in specific protein-protein and protein-DNA interactions and play a role in suppressing core histone tail domain acetylation in the chromatin fiber" Q#27841 - CGI_10028774 superfamily 247723 1453 1532 6.98E-32 121.712 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27842 - CGI_10028775 superfamily 241677 27 166 2.79E-81 240.237 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#27843 - CGI_10028776 superfamily 247058 40 173 1.03E-39 140.774 cl15762 crotonase-like superfamily C - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#27843 - CGI_10028776 superfamily 247058 177 312 5.39E-38 136.151 cl15762 crotonase-like superfamily N - "Crotonase/Enoyl-Coenzyme A (CoA) hydratase superfamily. This superfamily contains a diverse set of enzymes including enoyl-CoA hydratase, napthoate synthase, methylmalonyl-CoA decarboxylase, 3-hydoxybutyryl-CoA dehydratase, and dienoyl-CoA isomerase. Many of these play important roles in fatty acid metabolism. In addition to a conserved structural core and the formation of trimers (or dimers of trimers), a common feature in this superfamily is the stabilization of an enolate anion intermediate derived from an acyl-CoA substrate. This is accomplished by two conserved backbone NH groups in active sites that form an oxyanion hole." Q#27844 - CGI_10028777 superfamily 243035 134 246 4.31E-15 71.1117 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27845 - CGI_10028778 superfamily 244265 270 479 3.10E-107 320.365 cl05973 FAM20_C_like superfamily - - "C-terminal putative kinase domain of FAM20 (family with sequence similarity 20), Drosophila Four-jointed (Fj), and related proteins; Drosophila Fj is a Golgi kinase that phosphorylates Ser or Thr residues within extracellular cadherin domains of a transmembrane receptor Fat and its ligand, Dachsous (Ds). The Fat signaling pathway regulates growth, gene expression, and planar cell polarity (PCP). Defects from mutation in the Drosophila fj gene include loss of the intermediate leg joint, and a PCP defect in the eye. Fjx1, the murine homologue of Fj, has been shown to be involved in both the Fat and Hippo signaling pathways, these two pathways intersect at multiple points. The Hippo pathway is important in organ size control and in cancer. FAM20B is a xylose kinase that may regulate the number of glycosaminoglycan chains by phosphorylating the xylose residue in the glycosaminoglycan-protein linkage region of proteoglycans. This domain has homology to a kinase-active site, mutation of three conserved Asp residues at the Drosophila Fj putative active site abolished its ability to phosphorylate Ft and Ds cadherin domains. FAM20A may participate in enamel development and gingival homeostasis, FAM20B in proteoglycan production, and FAM20C in bone development. FAM20C, also called Dentin Matrix Protein 4, is abundant in the dentin matrix, and may participate in the differentiation of mesenchymal precursor cells into functional odontoblast-like cells. Mutations in FAM20C are associated with lethal Osteosclerotic Bone Dysplasia (Raine Syndrome), and mutations in FAM20A with Amelogenesis imperfecta (AI) and Gingival Hyperplasia Syndrome. This model includes the FAM20_C domain family, previously known as DUF1193; FAM20_C appears to be homologous to the catalytic domain of the phosphoinositide 3-kinase (PI3K)-like family." Q#27847 - CGI_10028780 superfamily 221663 878 1069 2.25E-38 143.206 cl14612 TFCD_C superfamily - - "Tubulin folding cofactor D C terminal; This domain family is found in eukaryotes, and is typically between 182 and 199 amino acids in length. The family is found in association with pfam02985. There is a single completely conserved residue R that may be functionally important. Tubulin folding cofactor D does not co-polymerise with microtubules either in vivo or in vitro, but instead modulates microtubule dynamics by sequestering beta-tubulin from GTP-bound alphabeta-heterodimers in microtubules." Q#27848 - CGI_10028781 superfamily 247724 5 171 3.83E-88 273.444 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27848 - CGI_10028781 superfamily 247724 416 582 1.26E-61 204.015 cl17170 Ras_like_GTPase superfamily - - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27848 - CGI_10028781 superfamily 192013 220 302 5.37E-37 132.689 cl07101 EF_assoc_2 superfamily - - EF hand associated; This region predominantly appears near EF-hands (pfam00036) in GTP-binding proteins. It is found in all three eukaryotic kingdoms. Q#27848 - CGI_10028781 superfamily 192012 342 414 1.26E-30 115.032 cl07100 EF_assoc_1 superfamily - - "EF hand associated; This region typically appears on the C-terminus of EF hands in GTP-binding proteins such as Arht/Rhot (may be involved in mitochondrial homeostasis and apoptosis). The EF hand associated region is found in yeast, vertebrates and plants." Q#27849 - CGI_10028782 superfamily 243092 43 335 5.35E-25 101.643 cl02567 WD40 superfamily - - "WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment." Q#27852 - CGI_10028785 superfamily 199575 3 88 8.10E-40 138.897 cl15439 BTG superfamily N - BTG family; BTG family. Q#27853 - CGI_10028786 superfamily 245836 518 736 2.84E-148 439.684 cl12015 Adenylation_DNA_ligase_like superfamily - - "Adenylation domain of proteins similar to ATP-dependent polynucleotide ligases; ATP-dependent polynucleotide ligases catalyze the phosphodiester bond formation of nicked nucleic acid substrates using ATP as a cofactor in a three step reaction mechanism. This family includes ATP-dependent DNA and RNA ligases. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP-dependent DNA ligases have a highly modular architecture, consisting of a unique arrangement of two or more discrete domains, including a DNA-binding domain, an adenylation or nucleotidyltransferase (NTase) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation domain binds ATP and contains many active site residues. Together with the C-terminal OB-fold domain, it comprises a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases including eukaryotic GRP-dependent mRNA-capping enzymes. The catalytic core contains both the active site as well as many DNA-binding residues. The RNA circularization protein from archaea and bacteria contains the minimal catalytic unit, the adenylation domain, but does not contain an OB-fold domain. This family also includes the m3G-cap binding domain of snurportin, a nuclear import adaptor that binds m3G-capped spliceosomal U small nucleoproteins (snRNPs), but doesn't have enzymatic activity." Q#27853 - CGI_10028786 superfamily 244947 741 886 4.99E-89 281.288 cl08424 OBF_DNA_ligase_family superfamily - - "The Oligonucleotide/oligosaccharide binding (OB)-fold domain is a DNA-binding module that is part of the catalytic core unit of ATP dependent DNA ligases; ATP-dependent polynucleotide ligases catalyze phosphodiester bond formation using nicked nucleic acid substrates with the high energy nucleotide of ATP as a cofactor in a three step reaction mechanism. DNA ligases play a vital role in the diverse processes of DNA replication, recombination and repair. ATP dependent DNA ligases have a highly modular architecture consisting of a unique arrangement of two or more discrete domains including a DNA-binding domain, an adenylation (nucleotidyltransferase (NTase)) domain, and an oligonucleotide/oligosaccharide binding (OB)-fold domain. The adenylation and C-terminal OB-fold domains comprise a catalytic core unit that is common to most members of the ATP-dependent DNA ligase family. The catalytic core unit contains six conserved sequence motifs (I, III, IIIa, IV, V and VI) that define this family of related nucleotidyltransferases. The OB-fold domain contacts the nicked DNA substrate and is required for the ATP-dependent DNA ligase nucleotidylation step. The RxDK motif (motif VI), which is essential for ATP hydrolysis, is located in the OB-fold domain." Q#27853 - CGI_10028786 superfamily 193687 303 371 2.00E-24 98.8039 cl00160 LbetaH superfamily - - "Left-handed parallel beta-Helix (LbetaH or LbH) domain: The alignment contains 5 turns, each containing three imperfect tandem repeats of a hexapeptide repeat motif (X-[STAV]-X-[LIV]-[GAED]-X). Proteins containing hexapeptide repeats are often enzymes showing acyltransferase activity, however, some subfamilies in this hierarchy also show activities related to ion transport or translation initiation. Many are trimeric in their active forms." Q#27853 - CGI_10028786 superfamily 245596 4 159 1.54E-40 149.346 cl11394 Glyco_tranf_GTA_type superfamily N - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#27854 - CGI_10028787 superfamily 242575 8 78 0.000806803 34.905 cl01548 YccV-like superfamily N - Hemimethylated DNA-binding protein YccV like; YccV is a hemimethylated DNA binding protein which has been shown to regulate dnaA gene expression. The structure of one of the hypothetical proteins in this family has been solved and it forms a beta sheet structure with a terminating alpha helix. Q#27855 - CGI_10028788 superfamily 245201 918 1005 1.40E-18 86.8216 cl09925 PKc_like superfamily C - "Protein Kinases, catalytic domain; The protein kinase superfamily is mainly composed of the catalytic domains of serine/threonine-specific and tyrosine-specific protein kinases. It also includes RIO kinases, which are atypical serine protein kinases, aminoglycoside phosphotransferases, and choline kinases. These proteins catalyze the transfer of the gamma-phosphoryl group from ATP to hydroxyl groups in specific substrates such as serine, threonine, or tyrosine residues of proteins." Q#27856 - CGI_10028789 superfamily 241782 5 72 0.00549735 34.6301 cl00321 AAT_I superfamily C - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27857 - CGI_10028790 superfamily 241563 97 138 1.57E-05 42.8516 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27859 - CGI_10028792 superfamily 241563 65 106 4.99E-05 41.3108 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27861 - CGI_10028795 superfamily 241567 16 93 1.05E-05 43.7432 cl00042 CASc superfamily NC - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#27862 - CGI_10028796 superfamily 241567 358 488 3.79E-08 52.9879 cl00042 CASc superfamily N - "Caspase, interleukin-1 beta converting enzyme (ICE) homologues; Cysteine-dependent aspartate-directed proteases that mediate programmed cell death (apoptosis). Caspases are synthesized as inactive zymogens and activated by proteolysis of the peptide backbone adjacent to an aspartate. The resulting two subunits associate to form an (alpha)2(beta)2-tetramer which is the active enzyme. Activation of caspases can be mediated by other caspase homologs." Q#27863 - CGI_10028797 superfamily 216712 20 222 1.26E-74 233.296 cl03360 LIM_bind superfamily - - "LIM-domain binding protein; The LIM-domain binding protein, binds to the LIM domain pfam00412 of LIM homeodomain proteins which are transcriptional regulators of development. Nuclear LIM interactor (NLI) / LIM domain-binding protein 1 (LDB1) is located in the nuclei of neuronal cells during development, it is co-expressed with Isl1 in early motor neuron differentiation and has a suggested role in the Isl1 dependent development of motor neurons. It is suggested that these proteins act synergistically to enhance transcriptional efficiency by acting as co-factors for LIM homeodomain and Otx class transcription factors both of which have essential roles in development. The Drosophila protein Chip is required for segmentation and activity of a remote wing margin enhancer. Chip is a ubiquitous chromosomal factor required for normal expression of diverse genes at many stages of development. It is suggested that Chip cooperates with different LIM domain proteins and other factors to structurally support remote enhancer-promoter interactions." Q#27864 - CGI_10028799 superfamily 218497 15 120 1.02E-19 87.2819 cl04984 COMPASS-Shg1 superfamily - - "COMPASS (Complex proteins associated with Set1p) component shg1; The Shg1 subunit is one of the eight subunits of the COMPASS complex, complex associated with SET1, conserved in yeasts and in other eukaryotes up to humans. It is associated with the region of the Set1 protein that is N-terminal to the C-terminus, ie Set1-560-900. The function of Shg1 seems to be to slightly inhibit histone 3 lysine 4 (H3K4) di- and tri-methylation, and it is a pioneer protein. The COMPASS complex functions to methylate the fourth lysine of Histone 3 and for silencing of genes close to the telomeres of chromosomes." Q#27865 - CGI_10028800 superfamily 243077 7 59 3.36E-14 67.5705 cl02542 DnaJ superfamily - - "DnaJ domain or J-domain. DnaJ/Hsp40 (heat shock protein 40) proteins are highly conserved and play crucial roles in protein translation, folding, unfolding, translocation, and degradation. They act primarily by stimulating the ATPase activity of Hsp70s, an important chaperonine family. Hsp40 proteins are characterized by the presence of a J domain, which mediates the interaction with Hsp70. They may contain other domains as well, and the architectures provide a means of classification." Q#27865 - CGI_10028800 superfamily 243034 183 260 1.35E-12 64.3236 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#27865 - CGI_10028800 superfamily 247723 353 420 1.05E-08 52.3073 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27867 - CGI_10028802 superfamily 241599 139 197 2.36E-20 82.6764 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#27870 - CGI_10028805 superfamily 247727 81 178 6.44E-06 42.8023 cl17173 AdoMet_MTases superfamily - - "S-adenosylmethionine-dependent methyltransferases (SAM or AdoMet-MTase), class I; AdoMet-MTases are enzymes that use S-adenosyl-L-methionine (SAM or AdoMet) as a substrate for methyltransfer, creating the product S-adenosyl-L-homocysteine (AdoHcy). There are at least five structurally distinct families of AdoMet-MTases, class I being the largest and most diverse. Within this class enzymes can be classified by different substrate specificities (small molecules, lipids, nucleic acids, etc.) and different target atoms for methylation (nitrogen, oxygen, carbon, sulfur, etc.)." Q#27871 - CGI_10028806 superfamily 203013 590 617 1.06E-07 49.1614 cl04519 zf-HIT superfamily - - HIT zinc finger; This presumed zinc finger contains up to 6 cysteine residues that could coordinate zinc. The domain is named after the HIT protein. This domain is also found in the Thyroid receptor interacting protein 3 (TRIP-3) that specifically interacts with the ligand binding domain of the thyroid receptor. Q#27871 - CGI_10028806 superfamily 241638 218 293 0.000536806 39.2504 cl00147 TNF superfamily C - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#27872 - CGI_10028807 superfamily 241638 136 279 0.00038333 38.48 cl00147 TNF superfamily - - "Tumor Necrosis Factor; TNF superfamily members include the cytokines: TNF (TNF-alpha), LT (lymphotoxin-alpha, TNF-beta), CD40 ligand, Apo2L (TRAIL), Fas ligand, and osteoprotegerin (OPG) ligand. These proteins generally have an intracellular N-terminal domain, a short transmembrane segment, an extracellular stalk, and a globular TNF-like extracellular domain of about 150 residues. They initiate apoptosis by binding to related receptors, some of which have intracellular death domains. They generally form homo- or hetero- trimeric complexes.TNF cytokines bind one elongated receptor molecule along each of three clefts formed by neighboring monomers of the trimer with ligand trimerization a requiste for receptor binding." Q#27873 - CGI_10028808 superfamily 238191 221 334 9.74E-09 56.1864 cl18907 Esterase_lipase superfamily C - "Esterases and lipases (includes fungal lipases, cholinesterases, etc.) These enzymes act on carboxylic esters (EC: 3.1.1.-). The catalytic apparatus involves three residues (catalytic triad): a serine, a glutamate or aspartate and a histidine.These catalytic residues are responsible for the nucleophilic attack on the carbonyl carbon atom of the ester bond. In contrast with other alpha/beta hydrolase fold family members, p-nitrobenzyl esterase and acetylcholine esterase have a Glu instead of Asp at the active site carboxylate." Q#27873 - CGI_10028808 superfamily 242244 77 169 0.00646632 37.4033 cl01002 DUF808 superfamily NC - Protein of unknown function (DUF808); This family consists of several bacterial proteins of unknown function. Q#27874 - CGI_10028809 superfamily 248458 27 203 8.16E-07 49.2345 cl17904 MFS superfamily C - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#27875 - CGI_10028810 superfamily 241599 118 176 7.94E-25 95.388 cl00084 homeodomain superfamily - - "Homeodomain; DNA binding domains involved in the transcriptional regulation of key eukaryotic developmental processes; may bind to DNA as monomers or as homo- and/or heterodimers, in a sequence-specific manner." Q#27876 - CGI_10028811 superfamily 246925 339 531 9.30E-17 81.2477 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#27876 - CGI_10028811 superfamily 246925 589 734 1.21E-10 62.7582 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#27876 - CGI_10028811 superfamily 247723 871 933 0.000357013 39.9422 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#27877 - CGI_10028812 superfamily 241810 981 1031 2.37E-12 64.073 cl00354 KOW superfamily - - "KOW: an acronym for the authors' surnames (Kyrpides, Ouzounis and Woese); KOW domain is known as an RNA-binding motif that is shared so far among some families of ribosomal proteins, the essential bacterial transcriptional elongation factor NusG, the eukaryotic chromatin elongation factor Spt5, the higher eukaryotic KIN17 proteins and Mtr4. The KOW motif contains an invariants glycine residue and comprises alternating blocks of hydrophilic and hydrophobic residues." Q#27879 - CGI_10028814 superfamily 219852 61 547 1.05E-06 51.237 cl15656 Sfi1 superfamily - - Sfi1 spindle body protein; This is a family of fungal spindle pole body proteins that play a role in spindle body duplication. They contain binding sites for calmodulin-like proteins called centrins which are present in microtubule-organising centres. Q#27880 - CGI_10028815 superfamily 222269 7 237 3.73E-41 149.781 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#27881 - CGI_10028816 superfamily 247856 8 57 0.000160562 36.7569 cl17302 EFh superfamily N - "EF-hand, calcium binding motif; A diverse superfamily of calcium sensors and calcium signal modulators; most examples in this alignment model have 2 active canonical EF hands. Ca2+ binding induces a conformational change in the EF-hand motif, leading to the activation or inactivation of target proteins. EF-hands tend to occur in pairs or higher copy numbers." Q#27883 - CGI_10028818 superfamily 241597 185 250 3.99E-20 85.7483 cl00082 HMG-box superfamily - - "High Mobility Group (HMG)-box is found in a variety of eukaryotic chromosomal proteins and transcription factors. HMGs bind to the minor groove of DNA and have been classified by DNA binding preferences. Two phylogenically distinct groups of Class I proteins bind DNA in a sequence specific fashion and contain a single HMG box. One group (SOX-TCF) includes transcription factors, TCF-1, -3, -4; and also SRY and LEF-1, which bind four-way DNA junctions and duplex DNA targets. The second group (MATA) includes fungal mating type gene products MC, MATA1 and Ste11. Class II and III proteins (HMGB-UBF) bind DNA in a non-sequence specific fashion and contain two or more tandem HMG boxes. Class II members include non-histone chromosomal proteins, HMG1 and HMG2, which bind to bent or distorted DNA such as four-way DNA junctions, synthetic DNA cruciforms, kinked cisplatin-modified DNA, DNA bulges, cross-overs in supercoiled DNA, and can cause looping of linear DNA. Class III members include nucleolar and mitochondrial transcription factors, UBF and mtTF1, which bind four-way DNA junctions." Q#27883 - CGI_10028818 superfamily 248469 464 574 3.22E-09 55.4539 cl17915 HAD_like superfamily - - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#27883 - CGI_10028818 superfamily 248469 617 682 2.53E-08 52.7575 cl17915 HAD_like superfamily N - "Haloacid dehalogenase-like hydrolases. The haloacid dehalogenase-like (HAD) superfamily includes L-2-haloacid dehalogenase, epoxide hydrolase, phosphoserine phosphatase, phosphomannomutase, phosphoglycolate phosphatase, P-type ATPase, and many others, all of which use a nucleophilic aspartate in their phosphoryl transfer reaction. All members possess a highly conserved alpha/beta core domain, and many also possess a small cap domain, the fold and function of which is variable. Members of this superfamily are sometimes referred to as belonging to the DDDD superfamily of phosphohydrolases." Q#27884 - CGI_10028819 superfamily 243836 3 128 5.24E-51 160.539 cl04661 Polysacc_synt_4 superfamily - - Polysaccharide biosynthesis; This family of proteins plays a role in xylan biosynthesis in plant cell walls. Its precise role in xylan biosynthesis is unknown. Its function in other organisms is unknown. Q#27885 - CGI_10028820 superfamily 243129 308 419 2.98E-26 101.947 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#27885 - CGI_10028820 superfamily 243129 148 253 7.42E-20 84.6037 cl02653 MA3 superfamily - - "MA3 domain; Domain in DAP-5, eIF4G, MA-3 and other proteins. Highly alpha-helical. May contain repeats and/or regions similar to MIF4G domains." Q#27887 - CGI_10021135 superfamily 242915 13 254 3.56E-58 187.192 cl02164 Utp11 superfamily - - "Utp11 protein; This protein is found to be part of a large ribonucleoprotein complex containing the U3 snoRNA. Depletion of the Utp proteins impedes production of the 18S rRNA, indicating that they are part of the active pre-rRNA processing complex. This large RNP complex has been termed the small subunit (SSU) processome." Q#27888 - CGI_10021136 superfamily 248012 85 197 6.75E-17 75.692 cl17458 TIR_2 superfamily - - TIR domain; This is a family of bacterial Toll-like receptors. Q#27889 - CGI_10021137 superfamily 243035 39 148 7.96E-26 95.7645 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27893 - CGI_10021141 superfamily 222313 78 118 1.69E-11 60.2834 cl18662 Methyltransf_32 superfamily N - Methyltransferase domain; This family appears to be a methyltransferase domain. Q#27894 - CGI_10021142 superfamily 247724 3 156 1.04E-47 158.853 cl17170 Ras_like_GTPase superfamily N - "Rat sarcoma (Ras)-like superfamily of small guanosine triphosphatases (GTPases); Ras-like GTPase superfamily. The Ras-like superfamily of small GTPases consists of several families with an extremely high degree of structural and functional similarity. The Ras superfamily is divided into at least four families in eukaryotes: the Ras, Rho, Rab, and Sar1/Arf families. This superfamily also includes proteins like the GTP translation factors, Era-like GTPases, and G-alpha chain of the heterotrimeric G proteins. Members of the Ras superfamily regulate a wide variety of cellular functions: the Ras family regulates gene expression, the Rho family regulates cytoskeletal reorganization and gene expression, the Rab and Sar1/Arf families regulate vesicle trafficking, and the Ran family regulates nucleocytoplasmic transport and microtubule organization. The GTP translation factor family regulates initiation, elongation, termination, and release in translation, and the Era-like GTPase family regulates cell division, sporulation, and DNA replication. Members of the Ras superfamily are identified by the GTP binding site, which is made up of five characteristic sequence motifs, and the switch I and switch II regions." Q#27895 - CGI_10021143 superfamily 241571 601 704 1.01E-07 51.6443 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#27895 - CGI_10021143 superfamily 238012 525 574 2.36E-07 49.6602 cl11390 EGF_Lam superfamily - - "Laminin-type epidermal growth factor-like domain; laminins are the major noncollagenous components of basement membranes that mediate cell adhesion, growth migration, and differentiation; the laminin-type epidermal growth factor-like module occurs in tandem arrays; the domain contains 4 disulfide bonds (loops a-d) the first three resemble epidermal growth factor (EGF); the number of copies of this domain in the different forms of laminins is highly variable ranging from 3 up to 22 copies" Q#27895 - CGI_10021143 superfamily 243146 790 838 1.29E-06 47.2839 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#27895 - CGI_10021143 superfamily 205157 383 421 1.34E-05 44.0655 cl18264 EGF_3 superfamily - - EGF domain; This family includes a variety of EGF-like domain homologues. This family includes the C-terminal domain of the malaria parasite MSP1 protein. Q#27895 - CGI_10021143 superfamily 243146 845 896 0.00795086 36.1131 cl02701 Kelch_3 superfamily - - "Galactose oxidase, central domain; Galactose oxidase, central domain. " Q#27896 - CGI_10021144 superfamily 241600 1 44 0.00315069 36.0643 cl00085 FReD superfamily C - "Fibrinogen-related domains (FReDs); C terminal globular domain of fibrinogen. Fibrinogen is involved in blood clotting, being activated by thrombin to assemble into fibrin clots. The N-termini of 2 times 3 chains come together to form a globular arrangement called the disulfide knot. The C termini of fibrinogen chains end in globular domains, which are not completely equivalent. C terminal globular domains of the gamma chains (C-gamma) dimerize and bind to the GPR motif of the N-terminal domain of the alpha chain, while the GHR motif of N-terminal domain of the beta chain binds to the C terminal globular domains of another beta chain (C-beta), which leads to lattice formation." Q#27897 - CGI_10021145 superfamily 217311 47 518 9.90E-101 317.741 cl18402 DUF229 superfamily - - Protein of unknown function (DUF229); Members of this family are uncharacterized. They are 500-1200 amino acids in length and share a long region conservation that probably corresponds to several domains. The Go annotation for the protein indicates that it is involved in nematode larval development and has a positive regulation on growth rate. Q#27898 - CGI_10021146 superfamily 247858 276 403 6.77E-17 78.1986 cl17304 2OG-FeII_Oxy_3 superfamily N - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#27898 - CGI_10021146 superfamily 247858 22 102 8.14E-05 40.8469 cl17304 2OG-FeII_Oxy_3 superfamily - - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#27902 - CGI_10021150 superfamily 243035 78 148 2.24E-11 57.2445 cl02432 CLECT superfamily C - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#27904 - CGI_10021152 superfamily 214531 612 652 6.47E-06 44.1297 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#27904 - CGI_10021152 superfamily 214531 566 609 7.33E-06 44.1297 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#27905 - CGI_10021153 superfamily 241568 676 712 0.00217836 37.0572 cl00043 CCP superfamily N - "Complement control protein (CCP) modules (aka short consensus repeats SCRs or SUSHI repeats) have been identified in several proteins of the complement system; SUSHI repeats (short complement-like repeat, SCR) are abundant in complement control proteins. The complement control protein (CCP) modules (also known as short consensus repeats SCRs or SUSHI repeats) contain approximately 60 amino acid residues and have been identified in several proteins of the complement system. Typically, 2 to 4 modules contribute to a binding site, implying that the orientation of the modules to each other is critical for function." Q#27905 - CGI_10021153 superfamily 241578 38 72 5.04E-05 44.298 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27905 - CGI_10021153 superfamily 214531 506 557 0.0017937 37.1961 cl18310 LY superfamily - - Low-density lipoprotein-receptor YWTD domain; Type "B" repeats in low-density lipoprotein (LDL) receptor that plays a central role in mammalian cholesterol metabolism. Also present in a variety of molecules similar to gp300/megalin. Q#27906 - CGI_10021154 superfamily 241643 341 371 0.000209272 38.9795 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#27906 - CGI_10021154 superfamily 241645 4 59 0.000106898 40.3715 cl00155 UBQ superfamily - - "Ubiquitin-like proteins; Ubiquitin homologs; Includes ubiquitin and ubiquitin-like proteins. Ubiquitin-mediated proteolysis is part of the regulated turnover of proteins required for controlling cell cycle progression. Other family members are protein modifiers that perform a wide range of functions. Ubiquitination usually results in a covalent bond between the C-terminus of ubiquitin and the epsilon-amino group of a substrate lysine. The three-step mechanism requires an activating enzyme (E1) that forms a thiol ester with the C-terminal carboxy group, a conjugating enzyme (E2) that transiently carries the activated ubiquitin molecule as a thiol ester, and a ligase (E3) that transfers the activated ubiquitin from the E2 to the substrate lysine residue. In poly-ubiquitination, ubiquitin itself is the substrate." Q#27906 - CGI_10021154 superfamily 241643 390 426 0.000208411 38.9707 cl00153 UBA superfamily - - "Ubiquitin Associated domain. The UBA domain is a commonly occurring sequence motif in some members of the ubiquitination pathway, UV excision repair proteins, and certain protein kinases. Although its specific role is so far unknown, it has been suggested that UBA domains are involved in conferring protein target specificity. The domain, a compact three helix bundle, has a conserved GFP-loop and the proline is thought to be critical for binding. The UBA domain is distinct from the conserved three helical domain seen in the N-terminus of EF-TS and eukaryotic NAC proteins." Q#27907 - CGI_10010600 superfamily 241563 86 120 0.00379263 35.7759 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27908 - CGI_10010601 superfamily 215724 31 343 1.86E-147 421.257 cl14706 wnt superfamily - - "wnt family; Wnt genes have been identified in vertebrates and invertebrates but not in plants, unicellular eukaryotes or prokaryotes. In humans, 19 WNT proteins are known. Because of their insolubility little is known about Wnt protein structure, but all have 23 or 24 Cys residues whose spacing is highly conserved. Signal transduction by Wnt proteins (including the Wnt/beta-catenin, the Wnt/Ca++, and the Wnt/polarity pathway) is mediated by receptors of the Frizzled and LDL-receptor-related protein (LRP) families." Q#27912 - CGI_10010606 superfamily 207662 41 113 9.44E-31 114.196 cl02596 NR_DBD_like superfamily - - "DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers; DNA-binding domain of nuclear receptors is composed of two C4-type zinc fingers. Each zinc finger contains a group of four Cys residues which co-ordinates a single zinc atom. It interacts with a specific DNA site upstream of the target gene and modulates the rate of transcriptional initiation. Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions, from development, reproduction, to homeostasis and metabolism in animals (metazoans). The family contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a flexible hinge and a C-terminal ligand binding domain (LBD). Most nuclear receptors bind as homodimers or heterodimers to their target sites, which consist of two hexameric half-sites. Specificity is determined by the half-site sequence, the relative orientation of the half-sites and the number of spacer nucleotides between the half-sites. However, a growing number of nuclear receptors have been reported to bind to DNA as monomers." Q#27912 - CGI_10010606 superfamily 245599 363 521 1.05E-13 68.7887 cl11397 NR_LBD superfamily - - "The ligand binding domain of nuclear receptors, a family of ligand-activated transcription regulators; Ligand-binding domain (LBD) of nuclear receptor (NR): Nuclear receptors form a superfamily of ligand-activated transcription regulators, which regulate various physiological functions in metazoans, from development, reproduction, to homeostasis and metabolism. The superfamily contains not only receptors for known ligands but also orphan receptors for which ligands do not exist or have not been identified. The members of the family include receptors of steroids, thyroid hormone, retinoids, cholesterol by-products, lipids and heme. With few exceptions, NRs share a common structural organization with a central well conserved DNA binding domain (DBD), a variable N-terminal domain, a non-conserved hinge and a C-terminal ligand binding domain (LBD)." Q#27913 - CGI_10010607 superfamily 243034 391 490 1.40E-11 61.242 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#27913 - CGI_10010607 superfamily 222269 128 327 1.07E-15 75.823 cl18657 Cupin_8 superfamily - - Cupin-like domain; This cupin like domain shares similarity to the JmjC domain. Q#27915 - CGI_10010609 superfamily 243069 191 329 3.51E-05 42.74 cl02525 Band_7 superfamily - - "The band 7 domain of flotillin (reggie) like proteins. This group contains proteins similar to stomatin, prohibitin, flotillin, HlfK/C and podicin. Many of these band 7 domain-containing proteins are lipid raft-associated. Individual proteins of this band 7 domain family may cluster to form membrane microdomains which may in turn recruit multiprotein complexes. Microdomains formed from flotillin proteins may in addition be dynamic units with their own regulatory functions. Flotillins have been implicated in signal transduction, vesicle trafficking, cytoskeleton rearrangement and are known to interact with a variety of proteins. Stomatin interacts with and regulates members of the degenerin/epithelia Na+ channel family in mechanosensory cells of Caenorhabditis elegans and vertebrate neurons and participates in trafficking of Glut1 glucose transporters. Prohibitin may act as a chaperone for the stabilization of mitochondrial proteins. Prokaryotic HflK/C plays a role in the decision between lysogenic and lytic cycle growth during lambda phage infection. Flotillins have been implicated in the progression of prion disease, in the pathogenesis of neurodegenerative diseases such as Parkinson's and Alzheimer's disease and, in cancer invasion and metastasis. Mutations in the podicin gene give rise to autosomal recessive steroid resistant nephritic syndrome" Q#27916 - CGI_10010610 superfamily 245206 15 105 1.80E-29 106.553 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#27917 - CGI_10010611 superfamily 245206 46 136 2.99E-32 115.027 cl09931 NADB_Rossmann superfamily N - "Rossmann-fold NAD(P)(+)-binding proteins; A large family of proteins that share a Rossmann-fold NAD(P)H/NAD(P)(+) binding (NADB) domain. The NADB domain is found in numerous dehydrogenases of metabolic pathways such as glycolysis, and many other redox enzymes. NAD binding involves numerous hydrogen-bonds and van der Waals contacts, in particular H-bonding of residues in a turn between the first strand and the subsequent helix of the Rossmann-fold topology. Characteristically, this turn exhibits a consensus binding pattern similar to GXGXXG, in which the first 2 glycines participate in NAD(P)-binding, and the third facilitates close packing of the helix to the beta-strand. Typically, proteins in this family contain a second domain in addition to the NADB domain, which is responsible for specifically binding a substrate and catalyzing a particular enzymatic reaction." Q#27918 - CGI_10005619 superfamily 219977 38 123 1.01E-19 85.4144 cl18539 Vps51 superfamily - - "Vps51/Vps67; This family includes a presumed domain found in a number of components of vesicular transport. The VFT tethering complex (also known as GARP complex, Golgi associated retrograde protein complex, Vps53 tethering complex) is a conserved eukaryotic docking complex which is involved recycling of proteins from endosomes to the late Golgi. Vps51 (also known as Vps67) is a subunit of VFT and interacts with the SNARE Tlg1. Cog1_N is the N-terminus of the Cog1 subunit of the eight-unit Conserved Oligomeric Golgi (COG) complex that participates in retrograde vesicular transport and is required to maintain normal Golgi structure and function. The subunits are located in two lobes and Cog1 serves to bind the two lobes together probably via the highly conserved N-terminal domain of approximately 85 residues." Q#27919 - CGI_10005620 superfamily 241563 61 100 0.000153652 40.5404 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27921 - CGI_10009423 superfamily 116688 428 654 6.89E-141 421.448 cl06910 PROCN superfamily C - PROCN (NUC071) domain; The PROCN domain is the central domain in pre-mRNA splicing factors of PRO8 family. Q#27921 - CGI_10009423 superfamily 149261 113 264 6.24E-81 255.706 cl06909 PRO8NT superfamily - - "PRO8NT (NUC069), PrP8 N-terminal domain; The PRO8NT domain is found at the N-terminus of pre-mRNA splicing factors of PRO8 family. The NLS or nuclear localisation signal for these spliceosome proteins begins at the start and runs for 60 residues. N-terminal to this domain is a highly variable proline-rich region." Q#27922 - CGI_10009424 superfamily 111042 76 253 3.46E-19 85.895 cl17928 Ocular_alb superfamily C - Ocular albinism type 1 protein; Ocular albinism type 1 protein. Q#27923 - CGI_10009425 superfamily 193607 457 589 5.74E-67 215.899 cl15237 Deltex_C superfamily - - "Domain found at the C-terminus of deltex-like; The deltex family of proteins is involved in the regulation of Notch signaling, and therefore may play roles in cell-to-cell communications that regulate mechanisms determining cell fate. They have a central RING-type zinc finger domain and contain a C-terminal domain, described here, that is also found in other domain architectures. Deltex-1 (DTX1) contains a RING finger and two WWE domains, indicating that it may be an E3 ubiquitin ligase. Human deltex 3-like, which contains an additional N-terminal domain (presumably with ubiquitin ligase activity) is also described as E3 ubiquitin-protein ligase DTX3L, B-lymphoma- and BAL-associated protein (BBAP), or rhysin-2. DTX3L mediates monoubiquitination of K91 of histone H4 in response to DNA damage." Q#27923 - CGI_10009425 superfamily 247792 411 450 0.000196558 39.7364 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27923 - CGI_10009425 superfamily 207713 7 77 9.68E-13 64.2629 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#27923 - CGI_10009425 superfamily 207713 83 160 1.12E-09 55.4033 cl02729 WWE superfamily - - WWE domain; The WWE domain is named after three of its conserved residues and is predicted to mediate specific protein- protein interactions in ubiquitin and ADP ribose conjugation systems. Q#27925 - CGI_10009427 superfamily 247684 1 334 1.97E-68 228.33 cl17037 NBD_sugar-kinase_HSP70_actin superfamily - - "Nucleotide-Binding Domain of the sugar kinase/HSP70/actin superfamily; This superfamily includes the actin family, the HSP70 family of molecular chaperones and nucleotide exchange factors, the ROK (repressor, ORF, kinase) family, the hexokinase family, the FGGY family (which includes glycerol kinase and similar carbohydrate kinases such as rhamnulokinase and xylulokinase), the exopolyphosphatase/guanosine pentaphosphate phosphohydrolase/nucleoside triphosphate diphosphohydrolase family, propionate kinase/acetate kinase family, glycerol dehydratase reactivase, 2-hydroxyglutaryl-CoA dehydratase component A, N-acetylglucosamine kinase, butyrate kinase 2, Escherichia coli YeaZ and similar glycoproteases, the cell shape-determining protein MreB, the plasmid DNA segregation factor ParM, cell cycle proteins FtsA, Pili assembly protein PilM, ethanolamine utilization protein EutJ, and similar proteins. The nucleotide-binding site residues are conserved; the nucleotide sits in a deep cleft formed between the two lobes of the nucleotide-binding domain (NBD). Substrate binding to superfamily members is associated with closure of this catalytic site cleft. The functional activities of several members of the superfamily, including hexokinases, actin, and HSP70s, are modulated by allosteric effectors, which may act on the cleft closure." Q#27928 - CGI_10009430 superfamily 150838 44 204 1.85E-50 164.945 cl10913 DUF2216 superfamily - - "Uncharacterized conserved proteins (DUF2216); This is the conserved N-terminal half of a proteins which are found from worms to humans. some annotation suggests it might be PKR, the Hepatitis delta antigen-interacting protein A, but this could not be confirmed." Q#27929 - CGI_10009431 superfamily 241563 26 62 3.69E-06 44.3924 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27930 - CGI_10009432 superfamily 247858 131 228 9.32E-10 55.0866 cl17304 2OG-FeII_Oxy_3 superfamily N - 2OG-Fe(II) oxygenase superfamily; This family contains members of the 2-oxoglutarate (2OG) and Fe(II)-dependent oxygenase superfamily. Q#27931 - CGI_10009433 superfamily 217920 55 446 7.92E-154 443.675 cl09557 ERO1 superfamily - - Endoplasmic Reticulum Oxidoreductin 1 (ERO1); Members of this family are required for the formation of disulphide bonds in the ER. Q#27932 - CGI_10009434 superfamily 240521 314 434 1.82E-16 81.1992 cl18940 Syo1_like superfamily NC - "Fungal symportin 1 (syo1) and similar proteins; This family of eukaryotic proteins includes Saccharomyces cerevisiae Ydl063c and Chaetomium thermophilum Syo1, which mediate the co-import of two ribosomal proteins, Rpl5 and Rpl11 (which both interact with 5S rRNA) into the nucleus. Import precedes their association with rRNA and subsequent ribosome assembly in the nucleolus. The primary structure of syo1 is a mixture of Armadillo- (ARM, N-terminal part of syo1) and HEAT-repeats (C-terminal part of syo1)." Q#27932 - CGI_10009434 superfamily 218373 486 517 0.00936493 34.6885 cl04882 SGS superfamily C - "SGS domain; This domain was thought to be unique to the SGT1-like proteins, but is also found in calcyclin binding proteins." Q#27933 - CGI_10009435 superfamily 219316 25 187 1.85E-78 234.423 cl06268 B9-C2 superfamily - - "Ciliary basal body-associated, B9 protein; The B9-C2 domain is found in proteins associated with the ciliary basal body. B9 domains were identified as a specific family of C2 domains. There are three sub-families represented by this family, notably, Mks1-Xbx7, Stumpy-Tza1 and Tza2 groups of proteins. Mutations in human Mks1 result in the developmental disorder Mechler-Gruber syndrome; mutations in mouse Stumpy lead to perinatal hydrocephalus and severe polycystic kidney disease. All the three distinct types of B9-C2 proteins cooperatively localise to the basal body or centrosome of cilia." Q#27934 - CGI_10009436 superfamily 246902 208 400 1.29E-107 319.626 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#27934 - CGI_10009436 superfamily 246902 17 186 3.67E-77 240.668 cl15239 PLDc_SF superfamily - - "Catalytic domain of phospholipase D superfamily proteins; Catalytic domain of phospholipase D (PLD) superfamily proteins. The PLD superfamily is composed of a large and diverse group of proteins including plant, mammalian and bacterial PLDs, bacterial cardiolipin (CL) synthases, bacterial phosphatidylserine synthases (PSS), eukaryotic phosphatidylglycerophosphate (PGP) synthase, eukaryotic tyrosyl-DNA phosphodiesterase 1 (Tdp1), and some bacterial endonucleases (Nuc and BfiI), among others. PLD enzymes hydrolyze phospholipid phosphodiester bonds to yield phosphatidic acid and a free polar head group. They can also catalyze the transphosphatidylation of phospholipids to acceptor alcohols. The majority of members in this superfamily contain a short conserved sequence motif (H-x-K-x(4)-D, where x represents any amino acid residue), called the HKD signature motif. There are varying expanded forms of this motif in different family members. Some members contain variant HKD motifs. Most PLD enzymes are monomeric proteins with two HKD motif-containing domains. Two HKD motifs from two domains form a single active site. Some PLD enzymes have only one copy of the HKD motif per subunit but form a functionally active dimer, which has a single active site at the dimer interface containing the two HKD motifs from both subunits. Different PLD enzymes may have evolved through domain fusion of a common catalytic core with separate substrate recognition domains. Despite their various catalytic functions and a very broad range of substrate specificities, the diverse group of PLD enzymes can bind to a phosphodiester moiety. Most of them are active as bi-lobed monomers or dimers, and may possess similar core structures for catalytic activity. They are generally thought to utilize a common two-step ping-pong catalytic mechanism, involving an enzyme-substrate intermediate, to cleave phosphodiester bonds. The two histidine residues from the two HKD motifs play key roles in the catalysis. Upon substrate binding, a histidine from one HKD motif could function as the nucleophile, attacking the phosphodiester bond to create a covalent phosphohistidine intermediate, while the other histidine residue from the second HKD motif could serve as a general acid, stabilizing the leaving group." Q#27935 - CGI_10009438 superfamily 241754 6 730 0 662.424 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#27935 - CGI_10009438 superfamily 245596 1473 1754 2.03E-76 256.466 cl11394 Glyco_tranf_GTA_type superfamily - - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#27935 - CGI_10009438 superfamily 245596 1396 1452 8.30E-10 60.3992 cl11394 Glyco_tranf_GTA_type superfamily C - "Glycosyltransferase family A (GT-A) includes diverse families of glycosyl transferases with a common GT-A type structural fold; Glycosyltransferases (GTs) are enzymes that synthesize oligosaccharides, polysaccharides, and glycoconjugates by transferring the sugar moiety from an activated nucleotide-sugar donor to an acceptor molecule, which may be a growing oligosaccharide, a lipid, or a protein. Based on the stereochemistry of the donor and acceptor molecules, GTs are classified as either retaining or inverting enzymes. To date, all GT structures adopt one of two possible folds, termed GT-A fold and GT-B fold. This hierarchy includes diverse families of glycosyl transferases with a common GT-A type structural fold, which has two tightly associated beta/alpha/beta domains that tend to form a continuous central sheet of at least eight beta-strands. The majority of the proteins in this superfamily are Glycosyltransferase family 2 (GT-2) proteins. But it also includes families GT-43, GT-6, GT-8, GT13 and GT-7; which are evolutionarily related to GT-2 and share structure similarities." Q#27935 - CGI_10009438 superfamily 245166 1841 1905 0.00315337 39.881 cl09823 Trep_Strep superfamily C - "Hypothetical bacterial integral membrane protein (Trep_Strep); This family consists of strongly hydrophobic proteins about 190 amino acids in length with a strongly basic motif near the C-terminus. It is found in rather few species, but in paralogous families of 12 members in the oral pathogenic spirochaete Treponema denticola and 2 in Streptococcus pneumoniae R6." Q#27936 - CGI_10026753 superfamily 241594 314 545 4.63E-07 50.3815 cl00077 HECTc superfamily - - "HECT domain; C-terminal catalytic domain of a subclass of Ubiquitin-protein ligase (E3). It binds specific ubiquitin-conjugating enzymes (E2), accepts ubiquitin from E2, transfers ubiquitin to substrate lysine side chains, and transfers additional ubiquitin molecules to the end of growing ubiquitin chains." Q#27938 - CGI_10026755 superfamily 242406 2 145 1.48E-21 86.4913 cl01271 DUF1768 superfamily - - Domain of unknown function (DUF1768); This is a domain of unknown function. It is alpha helical in structure. The GO annotation for this protein suggests it is involved in nematode larval development and has a positive regulation on growth rate. Q#27940 - CGI_10026757 superfamily 246925 80 220 0.00170199 39.261 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#27940 - CGI_10026757 superfamily 214507 446 495 0.00240584 36.254 cl15307 LRRCT superfamily - - Leucine rich repeat C-terminal domain; Leucine rich repeat C-terminal domain. Q#27943 - CGI_10026760 superfamily 241578 844 983 9.54E-13 67.3166 cl00057 vWFA superfamily - - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27943 - CGI_10026760 superfamily 242197 9 117 9.22E-45 158.094 cl00928 dsDNA_bind superfamily - - Double-stranded DNA-binding domain; This domain is believed to bind double-stranded DNA of 20 bases length. Q#27943 - CGI_10026760 superfamily 152683 522 624 4.17E-07 49.2085 cl13656 Methyltransf_FA superfamily - - "Farnesoic acid 0-methyl transferase; This domain family is found in bacteria and eukaryotes, and is approximately 110 amino acids in length.Farnesoic acid O-methyl transferase (FAMeT) is the enzyme that catalyzes the formation of methyl farnesoate (MF) from farnesoic acid (FA) in the biosynthetic pathway of juvenile hormone (JH)." Q#27943 - CGI_10026760 superfamily 241578 688 824 5.48E-06 46.6125 cl00057 vWFA superfamily N - "Von Willebrand factor type A (vWA) domain was originally found in the blood coagulation protein von Willebrand factor (vWF). Typically, the vWA domain is made up of approximately 200 amino acid residues folded into a classic a/b para-rossmann type of fold. The vWA domain, since its discovery, has drawn great interest because of its widespread occurrence and its involvement in a wide variety of important cellular functions. These include basal membrane formation, cell migration, cell differentiation, adhesion, haemostasis, signaling, chromosomal stability, malignant transformation and in immune defenses In integrins these domains form heterodimers while in vWF it forms multimers. There are different interaction surfaces of this domain as seen by the various molecules it complexes with. Ligand binding in most cases is mediated by the presence of a metal ion dependent adhesion site termed as the MIDAS motif that is a characteristic feature of most, if not all A domains." Q#27943 - CGI_10026760 superfamily 246918 434 479 5.83E-05 42.1887 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27943 - CGI_10026760 superfamily 216897 367 430 9.97E-05 41.8981 cl03463 Gal_Lectin superfamily - - Galactose binding lectin domain; Galactose binding lectin domain. Q#27944 - CGI_10026761 superfamily 241571 299 403 3.39E-17 77.4526 cl00049 CUB superfamily - - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#27944 - CGI_10026761 superfamily 245847 50 125 0.00216275 36.7116 cl12042 FA58C superfamily C - "Coagulation factor 5/8 C-terminal domain, discoidin domain; Cell surface-attached carbohydrate-binding domain, present in eukaryotes and assumed to have horizontally transferred to eubacterial genomes." Q#27945 - CGI_10026762 superfamily 245303 2 142 3.41E-70 217.427 cl10447 GH18_chitinase-like superfamily C - "The GH18 (glycosyl hydrolase, family 18) type II chitinases hydrolyze chitin, an abundant polymer of beta-1,4-linked N-acetylglucosamine (GlcNAc) which is a major component of the cell wall of fungi and the exoskeleton of arthropods. Chitinases have been identified in viruses, bacteria, fungi, protozoan parasites, insects, and plants. The structure of the GH18 domain is an eight-stranded beta/alpha barrel with a pronounced active-site cleft at the C-terminal end of the beta-barrel. The GH18 family includes chitotriosidase, chitobiase, hevamine, zymocin-alpha, narbonin, SI-CLP (stabilin-1 interacting chitinase-like protein), IDGF (imaginal disc growth factor), CFLE (cortical fragment-lytic enzyme) spore hydrolase, the type III and type V plant chitinases, the endo-beta-N-acetylglucosaminidases, and the chitolectins. The GH85 (glycosyl hydrolase, family 85) ENGases (endo-beta-N-acetylglucosaminidases) are closely related to the GH18 chitinases and are included in this alignment model." Q#27948 - CGI_10026765 superfamily 148067 95 196 2.22E-27 101.194 cl05643 DUF1011 superfamily - - Protein of unknown function (DUF1011); Family of uncharacterized eukaryotic proteins. Q#27950 - CGI_10026767 superfamily 246918 53 106 2.61E-09 51.0483 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27950 - CGI_10026767 superfamily 246918 1 47 7.92E-05 38.7219 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#27951 - CGI_10026768 superfamily 241547 3 231 6.28E-40 139.725 cl00012 alpha_CA superfamily N - "Carbonic anhydrase alpha (vertebrate-like) group. Carbonic anhydrases (CAs) are zinc-containing enzymes that catalyze the reversible hydration of carbon dioxide in a two-step mechanism: a nucleophilic attack of a zinc-bound hydroxide ion on carbon dioxide, followed by the regeneration of the active site by ionization of the zinc-bound water molecule and removal of a proton from the active site. They are ubiquitous enzymes involved in fundamental processes like photosynthesis, respiration, pH homeostasis and ion transport. There are three evolutionary distinct groups - alpha, beta and gamma carbonic anhydrases - which show no significant sequence identity or structural similarity. Most alpha CAs are monomeric enzymes. The zinc ion is complexed by three histidine residues and a fourth conserved histidine plays a potential role in proton transfer." Q#27952 - CGI_10026769 superfamily 248097 294 417 8.75E-28 106.195 cl17543 C1q superfamily - - C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system. Q#27954 - CGI_10026771 superfamily 241564 12 80 1.48E-25 97.3363 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#27954 - CGI_10026771 superfamily 247792 258 296 0.000242833 38.1956 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27955 - CGI_10026772 superfamily 241564 150 218 5.36E-25 95.0251 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#27955 - CGI_10026772 superfamily 241564 6 71 5.83E-18 75.3799 cl00035 BIR superfamily - - "Baculoviral inhibition of apoptosis protein repeat domain; Found in inhibitors of apoptosis proteins (IAPs) and other proteins. In higher eukaryotes, BIR domains inhibit apoptosis by acting as direct inhibitors of the caspase family of protease enzymes. In yeast, BIR domains are involved in regulating cytokinesis. This novel fold is stabilized by zinc tetrahedrally coordinated by one histidine and three cysteine residues and resembles a classical zinc finger." Q#27956 - CGI_10026773 superfamily 241802 1 296 7.38E-51 176.909 cl00342 Trp-synth-beta_II superfamily - - "Tryptophan synthase beta superfamily (fold type II); this family of pyridoxal phosphate (PLP)-dependent enzymes catalyzes beta-replacement and beta-elimination reactions. This CD corresponds to aminocyclopropane-1-carboxylate deaminase (ACCD), tryptophan synthase beta chain (Trp-synth_B), cystathionine beta-synthase (CBS), O-acetylserine sulfhydrylase (CS), serine dehydratase (Ser-dehyd), threonine dehydratase (Thr-dehyd), diaminopropionate ammonia lyase (DAL), and threonine synthase (Thr-synth). ACCD catalyzes the conversion of 1-aminocyclopropane-1-carboxylate to alpha-ketobutyrate and ammonia. Tryptophan synthase folds into a tetramer, where the beta chain is the catalytic PLP-binding subunit and catalyzes the formation of L-tryptophan from indole and L-serine. CBS is a tetrameric hemeprotein that catalyzes condensation of serine and homocysteine to cystathionine. CS is a homodimer that catalyzes the formation of L-cysteine from O-acetyl-L-serine. Ser-dehyd catalyzes the conversion of L- or D-serine to pyruvate and ammonia. Thr-dehyd is active as a homodimer and catalyzes the conversion of L-threonine to 2-oxobutanoate and ammonia. DAL is also a homodimer and catalyzes the alpha, beta-elimination reaction of both L- and D-alpha, beta-diaminopropionate to form pyruvate and ammonia. Thr-synth catalyzes the formation of threonine and inorganic phosphate from O-phosphohomoserine." Q#27962 - CGI_10026779 superfamily 241563 68 109 2.94E-05 42.0812 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27962 - CGI_10026779 superfamily 241717 121 213 0.00121363 38.9795 cl00240 RRF superfamily N - "Ribosome recycling factor (RRF). Ribosome recycling factor dissociates the posttermination complex, composed of the ribosome, deacylated tRNA, and mRNA, after termination of translation. Thus ribosomes are "recycled" and ready for another round of protein synthesis. RRF is believed to bind the ribosome at the A-site in a manner that mimics tRNA, but the specific mechanisms remain unclear. RRF is essential for bacterial growth. It is not necessary for cell growth in archaea or eukaryotes, but is found in mitochondria or chloroplasts of some eukaryotic species." Q#27964 - CGI_10026781 superfamily 241563 76 117 5.14E-06 43.622 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27965 - CGI_10026782 superfamily 246669 682 836 5.36E-71 232.583 cl14603 C2 superfamily - - "C2 domain; The C2 domain was first identified in PKC. C2 domains fold into an 8-standed beta-sandwich that can adopt 2 structural arrangements: Type I and Type II, distinguished by a circular permutation involving their N- and C-terminal beta strands. Many C2 domains are Ca2+-dependent membrane-targeting modules that bind a wide variety of substances including bind phospholipids, inositol polyphosphates, and intracellular proteins. Most C2 domain proteins are either signal transduction enzymes that contain a single C2 domain, such as protein kinase C, or membrane trafficking proteins which contain at least two C2 domains, such as synaptotagmin 1. However, there are a few exceptions to this including RIM isoforms and some splice variants of piccolo/aczonin and intersectin which only have a single C2 domain. C2 domains with a calcium binding region have negatively charged residues, primarily aspartates, that serve as ligands for calcium ions." Q#27965 - CGI_10026782 superfamily 128928 520 576 1.99E-15 72.7172 cl02734 DM14 superfamily - - "Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241; Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. " Q#27965 - CGI_10026782 superfamily 128928 368 422 2.24E-15 72.332 cl02734 DM14 superfamily - - "Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241; Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. " Q#27965 - CGI_10026782 superfamily 128928 269 324 6.73E-13 65.3984 cl02734 DM14 superfamily - - "Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241; Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. " Q#27965 - CGI_10026782 superfamily 128928 137 193 1.00E-10 59.2352 cl02734 DM14 superfamily - - "Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241; Repeats in fly CG4713, worm Y37H9A.3 and human FLJ20241. " Q#27966 - CGI_10026783 superfamily 247792 263 306 4.57E-08 51.6776 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27966 - CGI_10026783 superfamily 149746 4 76 2.34E-24 99.2258 cl07884 DWNN superfamily - - DWNN domain; DWNN is a ubiquitin like domain found at the N terminus of the RBBP6 family of splicing-associated proteins. The DWNN domain is independently expressed in higher vertebrates so it may function as a novel ubiquitin-like modifier of other proteins. Q#27967 - CGI_10026784 superfamily 241754 13 348 8.15E-171 517.601 cl00286 Motor_domain superfamily - - Myosin and Kinesin motor domain. These ATPases belong to the P-loop NTPase family and provide the driving force in myosin and kinesin mediated processes. Q#27967 - CGI_10026784 superfamily 247746 343 436 0.00535037 37.7463 cl17192 ATP-synt_B superfamily N - "ATP synthase B/B' CF(0); Part of the CF(0) (base unit) of the ATP synthase. The base unit is thought to translocate protons through membrane (inner membrane in mitochondria, thylakoid membrane in plants, cytoplasmic membrane in bacteria). The B subunits are thought to interact with the stalk of the CF(1) subunits. This domain should not be confused with the ab CF(1) proteins (in the head of the ATP synthase) which are found in pfam00006" Q#27968 - CGI_10026785 superfamily 243106 2030 2149 4.28E-54 187.224 cl02608 BAH superfamily - - "BAH, or Bromo Adjacent Homology domain (also called ELM1 and BAM for Bromo Adjacent Motif). BAH domains have first been described as domains found in the polybromo protein and Yeast Rsc1/Rsc2 (Remodeling of the Structure of Chromatin). They also occur in mammalian DNA methyltransferases and the MTA1 subunits of histone deacetylase complexes. A BAH domain is also found in Yeast Sir3p and in the origin receptor complex protein 1 (Orc1p), where it was found to interact with the N-terminal lobe of the silence information regulator 1 protein (Sir1p), confirming the initial hypothesis that BAH plays a role in protein-protein interactions." Q#27970 - CGI_10026787 superfamily 215733 23 61 8.30E-06 47.1747 cl02811 E1-E2_ATPase superfamily C - E1-E2 ATPase; E1-E2 ATPase. Q#27970 - CGI_10026787 superfamily 218499 1045 1131 0.00859504 38.5601 cl04986 ALG3 superfamily NC - "ALG3 protein; The formation of N-glycosidic linkages of glycoproteins involves the ordered assembly of the common Glc3Man9GlcNAc2 core-oligosaccharide on the lipid carrier dolichyl pyrophosphate. Whereas early mannosylation steps occur on the cytoplasmic side of the endoplasmic reticulum with GDP-Man as donor, the final reactions from Man5GlcNAc2-PP-Dol to Man9GlcNAc2-PP-Dol on the lumenal side use Dol-P-Man. ALG3 gene encodes the Dol-P-Man:Man5GlcNAc2-PP-Dol mannosyltransferase." Q#27971 - CGI_10026788 superfamily 246925 263 389 1.60E-08 55.0542 cl15309 LRR_RI superfamily N - "Leucine-rich repeats (LRRs), ribonuclease inhibitor (RI)-like subfamily. LRRs are 20-29 residue sequence motifs present in many proteins that participate in protein-protein interactions and have different functions and cellular locations. LRRs correspond to structural units consisting of a beta strand (LxxLxLxxN/CxL conserved pattern) and an alpha helix. This alignment contains 12 strands corresponding to 11 full repeats, consistent with the extent observed in the subfamily acting as Ran GTPase Activating Proteins (RanGAP1)." Q#27972 - CGI_10026789 superfamily 241774 64 224 2.85E-34 121.505 cl00313 Ribosomal_S7 superfamily - - Ribosomal protein S7p/S5e; This family contains ribosomal protein S7 from prokaryotes and S5 from eukaryotes. Q#27974 - CGI_10026791 superfamily 241813 49 110 3.41E-11 55.5966 cl00359 Ribosomal_L27 superfamily - - Ribosomal L27 protein; Ribosomal L27 protein. Q#27976 - CGI_10026793 superfamily 241563 62 103 3.24E-05 41.696 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#27980 - CGI_10026797 superfamily 248281 72 144 7.81E-11 57.6655 cl17727 GT1 superfamily - - "GT1, myb-like, SANT family; GT-1, a myb-like protein, is one of the GT trihelix transcription factors. GT-1 binds the GT cis-element of rbcS-3A, a light-induced gene, as a dimer. Arabidopsis GT-1 is a trans-activator and acts in the stabilization of components of the transcrtiption pre-initiation complex comprised of TFIIA-TBP-TATA. The isolated GT-1 DNA-binding domain is sufficient to bind DNA. This region closely resemble the myb domain, but with longer helices. It has been proposed that GT-1 may respond to light signals via calcium-dependent phosphorylation to create a light-modulated molecular switch. These proteins are members of the SANT/myb group. SANT is named after 'SWI3, ADA2, N-CoR and TFIIIB', several factors that share this domain. The SANT domain resembles the 3 alpha-helix bundle of the DNA-binding Myb domains and is found in a diverse set of proteins." Q#27980 - CGI_10026797 superfamily 241750 231 360 4.85E-07 49.1091 cl00281 metallo-dependent_hydrolases superfamily C - "Superfamily of metallo-dependent hydrolases (also called amidohydrolase superfamily) is a large group of proteins that show conservation in their 3-dimensional fold (TIM barrel) and in details of their active site. The vast majority of the members have a conserved metal binding site, involving four histidines and one aspartic acid residue. In the common reaction mechanism, the metal ion (or ions) deprotonate a water molecule for a nucleophilic attack on the substrate. The family includes urease alpha, adenosine deaminase, phosphotriesterase dihydroorotases, allantoinases, hydantoinases, AMP-, adenine and cytosine deaminases, imidazolonepropionase, aryldialkylphosphatase, chlorohydrolases, formylmethanofuran dehydrogenases and others." Q#27988 - CGI_10007779 superfamily 149667 41 140 1.93E-08 50.0615 cl07343 GON superfamily N - GON domain; The GON domain is found in the ADAMTS (a disintegrin and metalloproteinase domain with thrombospondin type-1 modules) family of proteins. It contains several conserved cysteine residues. Q#27990 - CGI_10007781 superfamily 247794 14 248 3.82E-37 134.186 cl17240 FDH_GDH_like superfamily - - "Formate/glycerate dehydrogenases, D-specific 2-hydroxy acid dehydrogenases and related dehydrogenases; The formate/glycerate dehydrogenase like family contains a diverse group of enzymes such as formate dehydrogenase (FDH), glycerate dehydrogenase (GDH), D-lactate dehydrogenase, L-alanine dehydrogenase, and S-Adenosylhomocysteine hydrolase, that share a common 2-domain structure. Despite often low sequence identity, these proteins typically have a characteristic arrangement of 2 similar domains of the alpha/beta Rossmann fold NAD+ binding form. The NAD(P) binding domain is inserted within the linear sequence of the mostly N-terminal catalytic domain. Structurally, these domains are connected by extended alpha helices and create a cleft in which NAD(P) is bound, primarily to the C-terminal portion of the 2nd (internal) domain. While many members of this family are dimeric, alanine DH is hexameric and phosphoglycerate DH is tetrameric. 2-hydroxyacid dehydrogenases are enzymes that catalyze the conversion of a wide variety of D-2-hydroxy acids to their corresponding keto acids. The general mechanism is (R)-lactate + acceptor to pyruvate + reduced acceptor. Formate dehydrogenase (FDH) catalyzes the NAD+-dependent oxidation of formate ion to carbon dioxide with the concomitant reduction of NAD+ to NADH. FDHs of this family contain no metal ions or prosthetic groups. Catalysis occurs though direct transfer of a hydride ion to NAD+ without the stages of acid-base catalysis typically found in related dehydrogenases." Q#27991 - CGI_10008930 superfamily 241782 70 539 6.08E-106 326.352 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27992 - CGI_10008931 superfamily 241782 79 460 1.80E-77 253.448 cl00321 AAT_I superfamily - - "Aspartate aminotransferase (AAT) superfamily (fold type I) of pyridoxal phosphate (PLP)-dependent enzymes. PLP combines with an alpha-amino acid to form a compound called a Schiff base or aldimine intermediate, which depending on the reaction, is the substrate in four kinds of reactions (1) transamination (movement of amino groups), (2) racemization (redistribution of enantiomers), (3) decarboxylation (removing COOH groups), and (4) various side-chain reactions depending on the enzyme involved. Pyridoxal phosphate (PLP) dependent enzymes were previously classified into alpha, beta and gamma classes, based on the chemical characteristics (carbon atom involved) of the reaction they catalyzed. The availability of several structures allowed a comprehensive analysis of the evolutionary classification of PLP dependent enzymes, and it was found that the functional classification did not always agree with the evolutionary history of these enzymes. Structure and sequence analysis has revealed that the PLP dependent enzymes can be classified into four major groups of different evolutionary origin: aspartate aminotransferase superfamily (fold type I), tryptophan synthase beta superfamily (fold type II), alanine racemase superfamily (fold type III), and D-amino acid superfamily (fold type IV) and Glycogen phophorylase family (fold type V)." Q#27993 - CGI_10008932 superfamily 247792 24 72 0.0070601 35.4992 cl17238 RING superfamily - - "RING-finger (Really Interesting New Gene) domain, a specialized type of Zn-finger of 40 to 60 residues that binds two atoms of zinc; defined by the 'cross-brace' motif C-X2-C-X(9-39)-C-X(1-3)- H-X(2-3)-(N/C/H)-X2-C-X(4-48)C-X2-C; probably involved in mediating protein-protein interactions; identified in a proteins with a wide range of functions such as viral replication, signal transduction, and development; has two variants, the C3HC4-type and a C3H2C3-type (RING-H2 finger), which have different cysteine/histidine pattern; a subset of RINGs are associated with B-Boxes (C-X2-H-X7-C-X7-C-X2-C-H-X2-H)" Q#27994 - CGI_10008933 superfamily 219941 1042 1208 5.59E-78 254.44 cl07298 TIP120 superfamily - - TATA-binding protein interacting (TIP20); TIP120 (also known as cullin-associated and neddylation-dissociated protein 1) is a TATA binding protein interacting protein that enhances transcription. Q#27999 - CGI_10008938 superfamily 241563 65 103 0.00197564 36.3032 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#28001 - CGI_10008940 superfamily 243035 47 171 1.07E-15 75.7341 cl02432 CLECT superfamily - - "C-type lectin (CTL)/C-type lectin-like (CTLD) domain; CLECT: C-type lectin (CTL)/C-type lectin-like (CTLD) domain; protein domains homologous to the carbohydrate-recognition domains (CRDs) of the C-type lectins. This group is chiefly comprised of eukaryotic CTLDs, but contains some, as yet functionally uncharacterized, bacterial CTLDs. Many CTLDs are calcium-dependent carbohydrate binding modules; other CTLDs bind protein ligands, lipids, and inorganic surfaces, including CaCO3 and ice. Animal C-type lectins are involved in such functions as extracellular matrix organization, endocytosis, complement activation, pathogen recognition, and cell-cell interactions. For example: mannose-binding lectin and lung surfactant proteins A and D bind carbohydrates on surfaces (e.g. pathogens, allergens, necrotic, and apoptotic cells) and mediate functions associated with killing and phagocytosis; P (platlet)-, E (endothelial)-, and L (leukocyte)- selectins (sels) mediate the initial attachment, tethering, and rolling of lymphocytes on inflamed vascular walls enabling subsequent lymphocyte adhesion and transmigration. CTLDs may bind a variety of carbohydrate ligands including mannose, N-acetylglucosamine, galactose, N-acetylgalactosamine, and fucose. Several CTLDs bind to protein ligands, and only some of these binding interactions are Ca2+-dependent; including the CTLDs of Coagulation Factors IX/X (IX/X) and Von Willebrand Factor (VWF) binding proteins, and natural killer cell receptors. C-type lectins, such as lithostathine, and some type II antifreeze glycoproteins function in a Ca2+-independent manner to bind inorganic surfaces. Many proteins in this group contain a single CTLD; these CTLDs associate with each other through several different surfaces to form dimers, trimers, or tetramers, from which ligand-binding sites project in different orientations. Various vertebrate type 1 transmembrane proteins including macrophage mannose receptor, endo180, phospholipase A2 receptor, and dendritic and epithelial cell receptor (DEC205) have extracellular domains containing 8 or more CTLDs; these CTLDs remain in the parent model. In some members (IX/X and VWF binding proteins), a loop extends to the adjoining domain to form a loop-swapped dimer. A similar conformation is seen in the macrophage mannose receptor CRD4's putative non-sugar bound form of the domain in the acid environment of the endosome. Lineage specific expansions of CTLDs have occurred in several animal lineages including Drosophila melanogaster and Caenorhabditis elegans; these CTLDs also remain in the parent model." Q#28001 - CGI_10008940 superfamily 241613 184 204 1.71E-05 44.1198 cl00104 LDLa superfamily C - "Low Density Lipoprotein Receptor Class A domain, a cysteine-rich repeat that plays a central role in mammalian cholesterol metabolism; the receptor protein binds LDL and transports it into cells by endocytosis; 7 successive cysteine-rich repeats of about 40 amino acids are present in the N-terminal of this multidomain membrane protein; other homologous domains occur in related receptors, including the very low-density lipoprotein receptor and the LDL receptor-related protein/alpha 2-macroglobulin receptor, and in proteins which are functionally unrelated, such as the C9 component of complement; the binding of calcium is required for in vitro formation of the native disulfide isomer and is necessary in establishing and maintaining the modular structure" Q#28001 - CGI_10008940 superfamily 241571 227 261 0.00197434 38.5252 cl00049 CUB superfamily C - "CUB domain; extracellular domain; present in proteins mostly known to be involved in development; not found in prokaryotes, plants and yeast." Q#28002 - CGI_10008941 superfamily 245010 14 118 9.20E-13 59.9391 cl09111 Prefoldin superfamily - - "Prefoldin is a hexameric molecular chaperone complex, found in both eukaryotes and archaea, that binds and stabilizes newly synthesized polypeptides allowing them to fold correctly. The complex contains two alpha and four beta subunits, the two subunits being evolutionarily related. In archaea, there is usually only one gene for each subunit while in eukaryotes there two or more paralogous genes encoding each subunit adding heterogeneity to the structure of the hexamer. The structure of the complex consists of a double beta barrel assembly with six protruding coiled-coils." Q#28003 - CGI_10008942 superfamily 247799 61 177 1.05E-45 153.554 cl17245 KH-I superfamily - - "K homology RNA-binding domain, type I. KH binds single-stranded RNA or DNA. It is found in a wide variety of proteins including ribosomal proteins, transcription factors and post-transcriptional modifiers of mRNA. There are two different KH domains that belong to different protein folds, but they share a single KH motif. The KH motif is folded into a beta alpha alpha beta unit. In addition to the core, type II KH domains (e.g. ribosomal protein S3) include N-terminal extension and type I KH domains (e.g. hnRNP K) contain C-terminal extension." Q#28004 - CGI_10013873 superfamily 243034 214 295 3.81E-09 53.1528 cl02429 TPR superfamily - - "Tetratricopeptide repeat domain; typically contains 34 amino acids [WLF]-X(2)-[LIM]-[GAS]-X(2)-[YLF]-X(8)-[ASE]-X(3)-[FYL]-X(2)-[ASL]-X(4)-[PKE] is the consensus sequence; found in a variety of organisms including bacteria, cyanobacteria, yeast, fungi, plants, and humans in various subcellular locations; involved in a variety of functions including protein-protein interactions, but common features in the interaction partners have not been defined; involved in chaperone, cell-cycle, transciption, and protein transport complexes; the number of TPR motifs varies among proteins (1,3-11,13 15,16,19); 5-6 tandem repeats generate a right-handed helical structure with an amphipathic channel that is thought to accomodate an alpha-helix of a target protein; it has been proposed that TPR proteins preferably interact with WD-40 repeat proteins, but in many instances several TPR-proteins seem to aggregate to multi-protein complexes; examples of TPR-proteins include, Cdc16p, Cdc23p and Cdc27p components of the cyclosome/APC, the Pex5p/Pas10p receptor for peroxisomal targeting signals, the Tom70p co-receptor for mitochondrial targeting signals, Ser/Thr phosphatase 5C and the p110 subunit of O-GlcNAc transferase; three copies of the repeat are present here" Q#28004 - CGI_10013873 superfamily 215821 97 184 1.91E-18 79.2066 cl18346 FKBP_C superfamily - - FKBP-type peptidyl-prolyl cis-trans isomerase; FKBP-type peptidyl-prolyl cis-trans isomerase. Q#28005 - CGI_10013874 superfamily 247723 34 126 3.91E-48 168.976 cl17169 RRM_SF superfamily - - "RNA recognition motif (RRM) superfamily; RRM, also known as RBD (RNA binding domain) or RNP (ribonucleoprotein domain), is a highly abundant domain in eukaryotes found in proteins involved in post-transcriptional gene expression processes including mRNA and rRNA processing, RNA export, and RNA stability. This domain is 90 amino acids in length and consists of a four-stranded beta-sheet packed against two alpha-helices. RRM usually interacts with ssRNA, but is also known to interact with ssDNA as well as proteins. RRM binds a variable number of nucleotides, ranging from two to eight. The active site includes three aromatic side-chains located within the conserved RNP1 and RNP2 motifs of the domain. The RRM domain is found in a variety heterogeneous nuclear ribonucleoproteins (hnRNPs), proteins implicated in regulation of alternative splicing, and protein components of small nuclear ribonucleoproteins (snRNPs)." Q#28005 - CGI_10013874 superfamily 243061 724 825 2.13E-29 115.517 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 409 508 8.50E-18 82.0046 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 1520 1616 1.46E-17 81.2342 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 1625 1728 3.07E-17 80.4638 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 521 610 2.09E-16 77.7674 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 992 1094 2.17E-16 77.7674 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 309 400 2.25E-14 71.9894 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 1097 1196 2.98E-13 68.5226 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 189 291 3.39E-11 62.7446 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 1731 1828 4.24E-11 62.3594 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 1417 1507 1.27E-09 57.8618 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 627 717 1.74E-09 57.4766 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 1225 1301 6.17E-07 49.7726 cl02509 SRCR superfamily N - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28005 - CGI_10013874 superfamily 243061 893 963 0.00444146 37.7066 cl02509 SRCR superfamily - - Scavenger receptor cysteine-rich domain; These domains are disulphide rich extracellular domains. These domains are found in several extracellular receptors and may be involved in protein-protein interactions. Q#28006 - CGI_10013875 superfamily 192073 5 80 2.21E-29 101.161 cl07254 Yos1 superfamily - - "Yos1-like; In yeast, Yos1 is a subunit of the Yip1p-Yif1p complex and is required for transport between the endoplasmic reticulum and the Golgi complex. Yos1 appears to be conserved in eukaryotes." Q#28007 - CGI_10013876 superfamily 241563 71 104 0.00806234 34.7624 cl00034 BBOX superfamily - - "B-Box-type zinc finger; zinc binding domain (CHC3H2); often present in combination with other motifs, like RING zinc finger, NHL motif, coiled-coil or RFP domain in functionally unrelated proteins, most likely mediating protein-protein interaction." Q#28009 - CGI_10013878 superfamily 241672 37 469 6.57E-110 335.525 cl00192 ribokinase_pfkB_like superfamily - - "ribokinase/pfkB superfamily: Kinases that accept a wide variety of substrates, including carbohydrates and aromatic small molecules, all are phosphorylated at a hydroxyl group. The superfamily includes ribokinase, fructokinase, ketohexokinase, 2-dehydro-3-deoxygluconokinase, 1-phosphofructokinase, the minor 6-phosphofructokinase (PfkB), inosine-guanosine kinase, and adenosine kinase. Even though there is a high degree of structural conservation within this superfamily, their multimerization level varies widely, monomeric (e.g. adenosine kinase), dimeric (e.g. ribokinase), and trimeric (e.g THZ kinase)." Q#28010 - CGI_10013879 superfamily 248458 69 439 1.94E-23 100.081 cl17904 MFS superfamily - - "The Major Facilitator Superfamily (MFS) is a large and diverse group of secondary transporters that includes uniporters, symporters, and antiporters. MFS proteins facilitate the transport across cytoplasmic or internal membranes of a variety of substrates including ions, sugar phosphates, drugs, neurotransmitters, nucleosides, amino acids, and peptides. They do so using the electrochemical potential of the transported substrates. Uniporters transport a single substrate, while symporters and antiporters transport two substrates in the same or in opposite directions, respectively, across membranes. MFS proteins are typically 400 to 600 amino acids in length, and the majority contain 12 transmembrane alpha helices (TMs) connected by hydrophilic loops. The N- and C-terminal halves of these proteins display weak similarity and may be the result of a gene duplication/fusion event. Based on kinetic studies and the structures of a few bacterial superfamily members, GlpT (glycerol-3-phosphate transporter), LacY (lactose permease), and EmrD (multidrug transporter), MFS proteins are thought to function through a single substrate binding site, alternating-access mechanism involving a rocker-switch type of movement. Bacterial members function primarily for nutrient uptake, and as drug-efflux pumps to confer antibiotic resistance. Some MFS proteins have medical significance in humans such as the glucose transporter Glut4, which is impaired in type II diabetes, and glucose-6-phosphate transporter (G6PT), which causes glycogen storage disease when mutated." Q#28011 - CGI_10013880 superfamily 241677 28 191 2.38E-87 257.185 cl00197 cyclophilin superfamily - - "cyclophilin: cyclophilin-type peptidylprolyl cis- trans isomerases. This family contains eukaryotic, bacterial and archeal proteins which exhibit a peptidylprolyl cis- trans isomerases activity (PPIase, Rotamase) and in addition bind the immunosuppressive drug cyclosporin (CsA). Immunosuppression in vertebrates is believed to be the result of the cyclophilin A-cyclosporin protein drug complex binding to and inhibiting the protein-phosphatase calcineurin. PPIase is an enzyme which accelerates protein folding by catalyzing the cis-trans isomerization of the peptide bonds preceding proline residues. Cyclophilins are a diverse family in terms of function and have been implicated in protein folding processes which depend on catalytic /chaperone-like activities. This group contains human cyclophilin 40, a co-chaperone of the hsp90 chaperone system; human cyclophilin A, a chaperone in the HIV-1 infectious process and; human cyclophilin H, a component of the U4/U6 snRNP, whose isomerization or chaperoning activities may play a role in RNA splicing." Q#28012 - CGI_10013881 superfamily 241583 226 434 1.95E-91 295.302 cl00064 ZnMc superfamily - - "Zinc-dependent metalloprotease. This super-family of metalloproteases contains two major branches, the astacin-like proteases and the adamalysin/reprolysin-like proteases. Both branches have wide phylogenetic distribution, and contain sub-families, which are involved in vertebrate development and disease." Q#28012 - CGI_10013881 superfamily 216572 22 166 3.35E-28 112.368 cl03265 Pep_M12B_propep superfamily - - Reprolysin family propeptide; This region is the propeptide for members of peptidase family M12B. The propeptide contains a sequence motif similar to the "cysteine switch" of the matrixins. This motif is found at the C terminus of the alignment but is not well aligned. Q#28012 - CGI_10013881 superfamily 246918 530 582 1.67E-15 73.3899 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#28012 - CGI_10013881 superfamily 246918 1340 1390 8.24E-07 48.3519 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#28012 - CGI_10013881 superfamily 246918 981 1028 8.64E-05 42.1887 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#28012 - CGI_10013881 superfamily 246918 1289 1334 0.00244309 37.7702 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#28012 - CGI_10013881 superfamily 246918 1228 1285 0.00790784 36.4107 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#28013 - CGI_10013882 superfamily 242392 11 169 7.74E-64 196.232 cl01251 OHCU_decarbox superfamily - - OHCU decarboxylase; The proteins in this family are OHCU decarboxylase - enzymes of the purine catabolism that catalyze the conversion of OHCU into S(+)-allantoin. This is the third step of the conversion of uric acid (a purine derivative) to allantoin. Step one is catalyzed by urate oxidase (pfam01014) and step two is catalyzed by HIUases (pfam00576). Q#28014 - CGI_10013883 superfamily 245814 29 101 1.38E-05 42.0911 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#28014 - CGI_10013883 superfamily 245814 154 227 2.49E-05 41.2137 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#28017 - CGI_10013886 superfamily 245814 93 167 0.000647225 36.3131 cl11960 Ig superfamily - - "Immunoglobulin domain; Ig: immunoglobulin (Ig) domain found in the Ig superfamily. The Ig superfamily is a heterogenous group of proteins, built on a common fold comprised of a sandwich of two beta sheets. Members of this group are components of immunoglobulin, neuroglia, cell surface glycoproteins, such as, T-cell receptors, CD2, CD4, CD8, and membrane glycoproteins, such as, butyrophilin and chondroitin sulfate proteoglycan core protein. A predominant feature of most Ig domains is a disulfide bridge connecting the two beta-sheets with a tryptophan residue packed against the disulfide bond." Q#28018 - CGI_10013887 superfamily 217685 77 185 8.07E-26 98.9456 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#28018 - CGI_10013887 superfamily 216290 15 53 7.33E-13 62.3058 cl03089 Cu2_monooxygen superfamily N - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#28019 - CGI_10013888 superfamily 216290 27 55 0.00134609 32.6454 cl03089 Cu2_monooxygen superfamily NC - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#28020 - CGI_10013889 superfamily 217685 87 232 7.71E-49 161.733 cl04225 Cu2_monoox_C superfamily - - "Copper type II ascorbate-dependent monooxygenase, C-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#28020 - CGI_10013889 superfamily 216290 11 61 2.70E-17 75.7877 cl03089 Cu2_monooxygen superfamily N - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#28021 - CGI_10013890 superfamily 216290 73 122 3.74E-09 49.9794 cl03089 Cu2_monooxygen superfamily C - "Copper type II ascorbate-dependent monooxygenase, N-terminal domain; The N and C-terminal domains of members of this family adopt the same PNGase F-like fold." Q#28024 - CGI_10013893 superfamily 247692 196 546 3.15E-45 163.616 cl17068 AFD_class_I superfamily - - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#28024 - CGI_10013893 superfamily 247692 49 221 1.08E-09 59.4827 cl17068 AFD_class_I superfamily C - "Adenylate forming domain, Class I; This family includes acyl- and aryl-CoA ligases, as well as the adenylation domain of nonribosomal peptide synthetases and firefly luciferases. The adenylate-forming enzymes catalyze an ATP-dependent two-step reaction to first activate a carboxylate substrate as an adenylate and then transfer the carboxylate to the pantetheine group of either coenzyme A or an acyl-carrier protein. The active site of the domain is located at the interface of a large N-terminal subdomain and a smaller C-terminal subdomain." Q#28026 - CGI_10003117 superfamily 245225 1 202 1.80E-25 101.234 cl10011 Periplasmic_Binding_Protein_Type_1 superfamily C - "Type 1 periplasmic binding fold superfamily; Type 1 periplasmic binding fold superfamily. This model and hierarchy represent the ligand binding domains of the LacI family of transcriptional regulators, periplasmic binding proteins of the ABC-type transport systems, the family C G-protein couples receptors (GPCRs), membrane bound guanylyl cyclases including the family of natriuretic peptide receptors (NPRs), and the N-terminal leucine/isoleucine/valine- binding protein (LIVBP)-like domains of the ionotropic glutamate receptors (iGluRs). In LacI-like transcriptional regulator and the bacterial periplasmic binding proteins the ligands are monosaccharides including lactose, ribose, fructose, xylose, arabinose, galactose/glucose, and other sugars, with a few exceptions. Periplasmic sugar binding proteins are one of the components of ABC transporters and are involved in the active transport of water-soluble ligands. The LacI family of proteins consists of transcriptional regulators related to the lac repressor. In this case, the sugar binding domain binds a sugar which changes the DNA binding activity of the repressor domain. The periplasmic binding proteins are the primary receptors for chemotaxis and transport of many sugar based solutes. The core structures of periplasmic binding proteins are classified into two types, and they differ in number and order of beta strands: type 1 has six beta strands, while type 2 has five beta strands per sub-domain. These two structural folds are thought to be distantly related via a common ancestor. Notably, while the N-terminal LIVBP-like domain of iGluRs belongs to the type 1 periplasmic-binding fold protein superfamily, the glutamate-binding domain of the iGluR is structurally similar to the type 2 periplasmic-binding fold." Q#28027 - CGI_10003118 superfamily 246918 4 55 2.04E-10 52.5891 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain. Q#28027 - CGI_10003118 superfamily 246918 60 111 2.04E-10 52.5891 cl15278 TSP_1 superfamily - - Thrombospondin type 1 domain; Thrombospondin type 1 domain.